Prompt and completion data being added to spans (to Observations, as high cardinality data) was previously implemented but we removed it in 1.0.0-RC1 and moved the data to logs due to several issues:
- The implementation directly used the OTel SDK so it did not work with other tracers
- The implementation used deprecated APIs (of the OTel SDK)
- The OTel Semantic conventions were also deprecated and instead of adding this data to spans, it moved them to logs
- The data was not limited in size which could have been problematic for backends that receive that data
- This did not work for ChatClient in reactive/streaming scenarios
See connected issues/changes/PRs: - https://github.com/spring-projects/spring-ai/pull/1237 - https://github.com/spring-projects/spring-ai/issues/3401 - https://github.com/spring-projects/spring-ai/issues/3257 - https://github.com/spring-projects/spring-ai/pull/3003 - https://github.com/spring-projects/spring-ai/commit/ca843e85887aa1da6300c77550c379c103500897 - https://github.com/spring-projects/spring-ai/pull/4417 - https://github.com/spring-projects/spring-ai/pull/3612
I believe we should be able to implement this in a way that eliminates most of the issues above:
- We can use the Observation API instead of the OTel SDK so this can work with different backends and no deprecated APIs will be used
- The Langfuse documentation being stable or not and OTel SemConv not being stable makes this a moving target, not sure there is a good solution here, making the keys configurable is probably good enough?
- We can limit (configurable) the size of the prompt/completion that is attached to the Observation
- Getting prompt/completion in the reactive/streaming use-case was fixed by #4417 in 1.1.0
Older proof of concept: promt-on-span (also contains changes that were already merged in #4417).
Comment From: markpollack
@ThomasVitale thoughts?
Comment From: zj611
Recommended metrics to add:
- ToolCallingObservationContext
(1) Add tool types, such as function/extension/datastore
(2) Add request types, such as Local/MCP/Skill/, these metrics are obviously helpful for observation
- ChatModelObservationContext For Flux streaming responses,
(1) Add initial response time
(2) And support passing relevant property parameters from ChatClientObservationContext
Comment From: jonatan-ivanov
@zj611 Not sure any of these connected to this issue but please feel free to open a new one.