Prompt and completion data being added to spans (to Observations, as high cardinality data) was previously implemented but we removed it in 1.0.0-RC1 and moved the data to logs due to several issues: - The implementation directly used the OTel SDK so it did not work with other tracers - The implementation used deprecated APIs (of the OTel SDK) - The OTel Semantic conventions were also deprecated and instead of adding this data to spans, it moved them to logs - The data was not limited in size which could have been problematic for backends that receive that data - This did not work for ChatClient in reactive/streaming scenarios

See connected issues/changes/PRs: - https://github.com/spring-projects/spring-ai/pull/1237 - https://github.com/spring-projects/spring-ai/issues/3401 - https://github.com/spring-projects/spring-ai/issues/3257 - https://github.com/spring-projects/spring-ai/pull/3003 - https://github.com/spring-projects/spring-ai/commit/ca843e85887aa1da6300c77550c379c103500897 - https://github.com/spring-projects/spring-ai/pull/4417 - https://github.com/spring-projects/spring-ai/pull/3612

I believe we should be able to implement this in a way that eliminates most of the issues above: - We can use the Observation API instead of the OTel SDK so this can work with different backends and no deprecated APIs will be used - The Langfuse documentation being stable or not and OTel SemConv not being stable makes this a moving target, not sure there is a good solution here, making the keys configurable is probably good enough? - We can limit (configurable) the size of the prompt/completion that is attached to the Observation - Getting prompt/completion in the reactive/streaming use-case was fixed by #4417 in 1.1.0

Older proof of concept: promt-on-span (also contains changes that were already merged in #4417).

Comment From: markpollack

@ThomasVitale thoughts?

Comment From: zj611

Recommended metrics to add:

  1. ToolCallingObservationContext

(1) Add tool types, such as function/extension/datastore

(2) Add request types, such as Local/MCP/Skill/, these metrics are obviously helpful for observation

  1. ChatModelObservationContext For Flux streaming responses,

(1) Add initial response time

(2) And support passing relevant property parameters from ChatClientObservationContext

Comment From: jonatan-ivanov

@zj611 Not sure any of these connected to this issue but please feel free to open a new one.