Spring AI Add prompt and completion information to spans

Prompt and completion data being added to spans (to Observations, as high cardinality data) was previously implemented but we removed it in 1.0.0-RC1 and moved the data to logs due to several issues: - The implementation directly used the OTel SDK so it did not work with other tracers - The implementation used deprecated APIs (of the OTel SDK) - The OTel Semantic conventions were also deprecated and instead of adding this data to spans, it moved them to logs - The data was not limited in size which could have been problematic for backends that receive that data - This did not work for ChatClient in reactive/streaming scenarios

See connected issues/changes/PRs: - https://github.com/spring-projects/spring-ai/pull/1237 - https://github.com/spring-projects/spring-ai/issues/3401 - https://github.com/spring-projects/spring-ai/issues/3257 - https://github.com/spring-projects/spring-ai/pull/3003 - https://github.com/spring-projects/spring-ai/commit/ca843e85887aa1da6300c77550c379c103500897 - https://github.com/spring-projects/spring-ai/pull/4417 - https://github.com/spring-projects/spring-ai/pull/3612

I believe we should be able to implement this in a way that eliminates most of the issues above: - We can use the Observation API instead of the OTel SDK so this can work with different backends and no deprecated APIs will be used - The Langfuse documentation being stable or not and OTel SemConv not being stable makes this a moving target, not sure there is a good solution here, making the keys configurable is probably good enough? - We can limit (configurable) the size of the prompt/completion that is attached to the Observation - Getting prompt/completion in the reactive/streaming use-case was fixed by #4417 in 1.1.0

Older proof of concept: promt-on-span (also contains changes that were already merged in #4417).

Comment From: markpollack

@ThomasVitale thoughts?

Comment From: zj611

Recommended metrics to add:

ToolCallingObservationContext

(1) Add tool types, such as function/extension/datastore

(2) Add request types, such as Local/MCP/Skill/, these metrics are obviously helpful for observation

ChatModelObservationContext For Flux streaming responses,

(1) Add initial response time

(2) And support passing relevant property parameters from ChatClientObservationContext

Comment From: jonatan-ivanov

@zj611 Not sure any of these connected to this issue but please feel free to open a new one.