Based on the Semantic Conventions 1.27.0. See docs. This issue is to keep track of the work I've been doing for the Spring AI initial adoption of the OpenTelemetry Semantic Conventions for GenAI Operations.
Traces
Span Names
Span Name | Spring AI | Requirement |
---|---|---|
{gen_ai.operation.name} {gen_ai.request.model} |
✅ | Recommended |
Span Kinds
Span Kind | Spring AI | Requirement | Comments |
---|---|---|---|
CLIENT/INTERNAL | ✅ | Required | The kind is currently INTERNAL (background info on why: https://github.com/open-telemetry/semantic-conventions/issues/1315). |
Span Attributes
Span Attribute | Type | Spring AI | Requirement | Comments |
---|---|---|---|---|
gen_ai.operation.name | string | ✅ | Required | |
gen_ai.request.model | string | ✅ | Required | |
gen_ai.system | string | ✅ | Required | |
error.type | string | ⬇️ | Conditionally Required if the operation ended in an error | There isn't currently a plan to support it in Spring AI. You can find error information as a span event and also in the child HTTP span auto-configured by Spring Boot. If you encounter a use-case in production where that is not enough, we would like to hear from you. Please, let us know by raising an issue and describing your scenario, so that we can get additional context and evaluate if and how to add this attribute to meet your needs. |
server.port | int | ⬇️ | Conditionally Required If server.address is set. | This attribute is optional. There isn't currently a plan to support it in Spring AI. You can find this information in the child HTTP span auto-configured by Spring Boot. If you encounter a use-case in production where that is not enough, we would like to hear from you. Please, let us know by raising an issue and describing your scenario, so that we can get additional context and evaluate if and how to add this attribute to meet your needs. |
gen_ai.request.frequency_penalty | double | ✅ | Recommended | |
gen_ai.request.max_tokens | int | ✅ | Recommended | |
gen_ai.request.presence_penalty | double | ✅ | Recommended | |
gen_ai.request.stop_sequences | string[] | ✅ | Recommended | |
gen_ai.request.temperature | double | ✅ | Recommended | |
gen_ai.request.top_k | double | ✅ | Recommended | |
gen_ai.request.top_p | double | ✅ | Recommended | |
gen_ai.response.finish_reasons | string[] | ✅ | Recommended | |
gen_ai.response.id | string | ✅ | Recommended | |
gen_ai.response.model | string | ✅ | Recommended | |
gen_ai.usage.input_tokens | int | ✅ | Recommended | |
gen_ai.usage.output_tokens | int | ✅ | Recommended | |
server.address | string | ⬇️ | Recommended | This attribute is optional. There isn't currently a plan to support it in Spring AI. You can find this information in the child HTTP span auto-configured by Spring Boot. If you encounter a use-case in production where that is not enough, we would like to hear from you. Please, let us know by raising an issue and describing your scenario, so that we can get additional context and evaluate if and how to add this attribute to meet your needs. |
Event Names
Event Name | Spring AI | Requirement | Comments |
---|---|---|---|
gen_ai.content.prompt | ✅ | Optional | Coming soon. Being implemented right now. |
gen_ai.content.completion | ✅ | Optional | Coming soon. Being implemented right now. |
Event Attributes
Event Attribute | Type | Spring AI | Requirement | Comments |
---|---|---|---|---|
gen_ai.prompt | string | ✅ | Conditionally Required if and only if corresponding event is enabled | Coming soon. Being implemented right now. Currently, it's available as a span attribute. |
gen_ai.completion | string | ✅ | Conditionally Required if and only if corresponding event is enabled | Coming soon. Being implemented right now. Currently, it's available as a span attribute. |
Metrics
Metric: Token Usage
Metric Name | Type | Unit | Spring AI | Requirement | Comments |
---|---|---|---|---|---|
gen_ai.client.token.usage | Histogram | {token} | ✅ | Recommended | Available |
Metric Attribute | Type | Spring AI | Requirement | Comments |
---|---|---|---|---|
gen_ai.operation.name | string | ✅ | Required | |
gen_ai.request.model | string | ✅ | Required | |
gen_ai.system | string | ✅ | Required | |
gen_ai.token.type | string | ✅ | Required | |
server.port | int | ⬇️ | Conditionally Required If server.address is set. | This attribute is optional. There isn't currently a plan to support it in Spring AI. You can find this information as part of the HTTP metrics auto-configured by Spring Boot. If you encounter a use-case in production where that is not enough, we would like to hear from you. Please, let us know by raising an issue and describing your scenario, so that we can get additional context and evaluate if and how to add this attribute to meet your needs. |
gen_ai.response.model | string | ✅ | Recommended | |
server.address | string | ⬇️ | Recommended | This attribute is optional. There isn't currently a plan to support it in Spring AI. You can find this information as part of the HTTP metrics auto-configured by Spring Boot. If you encounter a use-case in production where that is not enough, we would like to hear from you. Please, let us know by raising an issue and describing your scenario, so that we can get additional context and evaluate if and how to add this attribute to meet your needs. |
Metric: Operation Duration
Metric Name | Type | Unit | Spring AI | Requirement | Comments |
---|---|---|---|---|---|
gen_ai.client.operation.duration | Histogram | s | ✅ | Recommended | Available |
Metric Attribute | Type | Spring AI | Requirement | Comments |
---|---|---|---|---|
gen_ai.operation.name | string | ✅ | Required | |
gen_ai.request.model | string | ✅ | Required | |
gen_ai.system | string | ✅ | Required | |
error.type | string | ⬇️ | Conditionally Required if the operation ended in an error. | There isn't currently a plan to support it in Spring AI. You can find this information as part of the HTTP metrics auto-configured by Spring Boot. If you encounter a use-case in production where that is not enough, we would like to hear from you. Please, let us know by raising an issue and describing your scenario, so that we can get additional context and evaluate if and how to add this attribute to meet your needs. |
server.port | int | ⬇️ | Conditionally Required If server.address is set. | This attribute is optional. There isn't currently a plan to support it in Spring AI. You can find this information as part of the HTTP metrics auto-configured by Spring Boot. If you encounter a use-case in production where that is not enough, we would like to hear from you. Please, let us know by raising an issue and describing your scenario, so that we can get additional context and evaluate if and how to add this attribute to meet your needs. |
gen_ai.response.model | string | ✅ | Recommended | |
server.address | string | ⬇️ | Recommended | This attribute is optional. There isn't currently a plan to support it in Spring AI. You can find this information as part of the HTTP metrics auto-configured by Spring Boot. If you encounter a use-case in production where that is not enough, we would like to hear from you. Please, let us know by raising an issue and describing your scenario, so that we can get additional context and evaluate if and how to add this attribute to meet your needs. |
Comment From: lmolkova
User application may be configured to disable (not enable/suppress/sample-out/drop) HTTP/gRPC/underlying protocol.
That's the main reason that OTel uses kind client, but also one of the reasons to have server.address|port
and error.type
on the LLM spans/metrics.
Other reasons to report server.address|port
and error.type
on LLM telemetry are:
- HTTP client metrics don't have context to differentiate operations
- error rate on HTTP layer does not represent error rate for logical LLM (LLM operations are retrying under)
- same on latency/throughput
- generic HTTP instrumentations have limited capacity to describe streaming calls - they don't always know when/if the stream ends and are likely to record duration to headers (not to end-of-stream). As a result, they won't include errors that happen when connection drops/times out during SSE stream
- users would have much easier time analyzing their telemetry if they could build LLM-specific dashboards/alerts (i.e. were able to differentiate endpoints and errors on LLM telemetry as opposed to deriving it from HTTP one)
Comment From: ThomasVitale
@lmolkova thank you for sharing more details about how those attributes would be used, it was helpful. I have updated the table with a better description of the current adoption state and suggestion for future changes.
Considering that most Spring applications will have HTTP metrics and traces enabled, and that adding those HTTP attributes would require a major refactoring of the framework (including considerations for avoiding a raise in the time series due to information duplications), I think it would be beneficial to hear from the community a bit more about their observability needs and use cases that would require those additional attributes before making any change.
In any case, I'd like to point out the overview above is to keep track of the work I've been doing right now as a contributor to this project for the initial observability support in Spring AI. It's not an official or even comprehensive roadmap, and it's definitely not a final task. It's a start :)
Comment From: codefromthecrypt
took your demo for a spin. ping back when you've span events as I'll be interested to take a second look.
Comment From: piotrooo
OFFTOPIC: @codefromthecrypt what tool do you use for the traces?
Comment From: codefromthecrypt
@piotrooo https://github.com/ymtdzzz/otel-tui otherwise known as "missing 1000 github stars". It is technically an otel collector extension, bundled for convenience into one binary.
Comment From: ThomasVitale
Closing this issue for now. Spring AI has implemented the initial instrumentation based on the initial version of the OpenTelemetry Semantic Conventions for GenAI. They will stay stable in Spring AI to ensure production readiness rather than changing based on the new versions of the OTel SemConv, which are not stable yet and change very often in a non-backward compatible way (therefore, not suitable for production).
A separate, optional, module is recommended to be defined in order to support the current version of the OTel SemConv, so that users can decide to opt-in explicitly, knowing the risks of using non-stable conventions in production.
Comment From: codefromthecrypt
FWIW it feels possible for someone to add a ObservationHandler<ChatModelObservationContext>
and make your own instrumentation which follows a convention of choosing, whether that is otel's experimental semconv, openinference, or something else entirely. The decoupling makes this possible and if there are kinks in accessing certain data, I feel confident the spring-ai team can manage it!