Expected Behavior
Spring AI should provide success/failure metrics for individual tool calls to enable production monitoring and reliability tracking.
Proposed Metrics
- Tool Result Counter:
spring.ai.tool.result
- Tags:
tool.name
,success
(true/false) -
Tracks individual tool execution outcomes
-
Tool Error Counter:
spring.ai.tool.error
- Tags:
tool.name
,error.type
- Tracks tool failures by error type
Code Example
# Configuration (optional, enabled by default)
spring:
ai:
tool:
metrics:
enabled: true
include-error-details: false
# View tool success metrics
curl http://localhost:8080/actuator/metrics/spring.ai.tool.result
# Expected response:
{
"name": "spring.ai.tool.result",
"measurements": [
{ "statistic": "COUNT", "value": 45 }
],
"availableTags": [
{ "tag": "tool.name", "values": ["getCurrentWeather", "getRecommendation"] },
{ "tag": "success", "values": ["true", "false"] }
]
}
# Filter by specific tool and outcome
curl "http://localhost:8080/actuator/metrics/spring.ai.tool.result?tag=tool.name:getCurrentWeather&tag=success:false"
Implementation Approach
Add a new ToolMetricsObservationHandler
that extends the existing observation system:
@Component
@ConditionalOnClass({ObservationRegistry.class, MeterRegistry.class})
public class ToolMetricsObservationHandler implements ObservationHandler<Observation.Context> {
@Override
public void onStop(Observation.Context context) {
if (context instanceof ToolCallingObservationContext toolContext) {
String toolName = toolContext.getToolDefinition().name();
boolean isSuccess = !toolContext.hasError();
Counter.builder("spring.ai.tool.result")
.tag("tool.name", toolName)
.tag("success", String.valueOf(isSuccess))
.register(meterRegistry)
.increment();
}
}
}
Current Behavior
Spring AI currently provides basic spring.ai.tool timer metrics through the existing observation system:
✅ Tool execution time (COUNT, TOTAL_TIME, MAX) ✅ Basic tool call tracking ❌ Success/failure breakdown ❌ Error rate monitoring ❌ Tool reliability insights
The existing metrics are generated automatically by Micrometer's DefaultMeterObservationHandler, but lack domain-specific insights needed for production monitoring.
Context
How has this issue affected you? When running AI applications in production, it's critical to monitor tool reliability. Currently, there's no way to:
Identify which tools are failing frequently Set up alerts for tool failure rates Track tool performance degradation over time Monitor SLA compliance for tool-dependent services