AWS Bedrock has support for latency optimization that speeds up response times for certain models. Would love to be able to take advantage of this using Spring AI.

https://aws.amazon.com/blogs/machine-learning/optimizing-ai-responsiveness-a-practical-guide-to-amazon-bedrock-latency-optimized-inference/

AWS Docs - https://docs.aws.amazon.com/bedrock/latest/userguide/latency-optimized-inference.html

Comment From: JonasJSchreiber

Has this been evaluated, or have you overcome this limitation some way?

Comment From: dsarlo-viso

@JonasJSchreiber Nothing to report here. Waiting for someone on the spring ai side to evaluate adding this as an option.