Tools can significantly impact a model's response speed. How is everyone addressing this issue?
When I ask 99999*99999=?
model: qwen-max
tool: Calculator
use tool: 12s
no tool: 3s
Significant differences were found in speed.
Comment From: ilayaperumalg
@jingbio Since the client application controls the tool call execution before returning the response back to the model, the response speed depends on various factors on how the tool execution happened including the roundtrip of the request/response invocations from and to the model and the client. Closing this for now but feel free to re-open if you want to continue the discussion on this topic. Thanks!