Spring-ai has support for Hugginface Inference endpoints. However this doesn't work with the 'serverless' version of the inference API due to a hardcoded '/generate' subpath being used.

Bug description Configure hugginface and use a serverless inference endpoint such as: spring.ai.huggingface.chat.url=https://api-inference.huggingface.co/models/meta-llama/Meta-Llama-3-8B-Instruct

This will result in an exception with the following error: ```404 Not Found: "{"error":"Model meta-llama/Meta-Llama-3-8B-Instruct/generate does not exist"}"````

This is because the chatModel is calling the generate method (which leads to the openclient client calling /generate), while it seems it should be the 'compatGenerate' method to invoke the endpoint the / path.

https://github.com/spring-projects/spring-ai/blob/v1.0.0-M1/models/spring-ai-huggingface/src/main/java/org/springframework/ai/huggingface/HuggingfaceChatModel.java#L97

Environment Spring-ai 1.0.0.M1

Expected behavior I would expect this library to work with the serverless version of the inference endpoints (much cheaper 😅). If the path has to be different between versions it should be configurable.