When calculating the ContentLength
and writing data to the OutputStream
in StringHttpMessageConverter
, calling str.getBytes(charset)
repeatedly will result in unnecessary array objects occupying memory
https://github.com/spring-projects/spring-framework/blob/09917fad7bca9b3997522f0a75d6319203f2127f/spring-web/src/main/java/org/springframework/http/converter/StringHttpMessageConverter.java#L103-L106
https://github.com/spring-projects/spring-framework/blob/09917fad7bca9b3997522f0a75d6319203f2127f/spring-web/src/main/java/org/springframework/http/converter/StringHttpMessageConverter.java#L122-L129
Comment From: bclozel
Superseded by #35276
We can reopen this issue if we can find a way to improve performance without failing the correct behavior.
Comment From: brucelwl
@bclozel I have identified the reason for the build failure, but my submission signature was not carried.
It has now been fixed. see https://github.com/spring-projects/spring-framework/pull/35280 ,
Please open this issue,
Thanks
Comment From: kilink
I've looked into ways around this in the past, one easy approach is to special case ASCII Strings when the target Charset is ASCII or an ASCII superset (like UTF-8). Proof of concept here: #35290.
UTF-8 length can be completely handled without allocating the byte array if we added a utility like the one in Guava.
For ASCII and ISO_8859_1, I believe we can actually just use codePoints()
to calculate the length and handle the case of surrogate pairs if we really wanted to.
IMO it would probably be best to decide whether it's worth adding a utility for UTF-8 length calculation specifically, and special case that, since I imagine it's the most commonly used encoding.
Anyway, the only way I can see to preserve backwards compatibility and also avoid the extra allocation is to special case certain character sets and just do the calculations ourselves.
Comment From: rstoyanchev
@kilink thanks for the suggestion, but it becomes then a trade-off between CPU vs memory allocation. Also optimized for the ascii case, but as a side effect less optimized for non-ascii which would require both approaches.
I think we can try to bring closer together setting the content-length and the actual write, or at least make it possible where an optimization can be made.
We could experiment with StringHttpMessageConverter
opting out of content-length writing either by returning null
from getContentLength
or by having content-length header logic extracted into a protected method, and then setting it from within writeInternal
.
Comment From: rstoyanchev
After a closer look my suggestion won't work.
Headers and body writing are separated as distinct phases at a deeper level with headers copied to the underlying client, and decisions about chunked vs content-length mode finalized before writing begins.
Comment From: brucelwl
Thank you very much for taking this issue seriously and reopening it. I previously submitted a PR https://github.com/spring-projects/spring-framework/pull/35280 that could solve this problem, but the optimization method may not be particularly elegant.
Another optimization approach is to add a byte []
and a long
in HttpOutputMessage
to store the byte array and length of the string, and only initialize it on the first use.
Perhaps you have a better way, but as long as you can optimize this problem, thank you very much
Comment From: rstoyanchev
As I mentioned headers and body writing are separated into distinct phases, and I'm not sure if writing during the headers phase won't run into other side effects.
Comment From: kilink
@kilink thanks for the suggestion, but it becomes then a trade-off between CPU vs memory allocation. Also optimized for the ascii case, but as a side effect less optimized for non-ascii which would require both approaches.
Right, it was a proof-of-concept as an alternative to the current approach, which already takes more CPU / memory. If Spring had a utility akin to the one in Guava, it could handle UTF-8, and not just ASCII as well, which may be the most common character encoding. I have deployed a version of the StringHttpMessageConverter
that uses the Guava Utf8 helper in the past and have seen improvements, although admittedly the String converter is not typically the most widely used converter for us.