The UTF8JsonGenerator splits a string into segments without considering that it might cut the string exactly in between the high and low surrogate chars, which makes the generator escape surrogates instead of combining them when that feature is enabled.

All cases where the segment is split must check if the final character is not the beginning of a surrogate (_isStartOfSurrogatePair) and adjust the segment len based on it (-1).

https://github.com/FasterXML/jackson-core/blob/7ae2b8b9ea5d82c1b8d8ea543eb9e5577c0bff63/src/main/java/com/fasterxml/jackson/core/json/UTF8JsonGenerator.java#L1346

Does this make sense?

Comment From: cowtowncoder

Description makes sense on its own, yes.