Search before asking
- [x] I searched in the issues and found nothing similar.
Describe the bug
I've discovered that Jackson serializes Java BigDecimal
values as JSON numbers rather than strings. I can work around this in my own code, but I wanted to bring it to your attention as I consider this a serious issue that directly contradicts the explicit desire of the developer.
JavaScript doesn't indicate (or at least didn't use to) how the "number" type is represented, but the vast majority of implementations across platforms use IEEE 754. This standard uses base 2 to represent the fractional part. This means that there are many values that cannot be represented but only estimated. (This is not for lack of precision, but rather because of the number base used.)
Most currencies on the other hand use base 10 for representing fractional parts. Likewise there are many fractions that base 10 cannot represent, but the significant point here is that the unrepresentable values differ based upon the base. In other words, using floating point numbers guarantees that there are many values that can only be estimated that would have been accurately represented in base 10.
For this reason it is a well-known recommendation for decades not to use IEEE 754 to represent currency. See e.g. Be cool. Don’t use float/double for storing monetary values. A good explanation is in this Stack Overflow answer, but there are surely dozens of articles that have already covered this in depth.
I stress that this has nothing to do with precision. We could have a 128-bit or 256-bit precision or whatever, but if we are using the base 2 fractions that IEEE 754 uses, even a simple value such as $0.10
(10 cents, in whatever currency) will only be represented as an estimate in IEEE 754.
In short, if a Java developer has used BigDecimal
, they have expressly indicated that their application, for whatever reason, needs to accurately represent fractional parts using base 10 (hence the "decimal" in BigDecimal
). Converting the decimal value to what in all practical cases will surely be represented as an IEEE 754 floating point number, representing fractional parts using base 2, is guaranteed to lose information and directly goes against the express wish of the developer.
Version Information
No response
Reproduction
<-- Any of the following 1. Brief code sample/snippet: include here in preformatted/code section 2. Longer example stored somewhere else (diff repo, snippet), add a link 3. Textual explanation: include here -->
// Your code here
Expected behavior
No response
Additional context
No response
Comment From: pjfanning
This comment is relevant here too - https://github.com/FasterXML/jackson-databind/issues/2517#issuecomment-2976890827
Comment From: garretwilson
Thanks for the comments here and on #2517, @pjfanning . I have already mitigated this issue in my own framework which uses Jackson, but I wanted to report it here to bring awareness to the issue.
Comment From: JooHyukKim
@garretwilson I think this should suffice, unless you are suggesting the "default" behavior should change -- which would need some amount of up/down voting
@Test
public void testCustomSerializationBigDecalAsString() throws Exception {
assertEquals(a2q("{'value':2.20003}"),
jsonMapperBuilder()
.disable(JsonWriteFeature.WRITE_NUMBERS_AS_STRINGS)
.build()
.writeValueAsString(new BigDecimalHolder("2.20003"))
);
assertEquals(a2q("{'value':'2.20003'}"),
jsonMapperBuilder()
.enable(JsonWriteFeature.WRITE_NUMBERS_AS_STRINGS)
.build()
.writeValueAsString(new BigDecimalHolder("2.20003"))
);
}
Comment From: garretwilson
I think this should suffice, unless you are suggesting the "default" behavior should change -- which would need some amount of up/down voting
As I mentioned I have already added code to my framework to mitigate this issue. It uses a different approach than yours for several reasons, but your approach is fine as well.
Jackson's default representation of BigDecimal
as a JSON number, which inevitably will be turned into a IEEE 754 floating point value, goes against the express representation desire of the developer and results in corrupt financial data. So yes, I am suggesting that Jackson's default should be changed. I don't intend to argue about it, as the reasons and references outlined above cover the arguments, but I wanted to reiterate what I was proposing as there seemed to be some doubt about what I was suggesting.
Comment From: pjfanning
This appears in some of our tests. So this is one option.
mapper.configOverride(BigDecimal.class)
.setFormat(JsonFormat.Value.forShape(JsonFormat.Shape.STRING));
I'm adding this for users who run into this issue. I'm aware that some other users want us to change the default.
You can also annotate fields with @JsonFormat(shape=Shape.STRING)
.
Comment From: garretwilson
Regarding JsonFormat.Shape.STRING
, I haven't dug into the details, but I'd wager that it uses BigDecimal.toString()
, which may not give you what you want for some values. In my own converters I'm using BigDecimal.toPlainString()
to avoid the exponent form (e.g. "1.23E+5"
). Just something to be aware of.
Comment From: garretwilson
Well I said I wasn't going to argue about this 😅, but I think it bears noting that (as I just discovered) Jackson itself presents the same arguments I was presenting above at Representing Money in JSON:
⚠️ Monetary amounts ≠ floats
Before we dive into details, always keep the following in mind. However you desire to format money in JSON, nothing changes the fact that you should...
Never hold monetary values [..] in a float variable. Floating point is not suitable for this work, and you must use either fixed-point or decimal values. Coinkite: Common Terms and Data Objects
That document is showing how to represent currency using JSR 354 types, which is a separate subject; however the explanation of why not to use JSON numbers for the currency amount is spot on, and exactly why I opened this ticket. So Jackson's documentation has already made the arguments for me. 😁 (I'm aware that this documentation may have been imported from Zalando.)
Anyway I'm fine with closing this issue if you don't intend to change the default. The ticket may prove helpful if someone finds it by searching. But no use leaving it open if you don't plan on changing the default. Either way, thanks for reading.
Comment From: cowtowncoder
Just to make it clear: I am against suggested change to cater for inadequacies of Javascript (and some other languages/platforms). Numbers should be represented as numbers, as default setting. Forcing them to be Strings as acceptable, but just as opt-in configuration.
Now, configuration is another thing: we should (and do) support it. 2 main ways, mentioned on #2517 :
- For all numbers, make
JsonGenerator
handle conversion:JsonWriteFeature.WRITE_NUMBERS_AS_STRINGS
-
For specific ones (
BigDecimal
here):@JsonFormat
either annotated on field/accessor (@JsonFormat(shape=Shape.STRING)
) or for all properties of type:mapper.configOverride(BigInteger.class) .setFormat(JsonFormat.Value.forShape(JsonFormat.Shape.STRING));
And yes, let's still leave the issue open. I am sure there will be more discussion: this has been mentioned over past 10+ years, and not just for Jackson but wrt API design at my daytime jobs.
Comment From: cowtowncoder
(accidentally closed, please ignore)
Comment From: cowtowncoder
This one I don't follow:
Jackson's default representation of BigDecimal as a JSON number, which inevitably will be turned into a IEEE 754 floating point value, goes against the express representation desire of the developer and results in corrupt financial data.
@garretwilson I assume this refers to Javascript (et al) clients? But fundamentally if those clients need to operate on numbers, they need to find a way to do that without using IEEE-754 double
s, no? (presumably there are Decimal type libraries in existence?)
So isn't it a question of decoder deciding how to do conversions. In case of Java it gets mapper to BigDecimal
(assuming target class, or configuration, defines it so), but similarly other platforms typically allow definition of target types to use. So I don't see it as inevitable.
Comment From: garretwilson
I assume this refers to Javascript (et al) clients?
No, I'm referring to JSON implementations.
Jackson is a library that takes an object model and serializes it to a serialization format called JSON. I am not aware of a single JSON implementation in any language that will parse a JSON number into something that represents the fractional portion in base 10. Thus Jackson is serializing numbers the developer has indicated should be represented using base 10 into a format, knowing that in the real world the JSON reader, in whatever language, will read the value into something that loses information.
But fundamentally if those clients need to operate on numbers, they need to find a way to do that without using IEEE-754 doubles, no?
The first faulty assumption is that there is something out there called "numbers" that computers are operating on. That's not really the case. All computers work on some representation of numbers, with agreed-upon semantics of what the representation means. For numbers with fractional portions, they make a choice: either represent them using base 10 or use base 2. (There are other choices: a computer could decide to represent the fractional part as a fraction, with a numerator and denominator; this in fact would be the most accurate, as it could represent values base 10 and base 2 fractional parts could not represent.) The overwhelming majority of languages decide to discuss fractional values using base 2 to represent the fractional part, i.e. IEEE-754. And this works really well most of the time, e.g. for making calculations for putting satellites into orbit. Scientists need to work in terms of high-precision representations where the precision is most important.
The problem is for certain values that humans discuss in terms of base 10 for the fractional part. IEEE-754 is horrible representing those values. Money doesn't have a high precision—its precision is actually really coarse. But we want the exact value of e.g. 0.10
. Again if computers represented this as integer: 0; numerator: 1; denominator: 10
, then that would be an exact representation. But IEEE-754 represents that value in a completely different way. So it's a mistake to think that computers are just working with numbers. They work with representations of numbers, each with tradeoffs. The IEEE-754 tradeoffs are good for scientists, but horrible for bankers.
… they need to find a way to do that without using IEEE-754 doubles, no?
The developers need to find a way to tell their platform to use something other than IEEE-754 when working with currency. One way to do that in Java is to use BigDecimal
, which brings me back to my point: if a developer is using BigDecimal
, they have made their preference known on how they want the computer to represent values, and Jackson is going against their explicit preference by turning the value into a JSON number type, which we know in real life will be implemented using IEEE-754, which uses base 2 instead of base 10 for the fractional part, thereby corrupting the values.
In case of Java it gets mapper to
BigDecimal
(assuming target class, or configuration, defines it so), but similarly other platforms typically allow definition of target types to use. So I don't see it as inevitable.
Are you saying that if I store the value 0.10
as a JSON number type, and I have my deserializer set up to turn that into a BigDecimal
on the other side, that the Jackson parser doesn't first parse the JSON value 0.10
as an IEEE-754 before converting it to a BigDecimal
?
Even if Jackson is really smart and sees that the value should go to a BigDecimal
on the other side, so that it parses the string as base 10 and feeds it to BigDecimal
as a string and IEEE-754 is never involved (which I highly doubt), this still ignores the gigantic ecosystem of JSON processing. What if the JSON gets compressed into BSON in the middle? It will be represented as IEEE-754 (as per the BSON spec!) even before it gets to the Jackson deserialization. What if the Jackson output is parsed by Python or TypeScript? IEEE-754, all of them.
In summary no computers work with some abstract "number". They work with representations that have tradeoffs. By specifying BigDecimal
the developer has made known which representation they want, but Jackson is going changing that preference by knowingly serializing the value into a format that will inevitably be converted to IEEE-754 at some point.
I'm not trying to over-argue this; I'm just trying to explain what I was saying, as you had questions about it. For my application I've mitigated the issue, so for me it's neither here nor there what Jackson does.
Comment From: cowtowncoder
I am not aware of a single JSON implementation in any language that will parse a JSON number into something that represents the fractional portion in base 10.
Aside from Jackson, you mean?
Are you saying that if I store the value 0.10 as a JSON number type, and I have my deserializer set up to turn that into a BigDecimal on the other side, that the Jackson parser doesn't first parse the JSON value 0.10 as an IEEE-754 before converting it to a BigDecimal?
Yes, that's exactly what I am saying.
If target is BigDecimal
(or Number
with suitable configuration), that is exactly what happens.
Or, at low level, if access is with JsonParser.getDecimalValue()
. Jackson defers decoding until such point when access type is known.
I find it hard to believe I am the only JSON library developer capable of doing that.
And that everyone else was just going with "aww shucks, cannot do it, let's pass Number as String".
The first faulty assumption is that there is something out there called "numbers" that computers are operating on.
Maybe q of terminology, but I mean Logical Data Type of a JSON Property. There is clearly defined textual Number representation (without dictated physical type). Breaking this mapping seems Wrong to me.
Comment From: garretwilson
Yes, that's exactly what I am saying. If target is BigDecimal (or Number with suitable configuration), that is exactly what happens.
That's cool, and I apologize for making the assumption you didn't do that. You do a great job on your libraries, and your work on ClassMate was outstanding. After writing what I did, I started thinking, "you know, I'll bet he actually does that". 😅
And that everyone else was just going with "aww shucks, cannot do it, let's pass Number as String".
(I misread earlier what you wrote.) No, I think most developers don't have the grasp of IEEE-754 and simply represent money as floating point numbers, so the issue of conversion never comes up (because they chose the wrong type to begin with).
The issue here is that even if your library does the right thing (for someone who remembers to configure it), JSON files stand alone as an interchange format and we have to guard against all the hundreds of implementations that don't. On top of it, for developers that do know to use the correct type for currency, they might not even be aware that they need to configure Jackson specially, as they are already using the correct type in Java. That's the main point of this ticket—that it raises a surprise to the developer who has already gone out of their way to use the correct type for currency, not expecting that Jackson would subvert their intention without special action on their part.
Maybe q of terminology, but I mean Logical Data Type of a JSON Property. There is clearly defined textual Number representation (without dictated physical type). Breaking this mapping seems Wrong to me.
I understand what you're saying, and it would be nice if we could say, "the spec calls for a general number, so the implementations must do whatever it takes to represent that number". But that's not what happens in the real world, and that's one of the (many) downsides to the over-simplistic JSON specification. First of all, how would that even work? So I pass a complex number type to JSON (with a real and an imaginary part), how is that supposed to be represented as a number? Or what if have a number type that accurately represents an irrational number, such as 53/74
as a fraction. This is a number. It cannot be represented accurately as a decimal number, nor as a floating point number. It can only be represented as a fraction. How would that even be written in JSON?
JSON's problem is that it 1) represents numbers lexically as decimal in the source file, while 2) claiming that it an abstract number with no specific representation, but nevertheless 3) being implemented 99.99% in the real-world as IEEE-754 floating point. Seeing that JSON is ubiquitous and we have to use it for interchange, we have to live in that world and make appropriate decision to protect our data (such as using string for dates and for decimal-fraction number types such as currency). (Or create an alternate serialization superior to JSON, as I have, that has specific types for decimal representation. I've only been working on my format for a couple of decades and it's not finished yet, so we have to stick with JSON for the time being. 😁)
OK I'm heading back to my real work of the day. Thanks for the consideration and have a productive week.
Comment From: garretwilson
I find it hard to believe I am the only JSON library developer capable of doing that.
And that everyone else was just going with "aww shucks, cannot do it, let's pass Number as String".
Um … but (if they are even aware of the issue in the first place) they often do. See e.g. AWS DynamoDb's approach:
All numbers are sent across the network to DynamoDB as strings to maximize compatibility across languages and libraries.
In DynamoDb's case, they swung too far in the other direction, making the exception the rule. Instead they should have created a separate decimal number type, and only used strings for those. But they took the "aww shucks" road.
Comment From: cowtowncoder
OK I'm heading back to my real work of the day. Thanks for the consideration and have a productive week.
Ok, yes, I think we resolved some misunderstandings to at least agree on facts & situation. :)
Right now I think at least it is important to know how to make things work, regardless of (dis)agreements on defaults.