Performance impact while using Gson
First of all, thank you for the amazing job with this library.
We are using this library, both via SpringBoot and directly in some services. While troubleshooting some memory issues we found out the next behaviour:
- We are generating JWT with 3 timestamps with millisecond accuracy. Those timestamps are using a
double
format: 12345.6789 - Every time a request is received, the JWT is parsed and validated.
- While troubleshooting the memory issues we found out that the JWT parsing, via the Gson dependency is generating 3 NumberFormatExceptions for each parsing
The issue is related with the way the Gson library “evaluates” if a Number is Long or Double
public Number readNumber(JsonReader in) throws IOException, JsonParseException {
String value = in.nextString();
try {
return Long.parseLong(value);
} catch (NumberFormatException longE) {
try {
Double d = Double.valueOf(value);
if ((d.isInfinite() || d.isNaN()) && !in.isLenient()) {
throw new MalformedJsonException(
"JSON forbids NaN and infinities: " + d + "; at path " + in.getPreviousPath());
}
return d;
} catch (NumberFormatException doubleE) {
throw new JsonParseException(
"Cannot parse " + value + "; at path " + in.getPreviousPath(), doubleE);
}
}
}
The fact that each parsing is generating an Exception is directly impacting in memory and performance. A simple JUnit test shows the next values
ToNumberStrategy strategy = ToNumberPolicy.LONG_OR_DOUBLE;
for (var i = 0; i < 10_000_000; i++) {
assertThat(strategy.readNumber(fromString("10"))).isEqualTo(10);
}
Average time of 10 executions: 2s 430ms
ToNumberStrategy strategy = ToNumberPolicy.LONG_OR_DOUBLE;
for (var i = 0; i < 10_000_000; i++) {
assertThat(strategy.readNumber(fromString("10.1"))).isEqualTo(10.1);
}
Average time of 10 executions: 36s 655ms
1 execution = 1 loop with 10K asserts
The Gson library is in maintenance mode, so pushing any fix there will be problematic. From my POV this situation is bad in 2 different ways:
- The current Gson implementation is performing poorly. See a comparison like https://github.com/fabienrenaud/java-json-benchmark
- Gson is in maintenance mode, without a release in more than a year an uncertainty about the future
For those reason I think would be interesting to evaluate moving to a different library. I know that in the past json-smart was used and was working relatively well. The project seems to be maintained again, so maybe it’s an option.
Comments (4)
-
-
reporter I submitted a PR to the Gson project: https://github.com/google/gson/pull/2674 Lets see if they pick it up.
Since the library is shaded, I can try to submit a patch to fix this issue as a “temporal” workaround.
-
- changed status to resolved
Commit 26527774ef13ac4246505c0c05b167f4568dfbdd bumps GSon to 2.11.0, addressing Number parsing performance issue
-
Released:
version 9.39.3 (2024-05-30) * Bumps GSon to 2.11.0, addressing Number parsing performance issue (#546).
- Log in to comment
Okay, so it looks like GSon is capable of hitting teletype level performance in 2024.
We’ve had JSON Smart for a number of years, but then we got to a time when it started receiving DoS + unchecked exception CVEs every other month and that was upsetting a lot of people and causing us a lot of work. Some of these vulns were known to us prior to that, and we kind of mitigated around them. Until after years CVEs became a thing and people started “discovering” them. The internals of JSON Smart is pretty messy. I’ve chatted with the author on several occasions. He’s a genius, but apparently that’s his style of work. I feel uneasy about going back to JSON Smart. At a few occasions in deep prod the Smart serialised utter JSON weirdness for which we had no explanation. I’d rather err on the side of performance than have us break things.
Let me know how you see things, short list of good candidates. And how you see approaching this.
Thank god that bit of the lib is shaded now and changes should be easy to pull off.