Performance impact while using Gson

Issue #546 resolved
Carlos Tasada created an issue

First of all, thank you for the amazing job with this library.

We are using this library, both via SpringBoot and directly in some services. While troubleshooting some memory issues we found out the next behaviour:

  • We are generating JWT with 3 timestamps with millisecond accuracy. Those timestamps are using a double format: 12345.6789
  • Every time a request is received, the JWT is parsed and validated.
  • While troubleshooting the memory issues we found out that the JWT parsing, via the Gson dependency is generating 3 NumberFormatExceptions for each parsing

The issue is related with the way the Gson library “evaluates” if a Number is Long or Double

    public Number readNumber(JsonReader in) throws IOException, JsonParseException {
      String value = in.nextString();
      try {
        return Long.parseLong(value);
      } catch (NumberFormatException longE) {
        try {
          Double d = Double.valueOf(value);
          if ((d.isInfinite() || d.isNaN()) && !in.isLenient()) {
            throw new MalformedJsonException(
                "JSON forbids NaN and infinities: " + d + "; at path " + in.getPreviousPath());
          }
          return d;
        } catch (NumberFormatException doubleE) {
          throw new JsonParseException(
              "Cannot parse " + value + "; at path " + in.getPreviousPath(), doubleE);
        }
      }
    }

The fact that each parsing is generating an Exception is directly impacting in memory and performance. A simple JUnit test shows the next values

    ToNumberStrategy strategy = ToNumberPolicy.LONG_OR_DOUBLE;
    for (var i = 0; i < 10_000_000; i++) {
      assertThat(strategy.readNumber(fromString("10"))).isEqualTo(10);
    }

Average time of 10 executions: 2s 430ms

    ToNumberStrategy strategy = ToNumberPolicy.LONG_OR_DOUBLE;
    for (var i = 0; i < 10_000_000; i++) {
      assertThat(strategy.readNumber(fromString("10.1"))).isEqualTo(10.1);
    }

Average time of 10 executions: 36s 655ms

1 execution = 1 loop with 10K asserts

The Gson library is in maintenance mode, so pushing any fix there will be problematic. From my POV this situation is bad in 2 different ways:

For those reason I think would be interesting to evaluate moving to a different library. I know that in the past json-smart was used and was working relatively well. The project seems to be maintained again, so maybe it’s an option.

Comments (4)

  1. Vladimir Dzhuvinov

    Okay, so it looks like GSon is capable of hitting teletype level performance in 2024.

    We’ve had JSON Smart for a number of years, but then we got to a time when it started receiving DoS + unchecked exception CVEs every other month and that was upsetting a lot of people and causing us a lot of work. Some of these vulns were known to us prior to that, and we kind of mitigated around them. Until after years CVEs became a thing and people started “discovering” them. The internals of JSON Smart is pretty messy. I’ve chatted with the author on several occasions. He’s a genius, but apparently that’s his style of work. I feel uneasy about going back to JSON Smart. At a few occasions in deep prod the Smart serialised utter JSON weirdness for which we had no explanation. I’d rather err on the side of performance than have us break things.

    Let me know how you see things, short list of good candidates. And how you see approaching this.

    Thank god that bit of the lib is shaded now and changes should be easy to pull off.

  2. Vladimir Dzhuvinov

    Released:

    version 9.39.3 (2024-05-30)
    * Bumps GSon to 2.11.0, addressing Number parsing performance issue (#546).
    

  3. Log in to comment