Switch to java8's Base64 decoder for performance

Issue #380 new
Yohei Ueki created an issue

The performance of src/main/java/com/nimbusds/jose/util/Base64Codec.java is bad when comparing to java8’sjava.util.Base64.

My experiment shows that Base64Codec#decode consumes 20% of CPU time during parsing JWT.

I simply replace Base64Codec#decode as follows, and this consumes only 4% of CPU time for parsing JWTs.

    public static byte[] decode(final String b64String) {
        return Base64.getUrlDecoder().decode(b64String);
    }

I know this is blocked by issue #259.

Comments (12)

  1. Vladimir Dzhuvinov

    Interesting result. What tool was used?

    Would you be able to benchmark how much time it takes to decode a given Base64 string, with each decoder?

  2. Yohei Ueki reporter

    Hi Vladimir,

    I used jmh(benchmark runner) and async-profiler.
    During benchmarking jwt parsing, I found one of the bottleneck is base64 parsing.

    I pushed benchmark code and result of base64 parsers: https://github.com/yueki1993/nimbus-base64-benchmark, which compares java8, nimbus, guava, and commons-codec.
    The result shows java8's base64 decoder is 5 or 6 times faster than nimbus'.

  3. Yohei Ueki reporter

    Hi,

    I also pushed the async-profiler result.

    https://github.com/yueki1993/nimbus-base64-benchmark#benchmark-and-profile-signedjwt-parsing

    For large string, it takes more time to parse base64/json than rsa signature verification.

    The profiling results also imply that we should avoid toJSONObject() as less as possible.
    SignedJWT.java#getJWTClaimSet executes getPayload().toJSONObject(),
    and the jsonify result is not cached in Payload#toJSONObject() .
    This is bad when jwt.getJWTClaimsSet is multiply executed like:

    • first, we execute jwt = SignedJWT.parse(jwtStr),
    • second, we do some business logic using jwt.getJWTClaimsSet(),
    • and later, we execute DefaultJWTProcessor.process(jwt, ctx) to verify the token, and this executes jwt.getClaimsSet() internally.

    https://bitbucket.org/connect2id/nimbus-jose-jwt/src/13cd858dc48b99206de5d0a9df5d65aace4ddbc7/src/main/java/com/nimbusds/jwt/SignedJWT.java#lines-85
    https://bitbucket.org/connect2id/nimbus-jose-jwt/src/13cd858dc48b99206de5d0a9df5d65aace4ddbc7/src/main/java/com/nimbusds/jose/Payload.java#lines-356

    If ok, I want to submit a PR to cache the result of jsonify.
    [Plan A: modify Payload.java]

    //line 115
    private Map<String, Object> jsonObject; // remove `final`
    
    ...
    // line 356
    jsonObject = JSONObjectUtils.parse(s);
    return jsonObject
    

    https://bitbucket.org/connect2id/nimbus-jose-jwt/src/13cd858dc48b99206de5d0a9df5d65aace4ddbc7/src/main/java/com/nimbusds/jose/Payload.java
    This modification breaks consistency with other fields of Payload (string, bytes, base64URL, jwsObject etc; they are still final ), so this would not be good idea.

    [Plan B: modify subclasses#getJWTClaimsSet of JWT]

        private JWTClaimsSet claimsSet = null;
        ...
        @Override
        public JWTClaimsSet getJWTClaimsSet()
            throws ParseException {
            if (claimsSet != null) return claimsSet;
    
            Map<String, Object> json = getPayload().toJSONObject();
    
            if (json == null) {
                throw new ParseException("Payload of JWS object is not a valid JSON object", 0);
            }
    
            claimsSet = JWTClaimsSet.parse(json);
            return claimsSet;
        }
    
        // if we should consider that someone extends SignedJWT/PlainJWT/EncryptedJWT and override setPayload,
        // it would be more safe to purge the claimSet cache.
        @Override
        protected void setPayload(Payload payload) {
            claimsSet = null;
            super.setPayload(payload);
        }
    

    I think this would be better, but what do you think?

    Also, I pushed DefaultJWTProcessor with JWSVerificationKeySelector benchmark.
    JWSVerificationKeySelector#selectJWSKeys translates JWK → Key everytime it executes and this would consume cpu time.
    (And actually this becomes the bottleneck in my project)
    I can avoid this problem by implementing JWSKeySelector by myself,
    but if possible, I want to fix JWSVerificationKeySelector so that it can have a cache of Key object since it would be updated rarely.
    Can I submit this PR as well?

  4. Vladimir Dzhuvinov

    Thank you so much for your thorough post and investigation.

    It looks like the JDK team has done a great job with the Java 8 Base64 decoder. The Nimbus Base64 codec was refactored at some time in the past by a consultant we worked with, to prevent timing leaks. This may have affected the performance adversely, which in the early days was quite good (I cannot recall exact benchmarks right now). Perhaps one could select the safe decoder for the critical ops only, and use a performance decoder for everything else. This will require some thought.

    Could you submit a PR according to Plan B, and also for the key selector?

  5. Vladimir Dzhuvinov

    FYI: The merged RP was pushed to Maven Central as version 9.1 (2020-10-20).

    version 9.1 (2020-10-20)
        * Caches JWT.getJWTClaimsSet() result in implementations to improve
          performance (iss #380).
        * Adds new static X509CertUtils.setProvider(java.security.Provider) method
          for setting a preferred JCA provider for the certification operations.
          The X509CertUtils.getProvider method returns the currently configured JCA
          provider (iss #382).
    

  6. Yohei Ueki reporter

    Hi Vladimir,
    I am really sorry for my late reply.

    Thanks for merging & releasing my PR!

    I also tried to tune Key Selector but my modification seems unnatural.
    (The best solution would be implementing toJavaKeys() in JWK but avoiding breaking change seems hard)
    https://bitbucket.org/yueki/nimbus-jose-jwt-pr/commits/eadf6089c85eb4affe2b52fc10520235fd3fe1d7
    And what’s more profiling results imply that tuning base64 decoder should reduce CPU time of translating JWK → Key
    because most of time is consumed by base64 decoding.
    So I wait for base64 parser improvement!

    Thanks,

  7. Yohei Ueki reporter

    Hi Yavor&Vladimir,

    Really sorry for my delay.

    https://github.com/yueki1993/nimbus-base64-benchmark#result
    I attached results of base64 nimbus v5.14 (the version before base64codec refactored) and google-tink.

    Actually the performance of old nimbus base64 decoder is (about 2 times) faster than the new one
    Still, java8’s and guava’s ones are faster.

    I found nimbus already has google-tink dependency in pom.xml (it is optional dependency, though).
    If updating to java8 is hard and adding google-guava dependency is bad, using google-tink one would be good idea.
    (I am unsure google-tink’s one considers side-channel attack etc, though.)

  8. Vladimir Dzhuvinov

    Thanks for benchmarking the old implementation.

    We definitely must keep the constant-time codec for critical operations.

    One approach I imagine is to define codec interfaces for a safe and unsafe codec, and update the underlying code to call the safe when required, and allow the unsafe in all other cases.

    Thus you will be able to plug your own codec, be it the one that comes with Java 8 or some other. How does this sound to you?

    In any case we need to check how much work that is roughly going to require.

  9. Yohei Ueki reporter

    That sounds great to me.
    Almost of engineers including me are not expert on security, so it would be nice if
    nimbus' users do not need care about security and nimbus judges where safe codec must be used and where unsafe codec is allowed.

  10. Log in to comment