Switch to java8's Base64 decoder for performance

Issue #380 new

Yohei Ueki created an issue 2020-10-01

The performance of src/main/java/com/nimbusds/jose/util/Base64Codec.java is bad when comparing to java8’sjava.util.Base64.

My experiment shows that Base64Codec#decode consumes 20% of CPU time during parsing JWT.

I simply replace Base64Codec#decode as follows, and this consumes only 4% of CPU time for parsing JWTs.

    public static byte[] decode(final String b64String) {
        return Base64.getUrlDecoder().decode(b64String);
    }

I know this is blocked by issue ~~#259~~.

‌

Comments (12)

Yohei Ueki reporter
- edited description
- 2020-10-01T15:38:25+00:00
Vladimir Dzhuvinov
Interesting result. What tool was used?

Would you be able to benchmark how much time it takes to decode a given Base64 string, with each decoder?
- 2020-10-02T07:20:52+00:00
Yohei Ueki reporter
Hi Vladimir,

I used jmh(benchmark runner) and async-profiler.
During benchmarking jwt parsing, I found one of the bottleneck is base64 parsing.

I pushed benchmark code and result of base64 parsers: https://github.com/yueki1993/nimbus-base64-benchmark, which compares java8, nimbus, guava, and commons-codec.
The result shows java8's base64 decoder is 5 or 6 times faster than nimbus'.
- 2020-10-02T15:25:13+00:00
Yohei Ueki reporter
Hi,

I also pushed the async-profiler result.

https://github.com/yueki1993/nimbus-base64-benchmark#benchmark-and-profile-signedjwt-parsing

For large string, it takes more time to parse base64/json than rsa signature verification.

The profiling results also imply that we should avoid toJSONObject() as less as possible.
SignedJWT.java#getJWTClaimSet executes getPayload().toJSONObject(),
and the jsonify result is not cached in Payload#toJSONObject() .
This is bad when jwt.getJWTClaimsSet is multiply executed like:
- first, we execute jwt = SignedJWT.parse(jwtStr),
- second, we do some business logic using jwt.getJWTClaimsSet(),
- and later, we execute DefaultJWTProcessor.process(jwt, ctx) to verify the token, and this executes jwt.getClaimsSet() internally.
https://bitbucket.org/connect2id/nimbus-jose-jwt/src/13cd858dc48b99206de5d0a9df5d65aace4ddbc7/src/main/java/com/nimbusds/jwt/SignedJWT.java#lines-85
https://bitbucket.org/connect2id/nimbus-jose-jwt/src/13cd858dc48b99206de5d0a9df5d65aace4ddbc7/src/main/java/com/nimbusds/jose/Payload.java#lines-356

If ok, I want to submit a PR to cache the result of jsonify.
[Plan A: modify Payload.java]
```
//line 115
private Map<String, Object> jsonObject; // remove `final`

...
// line 356
jsonObject = JSONObjectUtils.parse(s);
return jsonObject
```
https://bitbucket.org/connect2id/nimbus-jose-jwt/src/13cd858dc48b99206de5d0a9df5d65aace4ddbc7/src/main/java/com/nimbusds/jose/Payload.java
This modification breaks consistency with other fields of Payload (string, bytes, base64URL, jwsObject etc; they are still final ), so this would not be good idea.

[Plan B: modify subclasses#getJWTClaimsSet of JWT]
```
    private JWTClaimsSet claimsSet = null;
    ...
    @Override
    public JWTClaimsSet getJWTClaimsSet()
        throws ParseException {
        if (claimsSet != null) return claimsSet;

        Map<String, Object> json = getPayload().toJSONObject();

        if (json == null) {
            throw new ParseException("Payload of JWS object is not a valid JSON object", 0);
        }

        claimsSet = JWTClaimsSet.parse(json);
        return claimsSet;
    }

    // if we should consider that someone extends SignedJWT/PlainJWT/EncryptedJWT and override setPayload,
    // it would be more safe to purge the claimSet cache.
    @Override
    protected void setPayload(Payload payload) {
        claimsSet = null;
        super.setPayload(payload);
    }
```
I think this would be better, but what do you think?

‌

Also, I pushed DefaultJWTProcessor with JWSVerificationKeySelector benchmark.
JWSVerificationKeySelector#selectJWSKeys translates JWK → Key everytime it executes and this would consume cpu time.
(And actually this becomes the bottleneck in my project)
I can avoid this problem by implementing JWSKeySelector by myself,
but if possible, I want to fix JWSVerificationKeySelector so that it can have a cache of Key object since it would be updated rarely.
Can I submit this PR as well?

‌

‌
- 2020-10-03T04:43:05+00:00
Vladimir Dzhuvinov
Thank you so much for your thorough post and investigation.

It looks like the JDK team has done a great job with the Java 8 Base64 decoder. The Nimbus Base64 codec was refactored at some time in the past by a consultant we worked with, to prevent timing leaks. This may have affected the performance adversely, which in the early days was quite good (I cannot recall exact benchmarks right now). Perhaps one could select the safe decoder for the critical ops only, and use a performance decoder for everything else. This will require some thought.

Could you submit a PR according to Plan B, and also for the key selector?

‌
- 2020-10-05T07:50:44+00:00
Yavor Vasilev
I found the PR which had the BASE64 CODEC rewrite to mitigate side channel leaks (because the original codec had a static table):
https://bitbucket.org/connect2id/nimbus-jose-jwt/pull-requests/38/mitigate-cache-timing-attacks-on-base64/diff
- 2020-10-05T17:29:04+00:00
Yohei Ueki reporter
Hi,

Sorry for the delay.
I submitted a PR of getJWTClaimsSet() caching : https://bitbucket.org/connect2id/nimbus-jose-jwt/pull-requests/71/cache-getjwtclaimsset-result-in-jwt
Later I will also submit a PR for key selector.

And thanks Yavor; I’ll add a b64 benchmark of base64 codec before the PR.

‌
- 2020-10-11T07:50:29+00:00

Vladimir Dzhuvinov

FYI: The merged RP was pushed to Maven Central as version 9.1 (2020-10-20).

version 9.1 (2020-10-20)
    * Caches JWT.getJWTClaimsSet() result in implementations to improve
      performance (iss #380).
    * Adds new static X509CertUtils.setProvider(java.security.Provider) method
      for setting a preferred JCA provider for the certification operations.
      The X509CertUtils.getProvider method returns the currently configured JCA
      provider (iss #382).

‌

2020-10-20T09:07:39+00:00

Yohei Ueki reporter
Hi Vladimir,
I am really sorry for my late reply.

Thanks for merging & releasing my PR!

I also tried to tune Key Selector but my modification seems unnatural.
(The best solution would be implementing toJavaKeys() in JWK but avoiding breaking change seems hard)
https://bitbucket.org/yueki/nimbus-jose-jwt-pr/commits/eadf6089c85eb4affe2b52fc10520235fd3fe1d7
And what’s more profiling results imply that tuning base64 decoder should reduce CPU time of translating JWK → Key
because most of time is consumed by base64 decoding.
So I wait for base64 parser improvement!

Thanks,
- 2020-10-27T14:44:05+00:00
Yohei Ueki reporter
Hi Yavor&Vladimir,

Really sorry for my delay.

https://github.com/yueki1993/nimbus-base64-benchmark#result
I attached results of base64 nimbus v5.14 (the version before base64codec refactored) and google-tink.

Actually the performance of old nimbus base64 decoder is (about 2 times) faster than the new one
Still, java8’s and guava’s ones are faster.

I found nimbus already has google-tink dependency in pom.xml (it is optional dependency, though).
If updating to java8 is hard and adding google-guava dependency is bad, using google-tink one would be good idea.
(I am unsure google-tink’s one considers side-channel attack etc, though.)

‌

‌
- 2020-11-14T05:03:43+00:00
Vladimir Dzhuvinov
Thanks for benchmarking the old implementation.

We definitely must keep the constant-time codec for critical operations.

One approach I imagine is to define codec interfaces for a safe and unsafe codec, and update the underlying code to call the safe when required, and allow the unsafe in all other cases.

Thus you will be able to plug your own codec, be it the one that comes with Java 8 or some other. How does this sound to you?

In any case we need to check how much work that is roughly going to require.
- 2020-11-16T10:35:33+00:00
Yohei Ueki reporter
That sounds great to me.
Almost of engineers including me are not expert on security, so it would be nice if
nimbus' users do not need care about security and nimbus judges where safe codec must be used and where unsafe codec is allowed.

‌
- 2020-11-23T05:22:43+00:00
Log in to comment

Assignee: –

Type: enhancement

Priority: major

Status: new

Component: –

Milestone: –

Votes: 0

Watchers: 1