should auth_req_id have limits on allowable characters?

Issue #150 resolved
Joseph Heenan created an issue

I can't find anything in the spec for auth_req_id that limits it's allowable characters or maximum length.

For interoperability purposes it may be desirable to have a limited allowed character set (same as base64url allows?) and a max length (1024 for consistency with client_notification_token?).

(device_code in the OAuth 2.0 Device Flow also doesn't have any restrictions I can see, https://tools.ietf.org/html/draft-ietf-oauth-device-flow-13#section-3.2 )

Comments (11)

  1. Brian Campbell

    The auth_req_id is communicated in HTTP entity bodies with JSON or as a form encoded parameter with application/x-www-form-urlencoded format and a character encoding of UTF-8 so I don't see a compelling reason to place arbitrary limits on it. And as far as I know the same rational applies to the Device Flow and is why there are no restrictions there on the device_code either.

  2. Joseph Heenan reporter

    I'm fine with that.

    Would this imply that when we write a test for a CIBA client, we should include a test where an authreqid has:

    1. U+00ff Latin Small Letter Y with Diaeresis (will trip up anyone that's not correctly applying utf-8 or is denormalising the string somehow)
    2. U+0000 Unicode Character 'NULL' (legal in a JSON string I believe but likely to causing C implementations issues)
    3. U+0079 U+0308 Latin small letter Y, COMBINING DIAERESIS (testing that the string isn't being normalised)
    4. U+1F44D U+1F3FB THUMBS UP SIGN, EMOJI MODIFIER FITZPATRICK TYPE-1-2 (emojis are all the rage)
    5. Enough random things spread across the rest of the utf-8 valid space to give 160 bits of entropy

    These would all be correctly utf-8 and whatever else encoded.

    (It case it's not clear, I'm not trying to be difficult - I'm trying to figure out the limits of where we should go with interoperability testing - the aim being that if we have an RP and an OP that both pass the tests they're guaranteed to work together no matter how weirdly each one decides to interpret the specs. Also I loath Unicode... and example '3' did cause an interoperability issue for my team just last week, albeit not in the oauth space)

  3. Joseph Heenan reporter

    This was discussed for about 20 minutes on today's call (sorry).

    There were good arguments on both sides, but I believe a preference came out to restrict the character set in a similar way to how client_notification_token is restricted.

    The argument for restricting is that:

    1) It is highly desirable to avoid the need for clients to be overly careful in how they handle the strings; decoding utf-8 characters from json that sit outside the 'safe' set, and reencoding them in url form encoding, is easy to get wrong.

    2) All servers implemented by sane engineers are likely to just send base64url things anyway

    3) Everything else already appears to be restricted in a similar way

    4) Restricting the character set means the conformance tests for CIBA can be simpler as they don't need to test the weird edge cases

    We talked about using a similar restriction as to bearer tokens, but I'd slightly misremember what the bearer token restriction was; to quote https://tools.ietf.org/html/rfc6750#section-2.1 :

    The syntax for Bearer credentials is as follows:

    b64token    = 1*( ALPHA / DIGIT /
                      "-" / "." / "_" / "~" / "+" / "/" ) *"="
    credentials = "Bearer" 1*SP b64token
    

    This is actually aligned to the normal base64 set, which wouldn't be sane for auth_req_id which does have to be url encoded.

    As discussed on the call, the character set used in unpadded base64url seems like a safe and expected choice, hence my suggested wording to be added to https://openid.net/specs/openid-client-initiated-backchannel-authentication-core-1_0.html#successful_authentication_request_acknowdlegment under auth_req_id something like:

    The OpenID Provider MUST restrict the characters used to 'A'-'Z', 'a'-'z', '0'-'9', '-' and '_', to reduce the chance of the client incorrectly decoding or re-encoding the auth_req_id; this character set was chosen to allow the server to use unpadded base64url if it wishes. The identifier MUST be treated as opaque by the client.

    Suggestions to improve that wording are most welcome.

  4. Joseph Heenan reporter

    I guess it might be sane to also allow '.' in case any servers decide to use JWS/JWE?

  5. Joseph Heenan reporter

    So suggested text would be (with the '.' addition):

    The OpenID Provider MUST restrict the characters used to ‘A'-'Z', 'a'-'z', '0'-'9', '.’, '-' and '_', to reduce the chance of the client incorrectly decoding or re-encoding the auth_req_id; this character set was chosen to allow the server to use unpadded base64url if it wishes. The identifier MUST be treated as opaque by the client.

  6. Log in to comment