verified_claims in an introspection response should represent input requirements for verified claims

Issue #1175 closed
Takahiko Kawasaki created an issue

A response from the introspection endpoint (RFC 7662) includes information about an access token itself. The response should not include information that may be obtained by using the access token. In that sense, the value of verified_claims in a response from the introspection endpoint should not be the end-user’s data. Instead, if included, the value of verified_claims in an introspection response should represent input requirements for verified claims. Embedding the end-user’s data in an introspection response is equivalent to embedding the end-user’s data in a JWT access token, which I don’t believe is a good practice.

Comments (12)

  1. Torsten Lodderstedt

    Hi @Takahiko Kawasaki , why do you believe embedding end-user claims in an access token is not a good practice? It allows the RS to directly use this data without another (remote) lookup. This pattern is used in several places and it is documented in the upcoming JWT access token BCP (see https://tools.ietf.org/html/draft-ietf-oauth-access-token-jwt-03#section-2.2.1).

    Privacy is preserved by encrypting the token contents. Consequently, the same holds true for introspection responses. It is not as efficient but allows to implement audience restricted token content without the need to issue RS-specific access tokens.

  2. Takahiko Kawasaki reporter

    That part in “JSON Web Token (JWT) Profile for OAuth 2.0 Access Tokens” says as follows.

    Commercial authorization servers will often include resource owner attributes directly in access tokens, so that resource servers can consume them directly for authorization or other purposes without any further roudtrips to introspection ([RFC7662]) or userinfo ([OpenID.Core]) endpoints.

    However, methods that are often used are not necessarily best practices. An example we all know well is “OAuth authentication”.

    The userinfo endpoint can return up-to-date information when it is called. Embedding user data into an access token when it is issued will make it impossible to return up-to-date information from the userinfo endpoint. In that sense, I object the above paragraph excerpted from “JSON Web Token (JWT) Profile for OAuth 2.0 Access Tokens”, too.

    That privacy is preserved by encryption doesn’t matter. That another lookup becomes unnecessary doesn’t matter, either. It just sounds strange that the introspection endpoint returns not only information about an access token itself but also the end-user’s detailed information that may be or should be obtained by calling another API (e.g. the userinfo endpoint).

    In my opinion, including user data in an introspection response (and in an access token) is not a right approach, but it is a kind of negligence, abuse and bad architecture of some commercial/open-source implementations.


    Here is my logic.

    When an implementation of the userinfo endpoint receives an access token, the implementation has to know what claims are requested by the access token (unless user info was embedded in the access token when the access token was issued).

    This means, as an inevitable logical consequence, an access token must be associated with the information contained in the ”userinfo” property in the claims request parameter of the authorization request which was made for the issuance of the access token.

    For example, an access token must be associated with the information equivalent to the following JSON (excerpted from OIDC Core 1.0 Section 5.5).

    {
     "given_name": {"essential": true},
     "nickname": null,
     "email": {"essential": true},
     "email_verified": {"essential": true},
     "picture": null,
     "http://example.info/claims/groups": null
    }
    

    After getting the information contained in the ”userinfo” property associated with the access token in some way or other, the implementation of the userinfo endpoint accesses the user database to retrieve actual values of the requested claims.

    If an implementation of the userinfo endpoint has to return actual values of requested verified claims too, an access token must be associated with the information like the following.

    {
     "given_name": {"essential": true},
     "nickname": null,
     "email": {"essential": true},
     "email_verified": {"essential": true},
     "picture": null,
     "http://example.info/claims/groups": null,
     "verified_claims": {
      "verification": {
       "time": null
      },
      "claims": {
       "given_name": null,
       "family_name": null
      }
     }
    }
    

    This is the very information, the information about “what verified claims are requested with what conditions”, that must be associated with an access token regarding verified claims. Actual values of request verified claims should not be directly tied to an access token.


    Now, I can go back to the starting point. Because the introspection endpoint is an endpoint that returns information about an access token, if responses from the introspection endpoint include ”verified_claims”, the value of “verified_claims” should represent the query conditions for verified claims instead of actual values of verified claims.

  3. Torsten Lodderstedt

    However, methods that are often used are not necessarily best practices. An example we all know well is “OAuth authentication”.

    In contrast to “OAuth authentication”, the practice of conveying user claims in access token and introspection responses is documented in specifications of the OAuth WG. If you think the OAuth WG is wrong, I recommend you to post to the OAuth mailing list.

    Now, I can go back to the starting point. Because the introspection endpoint is an endpoint that returns information about an access token, if responses from the introspection endpoint include ”verified_claims”, the value of “verified_claims” should represent the query conditions for verified claims instead of actual values of verified claims.

    I have to admit I cannot follow you. I assume we are talking about an OAuth use case, where a client obtains an access token with a certain scope relevant for invoking a RS, let’s say scope “get_open_issues” at a issue tracking API.

    When the request hits the API, it wants to determine what user delegates the access and what user privileges were delegated to the calling client. In my opinion, it is straight forward to add the user id (sub) and at least roles of the user to the access token, other user-specific data (aka claims) might be needed as well. The RS can directly validate the authorisation of the client to make the call and proceed.

    Please explain how this is supposed to work with your proposal.

  4. Torsten Lodderstedt

    @Takahiko Kawasaki Will you add further explanation as basis for further discussions or shall I close this ticket?

  5. Takahiko Kawasaki reporter

    This diagram illustrates what I explained previously. The introspection response should include the information about the access token only. The information about the end-user (the actual values of the end-user’s claims) should be handled somewhere else. Implementers of monolithic AS implementations don’t care about this separation, though.

  6. Torsten Lodderstedt

    How does the UserInfo Endpoint in your example determine the user id as prerequisite to look the correct data up in the User DB?

    The information about the end-user (the actual values of the end-user’s claims) should be handled somewhere else.

    That’s the whole point. How is an ordinary RS supposed to gain access to this data in your proposal? Your illustration is based on the UserInfo endpoint that is special since it belongs to the AS/OP and has direct access to a User DB. Please describe how this is supposed to work for ordinary RSs not belonging to the AS, e.g. the tracking API I mentioned.

  7. Takahiko Kawasaki reporter

    How does the UserInfo Endpoint in your example determine the user id as prerequisite to look the correct data up in the User DB?

    The introspection response contains a sub claim that represents the user ID.

    How is an ordinary RS supposed to gain access to this data

    The diagrams below illustrate how.

  8. Torsten Lodderstedt

    Couple of questions:

    The introspection response contains a sub claim that represents the user ID.

    So what is the reason for not including additional claims?

    Pattern 1:

    • How is this request protected?
    • Is there an interoperable standard for the Profile API? Otherwise, this would be a proprietary solution in contrast to using the Token Introspection Endpoint for claims delivery.

    Patter 2:

    • How is this request protected?
    • This obviously is a proprietary solution.

    Pattern 3:

    • The UserInfo endpoint requires a suitable access token. How is the RS supposed to obtain this access token? In your proposal, the introspection endpoint returns the list of claims the RS needs to acquire, but do you bridge the gap between this response and the request towards the UserInfo endpoint?

  9. Takahiko Kawasaki reporter

    So what is the reason for not including additional claims?

    Because the sub claim is special. User IDs are used not only by the database table that holds user attributes but also by other database tables that do not hold user attributes, e.g. chat history, payment history, friend list, etc. User IDs are not user attributes but keys to refer to user attributes. Some people may don’t care about the difference, though.

    Regarding interoperability of the pattern 1, there is no standard. If it is needed, we can develop it. Extending the introspection endpoint without trying to develop a profile endpoint for RS seems to be a kind of abuse to me as I mentioned repeatedly. But others don’t agree with me, I don’t object it any more.

    However, to say the least, AS implementations whose access tokens are directly tied to (a) actual values of verified claims which were available when the access tokens were created instead of (b) query conditions of verified claims are not good. This is my kind advice even to my competitors. If access token implementations are tied to (b), the value of verified_claims in the introspection response should also be (b). This is what I insisted.

  10. Torsten Lodderstedt

    Hi @Takahiko Kawasaki , I appreciate and respect your opinion as an expert. But all of your counter proposals are much more complex ways for resource servers to obtain user data associated with a certain access token (new interfaces, additional access tokens needed to authorize the query of data belonging to another access tokens) and I cannot see any benefit. Directly passing this data in the access token or token introspection response is simple and straightforward and I don’t see any disadvantage. It‘s well established best practice (btw since the good old Kerberos times), which is supported by many implementations.

    I suggest to close this ticket.

  11. Log in to comment