OAuth Access Token Implementation
1. Implementation Types
It depends on each authorization server how to implement access tokens.
It depends, but as the description in “1.4. Access Token” of RFC 6749 implies as follows,
The token may denote an identifier used to retrieve the authorization information or may self-contain the authorization information in a verifiable manner (i.e., a token string consisting of some data and a signature). …
implementations can be categorized into two major patterns, that is, “identifier type” and “self-contained type”. There also exists “hybrid type” that combines the two types.
1.1. Identifier Type
In an identifier-type implementation, information associated with access tokens is stored in a database of the authorization server. Then, unique identifiers that can identify each database record are used as access tokens.
In actual implementations, it is likely that hash values of access tokens are stored in the database instead of access tokens themselves for better security.
1.2. Self-Contained Type
In a self-contained-type implementation, information associated with an access token is embedded in the access token itself. The authorization server does not have to manage information associated with access tokens in its database.
1.3. Hybrid Type
In a hybrid-type implementation, self-contained-type access tokens are generated and at the same time database records corresponding to the access tokens are stored in a database of the authorization server.
2. How To Get Access Token Information
2.1. How To Get Information about Identifier-Type Access Token
If access tokens are identifier type, resources servers must make inquiries to the authorization server about access token information unless the resource servers and the authorization server directly share the same database. The API that the authorization server provides for the inquiries is called “introspection endpoint”.
RFC 7662 (OAuth 2.0 Token Introspection) is the standard specification for the introspection endpoint.
The standard introspection endpoint accepts POST
requests with a mandatory token
request parameter and an optional token_type_hint
request parameter and returns access token information in JSON format. The following are a request example and a response example from RFC 7662.
An only concern over identifier-type access tokens that is frequently asked is performance because making an inquiry to an introspection endpoint will involve network communication. However, in practice, concerns around introspection latency are largely mitigated by good caching. Of course, it is important to delete cache entries when corresponding access tokens are revoked.
2.2. How To Get Information about Self-Contained-Type Access Token
If access tokens are self-contained type, access token information (such as expiration date) can be obtained by reading the content of the access tokens. Resource servers do not have to make inquiries to the introspection endpoint of the authorization server.
3. Verification of Self-Contained-Type Access Token
The format of self-contained-type access tokens is publicly known unless they are encrypted. Therefore, access tokens can be easily counterfeited if there is no mechanism to prevent it.
Before swallowing information embedded in an access token, resource servers must verify in some way or other that the access token is not a fake one. This is the reason that the sentence shown above excerpted from “1.4. Access Token” of RFC 6749 says “in a verifiable manner”.
A common practice to detect forgery is to attach signature to data and verify the signature when the data is used. In the case of self-contained-type access tokens, the steps are (1) that an authorization server generates a signature-attached access token and then (2) that a resource server verifies the signature.
As JWT (JSON Web Token) defined in RFC 7519 is handy as a generic format for signature-attached data, JWT is often adopted as the format of self-contained-type access tokens. In fact, there exists a specification that assumes it.
4. Consideration Points for JWT-based Access Token
4.1. Signature Algorithm
Available choices for signature algorithm are the ones listed in “3.1. “alg” (Algorithm) Header Parameter Values for JWS” of RFC 7518 (JSON Web Algorithms). Among them, none
which means “no signature” is meaningless for JWT-based access tokens, of course.
4.1.1. Symmetric Signature Algorithm
Because HS256
, HS384
and HS512
are symmetric algorithms, an authorization server (which generates JWT-based access tokens) and a resource server (which interprets JWT-based access tokens) must share the same key. At the time of this writing, there is no specification that defines a rule to determine the shared key.
“10.1. Signing” of OpenID Connect Core 1.0 states that the shared key is “the octets of the UTF-8 representation of the client_secret
value” when a symmetric algorithm is used for signing. However, this rule can apply only between an authorization server and a client application and cannot apply for a symmetric key shared between an authorization server and a resource server.
Therefore, implementers have to decide their own rules as to how to determine a shared key if they want to use symmetric algorithms for signing access tokens.
Some authorization server implementations issue pairs of client ID and client secret to resource servers. By treating resource servers as clients, the existing rules and infrastructure for keys can be reused. It may work, but I’m not so sure that mixing different concepts won’t cause inconsistencies somewhere unexpected in future.
4.1.2. Asymmetric Signature Algorithm
Other algorithms are asymmetric.
An authorization server signs an access token with a private key, and a resource server verifies the signature using a public key exposed by the authorization server. The resource server necessarily has to obtain the public key of the authorization server in advance before performing signature verification.
If the authorization server provides an endpoint that exposes its JWK Set document (RFC 7517) and the document includes a public key whereby to verify signature of access tokens, resource servers can download the public key from the endpoint.
If the authorization server supports OpenID Connect Discovery 1.0, resource servers can find the URL of the JWK Set document in a response from the discovery endpoint of the authorization server ({issuer-identifier}/.well-known/openid-configuration
). A discovery endpoint returns information about the server’s configuration in JSON format. The value of the jwks_uri
parameter in the JSON represents the URL of the JWK Set document. A live example of discovery endpoint is here (Google’s discovery endpoint).
RS256
is “Recommended” in RFC 7518, but it is better not to use asymmetric algorithms that start with RS
. For security reasons, “8.6. JWS algorithm considerations” in “Financial-grade API — Part 2: Read and Write API Security Profile” says that RS
algorithms should not be used. In addition, from a viewpoint of key size and performance, other algorithms are preferable.
4.2. Encryption
JWT-based access tokens can be encrypted by using RFC 7516 (JSON Web Encryption).
4.2.1. Symmetric Encryption Algorithm
As the same as in the case of symmetric signature algorithm, implementers have to decide a rule as to how to determine a shared key between an authorization server and a resource server because there is no standard specification for the purpose.
4.2.2. Asymmetric Encryption Algorithm
If an asymmetric algorithm is used for encryption, an authorization server uses a public key of a resource server in encrypting access tokens. However, there is no standard specification that defines how to get a public key of a resource server (Note1). Therefore, implementers have to decide how to pass a public key of a resource server to an authorization server.
(Note1: A specification (OAuth 2.0 Protected Resource Metadata) defining metadata of a resource server was proposed in the past and it included jwks_uri
, but the last update of the draft was more than 2 years ago (Jan. 19, 2017) and it has already expired.)
If an authorization server encrypts access tokens with an asymmetric algorithm using a public key of a resource server, the implementation of introspection endpoint of the authorization server needs a private key of the resource server that corresponds to the public key in order to decrypt the encrypted access tokens. If implementers think it is a bad practice to share a private key between an authorization server and a resource server, acceptable choices will be either (1) that the introspection endpoint returns an error when it receives an encrypted access token or (2) that the authorization server gives up providing an introspection endpoint.
4.3. Information Hidden from Client
It is easy to read the payload part of unencrypted JWT-based access tokens. Therefore, information that should not be visible to a client must not be included in an unencrypted JWT-based access token.
To associate information that should be hidden from a client with an access token, possible choices will be the following.
- encrypt access tokens
- adopt identifier-type access tokens
- adopt hybrid-type access tokens and keep secret information only in the database on the server side and not embed the information in access tokens
Regarding access token encryption, it should be noted that access tokens are sent through network on every API call and this fact will raise another issue when encryption keys are compromised. Especially, if the encryption algorithm lacks PFS (perfect forward secrecy).
4.4. Access Token Revocation
It is difficult, if not impossible, to revoke self-contained-type access tokens. Because the structure is the same as that of PKI certificate, a mechanism equivalent to PKI’s CRL (Certificate Revocation List) or OCSP (Online Certificate Status Protocol) must be in operation in order to revoke self-contained-type access tokens before their expiration.
To build a mechanism equivalent to CRL or OCSP, each access token must be uniquely identifiable. This can be achieved by utilizing the jti
claim. Then, an authorization server registers the unique identifier of an access token into its “access token revocation list” when the access token is revoked. The unique identifier must be kept in the list until the original expiration date of the access token is reached. If the unique identifier were removed earlier, the revoked access token would be resurrected.
When a resource server receives an access token, it must check revocation status of the access token. If a CRL-like mechanism has been adopted, the resource server will download the list of revoked access tokens from somewhere and check whether the unique identifier of the access token is included in the list or not. On the other hand, if an OCSP-like mechanism is in operation, the resource server will pass the unique identifier of the access token to an API equivalent to “OCSP responder” and get revocation status of the access token in return.
However, making an inquiry about revocation status to an authorization server will involve network communication as does making an API call to an introspection endpoint for an identifier-type access token. If this is true, the biggest advantage of self-contained-type access tokens vanishes. Considering the fact that identifier type has more merits, it is hard to find convincing reasons to adopt self-contained type positively. If implementers still had to choose self-contained type, it would be only when an authorization server and a resource server cannot communicate over network at runtime for some reasons.
Therefore, when adopting self-contained type for access token implementation, it is often the case that implementers make a compromise “make duration of access tokens as short as possible and give up revocation”.
4.5. Claim
4.5.1. Claim Name
At the time of this writing, there is no standard specification describing how to embed information about an access token, such as scopes, expiration date and client ID, into the payload part of JWT-based access token (Note2).
(Note2: Recently, an arguable draft was adopted by OAuth Working Group as the starting point for discussion.)
For example, scopes associated with an access token can be represented by (1) claim name scopes
with an array of scope names:
"scopes" : [ "email", "profile" ]
or by (2) claim name scope
with a string of space-delimited scope names.
"scope" : "email profile"
It is even possible to devise other ways.
An unobjectionable choice would be to reuse claim names and claim types defined in RFC 7662 (OAuth 2.0 Token Introspection) and RFC 7519 (JSON Web Token).
4.5.2. Certificate Binding
“OAuth 2.0 Mutual TLS Client Authentication and Certificate-Bound Access Tokens” (hereafter “MTLS”) includes specification describing how to bind a client certificate used in a token request to the access token issued based on the request. The following diagram is an excerpt from “Financial-grade API (FAPI), explained by an implementer”, illustrating the concept of “certificate-bound access tokens”.
If an access token is bound to a client certificate and the format of the access token is JWT, the JWT should include the thumbprint of the client certificate in the payload part, MTLS says. To be concrete, X.509 Certificate SHA-256 Thumbprint of the client certificate should be embedded as the value of the x5t#S256
sub-claim in the cnf
claim (cf. RFC 7800).
The following is an example in “3.1. JWT Certificate Thumbprint Confirmation Method” of MTLS.
The cnf
claim is a concrete but rare example of standard claim names defined for JWT-based access tokens.
4.5.3. Claims Included in UserInfo Response
“UserInfo Endpoint” is defined in Section 5.3 of OpenID Connect Core 1.0. If an access token having openid
scope is given to the endpoint, information about the user associated with the access token is returned.
An authorization request may request that some claims be embedded in responses from user info endpoint. There are two ways and they are defined in “5.4. Requesting Claims using Scope Values” and “5.5. Requesting Claims using the “claims” Request Parameter”, respectively.
An important point here is that claims requested to be embedded in responses from user info endpoint must be associated with the access token so that the claims can be referred to later when the access token is given to user info endpoint. A logical consequence is that self-contained-type access tokens that don’t have corresponding records in the server-side database must include information about “claims to be embedded in responses from user info endpoint” in themselves.
For example, if the value of the scope
request parameter is openid phone
(Note3) and the value of the claims
request parameter is the JSON shown below,
the access token generated based on the request must include information equivalent to "userinfo"
in the following example.
(Note3: Including phone
in the scope
request parameter is equal to requesting the phone_number
claim and the phone_number_verified
claim. See Section 5.4 of OIDC Core for details.)
4.5.4. Potential Privacy Leakage
A logical conclusion is “a self-contained-type access token must include information about claims which are requested to be embedded in responses from user info endpoint”. The discussion we internally had regarding this conclusion is interesting, so I disclose it here. In the conversation below, justin is Justin Richer, a writer of various standard specifications and the author of “OAuth 2 in Action”, and taka is me, co-founder of Authlete, Inc.
justin> Your conclusion on JWT-based access token is incomplete. You do not need to put the user info into the access token, nor should you do so as it is potentially privacy-leaking. The userinfo endpoint will need to dereference the JWT to determine which user it applies to. This can be done with the
sub
claim. If theclaims
parameter is to be supported directly as you describe above, then that can be looked up underneath thejti
claim which is transaction-specific as Joseph said. No JWT based system is fully self-contained, that I’ve seen. Any those that come closest use encrypted tokens (and the associated key management) to keep information relatively safe.taka> Privacy-leaking only if claims in
claims
containvalue
orvalues
.justin> Not really — the fact that a field exists at all could be privacy leaking. This is less of a problem with common things like “address” and “family name” but more of an issue once you get to other resource types like medical and financial records.
taka> “No JWT based system is fully self-contained” is an interesting statement.
justin> You are right in the common case, but as Authlete is flexible enough we should consider other things
justin> re: self-contained, what I mean by that is that inevitably the token will be used to look up something in a database someplace
taka> Yes, I understand.
justin> :nod: ok. There is a design pressure, that I’ve seen, to put as much information into the JWT itself so as to minimize lookup and network calls. This is a dangerous pattern with often unintended consequences in security and privacy because you have an all-powerful artifact that leaks everywhere it’s used.
justin> We saw this happen with SAML
justin> Since OAuth is much more about API access we see it less, but it still can creep in
taka> So, Authlete can embed information equivalent to the userinfo property into a JWT-based access token but should dare not to do it. Thank you for your valuable comment.
justin> Yes, it would be technically feasible but I would not recommend it as either an implementation or a general pattern.
What I felt interesting were “No JWT based system is fully self-contained” and “the fact that a field exists at all could be privacy leaking.”
Considering the above, even if JWT is adopted as the format of access tokens, I think well-thought-out authorization server implementations will eventually adopt the hybrid pattern where the server-side database has records corresponding to access tokens.
5. Authlete’s Implementation
The implementation of access tokens issued by Authlete is identifier type. But, since Authlete 2.1, access tokens in JWT format can be issued by setting “access token signature algorithm”. In the case, the implementation becomes hybrid type.
The following are notes regarding Authlete’s implementation.
- Symmetric signature algorithms are not supported. Because there is no standard specification defining how to determine a key shared between an authorization and a resource server as mentioned in “4.1.1. Asymmetric Signature Algorithm”, Authlete’s current stance toward symmetric signature algorithms is “wait-and-see”. This stance has been affected by the fact that “7.1.1. Signed Authentication Request” of CIBA Core 1.0 intentionally excludes symmetric signature algorithms.
- Encryption is not supported. Because there is no standard specification defining how to manage encryption keys as mentioned in “4.2. Encryption”, Authlete’s current stance toward encryption is “wait-and-see”. It is possible to implement encryption feature quickly, but I think that the right approach should start from defining resource server metadata and relationship between authorization servers and resource servers properly, and so it will not be an easy task to design the architecture for access token encryption.
- Information about requested claims is not embedded in JWT-based access tokens. Considering the concern described in “4.5.4. Potential Privacy Leakage”, Authlete dare not to include information about requested claims into JWT-based access tokens. For details about claims embedded in JWT-based access tokens issued by Authlete, please read the JavaDoc of
Service
class in the authlete-java-common library. - Custom claims are supported. Authlete provides a mechanism whereby to associate arbitrary key-value pairs with an access token (see “Extra Properties” for details). Key-value pairs whose
hidden
attribute isfalse
are embedded in JWT-based access tokens. By using this mechanism, developers can embed any custom claims into JWT-based access tokens. Note that, however, values of custom claims are always embedded as strings.
Authlete 2.1 is a leading-edge implementation of authorization server and OpenID provider that eagerly supports promising new technologies such as FAPI (Financial-grade API), CIBA (Client Initiated Backchannel Authentication), JARM (JWT Secured Authorization Response Mode for OAuth 2.0) and MTLS (OAuth 2.0 Mutual TLS Client Authentication and Certificate Bound Access Tokens).
In addition, Authlete is the world’s first certified Financial-grade API OpenID Provider supporting MTLS client authentication. See “OpenID Certification” for details about the certification program.
Furthermore, it should be noted that at the time of this writing, Authlete is the only one vendor in the world that has signed up for “continuous conformance” of the official conformance test suite. That is, Authlete is continuously being tested by the latest official test suite and the test results are made public and viewable at here. See “How to be added to the regular automated tests run” for details about continuous conformance.
Please contact us (sales@authlete.com) if you are interested in a leading-edge implementation of OAuth 2.0 and OpenID Connect supporting promising new technologies.