Storing JWT signing keys in Azure Key Vault and access from .Net

Summary

You cannot retrieve the private key of a "key" stored in Azure Key Vault (AKV). This makes sense for high-security encryption mechanisms since although not all keys are backed by a Hardware Security Module in AKV, this behaviour reinforces the idea that you can never "get" the private key, you can only delegate its use for encryption or decryption to AKV.

There are, however, times when you want to access the private key so that you can use it in code and AKV "keys" are therefore not usable. Instead you need to use "Certificates" and to retrieve the private key, you need to use GetSecretAsync.

I will explain why and show code below for .Net. The basic ideas should translate into other languages however.

Why to retrieve the private key

If we delegate the encryption operation to AKV, there are two potential problems. The first is whether AKV even supports what you are doing. I am signing JWTs with a private key and I am not sure this is possible with AKV but I haven't really looked. The second problem is the cost of delegating thousands of operations to AKV when you can do them for free in your own code, which you are already paying to host!

The cheapest option is 2.3p per 10K operations for RSA2048 keys. This might not sound like much but it could quickly add up for scalable applications. Imagine 10000 per minute 24/7 and you would be paying over £1000/month for the privilege! This is a very possible number based on microservices and re-authentication taking place. If you want the premium tier for HSM backed keys or any key other than RSA2048, it will cost even more.

Well anyway, thanks to this blog, I found out how to store a private key in AKV that can be retrieved in code. The answer is Certificates.

How do certificates work in AKV

Certificates are X509 certificates and as you probably know, they consist of a private and public key pair and an optional certificate chain to a root cert, if that is useful. In our case, it is not useful because we control both ends of the signing conversation, we will not need to check that the issuer is sound and there is no direct mechanism to perform this anyway as long as we have a secure way to find the public key to validate our signature.

A certificate costs just over £2.00 to issue (for some reason!) but costs the same as normal key operations at 2.3p/10K, although in this case, by downloading and caching the keys in our services, we will not reach anywhere near the numbers required for encryption delegation to AKV.

An important technical detail is that if you add a certificate, it automatically adds a key and a secret with the same name. You CANNOT get a private key with GetCertificate or GetKey so you will need to use GetSecret to build your key! GetKey will get you the public key, which might be useful later.

Why use AKV at all?

This probably begs another question: If we have to workaround the pricing and we could generate our own keys locally and we are potentially exposing the private key to the world, why bother using AKV at all? For one very simple reason: to be able to rotate keys centrally at regular intervals with minimum fuss.

Have you ever updated an SSL certificate? It is a pain. Even if you have Lets Encrypt (which is designed for web sites rather than signing) there are many steps to do this and many ways to get it wrong. AKV provides the means to simply issue a new certificate and to write our code to automatically use the new certificate and delete the old one after a suitable time.

How to setup key vault

There are other guides about setting up key vault but if anything, they can be too long and complicated. If you have used Azure, you should know how to add a resource in a resource group, give it a name etc. You can choose standard or premium which currently offer the same prices except for HSM-backed keys which are only supported on premium.

Setup Application Access

This is the messiest part since adding applications in Azure AD is confusing. Although you only want an id/secret for your application, you add an application like it is an OAuth2 client with a name that needs to be a URL (use whatever you want) and an optional redirect URI (which you don't need).

Anyway, you add this under "App Registrations" in Azure AD and this will display a secret (which you can't see again so write it down or generate a new one later).

Note that if you are running on Azure, you definitely want to use Managed Service Identities to access the key store instead of normal creds but in my example, I am running on-prem and can't use them. Managed Service Identities allow you to restrict access for an app to a resource without ever seeing or being able to see its login credentials.

Once you have the App Registration (or MSI), you need to go to Access Policies in the AKV page and setup permissions. You can change these later so start with more basic permissions. In my example, you just need GET for Secrets.

Writing the Code

Annoyingly, some of the starter guides are hard to find or follow - MS definitely have the "feel of death" on their documentation site and I have personally spotted noticeable mistakes that have been corrected, which doesn't give me good feelings when I am trying to learn a new technology.

Anyway, start by installing nuget packages for:

Microsoft.Azure.KeyVault
Microsoft.IdentityModel.Clients.ActiveDirectory
System.IdentityModel.Tokens.Jwt (if you are signing JWTs like me!)

Note that your certificate (and key and secret) will have a base name that you give them when you create them in Azure. They also have a version guid. Both of these are required to decrypt/verify since you need the exact version but if you just want the latest, the key name is all you need.

You can create the KeyVaultClient in a number of ways but the example here is what you use with an id/secret as opposed to Managed Service Identity.

var kvClient = new KeyVaultClient(async (authority, resource, scope) =>
{
    var adCredential = new ClientCredential("client-id-guid", "secret");
    var authenticationContext = new AuthenticationContext(authority, null);
    return (await authenticationContext.AcquireTokenAsync(resource, adCredential)).AccessToken;
});

In this example, we simply create a credential that is passed to the AcquireTokenAsync method of the context. A lot of async and weirdness but easy enough to copy-and-paste!

With the client, we can now get the key in the form of the certificate's "secret". Note that there is a difference between getting the latest version, which uses this:

var secret = await kvClient.GetSecretAsync(keyVaultConfig.KeyVaultUrl, jwtKey);

and getting a specific version:

var secret = await kvClient.GetSecretAsync(keyVaultConfig.KeyVaultUrl, jwtKey, version);

where these parameters are all strings.

Once we have the secret, we can create a certificate directly from it's value:

var cert = new X509Certificate2(Convert.FromBase64String(secret.Value));

and if we want an RsaSecurityKey from the certificate, we need to do this:

var key = new RsaSecurityKey(cert.GetRSAPrivateKey());

We need to understand the concept of keyid and how it relates to being able to find the public key needed for verification by now, we have all the elements we need. It is up to you how you cache these responses. The version means that the key for a given version will never change after creation, making them highly cacheable.

Key Id in Json Web Tokens

If we sign a JWT (or strictly a JWS) with a symmetrical key, we don't need to do anything special, although we could set a keyid if we plan to swap or rotate these shared keys. The key id is not defined so you can use whatever you want as its value:

key.KeyId = "The id of your key";

Setting this before creating your JWT will set the string as the value of kid in the header.

If you are using asymmetrical signing (and you should!) then you can use this kid in two different ways. Firstly, in our example, you could simply put the name and version of the key as its key id (but not the vault URL!) for example: keyname/128379831798173973187193 which means when you are verifying, you will need to lookup this key in the vault.

Fortunately, this is not only easy to do but because the key id is basically a URL, you can call GetSecret with ("keyname/128379831798173973187193") and it translates to exactly the same as GetSecret(vault, "keyname","128379831798173973187193"). In other words, you can attach an IssuerSigningKeyResolver to your TokenValidationParameters (in dotnet) and simply pass this:

private IEnumerable IssuerSigningKeyResolver(string token, SecurityToken securitytoken, string kid, TokenValidationParameters validationparameters)
{
var key = Task.Run(() => GetKeyFromStore(kid)).Result;
return new List { key };
}

which calls the normal method that goes to the store (or hits cache) and then returns an enumerable of key. I think the enumerable allows for a keyid that effectively refers to a set of keys which you might not be able to resolve into a single key so you just pass back the whole set.

Use cases to consider

The happy path is finished but there are other considerations that need to be implemented and tested:

I am going to create a static endpoint with the public keys available to verify with as a simple JWK file and am going to set the jku header to point to this list. This will avoid the need to call the key store for public keys and can be cached but care needs to be taken to update this list when certificates are renewed, to make sure you include any live keys as well as the latest one and also not make it cached for so long that updated certs takes too long to propagate.
I have to test what happens when I renew the certificate. I don't think my current code will work properly so I need to think about how this works.
I need to test failure scenarios. At minimum, a broken token should not work but I would rather not cause something to crash if an attacker hits it, I would rather it failed gracefully and logged problems. There are various potentially attacks including replay attacks and downgrade attacks like someone setting the algorithm to "none". I can run unit tests to work out how they will failure but then, again, I need some better error handling that perhaps catches all exceptions, logs them and then returns a null user to the caller.