Erlang/OTP impact of DST Root CA X3 expiration

Posted 2021-05-18 15:10:47.161830

On September 30 2021, the root CA certificate DST Root CA X3 will expire. This should not have a noticeable impact on the Internet at large, as any recently issued server certificate will have been issued with a different trust chain that’s rooted in a newer root CA.

Let’s Encrypt has relied on the DST Root CA X3 to bootstrap its services, while in parallel working to get its own root CA (ISRG Root X1) included in all OS and browser trust stores. Now that the old root is reaching its end-of-life, it is time for Let’s Encrypt to stand on its own. However, there are still devices and applications out there that do not include Let’s Encrypt’s new root CA, in particular older Android devices. So Let’s Encrypt have arranged for a fall-back solution that will work with those older devices, and it involves an ‘alternate chain’ with a ‘cross-signed’ intermediate CA.

Unfortunately Erlang/OTP applications are likely to experience TLS handshake errors when trying to connect to servers that present the longer chain. Let’s have a closer look at what is likely to happen over the next few months, and why.

Cross-signing explained

Let’s Encrypt have already started issuing certificates with the new alternate chain, as explained in this post.

Servers with the new chain will send two intermediate CA certificates, along with their own server certificate: the R3 intermediate and the cross-signed ISRG Root X1. The latter is a variant of Let’s Encrypt’s own root CA that is not self-signed, but rather was signed by DST Root CA X3. The R3 intermediate was signed by ISRG Root X1, and it is up to the client to choose whether to select the self-signed or the cross-signed variant.

So there are two possible chains, one of which ignores one of the certificates sent by the server. Check out the SSL Labs report for ‘community.letsencrypt.org’: you can see the extra certificate being sent (#3), although the certification paths shown for Mozilla/Apple/Android/Java/Windows all ignore it since they have the ISRG Root X1 in their trust store.

If you look carefully you’ll see that the cross-signed certificate is valid until September 30 2024: Let’s Encrypt says this will allow those old Android devices to continue trusting Let’s Encrypt certificates even after DST Root CA X3 expires. So we may encounter the longer chain well after September 30th this year.

The problem with Erlang/OTP

The above assumes a generic TLS client, such as a web browser. But what if the client is an Erlang or Elixir application (or other BEAM application) using the the ssl application of OTP? It turns out it does not handle cross-signed certificates in quite the same way: the ssl and public_key applications only really consider the longest possible chain by default.

Some applications customize the certificate verification using a custom verify_fun and/or a partial_chain handler, but this may not be sufficient. Let’s have a look at what we can expect to see in the months before and after September 30th.

If you are not interested in the analysis, feel free to skip straight to the conclusions at the end.

Before September 30th

Over the next few weeks, as servers are renewing their Let’s Encrypt certificates, more and more servers will start to present the cross-signed chain. This should not impact existing BEAM applications that still have the DST Root CA X3 in their trust store.

Let’s first try out ssl directly, with no 3rd party dependencies:

Erlang/OTP 24 [erts-12.0] [source] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:1] [jit]

Eshell V12.0  (abort with ^G)

1> ssl:start().
ok

2> ssl:connect("community.letsencrypt.org", 443, [
2>   {verify, verify_peer},
2>   {cacertfile, "/etc/ssl/cert.pem"},
2>   {depth, 3}
2> ]).
{ok,{sslsocket,{gen_tcp,#Port<0.5>,tls_connection,undefined},
               [<0.117.0>,<0.116.0>]}}

In this case the ISRG Root X1, if present in the trust store, is never used. If the DST Root CA X3 certificate is removed from the trust store, the handshake fails because the built-in certificate verification in ssl does not consider the shorter path that ignores the cross-signed certificate.

Clients that customize the certificate verification may also work when only the ISRG Root X1 is present, through a partial_chain function. In particular, Hackney and its derivatives (using the ssl_verify_fun package) connect successfully regardless of whether DST Root CA X3 is present, and so does Mint.

After September 30th

To see what’s going to happen after the DST Root CA X3 certificate expires we can’t simply move the clock forward a couple of months: the server end-certificates that are currently out there would be considered expired. So we’re going to have to simulate it: I have set up a server with a certificate chain issued from Let’s Encrypt’s staging environment. It includes a cross-signed CA certificate issued by a (fake) root that has already expired.

(If you want to play along at home, you can find the files with the various trust store options here)

Let’s try connecting to the future…

Plain Erlang ssl (OTP 24.0)

Erlang/OTP 24 [erts-12.0] [source] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:1] [jit]

Eshell V12.0  (abort with ^G)

1> ssl:start().
ok

2> ssl:connect("future.voltone.net", 443, [
2>   {verify, verify_peer},
2>   %% Trust store with both root CA certificates:
2>   {cacertfile, "trust_store.pem"},
2>   {depth, 3}
2> ]).
{error,{tls_alert,{certificate_expired,"TLS client: In state wait_cert_cr at ssl_handshake.erl:1882 generated CLIENT ALERT: Fatal - Certificate Expired\n"}}}

3> ssl:connect("future.voltone.net", 443, [
3>   {verify, verify_peer},
2>   %% Only the new (short chain) root CA:
3>   {cacertfile, "new_root_ca.pem"},
3>   {depth, 3}
3> ]).
{error,{tls_alert,{unknown_ca,"TLS client: In state wait_cert_cr at ssl_handshake.erl:1899 generated CLIENT ALERT: Fatal - Unknown CA\n"}}}

So if the old root CA is present in the trust store, it gets selected as the root of the chain, which then gets rejected because it has expired. If the old root CA is removed from the trust store, no trust chain is found because the issuer of the last certificate in the chain sent by the server cannot be found.

Note that Erlang/OTP versions prior to 23.3 would have connected successfully in that first scenario: it did not consider expiry of a certificate in the local trust store to be a failure. In this respect it behaved just like those older Android devices that Let’s Encrypt aims to support past September 30th.

Plain Erlang ssl with a partial_chain function

When ssl decides that the certificate chain it originally built cannot be completed using a certificate in its trust store, it passes the chain to the partial_chain function, if one was provided. This function can then nominate one of those certificates (that is, one of the certificates sent by the server) as trusted, and ssl will then retry chain verification with the selected trust anchor.

In this case we know the server will send the cross-signed version of the new root CA, and we can treat it as an equivalent for the purpose of validating the rest of the chain. A secure implementation of the partial_chain function would have to find a match between the public keys of the certificates passed in and the public keys of the certificates in the CA trust store. But for the purpose of our investigation we’re going to cheat, and just return the first certificate in the list, which we know its the cross-signed certificate.

Erlang/OTP 24 [erts-12.0] [source] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:1] [jit]

Eshell V12.0  (abort with ^G)

1> ssl:start().
ok

2> ssl:connect("future.voltone.net", 443, [
2>   {verify, verify_peer},
2>   %% Trust store with both root CA certificates:
2>   {cacertfile, "trust_store.pem"},
2>   {depth, 3},
2>   {partial_chain, fun([CrossSigned | _]) -> {trusted_ca, CrossSigned} end}
2> ]).
{error,{tls_alert,{certificate_expired,"TLS client: In state wait_cert_cr at ssl_handshake.erl:1882 generated CLIENT ALERT: Fatal - Certificate Expired\n"}}}

3> ssl:connect("future.voltone.net", 443, [
3>   {verify, verify_peer},
3>   %% Only the new (short chain) root CA:
3>   {cacertfile, "new_root_ca.pem"},
3>   {depth, 3},
3>   {partial_chain, fun([CrossSigned | _]) -> {trusted_ca, CrossSigned} end}
3> ]).
{ok,{sslsocket,{gen_tcp,#Port<0.7>,tls_connection,undefined},
               [<0.125.0>,<0.124.0>]}}

Ok, that worked for the second scenario. Again, let me stress that this is NOT a secure implementation of a partial_chain function! When used with real-world servers this implementation would make it trivial to launch MitM attacks against your client.

The first scenario still fails on OTP 23.3 or later, as the chain is rejected due to the expired DST Root CA X3 certificate before the partial_chain function is called. On older Erlang/OTP versions both scenarios pass.

Hackney, Mint and co

Hackney by default uses the ssl_verify_fun package for certificate verification, along with a partial_chain function, as I described above. Mint brings its own certificate verification logic, including partial chain handling.

On OTP 23.2 and earlier they successfully connect to servers that send the cross-signed certificate, regardless of whether the expired DST Root CA X3 is present in the trust store. However, on OTP 23.3 or later the partial_chain functions are ineffective, just as it is in my experiments above. As a result, the handshake is aborted with a “Certificate Expired” error until the DST Root CA X3 certificate is removed from the trust store.

Conclusion

So how do you keep your application running without interruptions beyond September 30th?

First, check whether the clients used in your application are able to ignore the cross-signed certificate sent by the server, and verify the shorter chain instead. In other words, make sure they enable a suitable partial_chain function. If they do, and you’re on OTP 23.2 or earlier, you should be fine.

If you are on OTP 23.3 or later, make sure you remove the DST Root CA X3 certificate from the trust store before it expires on September 30th. Unfortunately this may not be trivial: Hackney for instance relies on the certifi package to provide its trust store, and selecting a different trust store requires passing in a whole bunch of ssl_options to Hackney, including a modified partial_chain function.

If you are using a client that does not (yet) handle cross-signed certificates, you may be able to configure a custom partial_chain function. I may write a follow-up post to show an example of how to do this securely. Alternatively you may have to switch to another client.

Of course if you control the server, and you have no requirement for it to support old Android devices, you can configure the server not to send the cross-signed certificate at all. Instead, make sure it only sends its server certificate and Let’s Encrypt’s R3 intermediate. That should enable any BEAM-based clients to connection without issues, provided they have a recent trust store that includes ISRG Root X1.


Back