Adding quantum resistance to WireGuard

We continue our post-quantum series with this blog post that details the process behind adding quantum resistance to the WireGuard protocol and evaluating the performance of the resulting software.

The WireGuard protocol

WireGuard is a fast and secure open source virtual private network (VPN) solution that is using state-of-the-art cryptography. While being extremely fast, the WireGuard VPN secures your connection and protects your online privacy with several nice security features such as data security or identity-hiding.

The WireGuard protocol works in two steps, first a handshake is performed between client and server. During this handshake, both participants authenticate both to each other and at the same time securely exchange some key material. The derived key material is used to open a secure tunnel using symmetric key encryption and exchange the actual network data. When using this tunnel a client can then “hide” behind the server and obtain additional privacy from other parties while surfing the web.

Original picture downloaded from iStock using a standard license.

The problem here is that the cryptography in use, as modern as it is, is not quantum resistant, thus threatened by the arrival of quantum computers.

As it relies on symmetric encryption, the tunnel part can be made quantum secure by “simply” doubling the key size. We thus put our main focus on the more complex challenge of adding quantum resistance to the WireGuard handshake.

There are already some works that propose post quantum handshakes for VPN. In the table below, we report the results of works published last year that add quantum resistance to respectively the OpenVPN protocol, or the WireGuard protocol. As we can see, the proposed quantum resistant handshake of WireGuard is much more practical than the one of OpenVPN. And this is why we chose to continue working with WireGuard specifically, and use these results as baseline.

Protocol	Traffic (B)	# of IPv6 packets	Client runtime (ms)	Server runtime (ms)
PQ-OpenVPN	8996	23	1277	1269
PQ-WireGuard	2532	2	0.92	0.30

Performance of post-quantum handshakes.

Performance == speed?

Before we dive into the technical details, it’s important to just step back and think about what we want to achieve and what we want to evaluate. The performance of a VPN handshake has two dominant dimensions: the runtime and the bandwidth. Both are important but one can prevail according to the setting we are in. In some cases, a lower runtime is preferred either for convenience or to fit computationally-limited resources. On the opposite, some might require solutions where the bandwidth is the feature to minimize.

It’s important for every scenario to define the settings and constraints in order to propose the best tool that there is to maximize the performance. For instance, discarding the importance of bandwidth usage at the benefit of the runtime is prone to fire back as the network limitations may cause considerable delay, leading to a slow and heavy protocol.

Following the steps of Hülsing et al.

With this in mind, the first thing that we do is to repeat the methodology of the prior PQ-WireGuard. We build our implementations on top of the Go mirror implementation of WireGuard.

The PQ-WireGuard handshake is built around an authenticated key exchange. In this case the authenticated key exchange follows the Fujioka construction –included below–, a recipe that combines two Key Encapsulation Mechanisms (KEM) to obtain the authentication, integrity and confidentiality properties. One of the KEM is working with static keys and is required to be CCA secure, and the second one is working with ephemeral keys, and is required to be only CPA secure. Only participants holding the correct private key can learn the internal view of the protocol and derive the correct key material, while the others can only at best guess at random.

Fujioka’s transform, combining two KEMs to obtain an AKE.

We use plain Kyber as our KEM instances, with a security level 3 (Kyber768). Kyber is a really fast finalist that is built using a security-enhancing transform over a weakly-secure encryption scheme. We use our own open-source implementation, introduced in our previous blog post.

This simple approach yields results that are not very competitive on the bandwidth side, but promising when looking at the runtime, especially on the client side.

Protocol	Traffic (B)	# of IPv6 packets	Client runtime (ms)	Server runtime (ms)
PQ-OpenVPN	8996	23	1277	1269
PQ-WireGuard	2532	2	0.92	0.30
Our work (K+K)	4816	4	0.64	0.43

Performance of post-quantum handshakes.

We examine further our solution to try to find out “how fast can we go”?

Optimizing the runtime

To do such, we tweak the Kyber KEM instance working with ephemeral keys. We start by removing the Fujioka transform, and build our CPA-secure Kyber KEM directly on top of the encryption scheme. We also augment the compression of the ciphertexts. We can do that because in practice, a negligible decapsulation failure rate can be an overkill compared for example to the possibility of a packet drop.

These tweaks reduce the size of outputs, boost the performance, and actually increase the security, with arguments that can be borrowed from learning with rounding schemes. Because of potential attacks, we cannot apply this trick to the Kyber instance working with static keys. Indeed there exist attacks targeting the decapsulation failures that could lead to a secret key recovery, and of course we do not want that. The internals of the tweaked instance are detailed below.

Variant	d_u	d_v	Failure Rate	Ciphertext Size
t-Kyber512	8	3	2^-21.5	608 B

Tweaked Kyber internals.

As a result, for security somewhat equal we obtain a protocol that offers improvement over the prior works regarding computational time while staying competitive regarding the bandwidth. Indeed, the handshake occurs every 2 minutes or so, the 2 extra IP packets being sent thus do not impact the practicality. As expected, the bigger improvement is on the client side, where we almost half the runtime compared to the prior PQ-WireGuard.

Protocol	Traffic (B)	# of IPv6 packets	Client runtime (ms)	Server runtime (ms)
PQ-OpenVPN	8996	23	1277	1269
PQ-WireGuard	2532	2	0.92	0.30
Our work (K+K)	4816	4	0.64	0.43
Our work (K+tK)	3760	4	0.50	0.29

Performance of post-quantum handshakes.

Optimizing the bandwidth

Whereas the lower bound of one IP packet per message is reached for prior PQ-WireGuard handshake, we quickly come to the conclusion that there is no way, as aggressively as we tried, to fit our Kyber-based handshake into two IP packets only. Either the security level drops, or the failure rates drop below an acceptable rate. We identify the bottleneck in our case as being the size of the ciphertexts of the CCA-secure Kyber scheme.

We deviate from the original approach to try to now optimize the communication costs.

We study the use of the del Pino construction, a different authenticated key exchange recipe that replaces the KEM working with long term authenticating keys by a signature scheme.

del Pino’s transform, combining a DSA and a KEM to obtain an AKE.

We keep using our tweaked Kyber instance with ephemeral keys, and choose to work with Rainbow, a signature scheme that stands out because of its very small signature size. The relatively large public key size of rainbow is not a problem in our setup as after a one time transfer, only a hash can be used during the frequent handshakes. Hence this preprocessing is amortized pretty quickly.

Protocol	Traffic (B)	# of IPv6 packets	Client runtime (ms)	Server runtime (ms)
PQ-OpenVPN	8996	23	1277	1269
PQ-WireGuard	2532	2	0.92	0.30
Our work (K+K)	4816	4	0.64	0.43
Our work (K+tK)	3760	4	0.50	0.29
Our work (R+tK)	1912	2	5.76	5.41

Performance of post-quantum handshakes.

As you can see, we achieve a very light protocol. With this approach, saving of almost 2 thousand bytes –hence two IP packets — comes at the cost of some few milliseconds of computational overhead, a cost that is barely perceptible by the user.

Security

The security properties of the original WireGuard protocol are maintained under the assumptions that the users public keys are not known to the attacker. We agree with the arguments of Hülsing et al. in Post-Quantum WireGuard and Appelbaum et al. in Tiny WireGuard tweak that conclude that such assumption is realistic. Then, as only a hash is used during the handshakes, any attacker such as eavesdroppers cannot threaten the deniability and anonymity properties.

Final Version

Our quantum-resistant WireGuard fork including all the aforementioned PQ-WireGuard variants is publicly available at https://github.com/kudelskisecurity/pq-wireguard. Furthermore, knowing that quantum computers are currently at the stage of prototypes, we choose in our final product to use instances with a light security level, i.e., Kyber512, which offers security roughly equivalent to AES–128, and Rainbow-1. As result, our quantum-resistant WireGuard implementations achieve an even better performance, as reported on the table below.

Protocol	Branch	Traffic (B)	# of IPv6 packets	Client runtime (ms)	Server runtime (ms)
K512 + tK512	tweakedKyber	3172	4	0.39	0.27
R1 + tK512	tweakedRainbow	1716	2	4.40	4.31

Performance of the post-quantum handshakes included in our WireGuard fork.

Takeaway

When used correctly, a VPN is an useful tool whose security should be assured even after the advent of quantum computers. Towards that goal, we presented two performant quantum resistant WireGuard constructions. We show that there is still some improvement surface, especially when looking to optimize one dimension especially among bandwidth or runtime, while staying practical in the other one.

We agree that prior PQ-WireGuard looks like a very good compromise, but we are pleased to present protocols that show some improvements in different practical settings, and that provide experimental results that can be used as motivation to accelerate the transition towards post quantum alternatives.

If you want to use our PQ-WireGuard protocols, the tutorial is included in the README of our GitHub repository. For additional information, you can also watch the recording of our presentation at the NIST third post-quantum cryptography standardization conference.

As last word, we would like to mention that we plan to integrate recent optimizations and port our implementation to the Linux kernel space, hopefully to get even better runtimes. Stay tuned!

The WireGuard protocol

Performance == speed?

Following the steps of Hülsing et al.

Optimizing the runtime

Optimizing the bandwidth

Security

Final Version

Takeaway

Share:

Related

Leave a Reply