Christian Huitema's blog

Cloudy sky, waves on the sea, the sun is
shining

The long slow path to QUIC multipath

28 Feb 2024

I started working on QUIC multipath in 2017, see this old draft. 7 years and 21 drafts later, we have still not converged on a solution. We thought we were close last November with version 6 of the IETF draft, but no, we had to go back to the drawing board.

At this point, my implementation of QUIC in picoquic supports three ways to implement multiparty:

The simple version made just the minimal set of changes to RFC 9000 to allow transmission over multiple paths in parallel. All application data packets were numbered from the same number space, regardless of the transmission path. This leaves the protocol simple, which is very good if we only need to support simple scenarios like “make a transition to the new path before breaking the old one.” On the other hand, if the scenario requires high speed transmission on parallel paths, implementers need to be very careful in the way they handle acknowledgements and detect packet losses. With enough care, the performance is within a few percent of the more complex solutions, but there is still a hole: we would need more work to properly track ECN marks per path. Most developers were convinced that we needed instead a solution that manages multiple number spaces, so we can keep the ACK management and loss discovery algorithms of RFC 9000, without extra complexity.

Using multiple number spaces means that packets sent on a path could be numbered “naturally” after the order of transmissions on that path. That means we could have packets 1, 2, 3, 4, etc. on Path A, and different packets with the same numbers of path B. Loss recovery is simpler, but we have a problem with encryption. QUIC relies on AEAD, AEAD encryption requires unique “nonce” per packet, and RFC 9001 solves that by using the 64 bit packet number of QUIC as a nonce. But if we have different packets with the same numbers, the nonce is not unique anymore and simple tricks can be used to guess the contents of packets with colliding nonce. The solution is to build a 96 bit nonce that combines 64 bits of sequence number and up to 32 bits of path identifier.

The version 6 draft does just that. The design starts by observing that all QUIC packets start with a “Connection Identifier” (CID) — a string of bytes that are unique to a connection. In principle, a CID is used on only one path, if only for privacy reasons. The CIDs are allocated by one node and sent to the other in “New Connection ID” frames, and are identified by a “CID sequence number”. We can use that sequence number to identify a path, but when we do that we are also using it to identify a packet number space. That works, but require having data structure in which packet numbers and ACK management are “per CID”, which is only an approximation of “per path”. It is a pretty good approximation and it works well most of the time, but there is a corner case.

QUIC nodes will often perform “CID renewal” for privacy reasons. Since the CID value is carried in each packet, renewing the CID breaks the easy correlation between an old series of packets with the old CID, and the new series with the new one. That’s a gain for privacy. The draft 6 handles that easily: when a packet arrives with a new CID but the same source and destination IP addresses and port numbers as some previously defined path, we can handle that as “CID renewal”. We will start a new packet sequence number, but we can retain old path properties like round trip time (RTT) measurements and congestion control state.

The paths carrying QUIC packets sometimes incorporate Network Address Translation devices (NAT). These NAT map an incoming IP address and UDP port to different outgoing values, a relation described as “port mapping”. In principle, the port mappings remain stable long enough for most connections, but sometimes they change in the middle of a connection. QUIC nodes detect a mapping change when they receive packets with the same CID as an old path, but a different IP address and port. And then they apply “NAT rebinding” rules to continue the connection while keeping the timing and congestion control information of the path.

But what if the CID renewal happens at the same time as a NAT rebinding? This is not a theoretical question, because nodes will want to renew the CID when there is a significant time gap between the old series of packets and the new one, and time gaps are exactly when NAT may lose the port mappings. The simple “CID based” path identification breaks when NAT rebinding and CID renewal happen at the same time. The new series of packets will be treated as a completely new path, requiring new RTT measurements and reinitialization of the congestion control algorithm. Arguably this does not happen very often, and some rare reinitialization of congestion control is not a very big deal. But still, this is a weakness in the design.

Enter the third proposal, which starts by redefining how CIDs are organized. RFC 9000 organizes a series of CIDs with unique sequence numbers within the connection. The new design organizes the set of CID per path, or rather per potential path. Each CID has a path identifier, and a sequence number that differentiates it from other CIDs in the same path. When doing CID renewal, nodes must replace the CID of the path with another with the same path identifier. The encryption nonce is composed of the packet sequence number and the path identifier. The sequence number is per path instead of per CID. If NAT rebinding happens in the middle of a CID renewal, the path identifier derived from the CID tells us that this is the same path, and the node can retain their path information. It is a nice structure, and many path management functions become simpler.

Of course, this has a cost. Path management is simpler, but CID management is more complex. The proposal has to replace the New Connection ID and Retire Connection ID frames with new “Multipath” variants, and to introduce a mechanism to control the flow of New Connection IDs. There are still options to be discussed, such as whether the CID chosen by the sender and the receiver should have the same path number, or the details of NAT Rebinding. Plus there is the cost to implementers — it took me a week to implement the “unique Path ID” proposal in picoquic, and I probably need to work some more to reach the quality of the previous code.

In the old days, the IETF was prone to find a solution that was good enough, ship it, gather experience, and then revise the standard later to fill in the gaps. If we had followed that process, we could probably have published a QUIC Multipath RFC last year, or maybe the year before — the draft 6 was definitely good enough, and the previous draft was probably OK as well. But we have decided instead to discuss all details before approving the draft. The result will probably be better, although gathering experience sooner would also have helped improve quality. In any case, the process is not quick. As things go, we will be lucky if we converge on a proposal by the next IETF meeting in July 2024, and even November 2025 will be challenging!

Comments

If you want to start or join a discussion on this post, the simplest way is to send a toot on the Fediverse/Mastodon to @huitema@social.secret-wg.org.