Reviewed-by: Matt Caswell <matt@openssl.org> Reviewed-by: Tomas Mraz <tomas@openssl.org> (Merged from https://github.com/openssl/openssl/pull/19925)
25 KiB
QUIC Connection State Machine
FSM Model
QUIC client-side connection state can be broken down into five coarse phases of a QUIC connection:
- The Idle substate (which is simply the state before we have started trying to establish a connection);
- The Active state, which comprises two substates:
- The Establishing state, which comprises many different substates;
- The Open state;
- The Terminating state, which comprises several substates;
- The Terminated state, which is the terminal state.
There is monotonic progression through these phases.
These names have been deliberately chosen to use different terminology to common QUIC terms such as 'handshake' to avoid confusion, as they are not the same concepts. For example, the Establishing state uses Initial, Handshake and 1-RTT packets.
This discussion is (currently) given from the client side perspective only. State machine considerations only relevant to servers are not mentioned. 0-RTT is also not currently modelled in this analysis.
The synthesis of this FSM is not suggested by the QUIC RFCs but has been discerned from the requirements imposed. This does not mean that the implementation of this FSM as literally presented below is an optimal or advisable implementation strategy, and a cursory examination of existing QUIC implementations suggests that such an approach is not common. Moreover, excess attention should not be given to the Open state, as 1-RTT application communication can occur even still in the Establishing state (for example, when the handshake has been completed but not yet confirmed).
However, the state machine described herein is helpful as an aid to understanding and broadly captures the logic which our implementation will embody. The design of the actual implementation is discussed further below.
The above states and their substates are defined as follows:
-
The Establishing state involves the use of Initial and Handshake packets. It is terminated when the handshake is confirmed.
Handshake confirmation is not the same as handshake completion. Handshake confirmation occurs on the client when it receives a
HANDSHAKE_DONE
frame (which occurs in a 1-RTT packet, thus 1-RTT packets are also invoked in the Establishing state). On the server, handshake confirmation occurs as soon as the handshake is considered completed (see RFC 9001 s. 4.1).The Establishing state is subdivided into the following substates:
-
Proactive Version Negotiation (optional): The client sends a Version Negotiation packet with a reserved version number to forcibly elicit a list of the server's supported versions. This is not expected to be commonly used, as it adds a round trip.
If it is used, the time spent in this state is based on waiting for the server to respond, and potentially retransmitting after a timeout.
-
Pre-Initial: The client has completed proactive version negotiation (if it performed it), but has not yet sent any encrypted packet. This substate is included for exposition; no time will generally be spent in it and there is immediate transmission of the first encrypted packet and transition to Initial Exchange A.
-
Initial Exchange A: The client has sent at least one Initial packet to the server attempting to initiate a connection.
The client is waiting for a server response, which might be:
- a Version Negotiation packet (leading to the Reactive Version Negotiation state);
- a Retry packet (leading to Initial Exchange B); or
- an Initial packet (leading to the Initial Exchange Confirmed state).
-
Reactive Version Negotiation: The server has rejected the client's proposed version. If proactive version negotiation was used, this can be considered an error. Otherwise, we return to the Pre-Initial state and proceed as though proactive version negotiation was performed using the information in the version negotiation packet.
-
Initial Exchange B: The client has been asked to perform a Retry. It sends at least one Initial packet to the server attempting to initiate a connection. Every Initial packet contains the quoted Retry Token. Any data sent in
CRYPTO
frames in Initial Exchange A must be retransmitted, but PNs MUST NOT be reset. Note that this is still considered part of the same connection, and QUIC Transport Parameters are later used to cryptographically bind the established connection state to the original DCIDs used as part of the Retry process. A server is not allowed to respond to a Retry-triggered Initial exchange with another Retry, and if it does we ignore it, which is the major distinction of this state from Initial Exchange A.The client is waiting for a server response, which might be:
- a Version Negotiation packet (invalid, ignored);
- a Retry packet (invalid, ignored);
- an Initial packet (leading to the Initial Exchange Continued state);
-
Initial Exchange Continued: The client has sent at least one Initial packet to the server and received at least one valid Initial packet from the server. There is no longer any possibility of a Retry (any such packet is ignored) and communications may continue via Initial packets for an arbitrarily long period until the handshake layer indicates the Handshake EL is ready.
The client is waiting for server packets, until one of those packets causes the handshake layer (whether it is TLS 1.3 or some other hypothetical handshake layer) to emit keys for the Handshake EL. This will generally occur due to incoming Initial packets containing crypto stream segments (in the form of
CRYPTO
frames) which deliver handshake layer protocol messages to the handshake layer in use. -
Handshake: The Handshake EL is now available to the client. Either client or server may send the first Handshake packet.
The client is waiting to receive a Handshake packet from the server.
-
Handshake Continued: The client has received and successfully decrypted at least one Handshake packet. The client now discards the Initial EL. Communications via the handshake EL may continue for an arbitrary period of time.
The client is waiting to receive more Handshake packets from the server to advance the handshake layer and cause it to transition to the Handshake Completed state.
-
Handshake Completed: The handshake layer has indicated that it considers the handshake completed. For TLS 1.3, this means both parties have sent and received (and verified) TLS 1.3 Finished messages. The handshake layer must emit keys for the 1-RTT EL at this time.
Though the handshake is not yet confirmed, the client can begin sending 1-RTT packets.
The QUIC Transport Parameters sent by the peer are now authenticated. (Though the peer's QUIC Transport Parameters may have been received earlier in the handshake process, they are only considered authenticated at this point.)
The client transitions to Handshake Confirmed once either
- it receives a
HANDSHAKE_DONE
frame in a 1-RTT packet, or - it receives acknowledgement of any 1-RTT packet it sent.
Though this discussion only covers the client state machine, it is worth noting that on the server, the handshake is considered confirmed as soon as it is considered completed.
- it receives a
-
Handshake Confirmed: The client has received confirmation from the server that the handshake is confirmed.
The principal effect of moving to this state is that the Handshake EL is discarded. Key Update is also now permitted for the first time.
The Establishing state is now done and there is immediate transition to the Open state.
-
-
The Open state is the steady state of the connection. It is a single state.
Application stream data is exchanged freely. Only 1-RTT packets are used. The Initial, Handshake (and 0-RTT) ELs have been discarded, transport parameters have been exchanged, and the handshake has been confirmed.
The client transitions to
- the Terminating — Closing state if the local application initiates an
immediate close (a
CONNECTION_CLOSE
frame is sent); - the Terminating — Draining state if the remote peer initiates
an immediate close (i.e., a
CONNECTION_CLOSE
frame is received); - the Terminated state if the idle timeout expires; a
CONNECTION_CLOSE
frame is NOT sent; - the Terminated state if the peer triggers a stateless reset; a
CONNECTION_CLOSE
frame is NOT sent.
- the Terminating — Closing state if the local application initiates an
immediate close (a
-
The Terminating state is used when closing the connection. This may occur due to an application request or a transport-level protocol error.
Key updates may not be initiated in the Terminating state.
This state is divided into two substates:
-
The Closing state, used for a locally initiated immediate close. In this state, a packet containing a
CONNECTION_CLOSE
frame is transmitted again in response to any packets received. This ensures that aCONNECTION_CLOSE
frame is received by the peer even if the initially transmittedCONNECTION_CLOSE
frame was lost. Note that theseCONNECTION_CLOSE
frames are not governed by QUIC's normal loss detection mechanisms; this is a bespoke mechanism unique to this state, which exists solely to ensure delivery of theCONNECTION_CLOSE
frame.The endpoint progresses to the Terminated state after a timeout interval, which should not be less than three times the PTO interval.
It is also possible for the endpoint to transition to the Draining state instead, if it receives a
CONNECTION_CLOSE
frame prior to the timeout expiring. This indicates that the peer is also closing. -
The Draining state, used for a peer initiated immediate close.
The local endpoint may not send any packets of any kind in this state. It may optionally send one
CONNECTION_CLOSE
frame immediately prior to entering this state.The endpoint progresses to the Terminated state after a timeout interval, which should not be less than three times the PTO interval.
-
-
The Terminated state is the terminal state of a connection. Regardless of how a connection ends (local or peer-initiated immediate close, idle timeout, stateless reset), a connection always ultimately ends up in this state. There is no longer any requirement to send or receive any packet. No timer events related to the connection will ever need fire again. This is a totally quiescent state. The state associated with the connection may now be safely freed.
We express this state machine in more concrete form in the form of a table, which makes the available transitions clear:
† Except where superseded by a more specific transition
ε means “where no other transition is applicable”.
Where an action is specified in the Transition/Action column but no new state, no state change occurs.
State | Action On Entry/Exit | Event | Transition/Action |
---|---|---|---|
IDLE | —APP:CONNECT→ | ACTIVE.ESTABLISHING.PROACTIVE_VER_NEG (if used), else ACTIVE.ESTABLISHING.PRE_INITIAL | |
—APP:CLOSE→ | TERMINATED | ||
ACTIVE | —IDLE_TIMEOUT→ | TERMINATED | |
—PROBE_TIMEOUT→ † | SendProbeIfAnySentPktsUnacked() | ||
—APP:CLOSE→ † | TERMINATING.CLOSING | ||
—RX:ANY[CONNECTION_CLOSE]→ | TERMINATING.DRAINING | ||
—RX:STATELESS_RESET→ | TERMINATED | ||
ACTIVE.ESTABLISHING.PROACTIVE_VER_NEG | enter:SendReqVerNeg | —RX:VER_NEG→ | ACTIVE.ESTABLISHING.PRE_INITIAL |
—PROBE_TIMEOUT→ | ACTIVE.ESTABLISHING.PROACTIVE_VER_NEG (retransmit) | ||
—APP:CLOSE→ | TERMINATED | ||
ACTIVE.ESTABLISHING.PRE_INITIAL | —ε→ | ACTIVE.ESTABLISHING.INITIAL_EXCHANGE_A | |
ACTIVE.ESTABLISHING.INITIAL_EXCHANGE_A | enter:SendPackets() (First Initial) | —RX:RETRY→ | ACTIVE.ESTABLISHING.INITIAL_EXCHANGE_B |
—RX:INITIAL→ | ACTIVE.ESTABLISHING.INITIAL_EXCHANGE_CONTINUED | ||
—RX:VER_NEG→ | ACTIVE.ESTABLISHING.REACTIVE_VER_NEG | ||
—CAN_SEND→ | SendPackets() | ||
ACTIVE.ESTABLISHING.REACTIVE_VER_NEG | —ε→ | ACTIVE.ESTABLISHING.PRE_INITIAL | |
ACTIVE.ESTABLISHING.INITIAL_EXCHANGE_B | enter:SendPackets() (First Initial, with token) (*All further Initial packets contain the token) (*PN is not reset) |
—RX:INITIAL→ | ACTIVE.ESTABLISHING.INITIAL_EXCHANGE_CONTINUED |
—PROBE_TIMEOUT→ | TODO: Tail loss probe for initial packets? | ||
—CAN_SEND→ | SendPackets() | ||
ACTIVE.ESTABLISHING.INITIAL_EXCHANGE_CONTINUED | enter:SendPackets() | —RX:INITIAL→ | (packet processed, no change) |
—TLS:HAVE_EL(HANDSHAKE)→ | ACTIVE.ESTABLISHING.HANDSHAKE | ||
ACTIVE.ESTABLISHING.HANDSHAKE | enter:ProvisionEL(Handshake) enter:SendPackets() (First Handshake packet, if pending) |
—RX:HANDSHAKE→ | ACTIVE.ESTABLISHING.HANDSHAKE_CONTINUED |
—RX:INITIAL→ | (packet processed if EL is not dropped) | ||
—CAN_SEND→ | SendPackets() | ||
ACTIVE.ESTABLISHING.HANDSHAKE_CONTINUED | enter:DropEL(Initial) enter:SendPackets() |
—RX:HANDSHAKE→ | (packet processed, no change) |
—TLS:HANDSHAKE_COMPLETE→ | ACTIVE.ESTABLISHING.HANDSHAKE_COMPLETE | ||
—CAN_SEND→ | SendPackets() | ||
ACTIVE.ESTABLISHING.HANDSHAKE_COMPLETED | enter:ProvisionEL(1RTT) enter:HandshakeComplete() enter[server]:Send(HANDSHAKE_DONE) enter:SendPackets() |
—RX:1RTT[HANDSHAKE_DONE]→ | ACTIVE.ESTABLISHING.HANDSHAKE_CONFIRMED |
—RX:1RTT→ | (packet processed, no change) | ||
—CAN_SEND→ | SendPackets() | ||
ACTIVE.ESTABLISHING.HANDSHAKE_CONFIRMED | enter:DiscardEL(Handshake) enter:Permit1RTTKeyUpdate() |
—ε→ | ACTIVE.OPEN |
ACTIVE.OPEN | —RX:1RTT→ | (packet processed, no change) | |
—CAN_SEND→ | SendPackets() | ||
TERMINATING | —TERMINATING_TIMEOUT→ | TERMINATED | |
—RX:STATELESS_RESET→ | TERMINATED | ||
TERMINATING.CLOSING | enter:QueueConnectionCloseFrame() enter:SendPackets() |
—RX:ANY[CONNECTION_CLOSE]→ | TERMINATING.DRAINING |
—RX:ANY→ | QueueConnectionCloseFrame() SendPackets() |
||
—CAN_SEND→ | SendPackets() | ||
TERMINATING.DRAINING | |||
TERMINATED | [terminal state] |
Notes on various events:
-
CAN_SEND
is raised when transmission of packets has been unblocked after previously having been blocked. There are broadly two reasons why transmission of packets may not have been possible:- Due to OS buffers or network-side write BIOs being full;
- Due to limits imposed by the chosen congestion controller.
CAN_SEND
is expected to be raised due to a timeout prescribed by the congestion controller or in response to poll(2) or similar notifications, as abstracted by the BIO system and how the application has chosen to notify libssl of network I/O readiness.It is generally implied that processing of a packet as mentioned above may cause new packets to be queued and sent, so this is not listed explicitly in the Transition column except for the
CAN_SEND
event. -
PROBE_TIMEOUT
is raised after the PTO interval and stimulates generation of a tail loss probe. -
IDLE_TIMEOUT
is raised after the connection idle timeout expires. Note that the loss detector only makes a determination of loss due to an incoming ACK frame; if a peer becomes totally unresponsive, this is the only mechanism available to terminate the connection (other than the local application choosing to close it). -
RX:STATELESS_RESET
indicates receipt of a stateless reset, but note that it is not guaranteed that we are able to recognise a stateless reset that we receive, thus this event may not always be raised. -
RX:ANY[CONNECTION_CLOSE]
denotes aCONNECTION_CLOSE
frame received in any non-discarded EL. -
Any circumstance where
RX:RETRY
orRX:VER_NEG
are not explicitly listed means that these packets are not allowed and will be ignored. -
Protocol errors, etc. can be handled identically to
APP:CLOSE
events as indicated in the above table if locally initiated. Protocol errors signalled by the peer are handled asRX:ANY[CONNECTION_CLOSE]
events.
Notes on various actions:
SendPackets()
sends packets if we have anything pending for transmission, and only to the extent we are able to with regards to congestion control and available BIO buffer space, etc.
Non-FSM Model
Common QUIC implementations appear to prefer modelling connection state as a set of flags rather than as a FSM. It can be observed above that there is a fair degree of commonality between many states. This has been modelled above using hierarchical states with default handlers for common events. The state machine can be viewed as a diagram here (large image).
We transpose the above table to sort by events rather than states, to discern the following list of events:
-
APP:CONNECT
: Supported inIDLE
state only. -
RX:VER_NEG
: Handled inESTABLISHING.PROACTIVE_VER_NEG
andESTABLISHING.INITIAL_EXCHANGE_A
only, otherwise ignored. -
RX:RETRY
: Handled inESTABLISHING.INITIAL_EXCHANGE_A
only. -
PROBE_TIMEOUT
: Applicable toOPEN
and all (non-ε)ESTABLISHING
substates. Handled viaSendProbeIfAnySentPktsUnacked()
except in theESTABLISHING.PROACTIVE_VER_NEG
state, which reenters that state to trigger retransmission of a Version Negotiation packet. -
IDLE_TIMEOUT
: Applicable toOPEN
and all (non-ε)ESTABLISHING
substates. Action: immediate transition toTERMINATED
(noCONNECTION_CLOSE
frame is sent). -
TERMINATING_TIMEOUT
: Timeout used by theTERMINATING
state only. -
CAN_SEND
: Applicable toOPEN
and all (non-ε)ESTABLISHING
substates, as well asTERMINATING.CLOSING
. Action:SendPackets()
. -
RX:STATELESS_RESET
: Applicable to allESTABLISHING
andOPEN
states and theTERMINATING.CLOSING
substate. Always causes a direct transition toTERMINATED
. -
APP:CLOSE
: Supported inIDLE
,ESTABLISHING
andOPEN
states. (Reasonably a no-op inTERMINATING
orTERMINATED.
) -
RX:ANY[CONNECTION_CLOSE]
: Supported in allESTABLISHING
andOPEN
states, as well as inTERMINATING.CLOSING
. Transition toTERMINATING.DRAINING
. -
RX:INITIAL
,RX:HANDSHAKE
,RX:1RTT
: Our willingness to process these is modelled on whether we have an EL provisioned or discarded, etc.; thus this does not require modelling as additional state.Once we successfully decrypt a Handshake packet, we stop processing Initial packets and discard the Initial EL, as required by RFC.
-
TLS:HAVE_EL(HANDSHAKE)
: Emitted by the handshake layer when Handshake EL keys are available. -
TLS:HANDSHAKE_COMPLETE
: Emitted by the handshake layer when the handshake is complete. Implies connection has been authenticated. Also implies 1-RTT EL keys are available. Whether the handshake is complete, and also whether it is confirmed, is reasonably implemented as a flag.
From here we can discern state dependence of different events:
-
APP:CONNECT
: Need to know if application has invoked this event yet, as if so it is invalid.State: Boolean: Connection initiated?
-
RX:VER_NEG
: Only valid if we have not yet received any successfully processed encrypted packet from the server. -
RX:RETRY
: Only valid if we have sent an Initial packet to the server, have not yet received any successfully processed encrypted packet from the server, and have not previously been asked to do a Retry as part of this connection (and the Retry Integrity Token validates).Action: Note that we are now acting on a retry and start again. Do not reset packet numbers. The original CIDs used for the first connection attempt must be noted for later authentication in the QUIC Transport Parameters.
State: Boolean: Retry requested?
State: CID: Original SCID, DCID.
-
PROBE_TIMEOUT
: If we have sent at least one encrypted packet yet, we can handle this via a standard probe-sending mechanism. Otherwise, we are still in Proactive Version Negotiation and should retransmit the Version Negotiation packet we sent.State: Boolean: Doing proactive version negotiation?
-
IDLE_TIMEOUT
: Only applicable inACTIVE
states.We are
ACTIVE
if a connection has been initiated (seeAPP:CONNECT
) and we are not inTERMINATING
orTERMINATED
. -
TERMINATING_TIMEOUT
: Timer used inTERMINATING
state only. -
CAN_SEND
: Stimulates transmission of packets. -
RX:STATELESS_RESET
: Always handled unless we are inTERMINATED
. -
APP:CLOSE
: Usually causes a transition toTERMINATING.CLOSING
. -
RX:INITIAL
,RX:HANDSHAKE
,RX:1RTT
: Willingness to process these is implicit in whether we currently have the applicable EL provisioned. -
TLS:HAVE_EL(HANDSHAKE)
: Handled by the handshake layer and forwarded to the record layer to provision keys. -
TLS:HANDSHAKE_COMPLETE
: Should be noted as a flag and notification provided to various components.
We choose to model the CSM's state as follows:
-
The
IDLE
,ACTIVE
,TERMINATING.CLOSED
,TERMINATING.DRAINED
andTERMINATED
states are modelled explicitly as a state variable. However, the substates ofACTIVE
are not explicitly modelled. -
The following flags are modelled:
- Retry Requested? (+ Original SCID, DCID if so)
- Have Sent Any Packet?
- Are we currently doing proactive version negotiation?
- Have Successfully Received Any Encrypted Packet?
- Handshake Completed?
- Handshake Confirmed?
-
The following timers are modelled:
- PTO Timeout
- Terminating Timeout
- Idle Timeout
Implementation Plan
-
Phase 1: “Steady state only” model which jumps to the
ACTIVE.OPEN
state with a hardcoded key.Test plan: Currently uncertain, to be determined.
-
Phase 2: “Dummy handshake” model which uses a one-byte protocol as the handshake layer as a standin for TLS 1.3. e.g. a 0x01 byte “represents” a ClientHello, a 0x02 byte “represents” a ServerHello. Keys are fixed.
Test plan: If feasible, an existing QUIC implementation will be modified to use this protocol and E2E testing will be performed against it. (This can probably be done quickly but an alternate plan may be required if the effort needed turns out be excessive.)
-
Phase 3: Final model with TLS 1.3 handshake layer fully plumbed in.
Test plan: Testing against real world implementations.