Just to remind everyone, this is the example protocol we’re writing:
- Protocol handshake
- Login
- Receive table lists
- Join game table
- Participate in play
- Leave table
Why a protocol handshake?
There are plenty of servers that roll the login packet and the protocol handshake in one. The disadvantage in combining them is that this makes it harder to update the login packet between protocol versions.
It might feel like over-engineering to build for advanced protocol versioning – until you realise that one of the most important considerations in protocol design is actually in how easily it can evolve.
Of course, this is for a persistent connection. For a stateless server we’d have to include protocol + login + request in each request anyway. However, for stateless servers this is not as big an issue, since the requests tend to be layered anyway – typically by using a protocol like HTTP (or HTTPS). That is an interesting topic in itself but beyond the scope of this series.
Our initial handshake
Let’s assume we won’t need more than 65535 protocol versions (and unless the protocol version is horribly misused, this will be true).
The client will send its protocol version as an unsigned short and with the server returning a byte with the result code. Since this forms the bootstrap part of the protocol, we try to keep it as simple and unlikely to change as possible.
At this point the server will know the client protocol, and may respond with any code that it knows the client can accept.
So, something like this:
CLIENT SERVER [protocol id] -> <- HANDSHAKE_OK / PROTOCOL_NOT_SUPPORTED
Supporting multiple protocols
The usefulness is revealed when you deploy a new server with an updated protocol. The server can then easily support earlier protocol version by restricting its replies to what it knows the client understands.
For example, let’s say protocol v.1 defined two possible responses to the protocol version: HANDSHAKE_OK
and PROTOCOL_NOT_SUPPORTED.
After some development you decide that you want to reject clients when the server is getting full. You add the error message SERVER_FULL
.
You write your server to be compatible with both v.1 and v.2, so when a client with v.2 comes to the full server, they get SERVER_FULL
and can show a nice error message. If an early v.1 client shows up, the server fall backs to closing the connection.
Behaviour on server full:
CLIENT v.1 SERVER [v.1] -> <connection dropped>
CLIENT v.2 SERVER [v.2] -> <- SERVER_FULL
You can go even further, the response to the v.2 version could even use a different serialization format entirely. As long as the initial protocol send is the same, we can allow arbitrary changes to the protocol depending on what version the client claims to be.
For professional grade servers, this is a requirement if you want to be able to upgrade servers in a server cluster without downtime.
Other considerations
We need to handle a couple of errors already – the obvious first one being timeout. The client might for some reason hang and not send its handshake message, or the reply never leaves the server, or someone logs into the server using TELNET. – Whatever the reason we can’t sit and wait.
The server may also – due to some bug or because the client settings logged into somewhere else – not respond with a valid return code.
For all of these errors it’s generally enough just log the problem and drop the connection, but you may eventually want to add additional measures to protect the server from things like accidental DoS attacks from broken clients that opens a lot of connections but never completes the login.
Our protocol so far:
CLIENT SERVER 00 02 00 01 -> ----- ----- | | | +--- protocol version 1 +--- size of packet = 2 <- 00 01 00 ----- -- | | size of packet = 1 --+ | HANDSHAKE_OK -----+ HANDSHAKE_OK = 0 PROTOCOL_NOT_SUPPORTED = 1
(As you can see this protocol is big-endian)
Next entry we’re looking at login and regular packet serialization.