Years ago, when I first started out with network programming, I didn’t expect that setting up a protocol would be such a problem.
It’s interesting that there’s a lot written about socket programming but next to nothing about protocol design, which is a very important aspect of network programming.
Many networking tutorials show how to set up a multi-user chat, which is fairly worthless in showing how a real protocol should be constructed.
There are different aspects of protocol design, but what I want to focus on are the following areas:
- How to serialize packets.
- How to split the protocol into packet types.
- How to synchronize the client and server’s data.
- Very nice things to have in your protocol.
Narrowing it down a little
There are a lot of different types of networking in games. In some cases messages might be sent over stateless HTTP/HTTPS in some text based format like XML or JSON, while many times you need a persistent connection to let the server push information to the client when necessary.
If you do very little over the wire – no more than a handful of simple messages – you’re unlikely to run into much trouble regardless of how you design things.
This is not intended as the definite guide to game protocols, but rather as a starting point for people who need a fairly complex protocol but don’t know where to start.
I’m going to limit myself to talking about writing a game protocol for persistent connections over a plain TCP socket, although most of this may be applicable in other situations with relevant modifications.
Building an example protocol
In this series we’ll be developing a protocol for a poker server over TCP. It’s not nearly going to be a complete protocol – it’s just to provide some context to the presented concepts.
For a simple poker game, the flow is roughly the following:
- Protocol handshake
- Login
- Receive table lists
- Join game table
- Participate in play
- Leave table
Since we use TCP, the very first thing we need to do is delimit messages, since TCP is essentially a stream of characters.
Delimiting packets
There are fundamentally two ways to delimit packets with TCP. Either you use a delimiter character (or byte sequence) or use a header to specify the packet size.
If you have a text based protocol, this delimiter would typically be \n or 0, the former which would allow simple use of TELNET to issue commands. Even if your game protocol is binary, you might have an admin console which uses TELNET. Delimiters are easy, just keep filling the input buffer until you reach the delimiter. Just remember to guard against buffer overflows.
An essentially binary protocol might also use a text-based login handshake, then go to binary afterwards. Often this is overkill though – better run the entire protocol with a packet size header.
Packet header
One of the simplest packet designs is to have an 8-bit or 16-bit unsigned int, followed by the serialized data payload. It might be tempting to increase this to 24 or 32 bits, but this is a bad idea, since queuing packets that might be 16 MB (24 bits) or even 4 GB (32 bits) is a good way to allow bugs or a malicious client to exhaust the server memory.
Besides, delivering graphics or similar heavy duty data should always be carefully handled, so the protocol allowing for such large packets in a single send is asking for trouble. Typically the 255 byte payload of unsigned 8-bit messages should be sufficient for lightweight protocols, but if you would find yourself sending over longer lists of players or similar, the max 64kb size of the 16-bit sized headers are better.
Reading a packet now looks like this:
- Read 1 (for 8-bit) or 2 (for 16-bit) bytes into a header buffer.
- From these two bytes, determine the size of the payload.
- Allocate a buffer to hold the packet payload.
- Read into this buffer until it is full or the connection is broken.
- Send the contents of the buffer for processing.
- Go to 1.
Next up
Now that we have our stream parsed into tidy packets, we can start the login process. The next entry is about the initial handshake.