Christoffer Lernö 8:12 am on April 1, 2014
Tags: game programming ( 7 ), server-client ( 7 ), TCP ( 2 ), UDP ( 2 )

Game servers: UDP vs TCP

When writing networked games, the question of UDP vs TCP will eventually come up.

Typically you will hear people say things like: “Unless you’re doing action games, you can use TCP” or “You can use TCP for your MMO, because look at WoW – it uses TCP!”

Unfortunately, these opinions don’t properly reflect the complexity of the TCP/UDP question.

Background

First off, let me state that my background is mainly TCP programming. I worked for years on a leading poker network’s game servers and we’d typically run 4,000 – 10,000 connections on each server instance during peak (with multiple instances running on a single machine) without any problems. From my point of view, TCP is the safe and well-known alternative.

Despite that, our current project is using UDP, and there is no way we could have it work well with TCP. In fact, it started out with TCP, but when it became obvious that we couldn’t get connection quality we wanted, we switched to UDP.

What TCP means in practice

In theory, the advantages of TCP are things like:

Straightforward persistent connections
Reliable messaging
Arbitrarily sized packets

Anyone with hands-on experience with TCP knows that a solid implementation needs to handle many not-so-obvious corner cases, such as disconnect detection, packet congestion due to slow client response, various DoS attack vectors relating to establishing connections, blocking vs non-blocking IO etc.

Despite the up-front ease of use, a good TCP solution isn’t easy to code.

However, the most damning property of TCP is the congestion control. Basically TCP interprets packet loss as a result of limited bandwidth, and throttles packet sends.

On 3G/WiFi on packet loss you want the replacement packet to be sent as soon as possible, but the TCP congestion control actually does the reverse!

There is no way to get around this, this is just the way TCP works on a very fundamental level. This is what can push a ping up to the 1000+ ms range on 3G or WiFi due to loss of a single packet.

Why UDP is “hard”

UDP is both easier and more difficult than TCP.

For example, UDP is packet based – which is something you’ll actually have to roll yourself for TCP. You also use a single socket for communication – unlike TCP which require a socket for each connected client. These things are mostly good stuff.

However, for most situations you actually need some concept of a connection, some rudimentary ordering and often also reliability. Neither of those are offered by UDP “out of the box”, while you get it for free with TCP.

This is while people often recommend TCP. With TCP you can get started and don’t worry too much about those things – not until you start having 500+ simultaneous connections anyway.

So yes, UDP doesn’t offer the whole kit, but as we’ll see, that’s exactly why it’s so great. In a way, TCP is to UDP what something like Hibernate is to writing your queries by hand in SQL.

The flawed case for TCP

People often give the advice to go with TCP on the idea that “TCP is just as fast as UDP” or “successful game X is using it, so it works”, not really understanding why it works in that particular game, and why UDP isn’t about about regular packet delivery speed.

So why does World of Warcraft work with TCP? First of all we need to rephrase that question. The question should be “why does World of Warcraft work despite the occasional 1000+ms delay?”. Because that is the reality of TCP – on dropped packets you’ll get huge lags as TCP first needs to detect the missing packet, then resend the packet all while cutting down throughput.

Reliable UDP will also have a delay, but since it’s a property of whatever protocol you write on top of UDP, it’s possible to reduce delays in many ways – unlike TCP, where it’s rolled into the TCP protocol itself and can’t be changed.

[At this point, some people will start talking about Nagle’s algorithm, which is pretty much the first thing you disable in any TCP implementation where latency is important.]

So why does World of Warcraft (and other games) work with these delays?

It’s simply because they’re able to hide the latency.

In the case of World of Warcraft, there are no player-to-player collisions: such collisions can’t be handled reliably predicted – but player-to-environment can, so the latter works fine with TCP.

Looking at combat in WoW, it’s easy to realize that commands sent to the servers are really something along the lines of attack_entity(entity_id) or cast_spell(entity_id, spell_id) – in other words, targeting is position independent. Furthermore, things like starting the attack motion or spell effect can be allowed to start without first getting confirmation from the server by showing a “fizzle” effect if the server response differs from the client prediction.

Starting an action before confirmation is a typical latency/lag hiding technique.

A few years back I wrote the client for a card game called Five Card Jazz. It was http based – which latency-wise is a lot worse than a plain persistent TCP connection.

We used the simple card draw and flip up animation to hide latency so that delays were only apparent in the case of very poor connections. The method was typical: send the request and start the animation drawing cards from the deck, but wait with the final flip up to reveal the cards until the server response arrived. WoW’s battle effects work in a similar manner.

This means that the choice of TCP vs UDP should basically be: “Can we hide latency or not?”

When TCP doesn’t work

A game running TCP either needs to be able to work well with occasional lags (poker clients typically, do – an occasional one second lag isn’t something people will get annoyed about), or have good latency mitigation techniques.

But what if you’re running a game where you can’t really apply any latency mitigation? Player vs player action games often fall into this category, but it’s not confined to action games.

An example:

I’m currently working on a multiplayer game (War Arcana).

During typical play, you quickly move your character over a world map initially covered with a fog of war, but which is progressively revealed as you explore.

Due to certain game rules and to prevent cheating, the server can only reveal information about the character’s immediate surroundings. This means that unlike WoW, it’s not possible to fully complete the movement until the server response arrives. What makes this a hard problem, compared to the card reveal of Five Card Jazz, is that we’re allowed a latency of max 500 ms before movement feels sluggish.

When prototyping this, everything worked fine as long as everything was on the same LAN, but as soon as we went to WiFi, the movement would randomly stutter and lag. Writing a few test programs showed the WiFi occasionally dropping packets, and every time that happened, server response time shot up from 100-150 ms to 1000-2000 ms.

No amount of tweaking of TCP settings could get around this issue.

We replaced the TCP code with a custom reliable UDP implementation which cut the penalty of a lost packet down to an additional 50 ms(!) – less than the time of a complete roundtrip. And that was only possible due having complete control of the reliability layer on top of UDP.

Myth: Reliable UDP is TCP implemented poorly

Have you heard this said: “Reliable UDP is just like TCP, so use TCP instead”?

The problem here is that this statement is false. Reliable UDP is unlikely to implement TCP’s particular brand of congestion control. In fact, this is exactly the biggest reason why you use reliable UDP instead of TCP – to get rid of its congestion control.

Another important point is how the “reliable” part of “Reliable UDP” works. There are many possible variants. I really like many of the ideas of the Quake 3 networking code which inspired the War Arcana UDP protocol.

You can also use one of the many UDP libraries that support reliable UDP, although the reliability layer might be more general and as such a bit less optimized than a hand-rolled implementation could be.

The bottom line

So UDP or TCP?

Use HTTP/HTTPS over TCP if you are making occasional, client-initiated stateless queries and an occasional delay is ok.
Use persistent plain TCP sockets if both client and server independently send packets but an occasional delay is ok (e.g. Online Poker, many MMOs).
Use UDP if both client and server may independently send packets and occasional lag is not ok (e.g. Most multiplayer action games, some MMOs)

These are mixable too: Your MMO client might first use HTTP to get the latest updates, then connect to the game servers using UDP.

Never be afraid of using the best tool for a task.

Daniel Egger 11:54 am on April 1, 2014

Interesting article.
Could you give some links for libraries which provide reliable UDP?
- Christoffer Lernö 12:03 pm on April 1, 2014
  
  Enet (http://enet.bespin.org/) is often mentioned. I used the source code as a reference when worked with my own custom implementation. There are a lot more, just Google for “udp game library” and you’ll find Enet, RakNet and others.
- Adam Stankiewicz 4:39 pm on April 1, 2014
  
  QUIC from Google? https://en.wikipedia.org/wiki/QUIC
  - Christoffer Lernö 4:53 pm on April 1, 2014
    
    Maybe someday. Meanwhile, why not use something like ENet?
Pingback: Game servers: UDP vs TCP | thoughts...
Dan Esparza 1:16 pm on April 1, 2014

I don’t have the experience it sounds like you have to make this call, but is one (TCP or UDP) easier to route across networks than another?
- LE 3:16 pm on April 1, 2014
  
  UDP and TCP work over IP. IP and its related protocols deal with routing; TCP and UDP do not affect routing.
- L'Togue 6:00 pm on April 1, 2014
  
  TCP tends to have fewer problems traversing firewalls/NATs.
Kenji 1:40 pm on April 1, 2014

Great article. One thing though: you said they’re mixable as well. But I read that mixing TCP and UDP at the same time is a bad idea. What do you think about that?
- Christoffer Lernö 2:38 pm on April 1, 2014
  
  It depends.
  
  TCP + UDP is not recommended because they can interfer with each other on a bandwidth-limited connection. If bandwidth is sufficient, *simultaneous* TCP + UDP isn’t a problem.
  
  However, when I speak of mixing, then typically the use would be to use different connections at different point of the client’s lifetime.
  
  For example, a client might authenticate and retrieve a login ticket from a login server using HTTPS, then retrieve game server lists and game updates from a lobby server using HTTP or plain TCP sockets.
  
  Once a game server is selected, the connection to the lobby server is dropped and the client connects to the game using UDP.
  
  In such a setup there’s no chance of the two protocols interfering with each other regardless of bandwidth.
Christopher Lord 2:42 pm on April 1, 2014

Great article! Just a minor nit: something can’t be easy and hard at the same time. I think you mean that UDP is simpler and hence harder, whereas TCP is more complex and hence easier.
- Christoffer Lernö 9:42 am on April 2, 2014
  
  No, I mean it’s easier in some ways and harder in others. Conceptually UDP is simpler than TCP, and if you just want to try throwing packets across a wire to do the first experiments with networking, then that’s easy.
  
  With TCP you need to think about packet delimiters from the beginning, since it’s stream-like nature is the opposite of what you usually want for a game.
hermione 4:00 pm on April 1, 2014

What about the case when your game is made in HTML5 and you have no choice but to use Websockets, which are implemented in TCP? I am planning on making a RTS, with mechanics similar to Warcraft 3, and I am curious if such a setup would be workable or not. I don’t want to spend a month making a prototype only to figure out that TCP limits me too much to continue.
- Christoffer Lernö 4:26 pm on April 1, 2014
  
  Everything boils down to “can the game survice an occasional one second lag”?
  
  It depends on your particular game, but typically an RTS sends commands that both can be mostly client side predicted and which aren’t dependent on the exact positions of the opponent’s units.
  
  So I feel that in general, real-time strategy games work fine over TCP, especially if you’re willing to leak some information to the game clients that the players should not see, such as information hidden by fog-of-war.
Roland Dobbins 6:27 pm on April 1, 2014

One thing which bears mentioning is ensuring that UDP-based game stack implementations can’t be leveraged for UDP reflection/amplification DDoS attacks – using nonces, and the like (the NTP monlist fix does this).

If measures aren’t taken to eliminate this problem, not only will your service be abused and perhaps blackholed by a lot of ISPs, but your game servers will end up being DDoSed by miscreants bouncing DDoS attacks through them.

Quake3 servers are infamous for this problem, FYI.
Chris Taylor 7:28 pm on April 1, 2014

It’s good to see someone else who appreciates the advantages of UDP for gaming! I’ve done some work recently to see how erasure codes can be used to improve gameplay over UDP. Please see the slides here: http://dkop.us/ErasureCodesInSoftware.pdf

The main software result is here: http://github.com/catid/shorthair/
dbrower 10:18 pm on April 1, 2014

What about SCTP?
- Christoffer Lernö 2:33 pm on April 3, 2014
  
  For uncommon protocols you’ll need to ensure that your platforms/languages support them. SCTP was defined in 2000 and availability is spotty. I think SCTP got java support with Java 7, and that was a large improvement for that particular protocol.
  
  Then it’s the question of firewalls/NATs. As people have pointed out you might even have problems with getting *UDP* working in some environments, because things are configured for TCP first.
  
  Once you start using less common protocols, the situation is much worse. This is likely the main reason why Google’s QUIC is built on top of UDP.
  
  Maybe in a few years we have a protocol that can compete with TCP/UDP and has broad support in firewalls etc. But we’re not there yet.
Bogdan Pou 10:19 pm on April 1, 2014

Impresive article regarding UDP and TCP, thansk for clearing that up. I never knew that a TCP connection could lag so badyl
Valentine Michael Smith 10:50 pm on April 1, 2014

You should consider NACK-Oriented Reliable Multicast (NORM) as an alternative. Despite the name, it’s really a general purpose reliable transport protocol that can be used for messaging, streaming, etc., and in unicast, multicast, etc., modes.

It’s also approved by the IETF (see RFC 5740). It rides on UDP, and provides reliable transport, flow control, and TCP-friendly congestion control. Unlike TCP, not all those things are bundled into one mechanism (i.e., ACK), and each has some configuration options.

Best of all, there is a reference implementation that’s open source and available at no cost from Naval Research Laboratory:

http://www.nrl.navy.mil/itd/ncs/products/norm

Seriously, this is one of the great overlooked network protocols/applications in existence.
leetnightshade 12:40 am on April 2, 2014

In THE BOTTOM LINE, and in a comment you talk about, “using HTTP or plain TCP sockets.” Seems weird to say considering HTTP is run over a reliable stream protocol, typically TCP, they’re not really equivalent to each other.

I get that you’re talking about preferring using an HTTP lib instead of using a TCP socket directly, but considering the article is talking about sockets I don’t think that belongs in the discussion unless you’re just talking about networking options instead of sockets. I just think it’s confusing to throw into the mix at the end of the article when it wasn’t in the discussion to begin with.
- Christoffer Lernö 9:44 am on April 2, 2014
  
  Yes, I agree it might have been confusing.
Jayson Vantuyl (@kagato) 3:12 am on April 2, 2014

If you’re doing RPC-style events like cast_spell, that’s one thing. I call this transmitting “events”. You’re transmitting an event that cannot be lost.

There are other types for information. There is an entire class of information that I call “telemetry”. This is information like position, or fog-of-war for small maps, etc.

UDP retransmits can send newer information. Why would you want to retransmit old position information with you have a current position? If you’re willing to pay attention to the different semantics for different data, UDP can be exceptionally nice for things like “telemetry”.
- Chris Taylor 5:50 pm on April 2, 2014
  
  Well said. Additionally, information about events can sometimes be 99.9% reliable instead of 100% reliable. This means that for low-latency it is possible to use redundancy instead of retransmissions.
  
  A hybrid scheme is also possible for 100% reliable messages, where redundancy can be used to lower the average latency of reliable messages and reduce perceived ploss to ~0.1%, and a [N]ACK-type message can take it the rest of the way. More details in the slides from above: http://dkop.us/ErasureCodesInSoftware.pdf
Zachary Friedman 5:26 am on April 2, 2014

I have some great advice about UDP, but you might not get it.
- Christoffer Lernö 5:03 pm on April 7, 2014
  
  Just keep sending it until I get it.
exim 10:58 am on April 2, 2014

Did you consider T/TCP?
Brad Dillman 12:08 pm on April 2, 2014

Check out QUIC, it incorporates a lot of the ideas behind the reliable UDP motivation in this article. And it has forward error correction, to reduce the frequency of retries after a dropped IP packet.

https://docs.google.com/document/d/1lmL9EF6qKrk7gbazY8bIdvq3Pno2Xj_l_YShP40GLQE/edit
Anon 1:10 pm on April 3, 2014

What about SCTP for games?

http://en.wikipedia.org/wiki/Stream_Control_Transmission_Protocol
Pingback: Programming 이슈, 3월 5주 | Tinkering Driven Life
Pingback: Links & reads for 2014 Week 14 | Martin's Weekly Curations
Pingback: Compendium of Wondrous Links vol V | Wrong Side of Memphis
glennfiedler 8:29 pm on January 9, 2015

This is super relevant: http://modong.github.io/pcc-page/
- Christoffer Lernö 6:17 pm on January 11, 2015
  
  Wow, that looks awesome.
as Ds 11:52 am on December 8, 2016

i playing blade and soul TW server and using pingzapper. so which i should use ? TCP or UDP
i dont know about this please help me. mail = sdas1472@gmail.com
thanks in advance