~ubuntu-branches/ubuntu/wily/maelstrom/wily : contents of Docs/Technical

~ubuntu-branches/ubuntu/wily/maelstrom/wily : (revision 17)

---
Note:
A great deal of technical development between the Technical_Notes-v1.1
and v2.0 is not documented.  This document starts at the beginning of
the addition of networking to Maelstrom, and finishes with the final
networking algorithm used.
(See Networking.Paper for a more organized description)
---

The hardest thing about this networking project is synchronization.

My initial feeling was to go with TCP for sending synchronization packets,
but running a real-time game at 30 frames per second, I can't have the
slow-start algorithm after a packet loss and the other overhead of TCP
would be wasteful.

My thought was to take advantage of the guaranteed delivery that TCP
offers, since on a 640x480 screen, even one missed frame could have
significant reprocussions in terms of game state.

I also need the speed of UDP.  Ideally, this game would be able to run
over 14,400 bps modem speeds.  Using PPP, this averages between 600
and 900 cps, so including IP headers, we would want no more than 600
bytes of traffic every second.  Using 1 byte synchronization packets,
30 frames per second would require in (1+U+20)*30 bytes throughput
every second. (Where U = #bytes in a UDP header)  That's at least
630 bytes per second, just going one way.  Guaranteed delivery of
synchronization packets requires at least an ack, nearly doubling
the necessary throughput.

The only reason we want such tight synchronization is so that
keypresses arrive on the frame that they need to, and that frame
is the same on every machine in the game.  Since key-presses
average about 10 per second, we can probably get away with only
10 sync packets every second (1 every 3 frames).  We still want
a granularity of a keyboard poll every frame, but we can now do
lazy acks and do guaranteed delivery on 1 sync per second and
each keystroke.

Well, the problem is not if a keystroke is sent to a machine
that is slow, but if a keystroke is sent to a machine that is
fast.  If a fast machine zips through the frames, then a keystroke
at a particular frame will get missed or late, both of which resulting
in corrupt games.  Imagine pressing the turn key on your machine,
pivoting round, and pressing the fire key.  On your machine you
hit the asteroid next to you, and on another fast machine, you
crashed before you were able to hit the fire key.  Instability:
You are dead on one machine, and live on another. :-)

We need synchronization on each frame (actually on each key poll,)
to prevent keystrokes coming in a frame or two too late.

Keystrokes can ride piggyback on sync frames, so there is little
overhead for that...

The alternative is to poll the keyboard every other frame, and only
send sync signals every other frame as well.
... That doesn't work very well... you need the every frame key poll
to respond quickly while firing.


Stop and wait?   -- not exactly. :)

The conclusion?  Networked Maelstrom won't run over the modem (200 ms
ping times are too slow) and we will use UDP with stop-go protocol
for real-time speed.

The next phase is designing a good handshake protocol.
The plan:
	When a new game starts... The checksum server (player 1)
	will send a checksum to all other players.  Each other player
	will, when it receives it, send back the same message.
	How do we guarantee delivery?  We don't, but if the reply
	doesn't make it, the first player will keep sending the NEW_GAME
	packet to that player, and that player will respond, even though
	it's waiting for the first frame packet.


Another possible dropped packet condition is as follows:

	Player 1 sends a frame packet to itself and Player 2.
	Player 2 sends a frame packet to itself and Player 1.
	Player 2 receives the frame packet from Player 1, and itself.
	Player 1 only receives the packet from itself.
	Player 2 continues to the next frame
	Player 1 retransmits the packet from its current frame.

Player 2 is now deadlocked waiting for the packet for frame +1
while Player 1 is still asking for the packet for frame +0.
The solution is for each player to keep the it's last frame
packet cached, so that if it receives a request for the last
frame, it can resend the previous frame's packet, and the game
can continue.

The players can only be one frame off, because each player does
not continue to the next frame until all players have sent the
frame packet for the current frame.  If one player is one frame
behind, all other players will wait for it.

If packets arrive out of order, then an older retransmission can
arrive after a frame has been successfully negotiated.  In this
case, the receiver will send a response if the packet is from
the previous frame, and it will be ignored by the sender, who is
a frame ahead.  Thus consistency is preserved without initiating
packet storms of retransmissions and acks.

Note that this configuration really is peer-to-peer, and the only
time one player is a "server" is when the first player sends the
initial level, number of lives, and random seed to all other players.

I had a great problem with lock-stepping using the above method.
Imagine both players furiously sending frame packets at each other,
and advancing to the next frame.  Then:

	Player 1  (frame 10)		Player 2  (frame 10)
		10 -->				<-- 10
		dropped packet
	Player 1  (frame 10)		Player 2  (frame 11)
		wait				<-- 11
		get 11, timeout
	Player 1  (frame 10)		Player 2  (frame 11)
		10 -->				<-- 10 (cached)
	Player 1  (frame 11)		Player 2 (frame 11)
		11 -->				wait
	Player 1  (frame 11)		Player 2 (frame 12)
		wait				<-- 12
		get 12, timeout

This goes on, repeating, as the two players advance frames in lock-step,
one frame per timeout.

The solution is for Player 1 to immediately resend the sync packet for
it's frame if it receives a packet for the next frame.  That way it doesn't
have to time out.  If it caches the packet for the next frame, the next
time around, it just sends the packet for the next frame, and advances
immediately.  The two players are now back in sync.
This assumes that few packets will be dropped or otherwise lost.
This seems to be a valid assumption.  Even over a 32.6K link to 
Australia, very few packets were lost and though the game was unplayably
slow, dropped, lost, and timed out packets were few and far between.

The new senario:

	Player 1  (frame 10)		Player 2  (frame 10)
		10 -->				<-- 10
		dropped packet
	Player 1  (frame 10)		Player 2  (frame 11)
		get 11, cache, 10 -->		<-- 11
	Player 1  (frame 10)		Player 2  (frame 11)
		get 10				<-- 10 (cached)
	Player 1  (frame 11)		Player 2  (frame 11)
		11 -->				get 11
	Player 1  (frame 12)		Player 2 (frame 12)
		12 -->				<-- 12

The players are now synchronized again.



1 by Christoph Baumann Import upstream version 1.4.3-L3.0.5	1	---
	2	Note:
	3	A great deal of technical development between the Technical_Notes-v1.1
	4	and v2.0 is not documented. This document starts at the beginning of
	5	the addition of networking to Maelstrom, and finishes with the final
	6	networking algorithm used.
	7	(See Networking.Paper for a more organized description)
	8	---
	9
	10	The hardest thing about this networking project is synchronization.
	11
	12	My initial feeling was to go with TCP for sending synchronization packets,
	13	but running a real-time game at 30 frames per second, I can't have the
	14	slow-start algorithm after a packet loss and the other overhead of TCP
	15	would be wasteful.
	16
	17	My thought was to take advantage of the guaranteed delivery that TCP
	18	offers, since on a 640x480 screen, even one missed frame could have
	19	significant reprocussions in terms of game state.
	20
	21	I also need the speed of UDP. Ideally, this game would be able to run
	22	over 14,400 bps modem speeds. Using PPP, this averages between 600
	23	and 900 cps, so including IP headers, we would want no more than 600
	24	bytes of traffic every second. Using 1 byte synchronization packets,
	25	30 frames per second would require in (1+U+20)*30 bytes throughput
	26	every second. (Where U = #bytes in a UDP header) That's at least
	27	630 bytes per second, just going one way. Guaranteed delivery of
	28	synchronization packets requires at least an ack, nearly doubling
	29	the necessary throughput.
	30
	31	The only reason we want such tight synchronization is so that
	32	keypresses arrive on the frame that they need to, and that frame
	33	is the same on every machine in the game. Since key-presses
	34	average about 10 per second, we can probably get away with only
	35	10 sync packets every second (1 every 3 frames). We still want
	36	a granularity of a keyboard poll every frame, but we can now do
	37	lazy acks and do guaranteed delivery on 1 sync per second and
	38	each keystroke.
	39
	40	Well, the problem is not if a keystroke is sent to a machine
	41	that is slow, but if a keystroke is sent to a machine that is
	42	fast. If a fast machine zips through the frames, then a keystroke
	43	at a particular frame will get missed or late, both of which resulting
	44	in corrupt games. Imagine pressing the turn key on your machine,
	45	pivoting round, and pressing the fire key. On your machine you
	46	hit the asteroid next to you, and on another fast machine, you
	47	crashed before you were able to hit the fire key. Instability:
	48	You are dead on one machine, and live on another. :-)
	49
	50	We need synchronization on each frame (actually on each key poll,)
	51	to prevent keystrokes coming in a frame or two too late.
	52
	53	Keystrokes can ride piggyback on sync frames, so there is little
	54	overhead for that...
	55
	56	The alternative is to poll the keyboard every other frame, and only
	57	send sync signals every other frame as well.
	58	... That doesn't work very well... you need the every frame key poll
	59	to respond quickly while firing.
	60
	61
	62	Stop and wait? -- not exactly. :)
	63
	64	The conclusion? Networked Maelstrom won't run over the modem (200 ms
65	ping times are too slow) and we will use UDP with stop-go protocol
66	for real-time speed.
67
68	The next phase is designing a good handshake protocol.
69	The plan:
70	When a new game starts... The checksum server (player 1)
71	will send a checksum to all other players. Each other player
72	will, when it receives it, send back the same message.
73	How do we guarantee delivery? We don't, but if the reply
74	doesn't make it, the first player will keep sending the NEW_GAME
75	packet to that player, and that player will respond, even though
76	it's waiting for the first frame packet.
77
78
79	Another possible dropped packet condition is as follows:
80
81	Player 1 sends a frame packet to itself and Player 2.
82	Player 2 sends a frame packet to itself and Player 1.
83	Player 2 receives the frame packet from Player 1, and itself.
84	Player 1 only receives the packet from itself.
85	Player 2 continues to the next frame
86	Player 1 retransmits the packet from its current frame.
87
88	Player 2 is now deadlocked waiting for the packet for frame +1
89	while Player 1 is still asking for the packet for frame +0.
90	The solution is for each player to keep the it's last frame
91	packet cached, so that if it receives a request for the last
92	frame, it can resend the previous frame's packet, and the game
93	can continue.
94
95	The players can only be one frame off, because each player does
96	not continue to the next frame until all players have sent the
97	frame packet for the current frame. If one player is one frame
98	behind, all other players will wait for it.
99
100	If packets arrive out of order, then an older retransmission can
101	arrive after a frame has been successfully negotiated. In this
102	case, the receiver will send a response if the packet is from
103	the previous frame, and it will be ignored by the sender, who is
104	a frame ahead. Thus consistency is preserved without initiating
105	packet storms of retransmissions and acks.
106
107	Note that this configuration really is peer-to-peer, and the only
108	time one player is a "server" is when the first player sends the
109	initial level, number of lives, and random seed to all other players.
110
111	I had a great problem with lock-stepping using the above method.
112	Imagine both players furiously sending frame packets at each other,
113	and advancing to the next frame. Then:
114
115	Player 1 (frame 10) Player 2 (frame 10)
116	10 --> <-- 10
117	dropped packet
118	Player 1 (frame 10) Player 2 (frame 11)
119	wait <-- 11
120	get 11, timeout
121	Player 1 (frame 10) Player 2 (frame 11)
122	10 --> <-- 10 (cached)
123	Player 1 (frame 11) Player 2 (frame 11)
124	11 --> wait
125	Player 1 (frame 11) Player 2 (frame 12)
126	wait <-- 12
127	get 12, timeout
128
129	This goes on, repeating, as the two players advance frames in lock-step,
130	one frame per timeout.
131
132	The solution is for Player 1 to immediately resend the sync packet for
133	it's frame if it receives a packet for the next frame. That way it doesn't
134	have to time out. If it caches the packet for the next frame, the next
135	time around, it just sends the packet for the next frame, and advances
136	immediately. The two players are now back in sync.
137	This assumes that few packets will be dropped or otherwise lost.
138	This seems to be a valid assumption. Even over a 32.6K link to
139	Australia, very few packets were lost and though the game was unplayably
140	slow, dropped, lost, and timed out packets were few and far between.
141
142	The new senario:
143
144	Player 1 (frame 10) Player 2 (frame 10)
145	10 --> <-- 10
146	dropped packet
147	Player 1 (frame 10) Player 2 (frame 11)
148	get 11, cache, 10 --> <-- 11
149	Player 1 (frame 10) Player 2 (frame 11)
150	get 10 <-- 10 (cached)
151	Player 1 (frame 11) Player 2 (frame 11)
152	11 --> get 11
153	Player 1 (frame 12) Player 2 (frame 12)
154	12 --> <-- 12
155
156	The players are now synchronized again.
157
158