by Spike » Thu Feb 09, 2012 4:43 am
I could try explaining how a NAT works, if you want...
Note that when I say NAT, I mean 'NAPT' or masquerading as linux refers to it.
True NAT doesn't have nearly as many issues, but its also a bit more pointless. ^^
If you're sending a packet from host A to D, through nats B and C, then the only guarenteed way for it to work is if C is set up to explicitly forward packets bound to C:26000 on to D:26000. This is a common situation with home firewalls/users.
A sends 'hey, gimme a connection' towards C, via B as a gateway. B sees a packet leaving the NAT and recognises this as a new connection. It generates a new random source port which isn't used by any other host on the lan and sends it out over the internet to host C (this means that eg 192.168.0.2 and 192.168.0.3 can both bind on the same port [this includes auto-assigned ports] and send to the same destination).
C sees a packet that appears to come from B, bound to port 26000. There's no services running on C, lets say its just a nat/firewall box (might have stuff listening on 192.168.0.1, but that's not open to the internet). So either the user has set up some port forwarding to forward 26000 on to D, or the packet gets dropped. Assuming they did set up port forwarding (if they're hosting the game behind a nat, they'll need to), then C will register this as a new connection (with a random port), update the destination address, and forward the packet on to D:26000. D receives the packet.
In this example, the packet has so far been send from A to D... It no longer has either the original source nor destination address, of course...
As this is a quake-specific example, the gameserver on host D receives the packet, understands that its a connection request from B:RNDNATB, finds a free qsocket object with a system socket bound on an automatic port (probably in the 4k-8k range if its windows) and replies to B:RNDNATB via gateway C saying 'you're accepted, by the way use port AUTOSV'. C receives the packet, checks its connectiontracker, verifies that its not a new connection, and changes the src to C:26000 before forwarding to B via the greater internet.
B receives the packet, checks its connection tracker, sees that the local side is actually A:AUTOCL (whatever port the client got autoassigned when it tried to bind to port 0), updates the destination to A:AUTOCL and forwards the packet over the lan to A.
A then receives a packet from C:26000.
Which tells it to send input packets to C:AUTOSV.
Except that C doesn't know about that port.
And it won't forward the packet. The connection will time out. (The 'workaround' is to disable the firewall/router on host C, and just forward everything to D).
Meanwhile, this is NQ... so D has already started trying to send packets to B:RNDNATB. Router C doesn't know the source address, so it creates a new outbound connection, with new random port.
Guess what...
B doesn't know it.... Connection accepted... Connection timed out. (this particular issue is fixed by the proquake nat fix - the server doesn't send until it receives a packet from the client to the correct port).
By having the server tell the client a port number, you generate issues with NATs.
FTE's NQ server uses the same socket for the client that the connection request was sent to, thus the client always sends to the same port, meaning the NATs use the same contrack entry for both the request and the connection itself, and all is well... assuming the client doesn't decide to just use a random new socket for the sake of it. As far as I'm aware, DP is identical.
QW servers and clients use a single socket for everything.
There's a few things that I've not mentioned. B:RNDNATB may be the same for all connections with the same A:PORT as a source. This is very useful to reduce the number of connections that need tracking, but for TCP its not an option, and the nat may only track connections initiated within the lan. This means that you can get routers that focus only on TCP and are suboptimal for UDP. This issue also prevents bouncing packets via a third party to punch a hole through both NATs and will ruin many techniques (eg: skype will have to find some other user's computer to use as a proxy instead).
There's also a class of really idiotic NATs that do not refresh connection timeouts after they're established, and then when a new connection is made, it generates a new random port. This basically means that your client's port number changes every 2 minutes killing your connection, which is what the qport thing is in every version of quake starting with quakeworld.
Side note:
A UDP packet contains:
IP: srcip+dstip+fraginfo(id+offset+morefrags)+payload(UDP: srcport+dstport+payload(NQGame:flagsnlen+sequence+svcdata))
Obviously the nat doesn't look into the udp's actual payload other than to send the data onwards, but will peek at the udp header.
The IP packet's payload consists of the UDP header. The UDP header is where the actual ports are stored. This means that if you have a fragmented UDP packet, *only* the first fragment contains the port numbers. This is why so many NATs have problems with fragmentation, and why you should avoid using more than 1450 bytes or whatever it is. This is even more problematic when you have routers that are unable(or refuse) to forward ICMP packets, which means your system never even knows that there's a router somewhere refusing to fragment (there's actually a minimum sane fragment size defined at around 578 bytes that every single router must be able to cope with, but generally everyone uses ethernet which is where the 1450 comes from, but beware ATM connections which fragment at 2 bytes less or so).
TCP generally depends upon ICMP messages in order to detect the MTU properly. Routers that do not forward ICMP(can be a security 'feature') can thus result in TCP connections failing to transfer data(hanging) to servers with a lower MTU beyond the router.
Additional side note:
bind(sock, &addr, sizeof(addr)) binds the socket to an address. if the address is a sockaddr_in with INADDR_ANY then it'll receive packets sent to any interface on the computer, and send packets from whatever interface the system thinks is the best (see system routing table).
If your client is unable to connect to a DP or FTE server via the host 'localhost', but can on 192.168.1.3 or whatever then your client does not support multi-homed computers (note that FTE needs sv_listen_nq 1, and possibly sv_port 26000 or it'll fail anyway. I don't remember the cvar to get DP to use vanilla protocols).
It seems to me that this is the biggest cause of connection issues discussed on quakeone.com.
Additional side note:
IPV6 can be enabled by using a hybrid ipv6 socket.
If you call: setsockopt(newsocket, IPPROTO_IPV6, IPV6_V6ONLY, (char *)&_false, sizeof(_false)) on an ipv6 socket then you get asocket that can also accept both ipv6 and ipv4 packets at the same time. It only accepts IPV6 addresses, but if you send to the address eg: ::ffff:192.168.0.1 then it'll actually send an ipv4 packet instead. Just beware that your code needs to deal exclusively in ipv6 addresses internally, but still support ipv4 addresses in the user-facing parts. Oh, you'll also need to bind to IN6ADDR_ANY or you'll not be listening to any ipv4 interfaces anyway, but that shouldn't be an issue.
.