User Tools

Site Tools


network_stuff:tcpnotes

This is an old revision of the document!


TCP NOTES
This is announced during the tcp handshake:

  • MSS is announced (not really negotiated but just announced).
  • Window scaling is also announced. The default window size is 64kB which is far too small. That's way window scaling is ON 99% of the times.
  • SACK also announced in the 3 way handshake and also ON 99% of the times.
    • SACK is like a lookeahead acknowledgement while we wait for slow Bytes to arrive.
    • Example: “If I received Bytes 1,2,3 5,6 but not 4, I acknowledge 3 and 'selectively acknowledge' 5 and 6.”

The ACK number for a packet is the packet's sequence number plus the data length. See this Link


  • congestion window (CWND) is a sender imposed window that was implemented to avoid overrunning some routers in the middle of the network path. The sender, with each segment sent, increases the congestion window slightly, i.e. the sender will allow itself more outstanding sent data.
    • You can't 'get' that value directly from the capture file, as it is NOT ADVERTISED, it lives in the sender
  • receive window (RWND) is the amount of data the receiver can take at once without getting overwhelmed.
    • Managed by the receiver, who sends out window sizes to the sender. The window sizes announce the number of bytes still free in the receiver buffer, i.e. the number of bytes the sender can still send without needing an acknowledgement from the receive
    • Is quite confusing because is the one that TRAVELS IN THE TCP HEADER. In books it appears as advertised window meaning that is Advertised by the Receiver



TCP CONGESTION CONTROL ALGORITHMS (Reno, Cubic, Tahoe, more recently, BBR + Vegas, , Westwood) https://medium.com/@atoonk/tcp-bbr-exploring-tcp-congestion-control-84c9c11dc3a9

  • sysctl net.ipv4.tcp_congestion_control # default is usually cubic or reno
  • sysctl net.ipv4.tcp_available_congestion_control # list available suites
  • sysctl -w net.ipv4.tcp_congestion_control=bbr # rebuilt suite so it takes latency as congestion, not packet losses (by google)

To test performance:

tc qdisc replace dev enp0s20f0 root netem loss 1.5% latency 70ms # introduces some latency and packet loss

WINDOWING:

  • MSS and window scaling is negotiated at the beginning. Normally ~*128

When a port is not available and the connection is rejected, an ICMP unreachable message is sent and then a RST tcp packet


TCP OPTIMIZATION
https://www.extrahop.com/company/blog/2016/tcp-nodelay-nagle-quickack-best-practices/

  • NAGLE: Aim is to reduce the number of small packets sent over the network. You might want to fill up the truck instead of sending it just with one box, or not.. Nagle's algorithm and delayed ACKs. Hence Nagle's algorithm is undesirable in highly interactive environments.
    • TCP_NODELAY socket option allows your network to bypass Nagle Delays by disabling Nagle's algorithm, and sending the data as soon as it's available
  • Delayed ACK: is basically a bet taken by the destination betting 200 - 500 ms, that a new packet will arrive before the delayed ACK timer expires. Nagle's algorithm effectively only allows one packet to be actively transporting on the network at any given time, this tends to hold back traffic due to the interactions between the Nagle's algorithm and delayed ACKs.
    • To disable Delayed ACKs, use the TCP_QUICKACK socket option.
network_stuff/tcpnotes.1628543150.txt.gz · Last modified: (external edit)