TCP and UDP: Ports and Connections
The network layer (IP) gets a packet to the right machine. But a machine runs many services at once (web, ssh, database). How does it know which service the packet is for, and how does it ensure data arrives complete and in order? That's the job of the transport layer (layer 4) — with two main protocols: TCP and UDP, plus the concept of a port.
Port: identifying a service on a machine
A port is a 16-bit number (0–65535) that identifies a specific service/connection on a machine. IP gets the packet to the right machine; the port gets it to the right program on that machine.
Ports fall into three ranges:
0 – 1023 Well-known (need elevated privilege to open) — standard services
1024 – 49151 Registered — assigned to specific applications
49152– 65535 Ephemeral (temporary) — clients grab one when opening a connection
A few well-known ports worth memorizing (per IANA):
22 SSH 53 DNS 443 HTTPS
80 HTTP 25 SMTP 3306 MySQL
123 NTP 53 DNS 5432 PostgreSQL
When you go to https://example.com, port 443 is implied; http:// is 80. The server "listens" on its fixed port; the client uses a temporary (ephemeral) port for each connection.
A connection is identified by 4 pieces of information
Here's the key idea. A TCP connection is uniquely identified by four things (a 4-tuple): source IP, source port, destination IP, destination port. View a real connection on your machine:
netstat -an -p tcp | grep ESTABLISHED
192.168.71.168.55696 → 34.149.66.137.443 ESTABLISHED
└──source IP─┘ └port┘ └──dest IP──┘ └port┘
The same machine (192.168.71.168) can open many connections to the same server on port 443, as long as the source port differs (55696, 55579...). It's this 4-tuple that lets the machine distinguish hundreds of parallel connections — and it's also how NAT (Article 5) uses ports to track each connection.
TCP: a reliable connection
TCP (Transmission Control Protocol) provides a reliable, ordered connection: data arrives complete, in sequence, with no duplicates. It does this by numbering every byte, acknowledging (ACK) what it received, and resending what was lost. (The current standard is RFC 9293, released 2022, replacing the classic RFC 793.)
Before transmitting data, TCP establishes a connection via the three-way handshake:
Client Server
│ ──── SYN (seq=x) ──────────────────► │ "I want to connect"
│ ◄─── SYN-ACK (seq=y, ack=x+1) ─────── │ "OK, I'm ready too"
│ ──── ACK (ack=y+1) ────────────────► │ "Acknowledged, let's begin"
│ ═════════ connection ESTABLISHED ════ │
│ ◄────────── two-way data ───────────► │
These three steps (SYN → SYN-ACK → ACK) let both sides synchronize and confirm both are ready. (You can see them yourself with sudo tcpdump -n 'tcp port 443' when you open a web page.) When the work is done, the connection is closed politely with an exchange of FIN packets.
The cost of reliability: the handshake takes one round trip before any data can be sent — adding latency. This is one reason people optimize (keep-alive connections, HTTP/2, QUIC — Article 8).
TCP connection states
TCP is a "stateful" protocol — each connection moves through a set of states. View the statistics on your machine:
netstat -an -p tcp | awk 'NR>2{print $NF}' | sort | uniq -c | sort -rn
37 ESTABLISHED ← actively transmitting data
20 LISTEN ← server waiting for incoming connections
1 TIME_WAIT ← just closed, waiting a moment to be sure
A few states worth knowing when troubleshooting:
- LISTEN — a service is waiting for connections on that port (remember
ss -tlnpin Article 13 of the Linux series). No LISTEN = nobody is serving that port. - ESTABLISHED — the connection is active.
- TIME_WAIT — the connection just closed; the system holds it a moment to handle late-arriving packets. Many TIME_WAITs is normal; a lot of them can signal opening/closing connections too frequently.
- SYN_SENT / SYN_RECV — in the middle of the handshake. Stuck at SYN_SENT usually means the server can't be reached (firewall blocking, server not listening).
UDP: fast, no guarantees
UDP (User Datagram Protocol) is the opposite approach: connectionless, no guarantees. It just sends packets out (datagrams) — no handshake, no acknowledgment, no resend if lost. In exchange: fast, low overhead, low latency.
When do you use UDP instead of TCP? When speed/latency matters more than receiving 100%:
- DNS (Article 7) — a short question/answer; re-asking if lost is still faster than a TCP handshake.
- Video calls, online games, streaming — drop a frame and move on, nobody wants to "rewind"; latency is the enemy.
- QUIC / HTTP/3 — built on UDP to avoid TCP's handshake latency, handling reliability itself at an upper layer.
TCP UDP
─────────────────────────────────────────────
has a handshake (3 steps) no handshake
guarantees arrival, in order no guarantees
resends lost packets lost is lost
slower (overhead) fast, lightweight
web, ssh, file, database DNS, video, game, QUIC
Checking a port with nc
Is a port open (a service listening)? nc (netcat) tries to connect:
nc -z -G 3 1.1.1.1 443
Connection to 1.1.1.1 port 443 [tcp/https] succeeded!
-z only checks (sends no data). "succeeded" = the port is open, a service is listening (the TCP handshake succeeded). This is a quick tool to check "is a service listening on that port / is a firewall blocking it" (Article 12).
Wrap-up
The transport layer gets data to the right service via a port (16-bit; well-known < 1024), and a connection is identified by a 4-tuple (source/destination IP+port). TCP gives a reliable, ordered connection through the three-way handshake (SYN/SYN-ACK/ACK) and moves through states (LISTEN, ESTABLISHED, TIME_WAIT...) — used for web/ssh/database. UDP is connectionless, no guarantees but fast — used for DNS/video/game. netstat/ss view connections, nc -z checks a port.
DNS uses UDP and is the first leg of every request (Article 0). Article 7 digs into it: how a domain name is resolved into an IP.