PORT VS SOCKET
Source: Dev.to
What Is a Port?
A port is just a number (0–65535) that identifies a service on a machine.
- IP address → identifies the machine
- Port → identifies the application inside the machine
Common ports
22→ SSH80→ HTTP443→ HTTPS3306→ MySQL
When you see 192.168.1.10:443 it means:
- Machine IP =
192.168.1.10 - Service = running on port
443
A port by itself does NOT mean a connection exists; it only means a process is listening.
What Is a Socket?
A socket is a full communication endpoint. It includes:
- IP address
- Port
- Protocol (TCP/UDP)
A real TCP connection is uniquely identified by the 5‑tuple:
Source IP + Source Port + Destination IP + Destination Port + Protocol
Example
- Client:
10.0.0.5:51512 - Server:
192.168.1.10:443 - Protocol: TCP
That 5‑tuple defines one unique connection.
| Port | Socket |
|---|---|
| Just a number | Full communication endpoint |
| Identifies a service | Identifies a connection |
| Exists without traffic | Exists during communication |
How Is a Socket Created?
Client Side
When a browser connects to HTTPS:
-
socket()– Application asks the kernel to create a socket.- Kernel allocates a socket structure in memory.
- Returns a file descriptor.
-
connect()– Kernel assigns an ephemeral port (e.g.,51512) and initiates the TCP 3‑way handshake:SYN → SYN‑ACK → ACKAfter the handshake the connection becomes ESTABLISHED.
Server Side
When Nginx starts:
socket()– Creates a listening socket.bind()– Reserves a port (e.g.,443).listen()– Marks the socket as listening.accept()– When a client connects, the kernel creates a new socket for that client while the listening socket remains open.
One new socket per client.
If 10,000 clients connect → 10,000 sockets.
Who Manages the Socket?
The Linux kernel TCP/IP stack manages:
- TCP state (
SYN_SENT,ESTABLISHED,TIME_WAIT, …) - Send/receive buffers
- Sequence numbers
- Congestion control
- Memory allocation
Applications only:
readwriteclose
Everything else is the kernel’s responsibility.
Why Does a Socket Use a File Descriptor?
Sockets do not write to disk, yet they use file descriptors (FDs) because, in Unix/Linux, “Everything is a file.”
Linux treats the following as file descriptors:
- Files
- Sockets
- Pipes
- Terminals
- Devices
epoll,eventfd, etc.
What Is a File Descriptor Actually?
A file descriptor is:
- Just an integer
- Index into a per‑process table
- Points to a kernel object
Standard descriptors
| FD | Meaning |
|---|---|
0 | stdin |
1 | stdout |
2 | stderr |
3 | First opened socket/file |
Example (C)
int fd = socket(AF_INET, SOCK_STREAM, 0);
The kernel:
- Creates a socket object.
- Stores it in the process’s FD table.
- Returns a small integer (the handle).
That integer is merely a reference; it does not imply a disk file.
Why Reuse the File Descriptor Mechanism?
It provides a unified API:
read(fd);
write(fd);
close(fd);
poll(fd);
epoll(fd);
The same syscalls work for files, sockets, and pipes, eliminating the need for a separate “network API.” This abstraction is extremely powerful.
Why This Matters in Real Systems
- High‑traffic services may have 50,000 concurrent connections → 50,000 sockets → 50,000 file descriptors.
- The error “Too many open files” usually indicates you have exhausted file descriptors, not disk files.
Check limits with:
ulimit -n
- Containers and Kubernetes pods share the node kernel, so node‑wide FD limits matter.
- Socket exhaustion (e.g., many
TIME_WAITstates) can kill throughput.
Visual Summary
| Concept | What It Represents |
|---|---|
| Port | Service identifier (no active communication) |
| Socket | Kernel object representing a live connection (contains TCP state + buffers) |
| File Descriptor | Integer handle pointing to a kernel object, used for unified I/O abstraction |
Final Mental Model
- IP = Building
- Port = Door
- Socket = Active phone call between two doors
- File Descriptor = The call reference number the OS uses internally