Performance Tuning: Linux Kernel Optimizations for 10k+ Connections

Published: (December 20, 2025 at 04:30 PM EST)
6 min read
Source: Dev.to

Source: Dev.to

Introduction

In high‑concurrency real‑time architectures, the performance bottleneck inevitably shifts from the application layer to the operating system. A well‑optimized Flask‑SocketIO application running on Gevent or Eventlet can theoretically handle tens of thousands of concurrent connections. However, in a default Linux environment, such an application will usually crash or stop accepting connections long before CPU or memory resources are saturated.

This plateau occurs because the Linux kernel, out of the box, is tuned for general‑purpose computing, not for acting as a massive termination point for persistent TCP connections. For a WebSocket server—where connections are long‑lived and stateful—resource exhaustion manifests as:

  • File descriptor limits
  • Ephemeral‑port starvation
  • TCP‑stack congestion

The article below outlines the specific kernel‑level tuning required to scale Flask‑SocketIO beyond the 10 000‑connection barrier.

File Descriptors

In Unix‑like operating systems, “everything is a file.” This includes TCP sockets. When a client connects to your server, the kernel allocates a file descriptor (FD) to represent that socket.

  • By default, most Linux distributions enforce a strict limit of 1024 open file descriptors per process – a legacy constraint.
  • For a WebSocket server, this means that after roughly 1 000 concurrent users (plus a few descriptors for log files and shared libraries), the application will crash or raise
OSError: [Errno 24] Too many open files

The kernel distinguishes between:

  • Soft limit – user‑configurable ceiling.
  • Hard limit – absolute ceiling set by root.

Verification

ulimit -n
# → 1024

Remediation

System‑wide (/etc/security/limits.conf):

* soft nofile 65535
* hard nofile 65535

systemd Service (/etc/systemd/system/app.service):

Systemd ignores user limits; you must define them explicitly in the unit file:

[Service]
LimitNOFILE=65535

Ephemeral Ports

While file descriptors limit incoming connections, ephemeral ports limit outgoing connections. This distinction is critical for Flask‑SocketIO architectures that rely on a message broker like Redis.

When the Flask app connects to Redis (or Nginx connects to your upstream Flask/Gunicorn workers), it opens a TCP socket. The kernel assigns a local port from the ephemeral‑port range.

  • Default range is often narrow (e.g., 32768–60999), providing only ~28 000 ports.
  • In high‑throughput scenarios—e.g., the Flask app publishing aggressively to Redis or Nginx proxying massive traffic—the server can run out of available local ports.

Symptoms

  • EADDRNOTAVAIL (Cannot assign requested address) errors in logs.
  • Sudden inability of the Flask app to talk to Redis, despite Redis being healthy.
  • Nginx returning 502 Bad Gateway because it cannot open a socket to the upstream.

Tuning

# Check current range
sysctl net.ipv4.ip_local_port_range

Add to /etc/sysctl.conf to expand the range:

net.ipv4.ip_local_port_range = 1024 65535

Apply the change:

sudo sysctl -p

TIME_WAIT State

The most misunderstood aspect of TCP scaling is the TIME_WAIT state. When a TCP connection is closed, the side that initiated the close enters TIME_WAIT for 2 * MSL (Maximum Segment Lifetime), typically 60 seconds. This ensures that delayed packets are handled correctly and not mistaken for a new connection on the same port.

In a high‑churn environment (e.g., clients constantly refreshing pages or reconnecting), the server can accumulate tens of thousands of sockets in TIME_WAIT. These sockets:

  • Consume system resources.
  • Lock up the 4‑tuple (source IP, source port, dest IP, dest port), preventing new outgoing connections.

tcp_tw_recycleDo NOT use

Older guides suggested enabling net.ipv4.tcp_tw_recycle. It was removed in Linux kernel 4.12 because it breaks connections for users behind NAT by aggressively dropping out‑of‑order packets.

tcp_tw_reuse – Safe alternative

net.ipv4.tcp_tw_reuse allows the kernel to reclaim a TIME_WAIT socket for a new outgoing connection if the new connection’s timestamp is strictly greater than the last packet seen on the old connection. This is safe for most internal infrastructure (e.g., Flask ↔ Redis).

Configuration (/etc/sysctl.conf):

# Allow reuse of sockets in TIME_WAIT state for new connections
net.ipv4.tcp_tw_reuse = 1

Apply:

sudo sysctl -p

Benchmarking WebSockets

Standard HTTP benchmarking tools like ab (Apache Bench) are useless for WebSockets. They measure requests per second, whereas the primary metric for WebSockets is concurrency (simultaneous open connections) and message latency.

  • Artillery – supports WebSocket scenarios.
  • Locust – can be scripted for persistent connections.

Test methodology

  1. Ramp‑up – Don’t connect 10 k users instantly; this triggers “thundering‑herd” protection or SYN‑flood defenses. Ramp up over minutes.
  2. Sustain – Hold the connections open for an extended period.
  3. Broadcast – While connections are held, trigger a broadcast event to measure the latency of the Redis back‑plane and the Nginx proxy buffering.

Interpretation

Failure pointLikely cause
~1024 usersFile‑descriptor limit still in effect
~28 000 usersEphemeral‑port range exhausted
>30 000 TIME_WAIT socketsChurn problem or missing tcp_tw_reuse

Observability

Observability is the only way to confirm that kernel tuning is effective. When running high‑concurrency workloads, monitor specific OS‑level metrics.

What to watch

  • process_open_fds for the Gunicorn/uWSGI process – if this line flattens at a specific number (e.g., 1024 or 4096) while CPU is low, you have hit a hard limit.
  • Socket state countsESTABLISHED, TIME_WAIT, etc.
    • ESTABLISHED should match your active user count.
    • TIME_WAIT spikes to 30 k+ → churn problem or need tcp_tw_reuse.
  • Allocated socketssockstat output.

Example commands:

# Open file descriptors used by the process (replace <pid>)
cat /proc/<pid>/fd | wc -l

# Socket statistics
ss -s

# Detailed socket list (filter by state)
ss -tan state established | wc -l
ss -tan state time-wait | wc -l

Memory Consumed by Networking Buffers

If you are using iptables or Docker, the nf_conntrack table limits how many connections the firewall tracks.

# Check kernel log for conntrack table overflow
dmesg | grep "nf_conntrack: table full, dropping packet"

Tune (example):

sysctl -w net.netfilter.nf_conntrack_max=131072

Risks of Aggressive Kernel Tuning

AreaPotential IssueImpact
SecurityExpanding the ephemeral port range makes port scanning slightly easier (negligible inside a private VPC).
StabilitySetting file‑descriptor limits too high (e.g., millions) can let a memory leak in the application crash the entire server rather than just the process.
Connection TrackingIncreasing nf_conntrack_max consumes kernel memory (RAM). Ensure the server has enough RAM to store the state of 100k+ tracked connections.

Golden Rule:
Never apply sysctl settings blindly. Deploy them via configuration‑management tools (Ansible, Terraform), document why they are needed, and validate with load testing.

Scaling Flask‑SocketIO to 10 000+ Connections

Achieving high‑concurrency is as much a systems‑engineering problem as a software one. The default Linux configuration is conservative—geared toward desktops or low‑traffic servers. By systematically addressing:

  1. File‑descriptor limits (ulimit)
  2. Ephemeral port range (net.ipv4.ip_local_port_range)
  3. TCP TIME‑WAIT reuse (net.ipv4.tcp_tw_reuse)

you unlock the OS’s ability to handle many simultaneous sockets.

Production‑Readiness Checklist

  • File‑descriptor limitulimit -n > 65535 for the Gunicorn process.
  • Ephemeral port rangenet.ipv4.ip_local_port_range = 1024 65535.
  • TCP TIME‑WAIT reusenet.ipv4.tcp_tw_reuse = 1.
  • TCP TIME‑WAIT recyclenet.ipv4.tcp_tw_recycle = 0 (or the key absent).
  • Conntrack table – Increase nf_conntrack_max if using stateful firewalls.

Example sysctl Configuration

# /etc/sysctl.d/99-flask-socketio.conf
net.ipv4.ip_local_port_range = 1024 65535
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_tw_recycle = 0
net.netfilter.nf_conntrack_max = 131072

Apply the changes:

sudo sysctl --system

Remember:

  • Monitor memory usage after raising nf_conntrack_max.
  • Keep an eye on the number of open file descriptors (lsof, cat /proc/<pid>/fd).
  • Perform load‑testing (e.g., with locust or wrk) to verify that the system remains stable under the expected traffic.
Back to Blog

Related posts

Read more »