Linux Internals Everyone *Must* Understand
Source: Dev.to
Beyond “I know Linux”
If you claim DevOps, Linux isn’t just an OS—it’s your runtime, debugger, firewall, scheduler, and autopsy report. Understanding the kernel‑exposed filesystems and process behavior is essential for production reliability.
/proc & /sys – The Kernel’s Virtual Filesystems
Linux exposes its internal state as files. Both are virtual filesystems (no disk I/O) created at boot and reflect the current kernel state.
Key /proc Paths
| Path | Description |
|---|---|
/proc/cpuinfo | CPU architecture, cores |
/proc/meminfo | Memory statistics |
/proc/loadavg | Load average (1, 5, 15 min) |
/proc/<pid>/fd | Open file descriptors for a process |
/proc/<pid>/maps | Memory mappings of a process |
Production insight
ls -l /proc/<pid>/fd | wc -l # count open file descriptors
Detect descriptor leaks instantly.
Key /sys Paths
Used by udev, drivers, and containers to control hardware and kernel interaction.
| Path | Example Use |
|---|---|
/sys/class/net/eth0/speed | Query network interface speed |
/sys/block/sda/queue/scheduler | View or set I/O scheduler |
Takeaway
/proc= What is happening now/sys= How hardware & kernel are wired
Process Lifecycle: fork → exec → zombie
Understanding process creation and termination separates junior from senior engineers.
fork(); // child process created
exec("/bin/java"); // replace child with new program
wait(); // parent collects exit status
If the parent fails to wait(), the child becomes a zombie:
ps aux | grep Z # shows zombie processes
Interview gold line: “Zombies don’t consume memory, but they exhaust PID space.”
Load Average
uptime
# output: 1.2, 0.9, 0.7
The three numbers are the average number of runnable or waiting processes over the last 1, 5, and 15 minutes.
| Load | Interpretation (on 4‑core system) |
|---|---|
| 4 | Healthy (≈1 per core) |
| 10 | Overloaded |
A high load with low CPU usage usually indicates an I/O bottleneck.
Memory Metrics
free -m
Focus on the available column, not “free”. Linux uses free memory for cache aggressively; high usage is normal.
Helpful tools:
vmstat 1
iostat -x
top
htop
These correlate CPU wait time, disk latency, and run‑queue length.
Sockets – IP + Port + Protocol
ss -tulnp
Example output
LISTEN 0 128 0.0.0.0:8080 java
| State | Meaning |
|---|---|
| LISTEN | Waiting for connections |
| ESTABLISHED | Active connection |
| TIME_WAIT | Normal close |
| CLOSE_WAIT | Application bug (leaking connections) |
A socket stuck in CLOSE_WAIT signals that the application isn’t closing connections properly.
SELinux – Mandatory Access Control
File permissions alone are insufficient. Check the SELinux mode:
getenforce
# Enforcing | Permissive | Disabled
Common production failure: correct Unix permissions but wrong SELinux context.
Troubleshooting steps
ausearch -m avc -ts recent # find recent denials
semanage fcontext -a -t <type> <path>
restorecon -v <path>
Senior rule: Never disable SELinux in production; fix the policies instead.
systemd – Service Supervision
systemd handles process supervision, logging, dependency management, and auto‑restart.
Example unit file
[Service]
ExecStart=/app/start.sh
Restart=always
MemoryMax=2G
Check recent failures:
journalctl -u myapp --since today
Features:
- Built‑in watchdog
- CGroup resource limits
- Deterministic startup order
Interviewers may not ask “Explain /proc”; they’ll ask “Why is load high but CPU idle?” or “App restarted but port still busy?” Mastering these internals lets you answer naturally.
Bottom Line
Linux is:
- Your observability platform
- Your runtime security layer
- Your truth source
Fundamentals change slowly; mastering them compounds forever.