Diagnosing High CPU Usage | RamNode Cloud VPS

Applies to: All RamNode VPS Plans | Ubuntu / Debian / CentOS / AlmaLinux | Rev. 2025

High CPU usage is one of the most common performance complaints on a VPS. The root cause can range from a legitimate spike in traffic to a runaway process, a poorly optimized script, or a compromised server silently mining cryptocurrency. This guide walks through a structured diagnostic process: how to spot the problem, understand what the numbers mean, and decide on the appropriate response.

1. Using top and htop to Identify Offending Processes

top — The Universal Starting Point

The top command is available on every Linux system without installation. Launch it with:

Launch top

top

Key columns to focus on:

Column	What It Tells You
%CPU	CPU percentage consumed by this process. Can exceed 100% on multi-core systems (200% = 2 full cores)
%MEM	Percentage of physical RAM in use
PID	Process ID — required for kill commands and deeper investigation
USER	The account running the process. `www-data` or `nobody` often indicates a web process; `root` may indicate a system task
COMMAND	The executable name. Use `e` field or press `c` to toggle the full command path
TIME+	Cumulative CPU time consumed since the process started — useful for spotting runaway processes

Useful top shortcuts: Press P to sort by CPU. Press M to sort by memory. Press k to kill a process by PID. Press 1 to expand per-core CPU view.

htop — An Interactive Alternative

htop provides a more readable, color-coded interface with mouse support. Install it if not already present:

Install htop — Debian/Ubuntu

apt install htop -y

Install htop — CentOS/AlmaLinux

dnf install htop -y

Launch htop

htop

htop advantages over top:

Horizontal bars at the top show per-core utilization at a glance — immediately reveals whether one core is pegged vs. all cores under load
Scroll through the process list and kill processes without memorizing keyboard shortcuts
F4 (Filter) narrows the list to a specific process name, e.g., php-fpm or python3
Tree view (F5) shows parent/child relationships, revealing which master process spawned multiple workers

TIP: On a fresh VPS you may not have htop installed. top is always available and is sufficient for initial triage.

2. Understanding Load Average vs. CPU Count

What Load Average Actually Means

The three load average numbers shown in top and /proc/loadavg represent the average number of runnable (or uninterruptible) tasks over the past 1 minute, 5 minutes, and 15 minutes respectively. A common misconception is that these numbers directly map to CPU percentage — they do not.

View load average

uptime
# Output: 14:32:11 up 22 days, load average: 2.41, 1.87, 1.55

cat /proc/loadavg
# Output: 2.41 1.87 1.55 3/312 18842

Interpreting Load Relative to CPU Count

The key formula: a load average equal to the number of logical CPUs represents 100% utilization. A load higher than your CPU count means processes are waiting for CPU time.

Check CPU count

nproc
# Or get more detail
lscpu | grep '^CPU(s):'

Scenario	Interpretation
Load = CPU count	System is fully utilized — acceptable if short-lived
Load < CPU count	Headroom exists — CPU is not the bottleneck
Load 2× CPU count	Significant pressure — processes queuing for CPU time
Load 4×+ CPU count	Severe overload — expect sluggish SSH, slow response times

WARNING: A 1-VPS plan with 1 vCPU and a load average of 1.0 is sitting at 100% utilization. A load of 2.0 means half the tasks are waiting. This is why high load on small VPS plans causes noticeable degradation faster than on dedicated servers.

Reading the Trend

Compare all three numbers together. A load of 8.0 / 4.0 / 2.0 on a 4-core system is decreasing — the spike may be passing. A load of 1.5 / 3.0 / 4.5 is increasing — something is accumulating and requires immediate attention.

3. Distinguishing User, System, and I/O Wait

The CPU Breakdown in top

The summary line beginning with %Cpu(s) in top breaks CPU usage into several categories. Press 1 to expand to per-core view:

Example CPU breakdown

%Cpu(s):  45.2 us,  8.1 sy,  0.0 ni, 38.5 id,  7.9 wa,  0.0 hi,  0.2 si,  0.0 st

Field	What It Means
us (user)	CPU time spent running user-space code. High values point to application-level workloads: PHP, Python, Node.js, etc.
sy (system)	CPU time in kernel-space. High values suggest frequent system calls — disk I/O, network operations, or context switching
ni (nice)	CPU time for lower-priority user-space processes. Usually low.
id (idle)	Remaining free CPU. Subtract from 100 to get rough total utilization.
wa (iowait)	Time the CPU waited for I/O. High wa (above 20–30%) suggests a disk bottleneck rather than a true CPU problem.
hi (hw irq)	Hardware interrupt requests — usually near zero unless heavy network traffic.
si (sw irq)	Software interrupts. High values can indicate heavy network processing.
st (steal)	CPU time stolen by the hypervisor. Persistent steal above 5–10% suggests overselling.

Diagnostic Decision Tree

High us (user): Application code is the problem. Find which process via top → P sort, then profile the application.
High sy (system): Kernel is doing heavy work. Check for excessive forks, context switching, or filesystem churn with vmstat 1 5.
High wa (iowait): I/O is blocking. Use iostat -x 1 5 or iotop to find the disk-hungry process. Do not confuse with a CPU problem.
High st (steal): The hypervisor is throttling your VPS. Consider upgrading your plan or contacting support if steal is consistently elevated.

4. Common Culprits

Runaway PHP Processes

PHP-FPM worker processes are one of the most frequent CPU offenders on VPS stacks running WordPress, WooCommerce, or Drupal. A slow database query, an infinite loop in a plugin, or a traffic surge can cause workers to pile up.

Inspect PHP-FPM workers

# Count active PHP-FPM workers
ps aux | grep php-fpm | grep -v grep | wc -l

# See which PHP processes are consuming CPU
ps aux --sort=-%cpu | grep php | head -20

# Check PHP-FPM pool status (if status endpoint is enabled)
curl http://127.0.0.1/status?full

Warning signs specific to PHP:

Dozens of php-fpm workers all stuck at the same CPU% with identical memory footprint
Workers accumulating over time without dying (check TIME+ in top — values above several minutes are suspicious)
Access logs showing a flood of POST requests to /xmlrpc.php or wp-login.php — bot traffic triggering PHP execution

WARNING: A common PHP-FPM trap: setting pm.max_children too high consumes all available RAM, which causes the kernel to swap, which causes iowait to spike, making the system appear CPU-bound when it is actually memory-bound.

Stuck or Looping Cron Jobs

Cron jobs that fail to exit, are scheduled too frequently, or run longer than their interval can stack up and consume significant CPU.

Check cron-related processes

# Check running cron-related processes
ps aux | grep -E '(cron|curl|wget|php|python|bash)' | grep -v grep

# View all user crontabs
for user in $(cut -f1 -d: /etc/passwd); do crontab -u $user -l 2>/dev/null; done

# Check system-level cron jobs
cat /etc/crontab
ls /etc/cron.d/ /etc/cron.hourly/ /etc/cron.daily/

Symptoms of a stuck cron job:

Multiple instances of the same script in ps output with increasing PID numbers
Process has a high TIME+ value relative to how long it should legitimately run
CPU spikes occur on a predictable schedule correlating with a cron entry

Cryptocurrency Miners from Compromised Servers

Cryptocurrency miners are frequently deployed on compromised servers via vulnerabilities in web applications, exposed Docker APIs, or weak SSH credentials. They typically manifest as sustained 80–100% CPU usage.

Detect cryptocurrency miners

# Look for known miner process names
ps aux | grep -iE '(xmrig|minerd|cpuminer|kworker|kthreadd)' | grep -v grep

# Check for processes with no associated file on disk (deleted binaries)
ls -la /proc/*/exe 2>/dev/null | grep deleted

# Unusual outbound network connections from high-CPU processes
ss -tulpn | grep -v '127.0.0.1'
netstat -antp | grep ESTABLISHED

# Check for recently modified or new binaries in common drop locations
find /tmp /var/tmp /dev/shm -type f -executable 2>/dev/null

# Inspect process binary path (replace PID)
ls -la /proc/PID/exe
cat /proc/PID/cmdline | tr '\0' ' '

DANGER: Miners often disguise themselves with names that resemble legitimate system processes such as kworker, sshd, or java. Always verify suspicious high-CPU processes by checking their actual binary path via /proc/PID/exe rather than trusting the COMMAND column alone.

Red flags that suggest a miner rather than a legitimate process:

Process binary resolves to /tmp, /dev/shm, or a hidden directory
CPU usage is consistently high (85–99%) across all cores, sustained over hours
The process was started recently but the server has been running for weeks
Active outbound connections on non-standard ports (TCP 3333, 4444, 5555, 7777, 14444 — common mining pool ports)
No log entries or shell history explaining when or how the process started
crontab, /etc/rc.local, or systemd unit files contain entries pointing to the executable

5. When to Kill vs. When to Investigate

Decision Framework

The appropriate response depends on whether you understand what the process is and whether it is expected. Killing first and investigating later is reasonable in a production emergency — but investigation must still follow.

Scenario	Indicators
Kill Immediately	Process binary is in /tmp or /dev/shm Active connections to unknown mining pool IPs Duplicate cron processes stacking up without end PHP/Python process consuming CPU for 30+ minutes with no user-facing request to justify it
Investigate First	Known process (mysql, nginx, php-fpm) under unexpected load System during a backup or indexing window High iowait rather than user/sys CPU (different root cause) Single spike vs. sustained elevation (spikes may self-resolve)

How to Kill a Process Safely

Kill process commands

# Graceful termination (allows process to clean up) — try first
kill -15 PID

# Force kill (use if -15 has no effect after a few seconds)
kill -9 PID

# Kill all processes matching a name
pkill -9 processname

# Kill all PHP-FPM workers and let the master restart them
pkill -9 php-fpm && systemctl restart php8.1-fpm

DANGER: Sending kill -9 to a database process (MySQL, PostgreSQL, Redis) without a graceful shutdown can corrupt data files or require crash recovery on the next start. Use systemctl stop servicename instead, which sends the correct signal sequence.

Post-Kill Investigation Steps

Whether the process was legitimate or malicious, document what happened and prevent recurrence:

For PHP/application spikes: Review slow query logs (/var/log/mysql/slow.log), PHP-FPM access logs, and application error logs in the timeframe surrounding the spike.
For stuck cron jobs: Add a lock mechanism using flock to prevent concurrent execution, and set a maximum runtime with timeout prefix in the crontab entry.
For suspected miners/compromise: Do not just kill the process. Check for persistence mechanisms in crontab, /etc/rc.local, /etc/cron.d/, and systemd unit files. Consider the server compromised until proven otherwise and review Fail2ban and auth.log for unauthorized access.
In all cases: Record the PID, binary path, user, and network connections before terminating if possible. Use cp /proc/PID/exe /tmp/evidence.bin to preserve the binary for analysis.

6. Quick Reference: Diagnostic Commands

Command	Purpose
top -b -n 1	Single-snapshot output to stdout — useful for logging
ps aux --sort=-%cpu \| head	Top CPU-consuming processes sorted at point-in-time
htop	Interactive process viewer with per-core bars
uptime	Load averages and system uptime
nproc	Number of logical CPUs (denominator for load averages)
vmstat 1 5	CPU breakdown, context switches, and memory per second for 5 samples
iostat -x 1 5	Per-disk I/O stats to confirm or rule out iowait as root cause
iotop	Real-time per-process disk I/O — requires root
ss -tulpn	Open ports and associated processes
netstat -antp	All TCP connections with PIDs — identify miner pool connections
find /tmp /var/tmp -executable	Scan for executable files in world-writable directories
ls -la /proc/PID/exe	Resolve true binary path for a process
strace -p PID	Trace system calls of a live process
lsof -p PID	List all files and sockets open by a process

Need more help?: Open a support ticket at my.ramnode.com or consult the RamNode knowledge base for additional guides on server hardening, PHP-FPM tuning, and incident response procedures.