Applies to: All RamNode Cloud VPS plans | Debian, Ubuntu, AlmaLinux, Rocky, RHEL | Rev. 2026
1. Introduction
Before deploying production workloads on a VPS, you should understand what you are actually getting. Provider specifications give you a starting point, but real-world performance depends on factors like CPU steal time, storage backend architecture, neighbor density on the host, and the underlying hardware generation. Benchmarking lets you verify that the resources you are paying for are actually delivering.
This guide covers three industry-standard tools for benchmarking a cloud VPS:
- sysbench for CPU and memory performance
- fio for disk I/O characterization
- stress-ng for stability and stress testing
By the end, you will have a reproducible benchmarking workflow you can run on any new instance to establish baselines and detect performance drift over time.
2. When to Benchmark
- Immediately after provisioning, to establish a baseline
- Before migrating production workloads to a new instance
- When investigating performance complaints from users or applications
- When comparing providers, regions, or plan tiers
- Periodically (monthly or quarterly) to detect noisy neighbor degradation — see Noisy Neighbor Symptoms vs. Real Performance Issues
3. Prerequisites
- Root or sudo access to the VPS
- At least 10 GB of free disk space on the device you plan to benchmark — see Diagnosing and Fixing Disk Space Issues
- Network connectivity for package installation
- A quiescent system: stop unnecessary services before testing to avoid skewed results
4. Installing the Tools
On Ubuntu or Debian:
apt update
apt install -y sysbench fio stress-ngOn AlmaLinux, Rocky Linux, or RHEL:
dnf install -y epel-release
dnf install -y sysbench fio stress-ngVerify the installed versions:
sysbench --version
fio --version
stress-ng --version5. CPU Benchmarking with sysbench
The sysbench CPU test calculates prime numbers up to a configurable ceiling. While synthetic, the results correlate well with general compute throughput and make it straightforward to compare clock speed and core efficiency between instances.
Single-thread CPU test
sysbench cpu --cpu-max-prime=20000 --threads=1 --time=60 runMulti-threaded CPU test
Match the thread count to the available vCPU count. Check vCPUs with nproc:
THREADS=$(nproc)
sysbench cpu --cpu-max-prime=20000 --threads=$THREADS --time=60 runKey Metrics
- events per second: primary throughput metric, higher is better
- total time: should match the
--timevalue - latency (avg, 95th percentile): lower is better
- threads fairness: low standard deviation across threads indicates consistent per-core performance
Reference numbers: A modern dedicated CPU core on recent Ryzen, EPYC, or Xeon Sapphire Rapids hardware should produce roughly 1,500 to 3,000 events per second at --cpu-max-prime=20000 on a single thread. Shared or burstable vCPUs often land 30 to 60 percent lower. If your numbers are unexpectedly low, check Diagnosing High CPU Usage.
6. Memory Benchmarking with sysbench
The memory test measures both sequential and random memory access performance.
Sequential write throughput
sysbench memory \
--memory-block-size=1M \
--memory-total-size=10G \
--memory-oper=write \
--memory-access-mode=seq \
runRandom read latency
sysbench memory \
--memory-block-size=1K \
--memory-total-size=10G \
--memory-oper=read \
--memory-access-mode=rnd \
runMemory bandwidth varies significantly between hardware generations. DDR5-equipped hosts produce noticeably higher throughput than DDR4 systems, particularly at larger block sizes. Small-block random access is more sensitive to memory latency than to raw bandwidth.
7. Disk I/O Benchmarking with fio
fio is the most flexible and widely trusted disk benchmark available. Disk performance is not a single number — it varies by block size, queue depth, read and write mix, and access pattern. You need multiple tests to characterize a storage device meaningfully.
Critical fio Parameters
--direct=1: bypasses the OS page cache so you measure storage, not RAM--ioengine=io_uring: preferred on kernels 5.1+; fall back tolibaio--bs: block size; 4K for random I/O, 1M for sequential--iodepth: queue depth; 1 for latency, 32–64 for peak throughput--numjobs: parallel workers; equal to vCPUs to maximize load--size: should exceed RAM to defeat caching--runtimewith--time_based: caps each test at a fixed duration
Test 1: 4K Random Read IOPS
The most important metric for database, web, and general application workloads.
fio --name=randread4k \
--filename=/tmp/fio-test \
--rw=randread \
--bs=4k \
--size=4G \
--numjobs=1 \
--iodepth=32 \
--direct=1 \
--ioengine=io_uring \
--runtime=60 \
--time_based \
--group_reportingTest 2: 4K Random Write IOPS
fio --name=randwrite4k \
--filename=/tmp/fio-test \
--rw=randwrite \
--bs=4k \
--size=4G \
--numjobs=1 \
--iodepth=32 \
--direct=1 \
--ioengine=io_uring \
--runtime=60 \
--time_based \
--group_reportingTest 3: Mixed 70/30 Random Read/Write
Approximates a typical database workload:
fio --name=mixedrw \
--filename=/tmp/fio-test \
--rw=randrw \
--rwmixread=70 \
--bs=4k \
--size=4G \
--numjobs=4 \
--iodepth=32 \
--direct=1 \
--ioengine=io_uring \
--runtime=60 \
--time_based \
--group_reportingTest 4: Sequential Throughput
Useful for large file workloads, backups, and streaming:
fio --name=seqread1m \
--filename=/tmp/fio-test \
--rw=read \
--bs=1M \
--size=4G \
--numjobs=1 \
--iodepth=8 \
--direct=1 \
--ioengine=io_uring \
--runtime=60 \
--time_based \
--group_reportingTest 5: Single-Queue Latency
Critical for understanding interactive responsiveness:
fio --name=latency \
--filename=/tmp/fio-test \
--rw=randread \
--bs=4k \
--size=4G \
--numjobs=1 \
--iodepth=1 \
--direct=1 \
--ioengine=io_uring \
--runtime=30 \
--time_basedClean up
rm /tmp/fio-testReading fio Output
- IOPS: operations per second, higher is better
- BW: throughput in MiB/s
- clat: per-operation completion latency; review average and 99th percentile
- clat 99.99: tail latency, important for user-facing applications
Reference numbers: Modern NVMe-backed VPS storage should produce 20,000 to 100,000+ IOPS for 4K random reads at queue depth 32, with sub-millisecond average latency. SATA SSD-backed instances typically land in the 5,000 to 20,000 IOPS range. If random write IOPS are dramatically lower than reads, the underlying storage may use a write-through cache or be experiencing host-level write amplification.
8. Stress Testing with stress-ng
While sysbench and fio measure peak performance, stress-ng pushes the system to its limits to expose stability issues, thermal throttling on the host, and noisy neighbor effects under sustained load.
CPU stress test
stress-ng --cpu $(nproc) --cpu-method matrixprod --metrics-brief --timeout 300sMemory stress test
Caution: Be careful with allocation percentages; over-allocating will trigger the OOM killer.
stress-ng --vm 2 --vm-bytes 75% --vm-method all --verify --metrics-brief --timeout 300sDisk I/O stress test
stress-ng --hdd 2 --hdd-bytes 1G --metrics-brief --timeout 300sCombined system stress
The canonical "everything at once" test:
stress-ng \
--cpu $(nproc) \
--io 2 \
--vm 1 \
--vm-bytes 25% \
--hdd 1 \
--hdd-bytes 1G \
--timeout 600s \
--metrics-briefWhile stress-ng is running, open a second SSH session and monitor the system:
top
vmstat 1
iostat -xz 1Watch for CPU steal time (%st in top), which indicates the hypervisor is preempting your vCPUs to schedule other tenants. See Basic Resource Monitoring for more on these tools.
9. Interpreting Steal Time
CPU steal time is one of the most useful indicators of host contention on a shared VPS. Brief spikes during system events are normal. Consistent steal time above 5 percent during a sustained sysbench CPU run typically points to neighbor contention. If you see sustained high steal time, especially during off-peak hours when your own workload is quiet, that warrants a support ticket so the provider can investigate the host or migrate your instance. The companion guide Noisy Neighbor Symptoms vs. Real Performance Issues walks through the full diagnostic workflow.
10. Putting It All Together: A Quick Benchmark Script
Save this as quickbench.sh for repeatable baseline captures:
#!/bin/bash
set -e
LOG="bench-$(date +%Y%m%d-%H%M%S).log"
exec > >(tee -a "$LOG") 2>&1
echo "=== System Info ==="
lscpu | grep -E "Model name|^CPU\(s\)|Thread|MHz"
free -h
df -h /
echo "=== sysbench CPU ==="
sysbench cpu --cpu-max-prime=20000 --threads=$(nproc) --time=60 run
echo "=== sysbench memory ==="
sysbench memory --memory-block-size=1M --memory-total-size=10G run
echo "=== fio 4K random read ==="
fio --name=rr4k --filename=/tmp/fio-test --rw=randread --bs=4k \
--size=4G --numjobs=1 --iodepth=32 --direct=1 \
--ioengine=io_uring --runtime=30 --time_based --group_reporting
echo "=== fio 4K random write ==="
fio --name=rw4k --filename=/tmp/fio-test --rw=randwrite --bs=4k \
--size=4G --numjobs=1 --iodepth=32 --direct=1 \
--ioengine=io_uring --runtime=30 --time_based --group_reporting
rm -f /tmp/fio-test
echo "=== Done. Log saved to: $LOG ==="Make it executable and run:
chmod +x quickbench.sh
./quickbench.shThe script logs results to a timestamped file so you can archive baselines for later comparison.
11. Common Pitfalls
- Testing with caching enabled: Always pass
--direct=1to fio, or you are benchmarking RAM speed instead of the storage device. - Insufficient runtime: Anything under 30 seconds is too short to capture steady-state behavior. Queue dynamics, thermal effects, and write throttling all need time to stabilize.
- Test data smaller than RAM: fio test sizes must exceed system memory to avoid page cache amplification of read numbers.
- Single test runs: Always run benchmarks three or more times and evaluate consistency, not just peak values.
- Benchmarking during deployment: Run on a clean, idle system, never while configuration management, package updates, or backups are running.
- Ignoring filesystem overhead: For the most accurate disk numbers, benchmark against a raw block device when possible — see Choosing a Filesystem.
12. Comparing Across Providers or Plans
When comparing benchmarks across providers, use identical command parameters and similar instance sizes. Document everything:
- vCPU count and model (
lscpu) - Total RAM (
free -h) - Advertised storage type (NVMe, SATA SSD, network block storage)
- Geographic region or datacenter
- Test timestamps, since load varies by time of day
- Kernel version (
uname -r) and OS release (cat /etc/os-release)
A spreadsheet with rows per provider and columns for each metric makes the comparison auditable when you revisit it months later.
13. Next Steps
After establishing baseline numbers, save them alongside your instance documentation. Re-run the same suite periodically and compare. Significant drift, particularly in random I/O latency or sustained CPU throughput, can signal host hardware issues or increased neighbor density and may justify opening a ticket or migrating to a different node.
For workload-specific benchmarking, consider following up with:
- pgbench for PostgreSQL workloads
- sysbench oltp for MySQL or MariaDB
- wrk or k6 for HTTP application performance
- iperf3 for network throughput between instances
- redis-benchmark for Redis workloads
Synthetic benchmarks give you a baseline. Application-level benchmarks tell you whether the instance will actually handle what you are about to put on it. Run both.
