Key Features of Apache Cassandra
System Requirements
⚠️ Important: Cassandra performs significantly better on SSDs compared to HDDs due to its write-heavy workload patterns. RamNode's NVMe storage plans are ideal for Cassandra deployments.
Resource Requirements
| Component | Minimum | Recommended |
|---|---|---|
| CPU | 2 cores | 4+ cores |
| RAM | 4 GB | 8+ GB |
| Storage | 20 GB SSD | 50+ GB NVMe SSD |
| OS | Ubuntu 22.04 | Ubuntu 24.04 LTS |
| Java | OpenJDK 11 | OpenJDK 11 or 17 |
Install Prerequisites
sudo apt update && sudo apt upgrade -ysudo apt install -y openjdk-11-jdk
# Verify installation
java -versionecho 'JAVA_HOME="/usr/lib/jvm/java-11-openjdk-amd64"' | sudo tee -a /etc/environment
source /etc/environment
echo $JAVA_HOMESystem Tuning
Configure kernel parameters for optimal Cassandra performance:
# Increase file descriptor limits
echo 'cassandra - nofile 100000' | sudo tee -a /etc/security/limits.conf
echo 'cassandra - memlock unlimited' | sudo tee -a /etc/security/limits.conf
echo 'cassandra - nproc 32768' | sudo tee -a /etc/security/limits.conf
echo 'cassandra - as unlimited' | sudo tee -a /etc/security/limits.conf
# Disable swap (important for Cassandra)
sudo swapoff -a
# Comment out swap line in /etc/fstab to persist after rebootInstall Apache Cassandra
# Add Cassandra repository key
curl -fsSL https://www.apache.org/dist/cassandra/KEYS | sudo gpg --dearmor -o /usr/share/keyrings/cassandra-archive-keyring.gpg
# Add repository (Cassandra 4.1)
echo "deb [signed-by=/usr/share/keyrings/cassandra-archive-keyring.gpg] https://debian.cassandra.apache.org 41x main" | sudo tee /etc/apt/sources.list.d/cassandra.sources.listsudo apt update
sudo apt install -y cassandra# Check service status
sudo systemctl status cassandra
# View cluster status
nodetool statusConfiguration
Edit the main configuration file at /etc/cassandra/cassandra.yaml:
sudo nano /etc/cassandra/cassandra.yamlEssential Settings
| Parameter | Description | Example |
|---|---|---|
| cluster_name | Unique cluster identifier | 'RamNodeCluster' |
| listen_address | Node IP for inter-node | Your VPS private IP |
| rpc_address | IP for client connections | 0.0.0.0 |
| seeds | Seed nodes for discovery | IP of seed nodes |
| endpoint_snitch | Cluster topology awareness | GossipingPropertyFileSnitch |
Single-Node Configuration
cluster_name: 'RamNodeDev'
num_tokens: 256
listen_address: localhost
rpc_address: localhost
seed_provider:
- class_name: org.apache.cassandra.locator.SimpleSeedProvider
parameters:
- seeds: "127.0.0.1:7000"Memory Configuration
sudo nano /etc/cassandra/jvm.options# For 8GB RAM VPS:
-Xms2G
-Xmx2G⚠️ Warning: Never allocate more than 8GB to the JVM heap. Cassandra uses off-heap memory extensively, and excessive heap can cause long GC pauses.
sudo systemctl restart cassandra
# Wait for startup, then verify
sleep 30 && nodetool statusSecurity Hardening
Enable Authentication
Enable password authentication in cassandra.yaml:
authenticator: PasswordAuthenticator
authorizer: CassandraAuthorizer# Restart Cassandra after config change
sudo systemctl restart cassandra
# Log in with default credentials
cqlsh -u cassandra -p cassandra
# Create new superuser
CREATE ROLE admin WITH PASSWORD = 'YourSecurePassword' AND SUPERUSER = true AND LOGIN = true;
# Exit and log in as new user
EXIT;
cqlsh -u admin -p 'YourSecurePassword'
# Disable default cassandra user
ALTER ROLE cassandra WITH PASSWORD = 'RandomLongString' AND SUPERUSER = false;Enable Client Encryption (TLS)
# Generate keystore
keytool -genkeypair -keyalg RSA -alias cassandra \
-keystore /etc/cassandra/conf/.keystore \
-storepass cassandra -keypass cassandra \
-dname "CN=cassandra, OU=RamNode, O=RamNode, L=City, ST=State, C=US" \
-validity 365
# Export certificate
keytool -export -alias cassandra -file cassandra.cer \
-keystore /etc/cassandra/conf/.keystore -storepass cassandra
# Import to truststore
keytool -import -alias cassandra -file cassandra.cer \
-keystore /etc/cassandra/conf/.truststore -storepass cassandra -nopromptclient_encryption_options:
enabled: true
optional: false
keystore: /etc/cassandra/conf/.keystore
keystore_password: cassandra
require_client_auth: falseFirewall Configuration
# CQL native transport (client connections)
sudo ufw allow from YOUR_APP_IP to any port 9042
# Inter-node communication (for clusters)
sudo ufw allow from CLUSTER_SUBNET to any port 7000
sudo ufw allow from CLUSTER_SUBNET to any port 7001
# JMX monitoring (restrict to localhost or monitoring server)
sudo ufw allow from 127.0.0.1 to any port 7199
sudo ufw enableBasic Operations
Connecting with CQL Shell
# Local connection
cqlsh localhost
# Authenticated connection
cqlsh localhost -u admin -p 'YourPassword'
# Remote connection
cqlsh YOUR_VPS_IP 9042 -u admin -p 'YourPassword'Creating a Keyspace
-- Single node development
CREATE KEYSPACE myapp WITH replication = {
'class': 'SimpleStrategy',
'replication_factor': 1
};
-- Production (multi-datacenter)
CREATE KEYSPACE myapp WITH replication = {
'class': 'NetworkTopologyStrategy',
'dc1': 3,
'dc2': 2
};Creating Tables
USE myapp;
CREATE TABLE users (
user_id UUID PRIMARY KEY,
email TEXT,
username TEXT,
created_at TIMESTAMP
);
-- Table with clustering columns for time-series data
CREATE TABLE events (
user_id UUID,
event_time TIMESTAMP,
event_type TEXT,
event_data TEXT,
PRIMARY KEY (user_id, event_time)
) WITH CLUSTERING ORDER BY (event_time DESC);Basic CRUD Operations
-- Insert data
INSERT INTO users (user_id, email, username, created_at)
VALUES (uuid(), 'user@example.com', 'johndoe', toTimestamp(now()));
-- Query data
SELECT * FROM users WHERE user_id = <uuid>;
-- Update data
UPDATE users SET email = 'newemail@example.com' WHERE user_id = <uuid>;
-- Delete data
DELETE FROM users WHERE user_id = <uuid>;Monitoring & Maintenance
nodetool Commands
| Command | Description |
|---|---|
| nodetool status | Show cluster node status |
| nodetool info | Display node information |
| nodetool tablestats <ks> | Table-level statistics |
| nodetool tpstats | Thread pool statistics |
| nodetool compactionstats | Current compaction activity |
| nodetool repair | Run anti-entropy repair |
| nodetool cleanup | Remove keys no longer owned |
Log Locations
- • Main system log:
/var/log/cassandra/system.log - • Debug log:
/var/log/cassandra/debug.log - • GC log:
/var/log/cassandra/gc.log
sudo tail -f /var/log/cassandra/system.logBackup & Recovery
Snapshot Backups
# Create snapshot
nodetool snapshot -t backup_$(date +%Y%m%d)
# Snapshots are stored in:
# /var/lib/cassandra/data/<keyspace>/<table>/snapshots/<snapshot_name>
# Copy snapshots to backup location
sudo rsync -av /var/lib/cassandra/data/*/snapshots/backup_* /backup/cassandra/# Stop Cassandra
sudo systemctl stop cassandra
# Clear existing data (if needed)
sudo rm -rf /var/lib/cassandra/data/<keyspace>/<table>/*
# Copy snapshot data
sudo cp -r /backup/cassandra/<snapshot>/* /var/lib/cassandra/data/<keyspace>/<table>/
# Fix ownership
sudo chown -R cassandra:cassandra /var/lib/cassandra/data/
# Start Cassandra
sudo systemctl start cassandra
# Refresh sstables
nodetool refresh <keyspace> <table>Automated Backup Script
#!/bin/bash
# /usr/local/bin/cassandra-backup.sh
BACKUP_DIR="/backup/cassandra"
DATE=$(date +%Y%m%d_%H%M%S)
RETENTION_DAYS=7
# Create snapshot
nodetool snapshot -t "backup_$DATE"
# Copy snapshots
mkdir -p "$BACKUP_DIR/$DATE"
find /var/lib/cassandra/data -path "*/snapshots/backup_$DATE" -exec cp -r {} "$BACKUP_DIR/$DATE/" \;
# Clear snapshot from Cassandra
nodetool clearsnapshot -t "backup_$DATE"
# Remove old backups
find $BACKUP_DIR -maxdepth 1 -type d -mtime +$RETENTION_DAYS -exec rm -rf {} \;
echo "Backup completed: $BACKUP_DIR/$DATE"# Make executable
chmod +x /usr/local/bin/cassandra-backup.sh
# Add to crontab (daily at 2 AM)
0 2 * * * /usr/local/bin/cassandra-backup.sh >> /var/log/cassandra-backup.log 2>&1Troubleshooting
Cassandra Deployed Successfully!
Your Apache Cassandra deployment is ready. This setup provides a solid foundation for development and testing, with security and monitoring capabilities for production use.
Production Considerations:
- ✓ Deploy multi-node clusters for high availability
- ✓ Use NVMe storage for optimal performance
- ✓ Implement regular backup schedules
- ✓ Monitor with nodetool and external tools
- ✓ Keep Cassandra updated with security patches
Additional Resources
Ready to Deploy Cassandra?
Get started with a RamNode VPS and deploy Apache Cassandra. Our NVMe storage and high-performance infrastructure are perfect for distributed database workloads.
View VPS Plans →