Database Guide

    Self-Hosted Apache Cassandra

    Deploy Apache Cassandra, a highly scalable distributed NoSQL database, on RamNode VPS. Linear scalability, fault tolerance, and no single point of failure.

    Ubuntu 22.04/24.04
    Cassandra 4.1
    ⏱️ 20-30 minutes

    Key Features of Apache Cassandra

    Linear scalability – Add nodes to increase throughput
    Fault tolerance – Data automatically replicated
    Tunable consistency – Per-query consistency levels
    Flexible data model – Wide-column store with CQL
    No single point of failure – Peer-to-peer architecture
    Multi-datacenter replication support

    System Requirements

    ⚠️ Important: Cassandra performs significantly better on SSDs compared to HDDs due to its write-heavy workload patterns. RamNode's NVMe storage plans are ideal for Cassandra deployments.

    Resource Requirements

    ComponentMinimumRecommended
    CPU2 cores4+ cores
    RAM4 GB8+ GB
    Storage20 GB SSD50+ GB NVMe SSD
    OSUbuntu 22.04Ubuntu 24.04 LTS
    JavaOpenJDK 11OpenJDK 11 or 17

    Development

    • • 4GB RAM, 2 vCPU
    • • 20GB SSD storage
    • • Single-node setup

    Production

    • • 8GB+ RAM, 4+ vCPU
    • • 50GB+ NVMe storage
    • • Multi-node cluster
    2

    Install Prerequisites

    Update System Packages
    sudo apt update && sudo apt upgrade -y
    Install Java (OpenJDK 11)
    sudo apt install -y openjdk-11-jdk
    
    # Verify installation
    java -version
    Configure Java Environment
    echo 'JAVA_HOME="/usr/lib/jvm/java-11-openjdk-amd64"' | sudo tee -a /etc/environment
    source /etc/environment
    echo $JAVA_HOME

    System Tuning

    Configure kernel parameters for optimal Cassandra performance:

    Configure System Limits
    # Increase file descriptor limits
    echo 'cassandra - nofile 100000' | sudo tee -a /etc/security/limits.conf
    echo 'cassandra - memlock unlimited' | sudo tee -a /etc/security/limits.conf
    echo 'cassandra - nproc 32768' | sudo tee -a /etc/security/limits.conf
    echo 'cassandra - as unlimited' | sudo tee -a /etc/security/limits.conf
    
    # Disable swap (important for Cassandra)
    sudo swapoff -a
    # Comment out swap line in /etc/fstab to persist after reboot
    3

    Install Apache Cassandra

    Add Cassandra Repository
    # Add Cassandra repository key
    curl -fsSL https://www.apache.org/dist/cassandra/KEYS | sudo gpg --dearmor -o /usr/share/keyrings/cassandra-archive-keyring.gpg
    
    # Add repository (Cassandra 4.1)
    echo "deb [signed-by=/usr/share/keyrings/cassandra-archive-keyring.gpg] https://debian.cassandra.apache.org 41x main" | sudo tee /etc/apt/sources.list.d/cassandra.sources.list
    Install Cassandra
    sudo apt update
    sudo apt install -y cassandra
    Verify Installation
    # Check service status
    sudo systemctl status cassandra
    
    # View cluster status
    nodetool status
    4

    Configuration

    Edit the main configuration file at /etc/cassandra/cassandra.yaml:

    Open Configuration
    sudo nano /etc/cassandra/cassandra.yaml

    Essential Settings

    ParameterDescriptionExample
    cluster_nameUnique cluster identifier'RamNodeCluster'
    listen_addressNode IP for inter-nodeYour VPS private IP
    rpc_addressIP for client connections0.0.0.0
    seedsSeed nodes for discoveryIP of seed nodes
    endpoint_snitchCluster topology awarenessGossipingPropertyFileSnitch

    Single-Node Configuration

    Development Setup (cassandra.yaml)
    cluster_name: 'RamNodeDev'
    num_tokens: 256
    listen_address: localhost
    rpc_address: localhost
    seed_provider:
      - class_name: org.apache.cassandra.locator.SimpleSeedProvider
        parameters:
          - seeds: "127.0.0.1:7000"

    Memory Configuration

    Edit JVM Options
    sudo nano /etc/cassandra/jvm.options
    Set Heap Size (1/4 of RAM, max 8GB)
    # For 8GB RAM VPS:
    -Xms2G
    -Xmx2G

    ⚠️ Warning: Never allocate more than 8GB to the JVM heap. Cassandra uses off-heap memory extensively, and excessive heap can cause long GC pauses.

    Apply Changes
    sudo systemctl restart cassandra
    
    # Wait for startup, then verify
    sleep 30 && nodetool status
    5

    Security Hardening

    Enable Authentication

    Enable password authentication in cassandra.yaml:

    Authentication Settings
    authenticator: PasswordAuthenticator
    authorizer: CassandraAuthorizer
    Create Admin User
    # Restart Cassandra after config change
    sudo systemctl restart cassandra
    
    # Log in with default credentials
    cqlsh -u cassandra -p cassandra
    
    # Create new superuser
    CREATE ROLE admin WITH PASSWORD = 'YourSecurePassword' AND SUPERUSER = true AND LOGIN = true;
    
    # Exit and log in as new user
    EXIT;
    cqlsh -u admin -p 'YourSecurePassword'
    
    # Disable default cassandra user
    ALTER ROLE cassandra WITH PASSWORD = 'RandomLongString' AND SUPERUSER = false;

    Enable Client Encryption (TLS)

    Generate SSL Certificates
    # Generate keystore
    keytool -genkeypair -keyalg RSA -alias cassandra \
      -keystore /etc/cassandra/conf/.keystore \
      -storepass cassandra -keypass cassandra \
      -dname "CN=cassandra, OU=RamNode, O=RamNode, L=City, ST=State, C=US" \
      -validity 365
    
    # Export certificate
    keytool -export -alias cassandra -file cassandra.cer \
      -keystore /etc/cassandra/conf/.keystore -storepass cassandra
    
    # Import to truststore
    keytool -import -alias cassandra -file cassandra.cer \
      -keystore /etc/cassandra/conf/.truststore -storepass cassandra -noprompt
    TLS Configuration (cassandra.yaml)
    client_encryption_options:
      enabled: true
      optional: false
      keystore: /etc/cassandra/conf/.keystore
      keystore_password: cassandra
      require_client_auth: false

    Firewall Configuration

    Configure UFW
    # CQL native transport (client connections)
    sudo ufw allow from YOUR_APP_IP to any port 9042
    
    # Inter-node communication (for clusters)
    sudo ufw allow from CLUSTER_SUBNET to any port 7000
    sudo ufw allow from CLUSTER_SUBNET to any port 7001
    
    # JMX monitoring (restrict to localhost or monitoring server)
    sudo ufw allow from 127.0.0.1 to any port 7199
    
    sudo ufw enable
    6

    Basic Operations

    Connecting with CQL Shell

    Connect to Cassandra
    # Local connection
    cqlsh localhost
    
    # Authenticated connection
    cqlsh localhost -u admin -p 'YourPassword'
    
    # Remote connection
    cqlsh YOUR_VPS_IP 9042 -u admin -p 'YourPassword'

    Creating a Keyspace

    Keyspace Examples
    -- Single node development
    CREATE KEYSPACE myapp WITH replication = {
      'class': 'SimpleStrategy',
      'replication_factor': 1
    };
    
    -- Production (multi-datacenter)
    CREATE KEYSPACE myapp WITH replication = {
      'class': 'NetworkTopologyStrategy',
      'dc1': 3,
      'dc2': 2
    };

    Creating Tables

    Table Examples
    USE myapp;
    
    CREATE TABLE users (
      user_id UUID PRIMARY KEY,
      email TEXT,
      username TEXT,
      created_at TIMESTAMP
    );
    
    -- Table with clustering columns for time-series data
    CREATE TABLE events (
      user_id UUID,
      event_time TIMESTAMP,
      event_type TEXT,
      event_data TEXT,
      PRIMARY KEY (user_id, event_time)
    ) WITH CLUSTERING ORDER BY (event_time DESC);

    Basic CRUD Operations

    CRUD Examples
    -- Insert data
    INSERT INTO users (user_id, email, username, created_at)
    VALUES (uuid(), 'user@example.com', 'johndoe', toTimestamp(now()));
    
    -- Query data
    SELECT * FROM users WHERE user_id = <uuid>;
    
    -- Update data
    UPDATE users SET email = 'newemail@example.com' WHERE user_id = <uuid>;
    
    -- Delete data
    DELETE FROM users WHERE user_id = <uuid>;
    7

    Monitoring & Maintenance

    nodetool Commands

    CommandDescription
    nodetool statusShow cluster node status
    nodetool infoDisplay node information
    nodetool tablestats <ks>Table-level statistics
    nodetool tpstatsThread pool statistics
    nodetool compactionstatsCurrent compaction activity
    nodetool repairRun anti-entropy repair
    nodetool cleanupRemove keys no longer owned

    Log Locations

    • • Main system log: /var/log/cassandra/system.log
    • • Debug log: /var/log/cassandra/debug.log
    • • GC log: /var/log/cassandra/gc.log
    View Logs
    sudo tail -f /var/log/cassandra/system.log
    8

    Backup & Recovery

    Snapshot Backups

    Create Snapshot
    # Create snapshot
    nodetool snapshot -t backup_$(date +%Y%m%d)
    
    # Snapshots are stored in:
    # /var/lib/cassandra/data/<keyspace>/<table>/snapshots/<snapshot_name>
    
    # Copy snapshots to backup location
    sudo rsync -av /var/lib/cassandra/data/*/snapshots/backup_* /backup/cassandra/
    Restore from Snapshot
    # Stop Cassandra
    sudo systemctl stop cassandra
    
    # Clear existing data (if needed)
    sudo rm -rf /var/lib/cassandra/data/<keyspace>/<table>/*
    
    # Copy snapshot data
    sudo cp -r /backup/cassandra/<snapshot>/* /var/lib/cassandra/data/<keyspace>/<table>/
    
    # Fix ownership
    sudo chown -R cassandra:cassandra /var/lib/cassandra/data/
    
    # Start Cassandra
    sudo systemctl start cassandra
    
    # Refresh sstables
    nodetool refresh <keyspace> <table>

    Automated Backup Script

    Backup Script
    #!/bin/bash
    # /usr/local/bin/cassandra-backup.sh
    
    BACKUP_DIR="/backup/cassandra"
    DATE=$(date +%Y%m%d_%H%M%S)
    RETENTION_DAYS=7
    
    # Create snapshot
    nodetool snapshot -t "backup_$DATE"
    
    # Copy snapshots
    mkdir -p "$BACKUP_DIR/$DATE"
    find /var/lib/cassandra/data -path "*/snapshots/backup_$DATE" -exec cp -r {} "$BACKUP_DIR/$DATE/" \;
    
    # Clear snapshot from Cassandra
    nodetool clearsnapshot -t "backup_$DATE"
    
    # Remove old backups
    find $BACKUP_DIR -maxdepth 1 -type d -mtime +$RETENTION_DAYS -exec rm -rf {} \;
    
    echo "Backup completed: $BACKUP_DIR/$DATE"
    Schedule with Cron
    # Make executable
    chmod +x /usr/local/bin/cassandra-backup.sh
    
    # Add to crontab (daily at 2 AM)
    0 2 * * * /usr/local/bin/cassandra-backup.sh >> /var/log/cassandra-backup.log 2>&1
    9

    Troubleshooting

    Cassandra Deployed Successfully!

    Your Apache Cassandra deployment is ready. This setup provides a solid foundation for development and testing, with security and monitoring capabilities for production use.

    Production Considerations:

    • ✓ Deploy multi-node clusters for high availability
    • ✓ Use NVMe storage for optimal performance
    • ✓ Implement regular backup schedules
    • ✓ Monitor with nodetool and external tools
    • ✓ Keep Cassandra updated with security patches

    Ready to Deploy Cassandra?

    Get started with a RamNode VPS and deploy Apache Cassandra. Our NVMe storage and high-performance infrastructure are perfect for distributed database workloads.

    View VPS Plans →