Database Guide

    Self-Hosted ScyllaDB

    Deploy ScyllaDB, a high-performance Cassandra-compatible NoSQL database, on RamNode VPS. 10x faster than Cassandra with the same APIs.

    Ubuntu 22.04/24.04
    ScyllaDB 5.4
    ⏱️ 15-25 minutes

    Key Features of ScyllaDB

    10x faster than Apache Cassandra
    100% Cassandra CQL compatible
    Shard-per-core architecture for maximum efficiency
    Automatic tuning - no JVM garbage collection
    Sub-millisecond P99 latencies
    Linear scalability with consistent performance

    System Requirements

    ⚠️ Important: ScyllaDB is designed to maximize hardware utilization. NVMe SSDs are strongly recommended for production workloads due to ScyllaDB's high I/O requirements.

    Resource Requirements

    WorkloadRAMCPUStorageRamNode Plan
    Development/Testing4 GB2 vCPUs40 GB SSDPremium 4GB
    Light Production8 GB4 vCPUs80 GB NVMePremium 8GB
    Standard Production16 GB8 vCPUs160 GB NVMePremium 16GB
    High Performance32 GB+16+ vCPUs500 GB+ NVMePremium 32GB+

    Network Ports

    PortServiceDescription
    9042CQLNative CQL client connections
    9142CQL SSLEncrypted CQL connections
    7000Inter-nodeCluster communication
    7001Inter-node SSLEncrypted cluster communication
    7199JMXMonitoring and management
    10000REST APIScyllaDB REST API
    9180PrometheusMetrics endpoint

    Development

    • • 4GB RAM, 2 vCPU
    • • 40GB SSD storage
    • • Single-node setup

    Production

    • • 16GB+ RAM, 8+ vCPU
    • • 160GB+ NVMe storage
    • • Multi-node cluster
    2

    Installation

    Update System Packages
    sudo apt update && sudo apt upgrade -y
    Install Prerequisites
    sudo apt install -y apt-transport-https wget gnupg2 curl
    sudo apt install -y openjdk-11-jre-headless

    Add ScyllaDB Repository

    Add ScyllaDB APT Repository
    # Import the ScyllaDB GPG key
    sudo mkdir -p /etc/apt/keyrings
    sudo gpg --homedir /tmp --no-default-keyring --keyring /etc/apt/keyrings/scylladb.gpg \
      --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys d0a112e067426ab2
    
    # Add the repository (Ubuntu 22.04)
    sudo wget -O /etc/apt/sources.list.d/scylla.list \
      http://downloads.scylladb.com/deb/debian/scylla-5.4.list
    Install ScyllaDB
    sudo apt update
    sudo apt install -y scylla

    Run ScyllaDB Setup

    ScyllaDB includes an automatic setup script that optimizes your system:

    Run Setup Wizard
    sudo scylla_setup
    
    # The wizard will configure:
    # - NTP synchronization
    # - RAID setup (if applicable)
    # - Filesystem optimization (XFS recommended)
    # - Network settings
    # - CPU pinning for optimal performance

    💡 Tip: For development, you can answer "no" to most optimization questions. For production, let the wizard optimize everything for maximum performance.

    Start ScyllaDB Service
    sudo systemctl start scylla-server
    sudo systemctl enable scylla-server
    
    # Check status
    sudo systemctl status scylla-server
    3

    Configuration

    Edit the main configuration file at /etc/scylla/scylla.yaml:

    Open Configuration File
    sudo nano /etc/scylla/scylla.yaml

    Essential Configuration Options

    ParameterDescriptionExample
    cluster_nameUnique cluster identifier'RamNodeScylla'
    listen_addressNode IP for inter-nodeYour VPS private IP
    rpc_addressIP for client connections0.0.0.0
    seedsSeed nodes for discoveryIP of seed nodes
    endpoint_snitchTopology awarenessGossipingPropertyFileSnitch

    Single-Node Configuration

    Development Setup (scylla.yaml)
    cluster_name: 'RamNodeDev'
    num_tokens: 256
    listen_address: localhost
    rpc_address: localhost
    seed_provider:
      - class_name: org.apache.cassandra.locator.SimpleSeedProvider
        parameters:
          - seeds: "127.0.0.1"
    
    # ScyllaDB-specific optimizations
    developer_mode: true  # Disable for production
    experimental: true

    Production Configuration

    Production Setup (scylla.yaml)
    cluster_name: 'RamNodeProd'
    num_tokens: 256
    listen_address: YOUR_PRIVATE_IP
    rpc_address: 0.0.0.0
    broadcast_rpc_address: YOUR_PUBLIC_IP
    seed_provider:
      - class_name: org.apache.cassandra.locator.SimpleSeedProvider
        parameters:
          - seeds: "SEED_NODE_IP"
    
    endpoint_snitch: GossipingPropertyFileSnitch
    
    # Disable developer mode for production
    developer_mode: false
    
    # Enable authentication
    authenticator: PasswordAuthenticator
    authorizer: CassandraAuthorizer
    Apply Configuration Changes
    sudo systemctl restart scylla-server
    
    # Wait for startup, then verify
    sleep 30 && nodetool status
    4

    Security Hardening

    Enable Authentication

    After enabling PasswordAuthenticator in scylla.yaml:

    Create Admin User
    # Restart ScyllaDB after config change
    sudo systemctl restart scylla-server
    
    # Log in with default credentials
    cqlsh localhost -u cassandra -p cassandra
    
    # Create new superuser
    CREATE ROLE admin WITH PASSWORD = 'YourSecurePassword123!' 
      AND SUPERUSER = true AND LOGIN = true;
    
    # Exit and log in as new user
    EXIT;
    cqlsh localhost -u admin -p 'YourSecurePassword123!'
    
    # Disable default cassandra user
    ALTER ROLE cassandra WITH PASSWORD = 'RandomLongString123!' 
      AND SUPERUSER = false;

    Firewall Configuration

    Configure UFW Firewall
    # Enable UFW
    sudo ufw enable
    
    # Allow SSH
    sudo ufw allow 22/tcp
    
    # CQL native transport (restrict to app servers)
    sudo ufw allow from YOUR_APP_IP to any port 9042
    
    # Inter-node communication (for clusters)
    sudo ufw allow from CLUSTER_SUBNET to any port 7000
    sudo ufw allow from CLUSTER_SUBNET to any port 7001
    
    # Prometheus metrics (restrict to monitoring server)
    sudo ufw allow from MONITORING_IP to any port 9180
    
    # Verify rules
    sudo ufw status verbose

    Enable TLS Encryption

    Generate SSL Certificates
    # Create certificate directory
    sudo mkdir -p /etc/scylla/certs
    
    # Generate self-signed certificate (for testing)
    sudo openssl req -x509 -nodes -days 365 -newkey rsa:2048 \
      -keyout /etc/scylla/certs/scylla.key \
      -out /etc/scylla/certs/scylla.crt \
      -subj "/CN=scylla.yourdomain.com"
    
    # Set proper permissions
    sudo chown scylla:scylla /etc/scylla/certs/*
    sudo chmod 600 /etc/scylla/certs/scylla.key
    TLS Configuration (scylla.yaml)
    client_encryption_options:
      enabled: true
      certificate: /etc/scylla/certs/scylla.crt
      keyfile: /etc/scylla/certs/scylla.key
      require_client_auth: false
    
    server_encryption_options:
      internode_encryption: all
      certificate: /etc/scylla/certs/scylla.crt
      keyfile: /etc/scylla/certs/scylla.key

    💡 Production Tip: Use certificates from a trusted Certificate Authority for production deployments.

    5

    Multi-Node Cluster Setup

    For high availability and performance, deploy ScyllaDB across multiple VPS nodes.

    Configure Seed Nodes

    On all nodes, update scylla.yaml with the seed node addresses:

    Cluster Configuration (scylla.yaml)
    cluster_name: 'RamNodeCluster'
    listen_address: THIS_NODE_PRIVATE_IP
    rpc_address: 0.0.0.0
    broadcast_rpc_address: THIS_NODE_PUBLIC_IP
    
    seed_provider:
      - class_name: org.apache.cassandra.locator.SimpleSeedProvider
        parameters:
          - seeds: "SEED_NODE_1_IP,SEED_NODE_2_IP"
    
    endpoint_snitch: GossipingPropertyFileSnitch
    Configure Datacenter (cassandra-rackdc.properties)
    sudo nano /etc/scylla/cassandra-rackdc.properties
    
    # Add:
    dc=ramnode-dc1
    rack=rack1
    Start Nodes and Verify Cluster
    # Start ScyllaDB on each node
    sudo systemctl start scylla-server
    
    # Check cluster status from any node
    nodetool status

    Recommended Cluster Topology

    NodesReplication FactorUse Case
    3 nodesRF=3Standard production
    5 nodesRF=3High availability
    6+ nodesRF=3High throughput
    6

    Basic Operations

    Connecting with CQL Shell

    Connect to ScyllaDB
    # Local connection
    cqlsh localhost
    
    # Authenticated connection
    cqlsh localhost -u admin -p 'YourPassword'
    
    # Remote connection
    cqlsh YOUR_VPS_IP 9042 -u admin -p 'YourPassword'

    Creating a Keyspace

    Create Keyspace and Table
    -- Create keyspace with replication
    CREATE KEYSPACE myapp WITH replication = {
      'class': 'NetworkTopologyStrategy',
      'ramnode-dc1': 3
    };
    
    -- Use the keyspace
    USE myapp;
    
    -- Create a table
    CREATE TABLE users (
      user_id UUID PRIMARY KEY,
      email TEXT,
      name TEXT,
      created_at TIMESTAMP
    );
    
    -- Insert data
    INSERT INTO users (user_id, email, name, created_at) 
    VALUES (uuid(), 'user@example.com', 'John Doe', toTimestamp(now()));
    
    -- Query data
    SELECT * FROM users;

    Cluster Management Commands

    Nodetool Commands
    # Check cluster status
    nodetool status
    
    # View node information
    nodetool info
    
    # Check compaction status
    nodetool compactionstats
    
    # Repair a node
    nodetool repair
    
    # Decommission a node
    nodetool decommission
    7

    Backup and Recovery

    Snapshot Backup

    Create Snapshot
    # Create snapshot of all keyspaces
    nodetool snapshot
    
    # Create snapshot of specific keyspace
    nodetool snapshot -t backup_$(date +%Y%m%d) myapp
    
    # Find snapshot location
    ls -la /var/lib/scylla/data/myapp/*/snapshots/

    Automated Backup Script

    /opt/scripts/scylla-backup.sh
    #!/bin/bash
    DATE=$(date +%Y%m%d_%H%M%S)
    BACKUP_DIR="/backup/scylla"
    KEYSPACES="myapp"
    LOG_FILE="/var/log/scylla-backup.log"
    
    echo "[$DATE] Starting backup..." >> $LOG_FILE
    
    # Create snapshot
    nodetool snapshot -t backup_$DATE $KEYSPACES
    
    # Copy snapshots to backup directory
    for ks in $KEYSPACES; do
      for table_dir in /var/lib/scylla/data/$ks/*/; do
        table=$(basename $table_dir)
        snapshot_dir="$table_dir/snapshots/backup_$DATE"
        if [ -d "$snapshot_dir" ]; then
          mkdir -p "$BACKUP_DIR/$DATE/$ks/$table"
          cp -r "$snapshot_dir"/* "$BACKUP_DIR/$DATE/$ks/$table/"
        fi
      done
    done
    
    # Clear old snapshots
    nodetool clearsnapshot -t backup_$DATE
    
    # Remove backups older than 7 days
    find $BACKUP_DIR -type d -mtime +7 -exec rm -rf {} \; 2>/dev/null
    
    echo "[$DATE] Backup completed" >> $LOG_FILE
    Schedule Daily Backup
    chmod +x /opt/scripts/scylla-backup.sh
    sudo crontab -e
    
    # Add this line (runs daily at 3 AM):
    0 3 * * * /opt/scripts/scylla-backup.sh

    Restore from Snapshot

    Restore Snapshot
    # Stop ScyllaDB
    sudo systemctl stop scylla-server
    
    # Clear existing data
    sudo rm -rf /var/lib/scylla/data/myapp/*
    
    # Copy snapshot data back
    sudo cp -r /backup/scylla/YYYYMMDD_HHMMSS/myapp/* /var/lib/scylla/data/myapp/
    
    # Fix ownership
    sudo chown -R scylla:scylla /var/lib/scylla/data/
    
    # Start ScyllaDB
    sudo systemctl start scylla-server
    
    # Rebuild secondary indexes if needed
    nodetool rebuild_index myapp users users_email_idx
    8

    Monitoring

    Built-in Prometheus Metrics

    ScyllaDB exposes Prometheus metrics on port 9180:

    Prometheus Scrape Configuration
    scrape_configs:
      - job_name: 'scylla'
        static_configs:
          - targets: ['localhost:9180']
        honor_labels: true
        metrics_path: /metrics

    Key Metrics to Monitor

    • scylla_storage_proxy_coordinator_read_latency - Read latency
    • scylla_storage_proxy_coordinator_write_latency - Write latency
    • scylla_reactor_utilization - CPU utilization per shard
    • scylla_memory_allocated_memory - Memory usage
    • scylla_compaction_manager_compactions - Compaction activity

    ScyllaDB Monitoring Stack

    ScyllaDB provides a pre-built monitoring stack:

    Deploy ScyllaDB Monitoring
    # Clone the monitoring stack
    git clone https://github.com/scylladb/scylla-monitoring.git
    cd scylla-monitoring
    
    # Start the stack (includes Grafana dashboards)
    ./start-all.sh -s YOUR_SCYLLA_IP:9180
    
    # Access Grafana at http://localhost:3000
    # Default credentials: admin/admin
    9

    Troubleshooting

    Service Won't Start

    Check Logs and Status
    # Check service status
    sudo systemctl status scylla-server
    
    # View detailed logs
    sudo journalctl -u scylla-server -n 100 --no-pager
    
    # Check ScyllaDB logs
    sudo tail -100 /var/log/scylla/scylla.log

    Node Not Joining Cluster

    Verify Network Connectivity
    # Test connectivity to seed nodes
    nc -zv SEED_NODE_IP 7000
    nc -zv SEED_NODE_IP 9042
    
    # Check gossip status
    nodetool gossipinfo
    
    # Verify cluster name matches on all nodes
    grep cluster_name /etc/scylla/scylla.yaml

    High Latency Issues

    Diagnose Performance
    # Check compaction status
    nodetool compactionstats
    
    # View thread pool status
    nodetool tpstats
    
    # Check for dropped messages
    nodetool netstats
    
    # View per-shard CPU utilization
    scylla_dev_mode_enable=1 scylla --developer-mode=true --help

    Common Commands Reference

    CommandPurpose
    nodetool statusCluster status overview
    nodetool infoNode information
    nodetool ringToken ring information
    nodetool repairAnti-entropy repair
    nodetool cleanupRemove unwanted data
    nodetool flushFlush memtables to disk

    Deployment Complete!

    You've successfully deployed ScyllaDB on your RamNode VPS. With its shard-per-core architecture and automatic tuning, ScyllaDB delivers consistent low-latency performance at scale.

    Additional Resources