Back to Cloud VPS Documentation

    Auto-Scaling & Load Balancing

    Scale your applications to handle traffic spikes efficiently

    Scale your applications automatically to handle traffic spikes and distribute load across multiple instances for high availability.

    Understanding Scaling

    Vertical Scaling (Scale Up)

    Add more resources to existing instance:

    • • More CPU cores
    • • More RAM
    • • Larger disk

    Pro: Simple to implement

    Con: Limited by instance size, requires downtime

    Horizontal Scaling (Scale Out)

    Add more instances:

    • • Multiple identical instances
    • • Load balancer distributes traffic
    • • Better redundancy

    Pro: Nearly unlimited scaling, no downtime

    Con: More complex architecture

    Load Balancer Setup

    Our cloud platform includes Cloud Load Balancers for distributing traffic across multiple instances.

    Creating a Load Balancer

    1. Log into the Cloud Control Panel
    2. Navigate to NetworkLoad Balancers
    3. Click Create Load Balancer
    4. Configure:
      • Name and description
      • Region (must match your instances)
      • Protocol (HTTP, HTTPS, TCP)
      • Health check settings
    5. Add instances to the pool
    6. Click Create

    Load Balancing Algorithms

    Round Robin

    Distributes evenly across all instances

    Least Connections

    Sends to instance with fewest connections

    Source IP Hash

    Same client to same instance

    Manual Horizontal Scaling

    While we don't have automatic auto-scaling built into the platform, you can implement manual horizontal scaling:

    Step 1: Create a Golden Image

    1. Set up one instance with your application fully configured
    2. Create a snapshot of this instance
    3. This becomes your template for new instances

    Step 2: Launch Additional Instances

    # Using OpenStack CLI\nopenstack server create \\\n  --image my-golden-image \\\n  --flavor m1.small \\\n  --network public \\\n  web-server-2

    Step 3: Add to Load Balancer

    Add the new instance to your load balancer pool through the Cloud Control Panel or API.

    Implementing Auto-Scaling with Scripts

    Create your own auto-scaling solution using monitoring and the OpenStack API:

    #!/bin/bash
    # Simple auto-scaling script
    
    LB_POOL_ID="your-load-balancer-pool-id"
    IMAGE_ID="your-golden-image-id"
    FLAVOR_ID="m1.small"
    
    # Get current CPU usage across instances
    AVG_CPU=$(nova list --fields=id | grep -v ID | while read id rest; do
        nova show $id | grep "cpu_util" | awk '{print $4}'
    done | awk '{sum+=$1; count++} END {print sum/count}')
    
    # Scale up if average CPU > 80%
    if [ $(echo "$AVG_CPU > 80" | bc) -eq 1 ]; then
        echo "High CPU detected, scaling up..."
        nova boot --image $IMAGE_ID --flavor $FLAVOR_ID web-server-$(date +%s)
    fi
    
    # Scale down if average CPU < 20% and more than 2 instances
    INSTANCE_COUNT=$(nova list | grep -c ACTIVE)
    if [ $(echo "$AVG_CPU < 20" | bc) -eq 1 ] && [ $INSTANCE_COUNT -gt 2 ]; then
        echo "Low CPU detected, scaling down..."
        OLDEST=$(nova list --sort created_at:asc | grep ACTIVE | head -1 | awk '{print $2}')
        nova delete $OLDEST
    fi

    Run as Cron Job

    # Check every 5 minutes\n*/5 * * * * /usr/local/bin/autoscale.sh >> /var/log/autoscale.log 2>&1

    Application-Level Considerations

    For horizontal scaling to work effectively, your application needs to be designed properly:

    Stateless Architecture

    Your application should not store session data locally on the instance.

    Solutions:

    • Use Redis or Memcached for session storage
    • Store sessions in a shared database
    • Use JWT tokens for stateless authentication
    • Enable sticky sessions on load balancer (less ideal)

    Shared Storage

    User-uploaded files and assets should be stored centrally:

    • Object storage (S3-compatible)
    • Shared NFS/GlusterFS volume
    • CDN for static assets

    Database Considerations

    Your database should be separate from your web servers:

    • Use a dedicated database instance
    • Consider read replicas for read-heavy workloads
    • Implement connection pooling
    • Use caching layers (Redis, Memcached)

    Third-Party Auto-Scaling Tools

    Several third-party tools can help implement auto-scaling:

    Kubernetes

    Deploy a Kubernetes cluster for container orchestration with built-in auto-scaling.

    • • Horizontal Pod Autoscaler
    • • Cluster Autoscaler
    • • Vertical Pod Autoscaler

    Docker Swarm

    Lighter alternative to Kubernetes with basic scaling capabilities.

    • • Service scaling
    • • Rolling updates
    • • Load balancing

    Terraform

    Infrastructure as Code for managing instance lifecycle.

    • • Declarative configuration
    • • State management
    • • OpenStack provider

    Ansible

    Automation and orchestration for scaling operations.

    • • Playbook automation
    • • Dynamic inventory
    • • OpenStack modules

    Monitoring & Cost Management

    Key Metrics to Monitor

    CPU Usage

    Scale up when consistently above 70-80%

    Memory Usage

    Watch for memory pressure and swapping

    Request Rate

    Requests per second across all instances

    Response Time

    Average and 95th percentile latency

    Cost Considerations

    • Each additional instance adds to your hourly cost
    • Load balancers have their own pricing
    • Set maximum instance limits to control costs
    • Monitor spending in the Cloud Control Panel

    Best Practices

    • Start Simple - Begin with 2-3 instances before complex auto-scaling
    • Test Failover - Regularly test that load balancer handles instance failures
    • Gradual Scaling - Add/remove instances one at a time
    • Cool-down Periods - Wait before scaling again (5-10 minutes)
    • Plan for Peak - Have capacity ready before expected traffic spikes
    • Monitor Costs - Track spending as you scale

    Need help designing a scalable architecture? Contact our support team or check out our Professional Services for architecture consulting.