Scale your applications to handle traffic spikes efficiently
Scale your applications automatically to handle traffic spikes and distribute load across multiple instances for high availability.
Add more resources to existing instance:
Pro: Simple to implement
Con: Limited by instance size, requires downtime
Add more instances:
Pro: Nearly unlimited scaling, no downtime
Con: More complex architecture
Our cloud platform includes Cloud Load Balancers for distributing traffic across multiple instances.
Round Robin
Distributes evenly across all instances
Least Connections
Sends to instance with fewest connections
Source IP Hash
Same client to same instance
While we don't have automatic auto-scaling built into the platform, you can implement manual horizontal scaling:
# Using OpenStack CLI\nopenstack server create \\\n --image my-golden-image \\\n --flavor m1.small \\\n --network public \\\n web-server-2Add the new instance to your load balancer pool through the Cloud Control Panel or API.
Create your own auto-scaling solution using monitoring and the OpenStack API:
#!/bin/bash
# Simple auto-scaling script
LB_POOL_ID="your-load-balancer-pool-id"
IMAGE_ID="your-golden-image-id"
FLAVOR_ID="m1.small"
# Get current CPU usage across instances
AVG_CPU=$(nova list --fields=id | grep -v ID | while read id rest; do
nova show $id | grep "cpu_util" | awk '{print $4}'
done | awk '{sum+=$1; count++} END {print sum/count}')
# Scale up if average CPU > 80%
if [ $(echo "$AVG_CPU > 80" | bc) -eq 1 ]; then
echo "High CPU detected, scaling up..."
nova boot --image $IMAGE_ID --flavor $FLAVOR_ID web-server-$(date +%s)
fi
# Scale down if average CPU < 20% and more than 2 instances
INSTANCE_COUNT=$(nova list | grep -c ACTIVE)
if [ $(echo "$AVG_CPU < 20" | bc) -eq 1 ] && [ $INSTANCE_COUNT -gt 2 ]; then
echo "Low CPU detected, scaling down..."
OLDEST=$(nova list --sort created_at:asc | grep ACTIVE | head -1 | awk '{print $2}')
nova delete $OLDEST
fi# Check every 5 minutes\n*/5 * * * * /usr/local/bin/autoscale.sh >> /var/log/autoscale.log 2>&1For horizontal scaling to work effectively, your application needs to be designed properly:
Your application should not store session data locally on the instance.
Solutions:
User-uploaded files and assets should be stored centrally:
Your database should be separate from your web servers:
Several third-party tools can help implement auto-scaling:
Deploy a Kubernetes cluster for container orchestration with built-in auto-scaling.
Lighter alternative to Kubernetes with basic scaling capabilities.
Infrastructure as Code for managing instance lifecycle.
Automation and orchestration for scaling operations.
CPU Usage
Scale up when consistently above 70-80%
Memory Usage
Watch for memory pressure and swapping
Request Rate
Requests per second across all instances
Response Time
Average and 95th percentile latency
Need help designing a scalable architecture? Contact our support team or check out our Professional Services for architecture consulting.