Skip to main content

Module 10: Production Deployment

Learning Objectives

By the end of this module, you will be able to:

  • Select and configure a VPS host suitable for OpenClaw
  • Deploy using Docker or Podman containerization
  • Use Nix for reproducible build environments
  • Set up a systemd service to keep your Agent running
  • Configure comprehensive monitoring, logging, and backup strategies
  • Complete a full development-to-production deployment workflow

Core Concepts

Deployment Method Comparison

MethodProsConsBest For
Direct installSimple, low overheadEnvironment pollution, hard to reproduceDev/test
DockerMature ecosystem, rich imagesRequires root daemonGeneral deployment
PodmanRootless, no daemon, secureSmaller ecosystemSecurity-sensitive environments
NixFully reproducible, declarativeSteep learning curveAdvanced deployment, CI/CD
systemdNative Linux service managementLinux onlyComplement to any of the above

Hardware Requirements

ScaleCPURAMStorageNotes
Minimum2 vCPU4 GB20 GB SSDSingle Agent, no browser
Recommended4 vCPU8 GB50 GB SSDSingle Agent with browser
Multi-Agent8 vCPU16 GB100 GB SSD2-3 Agents with browser
Enterprise16+ vCPU32+ GB200+ GB NVMeMulti-Agent + monitoring + logging
ProviderEntry PlanMonthly Cost (approx.)Notes
HetznerCX22 (2vCPU/4GB)~$4.5European DCs, best value
DigitalOceanBasic (2vCPU/4GB)~$24Simple and user-friendly
Linode/AkamaiNanode (1vCPU/2GB)~$5Lowest entry point
VultrCloud Compute~$6Many global locations
AWS EC2t3.medium~$30Enterprise needs

Implementation: VPS + Podman Deployment

Step 1: VPS Initial Setup

# Example using Hetzner CX22, connect to VPS
ssh root@YOUR_VPS_IP

# Create a non-root user
adduser openclaw
usermod -aG sudo openclaw

# Set up SSH key login (more secure)
mkdir -p /home/openclaw/.ssh
cp ~/.ssh/authorized_keys /home/openclaw/.ssh/
chown -R openclaw:openclaw /home/openclaw/.ssh
chmod 700 /home/openclaw/.ssh
chmod 600 /home/openclaw/.ssh/authorized_keys

# Disable password login
sed -i 's/PasswordAuthentication yes/PasswordAuthentication no/' /etc/ssh/sshd_config
systemctl restart sshd

# Configure firewall
ufw default deny incoming
ufw default allow outgoing
ufw allow ssh
ufw enable

# DO NOT open port 18789!
# Use an SSH tunnel for remote access instead
Never Expose Port 18789

Recall the lesson from Module 9: Security: over 135,000 OpenClaw instances worldwide were compromised because port 18789 was exposed. Always use an SSH tunnel for remote access.

# Create an SSH tunnel from your local machine
ssh -L 18789:127.0.0.1:18789 openclaw@YOUR_VPS_IP
# Then access http://127.0.0.1:18789 locally

Step 2: Install Podman

# Switch to the openclaw user
su - openclaw

# Install Podman (Ubuntu/Debian)
sudo apt-get update
sudo apt-get install -y podman slirp4netns fuse-overlayfs

# Confirm Podman version
podman --version

# Confirm rootless mode is available
podman info | grep -i rootless

# Configure subuid/subgid (required for rootless)
sudo usermod --add-subuids 100000-165535 openclaw
sudo usermod --add-subgids 100000-165535 openclaw

Step 3: Prepare Configuration Files

# Create directory structure
mkdir -p ~/openclaw/{config,data,logs,skills,tls}

Create settings.json:

{
"server": {
"host": "127.0.0.1",
"port": 18789,
"auth": {
"enabled": true,
"api_key": "${OPENCLAW_API_KEY}"
}
},
"llm": {
"provider": "openai",
"model": "gpt-4o",
"api_key": "${OPENAI_API_KEY}",
"max_tokens": 4096,
"temperature": 0.7
},
"channels": {
"discord": {
"enabled": true,
"token": "${DISCORD_BOT_TOKEN}",
"guild_id": "${DISCORD_GUILD_ID}"
}
},
"browser": {
"enabled": true,
"headless": true,
"launch_options": {
"args": ["--no-sandbox", "--disable-dev-shm-usage"]
}
},
"logging": {
"level": "info",
"file": "/data/logs/openclaw.log",
"max_size_mb": 100,
"max_files": 10,
"rotation": "daily"
},
"data_dir": "/data"
}

Create an environment variables file:

# ~/openclaw/.env (ensure permissions are 600)
cat > ~/openclaw/.env << 'EOF'
OPENCLAW_API_KEY=your_api_key_here
OPENAI_API_KEY=sk-your-openai-key
DISCORD_BOT_TOKEN=your-discord-bot-token
DISCORD_GUILD_ID=your-guild-id
EOF

chmod 600 ~/openclaw/.env

Step 4: Launch with Podman

# Pull the OpenClaw image
podman pull ghcr.io/openclaw/openclaw:latest

# Start the container
podman run -d \
--name openclaw \
--userns=keep-id \
--security-opt=no-new-privileges \
--cap-drop=ALL \
--cap-add=NET_BIND_SERVICE \
--cap-add=SYS_ADMIN \
--read-only \
--tmpfs /tmp:rw,size=200m \
--tmpfs /run:rw,size=50m \
-p 127.0.0.1:18789:18789 \
-v ~/openclaw/config/settings.json:/app/settings.json:ro,Z \
-v ~/openclaw/config/soul.md:/app/soul.md:ro,Z \
-v ~/openclaw/data:/data:Z \
-v ~/openclaw/skills:/app/skills:ro,Z \
--env-file ~/openclaw/.env \
--memory=4g \
--cpus=2 \
--restart=unless-stopped \
ghcr.io/openclaw/openclaw:latest

# Confirm container status
podman ps

# View logs
podman logs -f openclaw
SYS_ADMIN Capability

--cap-add=SYS_ADMIN is required for Headless Chromium to run inside a container. If you don't need browser capabilities, you can remove this capability for stronger security.

Step 5: systemd User Service

Create a systemd service so the Podman container starts automatically on boot:

# Create systemd user service directory
mkdir -p ~/.config/systemd/user/

# Auto-generate a systemd service from the Podman container
podman generate systemd --name openclaw --new --files
mv container-openclaw.service ~/.config/systemd/user/

# Or manually create the service file
cat > ~/.config/systemd/user/openclaw.service << 'EOF'
[Unit]
Description=OpenClaw AI Agent
After=network-online.target
Wants=network-online.target

[Service]
Type=simple
Restart=always
RestartSec=10
TimeoutStartSec=60
TimeoutStopSec=30

ExecStartPre=-/usr/bin/podman rm -f openclaw
ExecStart=/usr/bin/podman run \
--name openclaw \
--userns=keep-id \
--security-opt=no-new-privileges \
--cap-drop=ALL \
--cap-add=NET_BIND_SERVICE \
--cap-add=SYS_ADMIN \
--read-only \
--tmpfs /tmp:rw,size=200m \
--tmpfs /run:rw,size=50m \
-p 127.0.0.1:18789:18789 \
-v %h/openclaw/config/settings.json:/app/settings.json:ro,Z \
-v %h/openclaw/config/soul.md:/app/soul.md:ro,Z \
-v %h/openclaw/data:/data:Z \
-v %h/openclaw/skills:/app/skills:ro,Z \
--env-file %h/openclaw/.env \
--memory=4g \
--cpus=2 \
ghcr.io/openclaw/openclaw:latest

ExecStop=/usr/bin/podman stop -t 30 openclaw
ExecStopPost=-/usr/bin/podman rm -f openclaw

[Install]
WantedBy=default.target
EOF

# Enable and start the service
systemctl --user daemon-reload
systemctl --user enable openclaw.service
systemctl --user start openclaw.service

# Ensure the user's systemd services run even when not logged in
loginctl enable-linger openclaw

# Check status
systemctl --user status openclaw.service

Step 6: Nix Reproducible Deployment (Advanced)

If you prefer Nix's declarative deployment:

# flake.nix
{
description = "OpenClaw production deployment";

inputs = {
nixpkgs.url = "github:NixOS/nixpkgs/nixos-unstable";
openclaw.url = "github:openclaw/openclaw";
};

outputs = { self, nixpkgs, openclaw }: {
nixosConfigurations.openclaw-server = nixpkgs.lib.nixosSystem {
system = "x86_64-linux";
modules = [
./hardware-configuration.nix
openclaw.nixosModules.default
({ config, pkgs, ... }: {
services.openclaw = {
enable = true;
settings = {
server.host = "127.0.0.1";
server.port = 18789;
llm.provider = "openai";
llm.model = "gpt-4o";
browser.enabled = true;
browser.headless = true;
};
environmentFile = "/run/secrets/openclaw.env";
};

# Firewall
networking.firewall = {
enable = true;
allowedTCPPorts = [ 22 ];
# Do NOT open 18789
};

# Auto-update
system.autoUpgrade = {
enable = true;
flake = "github:openclaw/openclaw";
dates = "04:00";
};
})
];
};
};
}

Deploy:

# Build and deploy
nixos-rebuild switch --flake .#openclaw-server

# Or deploy remotely
nixos-rebuild switch --flake .#openclaw-server \
--target-host openclaw@YOUR_VPS_IP

Step 7: Monitoring & Logging

Log aggregation:

# View systemd service logs with journalctl
journalctl --user -u openclaw.service -f

# Use Loki + Grafana for log aggregation
# docker-compose.monitoring.yml
# monitoring/docker-compose.yml
version: '3'
services:
prometheus:
image: prom/prometheus:latest
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
ports:
- "127.0.0.1:9090:9090"

grafana:
image: grafana/grafana:latest
ports:
- "127.0.0.1:3000:3000"
environment:
GF_SECURITY_ADMIN_PASSWORD: "${GRAFANA_PASSWORD}"

Health check script:

#!/bin/bash
# ~/openclaw/scripts/health-check.sh

OPENCLAW_URL="http://127.0.0.1:18789/api/health"
ALERT_WEBHOOK="${DISCORD_WEBHOOK_URL}"

response=$(curl -s -o /dev/null -w "%{http_code}" \
-H "Authorization: Bearer ${OPENCLAW_API_KEY}" \
"$OPENCLAW_URL")

if [ "$response" != "200" ]; then
# Attempt restart
systemctl --user restart openclaw.service

# Send Discord notification
curl -X POST "$ALERT_WEBHOOK" \
-H "Content-Type: application/json" \
-d "{
\"content\": \"OpenClaw health check failed (HTTP $response). Restart attempted.\"
}"
fi
# Add to crontab
chmod +x ~/openclaw/scripts/health-check.sh
crontab -e
# Add:
# */5 * * * * /home/openclaw/openclaw/scripts/health-check.sh

Step 8: Backup Strategy

#!/bin/bash
# ~/openclaw/scripts/backup.sh

BACKUP_DIR="/home/openclaw/backups"
DATE=$(date +%Y%m%d_%H%M%S)
RETENTION_DAYS=30

mkdir -p "$BACKUP_DIR"

# Backup data (excluding temp files)
tar -czf "$BACKUP_DIR/openclaw-data-$DATE.tar.gz" \
-C /home/openclaw/openclaw \
--exclude='*.tmp' \
--exclude='logs/*.log.*' \
data/ config/ skills/

# Backup soul.md
cp ~/openclaw/config/soul.md "$BACKUP_DIR/soul-$DATE.md"

# Clean old backups
find "$BACKUP_DIR" -name "*.tar.gz" -mtime +$RETENTION_DAYS -delete
find "$BACKUP_DIR" -name "*.md" -mtime +$RETENTION_DAYS -delete

# Optional: upload to remote storage
# rclone copy "$BACKUP_DIR/openclaw-data-$DATE.tar.gz" remote:openclaw-backups/

echo "Backup complete: openclaw-data-$DATE.tar.gz"
# Daily automatic backup
crontab -e
# Add:
# 0 3 * * * /home/openclaw/openclaw/scripts/backup.sh >> /home/openclaw/openclaw/logs/backup.log 2>&1

Common Errors

IssueCauseSolution
Container exits immediately after startEnvironment variables not setVerify .env file contents and path
Chromium won't startMissing SYS_ADMIN capabilityAdd --cap-add=SYS_ADMIN
systemd service doesn't start on bootLinger not enabledRun loginctl enable-linger
Disk space exhaustedLogs not rotatedConfigure max_size_mb and max_files
DNS resolution fails inside containerPodman network configuration issueAdd --dns=8.8.8.8
SELinux blocks volume mountsSELinux label mismatchUse the :Z flag when mounting volumes

Troubleshooting

# Container won't start -- view detailed logs
podman logs openclaw 2>&1 | tail -50

# Network issues
podman exec openclaw curl -s http://127.0.0.1:18789/api/health

# File permission issues
podman exec openclaw ls -la /data/
podman unshare ls -la ~/openclaw/data/

# systemd service failure
systemctl --user status openclaw.service
journalctl --user -u openclaw.service --no-pager -n 50

# Resource usage
podman stats openclaw --no-stream

Exercises

Exercise 1: Basic VPS Deployment

Deploy OpenClaw on a VPS using Podman, ensuring:

  • Listens only on 127.0.0.1
  • Managed by systemd
  • Health check script configured

Exercise 2: Full Monitoring

Add Prometheus + Grafana monitoring to your Exercise 1 deployment. Build dashboards for:

  • Agent response time
  • LLM API usage
  • Memory / CPU utilization
  • Error rate

Exercise 3: Disaster Recovery

Simulate the following scenarios and develop recovery plans:

  • VPS hard disk failure
  • LLM API Key leak
  • Agent injected with a malicious prompt

Quiz

  1. Why is loginctl enable-linger recommended?

    • A) Performance improvement
    • B) Keeps the user's systemd services running even when the user is not logged in
    • C) Enables root privileges
    • D) Auto-updates the system
    View Answer
    B) By default, user systemd services stop when the user logs out. enable-linger ensures services keep running even when the user is not logged in.
  2. What is the purpose of Podman's --read-only combined with --tmpfs?

    • A) Improve disk performance
    • B) The read-only filesystem prevents malicious file writes, while tmpfs provides necessary temporary storage
    • C) Save disk space
    • D) Encrypt the filesystem
    View Answer
    B) --read-only ensures attackers cannot write backdoors or malicious programs inside the container. --tmpfs provides in-memory temporary storage for directories like /tmp that need write access.
  3. What does max_files: 10 combined with max_size_mb: 100 mean for log rotation?

    • A) A maximum of 10 files, each up to 100MB, totaling approximately 1GB of logs
    • B) 10MB of total logs
    • C) All logs kept forever
    • D) A new file every 10 minutes
    View Answer
    A) When a log file reaches 100MB it automatically rotates, with a maximum of 10 historical files retained, so logs occupy approximately 1GB of disk space at most.
  4. Which of the following is NOT an essential backup item?

    • A) data/ directory (Agent memory and data)
    • B) settings.json
    • C) soul.md
    • D) Podman image cache
    View Answer
    D) Podman images can be re-pulled from the registry at any time and don't need to be backed up. What matters is the Agent's data, configuration, and soul.md.

Next Steps