Operations Guide¶
Day-to-day operations, maintenance tasks, and best practices for running Tinkero.
Table of Contents¶
- Disk Space Management
- Build Management
- Site Management
- Backup and Restore
- Updating Tinkero
- Service Management
- Log Management
- Security Operations
Disk Space Management¶
Monitoring Disk Space¶
Via Central Grafana Dashboard¶
- Open the central Grafana instance
- Navigate to Dashboards > Tinkero Overview
- Check the Disk Usage panel
Via Command Line¶
# Overall disk usage
df -h /
# Docker disk usage
docker system df
# Tinkero sites disk usage
du -sh /srv/tinkero/sites/*
# Detailed breakdown per site
for site in /srv/tinkero/sites/*/; do
echo "=== $(basename $site) ==="
du -sh "$site"releases/* 2>/dev/null | sort -hr | head -5
done
Manual Cleanup with tinkero cleanup¶
The CLI provides an interactive cleanup tool to remove old releases.
Interactive Mode¶
Output:
๐งน Tinkero Release Cleanup
โน๏ธ Step 1/4: Selecting site...
๐ Available Sites:
[1] my-app
[2] another-site
[3] docs-site
Select site (1-3): 1
Direct Site Mode¶
Cleanup Workflow¶
-
View releases:
================================================================================ ๐ฆ Available Releases ================================================================================ # Release Modified Size Status -------------------------------------------------------------------------------- 1 release-20240108-143022 2024-01-08 14:30:22 45.2 MiB โ CURRENT 2 release-20240107-091533 2024-01-07 09:15:33 43.8 MiB 3 release-20240106-163045 2024-01-06 16:30:45 44.1 MiB 4 release-20240105-120332 2024-01-05 12:03:32 42.7 MiB -
Select releases to delete:
- Enter specific numbers:
2,3,4 - Delete all old releases:
all -
Cancel: (press Enter)
-
Confirm deletion:
-
View summary:
Automated Cleanup¶
For automated cleanup, you can create a cron job:
# Edit crontab
crontab -e
# Add cleanup job (runs daily at 3 AM, keeps last 5 releases per site)
0 3 * * * /usr/local/bin/tinkero-auto-cleanup.sh
Example cleanup script:
#!/bin/bash
# /usr/local/bin/tinkero-auto-cleanup.sh
SITES_DIR="/srv/tinkero/sites"
KEEP_RELEASES=5
for site in "$SITES_DIR"/*/; do
site_name=$(basename "$site")
releases_dir="$site/releases"
if [[ -d "$releases_dir" ]]; then
# Get current release
current=$(readlink -f "$site/current" 2>/dev/null | xargs basename 2>/dev/null)
# List releases, exclude current, keep newest N
releases=$(ls -t "$releases_dir" | grep -v "^$current$" | tail -n +$((KEEP_RELEASES + 1)))
for release in $releases; do
echo "Deleting $site_name/$release"
rm -rf "$releases_dir/$release"
done
fi
done
Docker Cleanup¶
# Remove unused Docker resources
docker system prune -f
# Remove unused images
docker image prune -a -f
# Remove unused volumes (careful - may remove data)
docker volume prune -f
# Full cleanup (removes everything unused)
docker system prune -a --volumes -f
Build Management¶
Concurrent Build Limits¶
Tinkero supports concurrent builds with a default limit of 3 simultaneous builds. Additional builds are queued until a slot becomes available.
| Setting | Default Value | Description |
|---|---|---|
| MaxConcurrent | 3 | Maximum parallel builds |
| QueueSize | 100 | Maximum queued build requests |
Checking Build Queue¶
# View webhook handler logs
docker compose logs -f webhook-handler
# Check active and queued builds via Prometheus metrics
curl -s http://localhost:8080/metrics | grep -E "active_builds|queued_builds"
Build Queue Monitoring in Central Grafana¶
- Navigate to Dashboards > Tinkero Builds in the central Grafana instance
- Check the Active Builds and Queued Builds panels
- Set up alerts for queue backlog > 10
Build Timeout Behavior¶
Each build has a single 15-minute timeout that covers the entire build process (dependency installation and build command execution). The timeout is applied to the Docker container running the build.
| Setting | Default | Description |
|---|---|---|
| Build Timeout | 15 minutes | Total time for install + build |
Note: There are no separate phase timeouts. The 15-minute limit applies to the combined installCommand && buildCommand execution.
Timeout Error Example¶
Error: Build timed out after 15 minutes
Hint: Check if your build requires more time or has infinite loops
Handling Long Builds¶
For projects that legitimately need longer build times:
- Optimize the build:
- Use
npm ciinstead ofnpm install - Enable build caching in your framework
- Remove unnecessary dependencies
-
Minimize the number of assets processed
-
Use pre-built sites:
Build locally or in CI/CD and commit the built files.
Monitoring Active Builds¶
# View running containers
docker ps | grep -E "node|npm"
# View build logs
docker compose logs -f webhook-handler
# Check for stuck builds
docker ps --filter "status=running" --format "{{.Names}} {{.RunningFor}}"
Site Management¶
Adding New Sites¶
- Install the GitHub App on the repository (see GitHub App Guide)
- Add
.tinkero.ymlto the repository - Push to trigger the first deployment
Viewing Deployed Sites¶
# List all sites
ls -la /srv/tinkero/sites/
# View site details
ls -la /srv/tinkero/sites/my-app/
# Check current release
readlink /srv/tinkero/sites/my-app/current
Site Directory Structure¶
/srv/tinkero/sites/
โโโ my-app/
โโโ current -> releases/release-20240108-143022/
โโโ releases/
โ โโโ release-20240108-143022/
โ โโโ release-20240107-091533/
โ โโโ release-20240106-163045/
โโโ error.html (generated on build failure)
Deleting a Site¶
Currently, site deletion is a manual process:
# 1. Stop serving the site (remove from Caddy)
docker exec caddy wget -qO- "http://localhost:2019/config/apps/http/servers/srv0/routes" | \
jq 'del(.[] | select(.match[].path[] | contains("/my-app")))'
# 2. Remove site files
sudo rm -rf /srv/tinkero/sites/my-app/
# 3. Remove from Redis
docker exec redis redis-cli DEL "site:my-app"
# 4. Reload Caddy
docker exec caddy caddy reload --config /etc/caddy/Caddyfile
Caution: This is destructive and cannot be undone. Always backup before deleting.
Changing Deployment Paths¶
If you need to change a site's URL path:
- Update
.tinkero.ymlin the repository (if usingpathfield) - Delete the old site directory (see above)
- Push to trigger a new deployment with the new path
Backup and Restore¶
What to Backup¶
| Component | Location | Priority |
|---|---|---|
| Configuration | .env |
Critical |
| GitHub App key | /srv/tinkero/github-app-key.pem |
Critical |
| Traefik certs | ./data/acme.json |
High |
| Site content | /srv/tinkero/sites/ |
Medium |
| Redis data | Docker volume redis_data |
Low |
Backup Commands¶
# Create backup directory
mkdir -p ~/tinkero-backup/$(date +%Y%m%d)
cd ~/tinkero-backup/$(date +%Y%m%d)
# Backup configuration
cp /path/to/tinkero/.env ./
cp /srv/tinkero/github-app-key.pem ./
# Backup Traefik certificates
cp /path/to/tinkero/data/acme.json ./
# Backup sites (may be large)
tar -czf sites.tar.gz /srv/tinkero/sites/
# Backup Redis data
docker exec redis redis-cli BGSAVE
docker cp redis:/data/dump.rdb ./redis-dump.rdb
Automated Backup Script¶
#!/bin/bash
# /usr/local/bin/tinkero-backup.sh
BACKUP_DIR="/backup/tinkero"
DATE=$(date +%Y%m%d-%H%M%S)
TINKERO_DIR="/path/to/tinkero"
# Create backup directory
mkdir -p "$BACKUP_DIR/$DATE"
cd "$BACKUP_DIR/$DATE"
# Backup configuration
cp "$TINKERO_DIR/.env" ./
cp /srv/tinkero/github-app-key.pem ./
# Backup certificates
cp "$TINKERO_DIR/data/acme.json" ./
# Backup Redis
docker exec redis redis-cli BGSAVE
sleep 5
docker cp redis:/data/dump.rdb ./redis-dump.rdb
# Create archive
cd "$BACKUP_DIR"
tar -czf "backup-$DATE.tar.gz" "$DATE"
rm -rf "$DATE"
# Keep only last 7 backups
ls -t "$BACKUP_DIR"/backup-*.tar.gz | tail -n +8 | xargs -r rm
echo "Backup completed: $BACKUP_DIR/backup-$DATE.tar.gz"
Restore Procedure¶
# 1. Stop services
cd /path/to/tinkero
docker compose down
# 2. Extract backup
tar -xzf backup-20240108-120000.tar.gz
cd 20240108-120000/
# 3. Restore configuration
cp .env /path/to/tinkero/
cp github-app-key.pem /srv/tinkero/
# 4. Restore certificates
cp acme.json /path/to/tinkero/data/
chmod 600 /path/to/tinkero/data/acme.json
# 5. Restore sites (if backed up)
tar -xzf sites.tar.gz -C /
# 6. Restore Redis
docker cp redis-dump.rdb redis:/data/dump.rdb
docker compose restart redis
# 7. Start services
docker compose up -d
# 8. Verify
tinkero health
Updating Tinkero¶
Updating the CLI¶
# Download latest binary
wget https://github.com/yourusername/tinkero/releases/latest/download/tinkero-linux-amd64
chmod +x tinkero-linux-amd64
sudo mv tinkero-linux-amd64 /usr/local/bin/tinkero
# Verify version
tinkero --version
Updating Services¶
# Navigate to Tinkero directory
cd /path/to/tinkero
# Pull latest changes
git pull origin main
# Pull latest Docker images
docker compose pull
# Rebuild custom images (webhook-handler)
docker compose build --no-cache webhook-handler
# Restart services with new images
docker compose up -d
# Verify services
tinkero health
Rolling Update (Zero Downtime)¶
# Update one service at a time
docker compose up -d --no-deps --build webhook-handler
docker compose up -d --no-deps caddy
Rollback¶
If an update causes issues:
# Rollback to previous commit
git checkout HEAD~1
# Rebuild and restart
docker compose up -d --build
# If images were pulled, specify previous versions as needed
docker compose pull
docker compose up -d
Service Management¶
Viewing Service Status¶
# Quick health check
tinkero health
# Docker Compose status
docker compose ps
# Detailed container info
docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}"
Restarting Services¶
# Restart all services
docker compose restart
# Restart specific service
docker compose restart webhook-handler
# Full restart (down + up)
docker compose down && docker compose up -d
Viewing Logs¶
# All services
docker compose logs -f
# Specific service
docker compose logs -f webhook-handler
# With timestamps
docker compose logs -f --timestamps webhook-handler
# Last N lines
docker compose logs --tail 100 webhook-handler
Scaling Services¶
Currently, Tinkero runs single instances of each service. For scaling:
Note: Scaling requires additional configuration for load balancing.
Log Management¶
Log Locations¶
| Service | Log Access |
|---|---|
| All services | docker compose logs |
| Traefik | docker compose logs traefik |
| webhook-handler | docker compose logs webhook-handler |
| Caddy | docker compose logs caddy |
Log Retention¶
Logs are managed by Docker's logging driver. Default retention is unlimited.
Configure log rotation in docker-compose.yml:
Querying Logs in Loki¶
Access Loki via central Grafana:
- SSH tunnel:
ssh -L 3000:localhost:3000 user@server - Open the central Grafana instance > Explore
- Select Loki datasource
- Query logs:
# All webhook handler logs
{container_name="webhook-handler"}
# Error logs only
{container_name="webhook-handler"} |= "error"
# Deployment logs
{container_name="webhook-handler"} |= "deployment"
Security Operations¶
Rotating GitHub App Key¶
- Go to GitHub App settings
- Generate new private key
- Download and store securely
- Update Tinkero:
- Delete old key from GitHub
Rotating Webhook Secret¶
- Generate new secret:
- Update GitHub App webhook settings
- Update
.env: - Restart:
Rotating Passwords¶
Traefik Dashboard:
# Generate new bcrypt hash
htpasswd -nbB admin 'new-password'
# Update .env with new hash
nano .env
# Restart Traefik
docker compose restart traefik
Security Audit Checklist¶
- Firewall only allows ports 22, 80, 443
- SSH key authentication enabled
- GitHub App private key has 600 permissions
-
.envfile has 600 permissions - No services exposed on public ports except Traefik
- SSL certificates are valid and auto-renewing
- Passwords are strong and unique
- Logs are being collected in Loki
- Backup system is working
See Also: - Monitoring Guide - Detailed metrics and dashboards - Troubleshooting - Common issues - CLI Reference - Command documentation