Service Overview
RunOS makes deploying complex open-source services as simple as clicking a button. The service marketplace includes databases, caches, message queues, monitoring tools, AI platforms, and more - all with intelligent defaults, automatic configuration, and integrated backups.
What are RunOS Services?
RunOS services are pre-configured, production-ready deployments of popular open-source software. Each service includes:
- Intelligent defaults - Optimized configuration for common use cases
- Tiered options - Lightweight, HA, or enterprise configurations
- Automatic networking - Services can discover and connect to each other
- Integrated monitoring - Metrics and dashboards included
- Backup support - Automated backups for supported services
- Health checks - Automatic monitoring and restart on failure
Every deployed service gets its own namespace (named with the OSID) for complete isolation and easy management.
Deploying Services
Accessing the Marketplace
- Navigate to Services - Click "Services" in main navigation
- Click "Add Service" or "Browse Marketplace"
- Browse or search - Find the service you need
- Click service card - View details and deployment options
Basic Deployment Steps
Step 1: Select Service
- Browse marketplace categories
- Use search to find specific services
- Click the service to deploy
Step 2: Configure Service
Basic Configuration:
- Service name - Descriptive name (e.g.,
postgres-prod,redis-cache) - Configuration tier - Lightweight, High Availability, or Enterprise
- Resource allocation - CPU, memory, and storage
Advanced Configuration:
- Storage class - OpenEBS (default, high-performance) or Longhorn (distributed)
- Node affinity - Optional tags to run on specific nodes
- Service dependencies - Link to compatible services already deployed
- Authentication - Enable/disable with auto-generated or custom credentials
Step 3: Review and Deploy
- Review all settings
- Click "Deploy Service"
- Wait for provisioning (typically 2-5 minutes)
- Access connection info and credentials
Configuration Tiers
Lightweight:
- Single instance
- Minimal resource allocation
- Local storage
- Use for: Development, testing, low-traffic applications
High Availability:
- Multiple replicas (2-3 depending on service)
- Automatic failover
- Load balancing
- Use for: Production applications, business-critical services
Enterprise (select services):
- Maximum replicas
- Advanced replication
- Performance optimizations
- Use for: High-traffic production, compliance requirements
Service Intelligence Features
Automatic Dependency Discovery:
- Services automatically discover compatible dependencies
- Dropdowns only show compatible services (e.g., services requiring Redis with auth only show password-enabled instances)
- Connection configured automatically
Context-Aware Storage:
- Databases (PostgreSQL, MySQL): Default OpenEBS for performance
- Bulk storage (MinIO, Harbor): Longhorn option for distributed storage
- Smart recommendations based on service type
Deployment Example
PostgreSQL Production Database:
Service name: postgres-prod
Tier: High Availability
CPU: 2 cores
Memory: 4Gi
Storage: 50Gi
Storage class: OpenEBS
Authentication: Enabled
Result: PostgreSQL HA cluster with 2 replicas, automatic failover, daily backups
Connection: postgres://username:password@postgres-prod.<OSID>.svc.cluster.local:5432
Managing Services
Viewing Service Status
Service List View: Navigate to Services to see all deployed services with:
- Service name and OSID
- Service type (PostgreSQL, Redis, etc.)
- Status (Running, Pending, Failed, Degraded)
- Replica count and health
- Resource usage
Service Detail View: Click any service to view:
- Overview - Status, connection info, resource charts, events
- Metrics - Service-specific metrics and Grafana dashboards
- Logs - Real-time log streaming and search
- Configuration - Current settings and resource allocations
- Events - Kubernetes events and deployment history
Modifying Resources
Adjust CPU and Memory:
- Navigate to service detail page
- Click "Edit Configuration"
- Modify CPU/memory requests and limits
- Save changes
- Service performs rolling update with no downtime (multi-replica services)
Expand Storage:
- Edit service configuration
- Update storage size (must be larger, cannot shrink)
- Save changes
- Volume expands automatically
Restarting Services
Graceful Restart (Recommended):
- Navigate to service detail page
- Click "Restart Service"
- Confirm restart
- Rolling restart of all pods
Via kubectl:
# Graceful rolling restart
kubectl rollout restart statefulset -n <OSID>
# Monitor progress
kubectl rollout status statefulset -n <OSID>
Service Tagging
Control which nodes run your service by using tags:
Why use tags:
- Hardware requirements (GPU, SSD, high memory)
- Dedicated database nodes
- Regional placement
- Development vs production isolation
Add tags:
- Edit service configuration
- Enter tags in "Node Affinity" section (e.g.,
database,ssd) - Save changes
- Service reschedules to nodes with matching tags
Tag matching: All tags must match. A service with tags database,ssd only runs on nodes with both tags.
Deleting Services
Before deletion:
- Export all important data
- Verify backups are complete
- Update applications to remove dependencies
Deletion process:
- Navigate to service detail page
- Click "Delete Service"
- Type service name to confirm
- Click "Permanently Delete"
What gets deleted: All pods, persistent volumes, configuration, networking, and monitoring. Backups stored externally are preserved.
Service Backups
Backup Support by Service
Full backup support:
- PostgreSQL - CloudNativePG operator with automated backups, point-in-time recovery
- MySQL - Operator-managed backups with S3 storage
- MinIO - Object versioning and replication
Manual backup procedures available for all other services.
PostgreSQL Backups
PostgreSQL backups are managed by the CloudNativePG operator with automated backup capabilities and point-in-time recovery.
Configure backups:
- During deployment, expand "Backup Configuration"
- Set backup schedule using cron expression (e.g.,
0 2 * * *for daily at 2 AM) - Set retention period in days
- Configure S3-compatible storage (recommended for production)
Backup method:
- Continuous archiving using Write-Ahead Logs (WAL)
- Storage options: Local persistent volume or S3-compatible storage
- Point-in-time recovery to any moment within WAL retention period
Manual backup:
- Navigate to service → Backups tab
- Click "Create Backup Now"
- Wait for completion
Restore from backup:
- Deploy new PostgreSQL service
- Select "Restore from backup"
- Choose backup source and optionally specify point-in-time recovery target
- Complete deployment
- Verify data
MySQL Backups
MySQL backups are managed by the MySQL operator with S3-compatible storage and point-in-time recovery capabilities.
Configuration:
- Requires S3-compatible storage
- Backup schedule is configurable using cron expressions
- Retention period is configurable based on your requirements
- Uses mysqldump or Xtrabackup for backup operations
Backup method:
- Binary log archiving enables point-in-time recovery
- Storage options: S3-compatible storage
- Point-in-time recovery to any moment within binary log retention period (near-minute recovery precision)
Restore: Create a new MySQL service and select "Restore from backup" during initialization. You can optionally specify a point-in-time recovery target to restore to a specific moment.
MinIO Backups
Object Versioning:
- Access MinIO console
- Navigate to bucket → Settings
- Enable versioning
- Set retention policy
Benefits: Recover from accidental deletion, restore previous versions, ransomware protection.
Replication:
- Replicate to another MinIO instance for disaster recovery
- Geographic redundancy
- Active-active or active-passive setup
Manual Backup Procedures
For services not listed above (Redis/Valkey, ClickHouse, etc.), you can perform manual backups:
- Use service-specific backup tools via kubectl exec
- Export data using the service's native backup commands
- Copy backup files to external storage for safekeeping
- Refer to the service's official documentation for best practices
Backup Best Practices
3-2-1 Rule:
- 3 copies of data (production + local backup + off-site)
- 2 different media types (volumes + object storage)
- 1 copy off-site (different cluster/datacenter)
Retention policies: Configure retention policies based on your requirements. Example retention strategies:
- Production environments: 7 daily, 4 weekly, 12 monthly backups
- Development environments: 3 daily backups
Regular testing:
- Monthly restore tests for production
- Quarterly disaster recovery drills
- Document test results
Monitor backups:
- Alert on backup failures
- Track backup sizes
- Monitor storage usage
- Verify backup completion
Recovery Time and Data Loss
PostgreSQL:
- Recovery time: 5 minutes to 4 hours (depends on size)
- Data loss (daily backups): Up to 24 hours
- Data loss (WAL archiving): Minutes or less
MySQL:
- Similar to PostgreSQL
Redis:
- Recovery time: 1-5 minutes
- Data loss: Last snapshot interval
Common Service Patterns
Database + Application:
- Deploy database (PostgreSQL, MySQL)
- Note connection string
- Deploy application with database URL
- Application automatically connects
AI Platform:
- Deploy Ollama with GPU node affinity
- Deploy Open WebUI connected to Ollama
- Configure applications to use Ollama API
Analytics Stack:
- Deploy ClickHouse for analytics
- Deploy Grafana for visualization
- Deploy Kafka for data ingestion
- Connect pipelines to ClickHouse
Viewing Logs and Metrics
Logs
Basic logs (always available):
- Navigate to service → Logs tab
- Select pod
- View streaming logs
Via kubectl:
# View logs
kubectl logs -n <OSID> <pod-name>
# Stream logs
kubectl logs -n <OSID> <pod-name> -f
# Previous instance (if crashed)
kubectl logs -n <OSID> <pod-name> --previous
Advanced logs (with Vector):
- Full-text search across all services
- Historical retention
- Log correlation
- Export capabilities
Metrics
Console integration:
- CPU and memory usage
- Network and storage metrics
- Service-specific metrics (queries/sec, connections, cache hits)
Grafana dashboards:
- Click "View in Grafana" on service page
- Pre-built dashboards for each service type
- Custom dashboard creation
- Alerting capabilities
Prometheus queries:
# CPU usage by pod
rate(container_cpu_usage_seconds_total{namespace="<OSID>"}[5m])
# Memory usage
container_memory_usage_bytes{namespace="<OSID>"}
# Service-specific metrics
rate(http_requests_total{namespace="<OSID>"}[5m])
Troubleshooting
Service Won't Start
Check pod status:
kubectl get pods -n <OSID>
kubectl describe pod <pod-name> -n <OSID>
kubectl logs <pod-name> -n <OSID>
Common causes:
- Insufficient resources (CPU/memory)
- Storage provisioning failed
- Image pull failure
- Configuration error
- No nodes matching tags
Cannot Connect to Service
Check service endpoints:
kubectl get svc -n <OSID>
kubectl get endpoints -n <OSID>
Common causes:
- Pods not ready yet
- Health checks failing
- Network policy blocking traffic
- Wrong connection string
High Resource Usage
- View metrics to identify bottleneck
- Check for runaway queries or processes
- Scale up resources if legitimate load
- Restart to clear memory leaks
Quick Reference
# View all services (OSIDs are namespaces)
kubectl get namespaces
# View service resources
kubectl get all -n <OSID>
# Check service status
kubectl get pods -n <OSID>
# View service logs
kubectl logs -n <OSID> <pod-name> -f
# View service endpoints
kubectl get svc,endpoints -n <OSID>
# Resource usage
kubectl top pods -n <OSID>
# Restart service
kubectl rollout restart statefulset -n <OSID>
# Delete service
kubectl delete namespace <OSID>
RunOS services combine simplicity with the power and flexibility of Kubernetes, providing production-ready deployments with minimal configuration.