Service Overview

RunOS makes deploying complex open-source services as simple as clicking a button. The service marketplace includes databases, caches, message queues, monitoring tools, AI platforms, and more - all with intelligent defaults, automatic configuration, and integrated backups.

What are RunOS Services?

RunOS services are pre-configured, production-ready deployments of popular open-source software. Each service includes:

Intelligent defaults - Optimized configuration for common use cases
Tiered options - Lightweight, HA, or enterprise configurations
Automatic networking - Services can discover and connect to each other
Integrated monitoring - Metrics and dashboards included
Backup support - Automated backups for supported services
Health checks - Automatic monitoring and restart on failure

Every deployed service gets its own namespace (named with the OSID) for complete isolation and easy management.

Deploying Services

Accessing the Marketplace

Navigate to Services - Click "Services" in main navigation
Click "Add Service" or "Browse Marketplace"
Browse or search - Find the service you need
Click service card - View details and deployment options

Basic Deployment Steps

Step 1: Select Service

Browse marketplace categories
Use search to find specific services
Click the service to deploy

Step 2: Configure Service

Basic Configuration:

Service name - Descriptive name (e.g., postgres-prod, redis-cache)
Configuration tier - Lightweight, High Availability, or Enterprise
Resource allocation - CPU, memory, and storage

Advanced Configuration:

Storage class - OpenEBS (default, high-performance) or Longhorn (distributed)
Node affinity - Optional tags to run on specific nodes
Service dependencies - Link to compatible services already deployed
Authentication - Enable/disable with auto-generated or custom credentials

Step 3: Review and Deploy

Review all settings
Click "Deploy Service"
Wait for provisioning (typically 2-5 minutes)
Access connection info and credentials

Configuration Tiers

Lightweight:

Single instance
Minimal resource allocation
Local storage
Use for: Development, testing, low-traffic applications

High Availability:

Multiple replicas (2-3 depending on service)
Automatic failover
Load balancing
Use for: Production applications, business-critical services

Enterprise (select services):

Maximum replicas
Advanced replication
Performance optimizations
Use for: High-traffic production, compliance requirements

Service Intelligence Features

Automatic Dependency Discovery:

Services automatically discover compatible dependencies
Dropdowns only show compatible services (e.g., services requiring Redis with auth only show password-enabled instances)
Connection configured automatically

Context-Aware Storage:

Databases (PostgreSQL, MySQL): Default OpenEBS for performance
Bulk storage (MinIO, Harbor): Longhorn option for distributed storage
Smart recommendations based on service type

Deployment Example

PostgreSQL Production Database:

Service name: postgres-prod
Tier: High Availability
CPU: 2 cores
Memory: 4Gi
Storage: 50Gi
Storage class: OpenEBS
Authentication: Enabled
Result: PostgreSQL HA cluster with 2 replicas, automatic failover, daily backups
Connection: postgres://username:password@postgres-prod.<OSID>.svc.cluster.local:5432

Managing Services

Viewing Service Status

Service List View: Navigate to Services to see all deployed services with:

Service name and OSID
Service type (PostgreSQL, Redis, etc.)
Status (Running, Pending, Failed, Degraded)
Replica count and health
Resource usage

Service Detail View: Click any service to view:

Overview - Status, connection info, resource charts, events
Metrics - Service-specific metrics and Grafana dashboards
Logs - Real-time log streaming and search
Configuration - Current settings and resource allocations
Events - Kubernetes events and deployment history

Modifying Resources

Adjust CPU and Memory:

Navigate to service detail page
Click "Edit Configuration"
Modify CPU/memory requests and limits
Save changes
Service performs rolling update with no downtime (multi-replica services)

Expand Storage:

Edit service configuration
Update storage size (must be larger, cannot shrink)
Save changes
Volume expands automatically

Restarting Services

Graceful Restart (Recommended):

Navigate to service detail page
Click "Restart Service"
Confirm restart
Rolling restart of all pods

Via kubectl:

# Graceful rolling restart
kubectl rollout restart statefulset -n <OSID>

# Monitor progress
kubectl rollout status statefulset -n <OSID>

Service Tagging

Control which nodes run your service by using tags:

Why use tags:

Hardware requirements (GPU, SSD, high memory)
Dedicated database nodes
Regional placement
Development vs production isolation

Add tags:

Edit service configuration
Enter tags in "Node Affinity" section (e.g., database,ssd)
Save changes
Service reschedules to nodes with matching tags

Tag matching: All tags must match. A service with tags database,ssd only runs on nodes with both tags.

Deleting Services

Before deletion:

Export all important data
Verify backups are complete
Update applications to remove dependencies

Deletion process:

Navigate to service detail page
Click "Delete Service"
Type service name to confirm
Click "Permanently Delete"

What gets deleted: All pods, persistent volumes, configuration, networking, and monitoring. Backups stored externally are preserved.

Service Backups

Backup Support by Service

Full backup support:

PostgreSQL - CloudNativePG operator with automated backups, point-in-time recovery
MySQL - Operator-managed backups with S3 storage
MinIO - Object versioning and replication

Manual backup procedures available for all other services.

PostgreSQL Backups

PostgreSQL backups are managed by the CloudNativePG operator with automated backup capabilities and point-in-time recovery.

Configure backups:

During deployment, expand "Backup Configuration"
Set backup schedule using cron expression (e.g., 0 2 * * * for daily at 2 AM)
Set retention period in days
Configure S3-compatible storage (recommended for production)

Backup method:

Continuous archiving using Write-Ahead Logs (WAL)
Storage options: Local persistent volume or S3-compatible storage
Point-in-time recovery to any moment within WAL retention period

Manual backup:

Navigate to service → Backups tab
Click "Create Backup Now"
Wait for completion

Restore from backup:

Deploy new PostgreSQL service
Select "Restore from backup"
Choose backup source and optionally specify point-in-time recovery target
Complete deployment
Verify data

MySQL Backups

MySQL backups are managed by the MySQL operator with S3-compatible storage and point-in-time recovery capabilities.

Configuration:

Requires S3-compatible storage
Backup schedule is configurable using cron expressions
Retention period is configurable based on your requirements
Uses mysqldump or Xtrabackup for backup operations

Backup method:

Binary log archiving enables point-in-time recovery
Storage options: S3-compatible storage
Point-in-time recovery to any moment within binary log retention period (near-minute recovery precision)

Restore: Create a new MySQL service and select "Restore from backup" during initialization. You can optionally specify a point-in-time recovery target to restore to a specific moment.

MinIO Backups

Object Versioning:

Access MinIO console
Navigate to bucket → Settings
Enable versioning
Set retention policy

Benefits: Recover from accidental deletion, restore previous versions, ransomware protection.

Replication:

Replicate to another MinIO instance for disaster recovery
Geographic redundancy
Active-active or active-passive setup

Manual Backup Procedures

For services not listed above (Redis/Valkey, ClickHouse, etc.), you can perform manual backups:

Use service-specific backup tools via kubectl exec
Export data using the service's native backup commands
Copy backup files to external storage for safekeeping
Refer to the service's official documentation for best practices

Backup Best Practices

3-2-1 Rule:

3 copies of data (production + local backup + off-site)
2 different media types (volumes + object storage)
1 copy off-site (different cluster/datacenter)

Retention policies: Configure retention policies based on your requirements. Example retention strategies:

Production environments: 7 daily, 4 weekly, 12 monthly backups
Development environments: 3 daily backups

Regular testing:

Monthly restore tests for production
Quarterly disaster recovery drills
Document test results

Monitor backups:

Alert on backup failures
Track backup sizes
Monitor storage usage
Verify backup completion

Recovery Time and Data Loss

PostgreSQL:

Recovery time: 5 minutes to 4 hours (depends on size)
Data loss (daily backups): Up to 24 hours
Data loss (WAL archiving): Minutes or less

MySQL:

Similar to PostgreSQL

Redis:

Recovery time: 1-5 minutes
Data loss: Last snapshot interval

Common Service Patterns

Database + Application:

Deploy database (PostgreSQL, MySQL)
Note connection string
Deploy application with database URL
Application automatically connects

AI Platform:

Deploy Ollama with GPU node affinity
Deploy Open WebUI connected to Ollama
Configure applications to use Ollama API

Analytics Stack:

Deploy ClickHouse for analytics
Deploy Grafana for visualization
Deploy Kafka for data ingestion
Connect pipelines to ClickHouse

Viewing Logs and Metrics

Logs

Basic logs (always available):

Navigate to service → Logs tab
Select pod
View streaming logs

Via kubectl:

# View logs
kubectl logs -n <OSID> <pod-name>

# Stream logs
kubectl logs -n <OSID> <pod-name> -f

# Previous instance (if crashed)
kubectl logs -n <OSID> <pod-name> --previous

Advanced logs (with Vector):

Full-text search across all services
Historical retention
Log correlation
Export capabilities

Metrics

Console integration:

CPU and memory usage
Network and storage metrics
Service-specific metrics (queries/sec, connections, cache hits)

Grafana dashboards:

Click "View in Grafana" on service page
Pre-built dashboards for each service type
Custom dashboard creation
Alerting capabilities

Prometheus queries:

# CPU usage by pod
rate(container_cpu_usage_seconds_total{namespace="<OSID>"}[5m])

# Memory usage
container_memory_usage_bytes{namespace="<OSID>"}

# Service-specific metrics
rate(http_requests_total{namespace="<OSID>"}[5m])

Troubleshooting

Service Won't Start

Check pod status:

kubectl get pods -n <OSID>
kubectl describe pod <pod-name> -n <OSID>
kubectl logs <pod-name> -n <OSID>

Common causes:

Insufficient resources (CPU/memory)
Storage provisioning failed
Image pull failure
Configuration error
No nodes matching tags

Cannot Connect to Service

Check service endpoints:

kubectl get svc -n <OSID>
kubectl get endpoints -n <OSID>

Common causes:

Pods not ready yet
Health checks failing
Network policy blocking traffic
Wrong connection string

High Resource Usage

View metrics to identify bottleneck
Check for runaway queries or processes
Scale up resources if legitimate load
Restart to clear memory leaks

Quick Reference

# View all services (OSIDs are namespaces)
kubectl get namespaces

# View service resources
kubectl get all -n <OSID>

# Check service status
kubectl get pods -n <OSID>

# View service logs
kubectl logs -n <OSID> <pod-name> -f

# View service endpoints
kubectl get svc,endpoints -n <OSID>

# Resource usage
kubectl top pods -n <OSID>

# Restart service
kubectl rollout restart statefulset -n <OSID>

# Delete service
kubectl delete namespace <OSID>

RunOS services combine simplicity with the power and flexibility of Kubernetes, providing production-ready deployments with minimal configuration.

What are RunOS Services?​

Deploying Services​

Accessing the Marketplace​

Basic Deployment Steps​

Configuration Tiers​

Service Intelligence Features​

Deployment Example​

Managing Services​

Viewing Service Status​

Modifying Resources​

Restarting Services​

Service Tagging​

Deleting Services​

Service Backups​

Backup Support by Service​

PostgreSQL Backups​

MySQL Backups​

MinIO Backups​

Manual Backup Procedures​

Backup Best Practices​

Recovery Time and Data Loss​

Common Service Patterns​

Viewing Logs and Metrics​

Logs​

Metrics​

Troubleshooting​

Service Won't Start​

Cannot Connect to Service​

High Resource Usage​

Quick Reference​