Health Check

GET /v1/agents/{agent_id}/health

Check the health status of a deployed agent.

Request

GET https://api.run-agent.ai/v1/agents/{agent_id}/health
Authorization: Bearer YOUR_API_KEY

Path Parameters

agent_id

string

required

The unique identifier of the agent

Response

status

string

required

Overall health status: healthy, degraded, or unhealthy

checks

object

required

Individual health check results

version

string

Current agent version

uptime

number

Uptime in seconds

Examples

Basic Health Check

curl https://api.run-agent.ai/v1/agents/agent-123/health \
  -H "Authorization: Bearer YOUR_API_KEY"

Response Examples

Healthy Agent

{
  "status": "healthy",
  "checks": {
    "agent": {
      "status": "healthy",
      "response_time_ms": 45
    },
    "dependencies": {
      "openai_api": "healthy",
      "database": "healthy"
    },
    "resources": {
      "memory_usage_percent": 65,
      "cpu_usage_percent": 20
    }
  },
  "version": "1.2.3",
  "uptime": 3600,
  "last_request": "2024-01-01T12:00:00Z"
}

Degraded Agent

{
  "status": "degraded",
  "checks": {
    "agent": {
      "status": "healthy",
      "response_time_ms": 150
    },
    "dependencies": {
      "openai_api": "healthy",
      "database": "slow"
    },
    "resources": {
      "memory_usage_percent": 85,
      "cpu_usage_percent": 75
    }
  },
  "version": "1.2.3",
  "uptime": 7200,
  "warnings": ["High memory usage", "Database latency detected"]
}

Health Check Logic

Status is determined by:

Healthy: All checks pass
Degraded: Some checks show warnings but agent is functional
Unhealthy: Critical checks fail

Monitoring Integration

Automated Monitoring

import time

def monitor_agent(agent_id, interval=60):
    while True:
        try:
            response = requests.get(
                f"https://api.run-agent.ai/v1/agents/{agent_id}/health",
                headers={"Authorization": "Bearer YOUR_API_KEY"}
            )
            
            health = response.json()
            
            if health['status'] != 'healthy':
                send_alert(f"Agent {agent_id} is {health['status']}")
                
        except Exception as e:
            send_alert(f"Health check failed: {e}")
            
        time.sleep(interval)

Prometheus Integration

# Expose metrics for Prometheus
from prometheus_client import Gauge

agent_health = Gauge('agent_health_status', 'Agent health status', ['agent_id'])
memory_usage = Gauge('agent_memory_usage', 'Memory usage percentage', ['agent_id'])

def update_metrics(agent_id):
    health = get_agent_health(agent_id)
    
    status_value = {'healthy': 1, 'degraded': 0.5, 'unhealthy': 0}
    agent_health.labels(agent_id=agent_id).set(status_value[health['status']])
    
    memory = health['checks']['resources']['memory_usage_percent']
    memory_usage.labels(agent_id=agent_id).set(memory)

Best Practices

Regular Monitoring: Check health every 30-60 seconds
Set Alerts: Alert on status changes
Track Trends: Monitor resource usage over time
Implement Retries: Handle temporary network issues

Getting Started

Agent Endpoints

Advanced

GET /v1/agents/{agent_id}/health

Request

Path Parameters

Response

Examples

Basic Health Check

Response Examples

Healthy Agent

Degraded Agent

Health Check Logic

Monitoring Integration

Automated Monitoring

Prometheus Integration

Best Practices

See Also

Getting Started

Agent Endpoints

Advanced

​GET /v1/agents/{agent_id}/health

​Request

​Path Parameters

​Response

​Examples

​Basic Health Check

​Response Examples

​Healthy Agent

​Degraded Agent

​Health Check Logic

​Monitoring Integration

​Automated Monitoring

​Prometheus Integration

​Best Practices

​See Also

GET /v1/agents/{agent_id}/health

Request

Path Parameters

Response

Examples

Basic Health Check

Response Examples

Healthy Agent

Degraded Agent

Health Check Logic

Monitoring Integration

Automated Monitoring

Prometheus Integration

Best Practices

See Also