Mission Control

GPU Cluster Dashboard

Connecting...

Queue Depth

click to view

Recent Runs

0

click to view

Workers Online

click to inspect

Models Available

click to manage

Cluster Health

Telemetry offline
RTX 3090 CUDA
Offline
Model
VRAM
— / 24 GB
Load
— %
Temp —°C
Power — W
Errors
Jobs Done
Vega 64 #1 ROCm
Offline
Model
VRAM
— / 8 GB
Load
— %
Temp —°C
Power — W
Errors
Jobs Done
Vega 64 #2 ROCm
Offline
Model
VRAM
— / 8 GB
Load
— %
Temp —°C
Power — W
Errors
Jobs Done

Cluster Intel

Observed lifetime counters and recent execution outcomes

Submitted

Completed

Failed

Observed Active

Last 20 Cluster Runs

Loading…
Finished GPU Kind Status Model Queue Run Tokens Failure / Notes

Loading…

Failure Intelligence

Recent retained history

Failure Buckets

Bucket Failed Examples

Loading…

Model Failure Rates

Model Total Failed Rate Buckets

Loading…

Worker Failure Rates

Worker / GPU Total Failed Rate Buckets

Loading…

Quick Fire

One-click test prompts

Current Queue

Loading…
Central Queue
Observed Active
Running Now
Queued / Assigned
Pos Queued Kind Status GPU Model Progress Wait Notes

Loading…

GPU Lifetime Metrics

Loading workers...

Call Types By GPU

Success/failure split for every observed call type
GPU Call Type Total Success Failed Avg Queue Avg Run Tokens

Loading…

Status
Mode
Status Started Mode Model Latency Queue Tokens Tok/s GPU Trace ID

Loading…

Available Models

Loading models...

Download New Model

• HuggingFace: Qwen/Qwen2.5-7B-Instruct
• CivitAI: https://civitai.com/models/...
• Direct: https://example.com/model.safetensors