Queue Depth
—
click to view
Recent Runs
0
click to view
Workers Online
—
click to inspect
Models Available
—
click to manage
Cluster Health
RTX 3090
CUDA
Offline
Model
—
VRAM
— / 24 GB
Load
— %
Temp
—°C
Power
— W
Errors
—
Jobs Done
—
Vega 64 #1
ROCm
Offline
Model
—
VRAM
— / 8 GB
Load
— %
Temp
—°C
Power
— W
Errors
—
Jobs Done
—
Vega 64 #2
ROCm
Offline
Model
—
VRAM
— / 8 GB
Load
— %
Temp
—°C
Power
— W
Errors
—
Jobs Done
—
Cluster Intel
Submitted
—
Completed
—
Failed
—
Observed Active
—
Last 20 Cluster Runs
| Finished | GPU | Kind | Status | Model | Queue | Run | Tokens | Failure / Notes |
|---|---|---|---|---|---|---|---|---|
Failure Intelligence
Failure Buckets
| Bucket | Failed | Examples |
|---|---|---|
Model Failure Rates
| Model | Total | Failed | Rate | Buckets |
|---|---|---|---|---|
Worker Failure Rates
| Worker / GPU | Total | Failed | Rate | Buckets |
|---|---|---|---|---|
Quick Fire
Current Queue
Central Queue
—
Observed Active
—
Running Now
—
Queued / Assigned
—
| Pos | Queued | Kind | Status | GPU | Model | Progress | Wait | Notes |
|---|---|---|---|---|---|---|---|---|
GPU Lifetime Metrics
Call Types By GPU
| GPU | Call Type | Total | Success | Failed | Avg Queue | Avg Run | Tokens |
|---|---|---|---|---|---|---|---|
| Status | Started | Mode | Model | Latency | Queue | Tokens | Tok/s | GPU | Trace ID |
|---|---|---|---|---|---|---|---|---|---|