Quickstart¶
This page gets you from a blank environment to a written benchmark report in about 60 seconds.
1. Install¶
See Installation for the full extras matrix.
2. List what's available¶
rpx ls # tasks + ESD splits + required modalities
rpx models # runnable adapter names (also shows deferred stubs)
Example output of rpx models:
Runnable models:
depth_anything_v2_metric_indoor_base
depth_anything_v2_metric_indoor_large
depth_anything_v2_metric_indoor_small
depth_pro
metric3d_v2_vit_giant2
metric3d_v2_vit_large
metric3d_v2_vit_small
unidepth_v2_vitb
unidepth_v2_vitl
zoedepth_nyu
Deferred (registered for visibility; raise on resolve):
depth_anything_3
prompt_depth_anything_vits
video_depth_anything_large
3. Run a registered model on the Hard split¶
4. Or run ANY HuggingFace depth checkpoint — zero code¶
rpx bench monocular_depth \
--hf-checkpoint depth-anything/Depth-Anything-V2-Metric-Indoor-Large-hf \
--split hard
Works with any model loadable via
transformers.AutoModelForDepthEstimation.
5. Run segmentation¶
rpx bench object_segmentation \
--hf-checkpoint facebook/mask2former-swin-tiny-coco-instance \
--split hard
6. Read the output¶
Every run writes two files under
./rpx_results/<model>/<split>/:
result.json— machine-readable. Contains:aggregated: metrics averaged over the split.per_sample: one row per sample with metric values andid/phase/difficultymetadata, so downstream analysis can group back to scenes and phases without re-reading the manifest.deployment_readiness: Weighted Phase Score, State-Transition Robustness, Temporal Stability, FLOPs, median latency, parameter count.
summary.md— human-readable. Rendered with the same tables the CLI prints.
7. Live terminal output¶
If rich is installed (pulled in by [depth]), the CLI renders
Claude-Code-style panels:
- A header panel with model / split / device
- Live progress bar during inference
- Aggregated metric table
- ESD-weighted phase score table with coloured Δ_int / Δ_rec
- Efficiency table (params / FLOPs / latency)
- Footer pointing at the output files
Pass --plain to disable the rich UI for CI logs / plain ssh / tmux
weirdness.
8. Global flags¶
Exit codes:
| Code | Meaning |
|---|---|
0 |
Success |
1 |
RPXError (config / dataset / model / metric / download failure) |
2 |
CLI argument error (handled by argparse) |
130 |
KeyboardInterrupt |
All errors subclass rpx.RPXError and carry a hint line telling
the user exactly what to fix.