Bring Your Own Model¶

The toolkit is built so the only variable is the model. Datasets, splits, metrics, reports, and deployment-readiness scoring are fixed. Users touch only the model — input and output adapters are already shipped for the two common families.

Adapter framework at a glance¶

Sample ───► InputAdapter.prepare ───► PreparedInput(payload, context)
                                              │
                                              ▼
                                       model(payload)
                                              │
                                              ▼
Sample, context, model_output ───► OutputAdapter.finalize ───► Prediction

A BenchmarkableModel composes three things: an input adapter, a model callable (anything that responds to (payload) or (**payload)), and an output adapter. All three together satisfy the BenchmarkModel ABC that the runner consumes.

Three fast paths¶

Zero code — any HF checkpointPlain numpy callableCustom torch / transformers stack

rpx bench monocular_depth \
    --hf-checkpoint my-org/my-depth-model \
    --split hard

Works with any checkpoint loadable via transformers.AutoModelForDepthEstimation. The shipped HFDepthInputAdapter / HFDepthOutputAdapter pair handles preprocessing, postprocessing, and resize-back-to-GT.

The output adapter introspects the processor signature at setup time so it correctly handles checkpoints with different post-process kwargs (DA-v2 takes target_sizes only; ZoeDepth also requires source_sizes; PromptDA takes an optional prompt_depth).

import numpy as np
import rpx_benchmark as rpx

def my_depth(rgb: np.ndarray) -> np.ndarray:
    """rgb: H x W x 3 uint8 → returns H x W float32 in metres."""
    ...
    return depth_map

bm = rpx.make_numpy_depth_model(my_depth, name="my_numpy_depth")

cfg = rpx.MonocularDepthRunConfig(model=bm, split="hard", device="cpu")
result, report, paths = rpx.run_monocular_depth(cfg)

print(result.aggregated)                     # absrel, rmse, delta1..3
print(report.weighted_phase_score.to_dict()) # phase-stratified scores

For segmentation the symmetric helper is rpx.make_numpy_mask_model(fn) where fn(rgb) → int32 mask.

from transformers import AutoImageProcessor, AutoModelForDepthEstimation
import rpx_benchmark as rpx
from rpx_benchmark.adapters.depth_hf import (
    HFDepthInputAdapter, HFDepthOutputAdapter,
)

processor = AutoImageProcessor.from_pretrained("my-org/my-model")
model = (
    AutoModelForDepthEstimation
    .from_pretrained("my-org/my-model")
    .to("cuda").eval()
)

bm = rpx.BenchmarkableModel(
    task=rpx.TaskType.MONOCULAR_DEPTH,
    input_adapter=HFDepthInputAdapter(processor=processor, device="cuda"),
    model=model,
    output_adapter=HFDepthOutputAdapter(processor=processor),
    name="my_model",
)

rpx.run_monocular_depth(
    rpx.MonocularDepthRunConfig(model=bm, split="hard")
)

For non-HuggingFace model families, clone rpx_benchmark/adapters/depth_unidepth.py (UniDepth pattern) or rpx_benchmark/adapters/depth_metric3d.py (torch.hub + letterbox pattern).

Writing your own input / output adapter¶

The protocols are minimal:

from typing import Any, Dict
from rpx_benchmark.adapters import (
    BenchmarkableModel, InputAdapter, OutputAdapter, PreparedInput,
)
from rpx_benchmark.api import DepthPrediction, Sample, TaskType

class MyInputAdapter(InputAdapter):
    def setup(self) -> None: ...           # optional one-time init
    def prepare(self, sample: Sample) -> PreparedInput:
        # Return whatever your model wants plus any context you need
        # later for post-processing.
        return PreparedInput(
            payload={"pixel_values": some_tensor},
            context={"target_hw": sample.rgb.shape[:2]},
        )

class MyOutputAdapter(OutputAdapter):
    def setup(self) -> None: ...
    def finalize(self, model_output: Any, context: Dict[str, Any],
                 sample: Sample) -> DepthPrediction:
        # Run your post-processing and return a task Prediction dataclass.
        return DepthPrediction(depth_map=...)

bm = BenchmarkableModel(
    task=TaskType.MONOCULAR_DEPTH,
    input_adapter=MyInputAdapter(),
    model=my_model_object,                 # any callable or nn.Module
    output_adapter=MyOutputAdapter(),
    name="my_custom_model",
)

The default invoker calls model(**payload) when the payload is a dict, otherwise model(payload). Override with invoker= if your model needs a different calling convention (see how make_unidepth_v2_model uses a custom invoker to call .infer(...)).