Tasks (`rpx_benchmark.tasks`)¶

End-to-end benchmark pipelines plus the task plugin registry. Nine of ten tasks have a runnable pipeline; object tracking is deferred until a sequence-per-sample protocol is decided.

Registry¶

`registry` ¶

Pluggable task registry for RPX benchmark pipelines.

Each benchmark task (monocular depth, segmentation, tracking, …) registers a :class:TaskSpec describing:

which modalities must be downloaded from HuggingFace,
which metric drives the ESD-weighted phase score,
how to parse CLI arguments into a typed config,
how to run the end-to-end pipeline (download → load → evaluate → report).

The CLI auto-discovers every registered task at startup and generates one rpx bench <task> subcommand per entry, so adding a new task requires zero changes to cli.py — you write a new module under rpx_benchmark/tasks/ and decorate the runner with @register_task(...).

Example

::

from rpx_benchmark.api import TaskType
from rpx_benchmark.tasks.registry import TaskSpec, register_task

@register_task(
    TaskSpec(
        task=TaskType.OBJECT_SEGMENTATION,
        display_name="Object Segmentation",
        primary_metric="miou",
        required_modalities=["rgb", "mask"],
        build_config=_build_seg_config,
        run=_run_segmentation,
        add_cli_arguments=_add_seg_cli_arguments,
    )
)
def _run_segmentation(cfg): ...

`TaskSpec(task: TaskType, display_name: str, primary_metric: str, required_modalities: List[str], build_config: BuildConfigFn, run: RunFn, add_cli_arguments: AddCliArgsFn, higher_is_better: bool = False, description: str = '')` `dataclass` ¶

Everything the CLI + runner need to drive a task end-to-end.

Parameters:

Name	Type	Description	Default
`task`	`TaskType`	Enum value identifying the task.	required
`display_name`	`str`	Human-readable name shown in CLI help and UI headers.	required
`primary_metric`	`str`	Metric key (as produced by the task's calculators) used to drive the ESD-weighted phase score. Lower is better unless :attr:`higher_is_better` is set.	required
`required_modalities`	`list[str]`	Hint describing which RPX modalities (`rgb`, `depth`, `mask`, …) this task needs. Used for docs and for the hub downloader; the per-task run function is responsible for the actual download.	required
`build_config`	`callable`	`(argparse.Namespace) -> config`. Converts CLI args into a typed config dataclass the task's `run` function expects.	required
`run`	`callable`	`(config) -> (BenchmarkResult, DeploymentReadinessReport \| None, {paths})`. The end-to-end pipeline entrypoint.	required
`add_cli_arguments`	`callable`	`(argparse.ArgumentParser) -> None`. Populates a subparser with the task's flags. The CLI calls this once when building its parser tree.	required
`higher_is_better`	`bool`	When True, the primary metric is interpreted higher-is-better in reports and deltas. Default False (error metrics).	`False`
`description`	`str`	One-line description used in subparser help.	`''`

`register_task(spec: TaskSpec) -> TaskSpec` ¶

Install a :class:TaskSpec into the global registry.

Returns the spec unchanged, so it can be used as a decorator or a direct call.

Raises:

Type	Description
`ConfigError`	If another spec is already registered under the same task.