Skip to content

Loader (rpx_benchmark.loader)

Reads a manifest JSON and yields Sample batches to the runner.

loader

Data loading utilities for RPX benchmark.

Reads a JSON manifest describing a slice of the RPX dataset (usually produced by :func:rpx_benchmark.hub.download_split) and yields :class:~rpx_benchmark.api.Sample batches the :class:BenchmarkRunner can feed to a model.

Manifest format

::

{
  "task":   "monocular_depth",
  "root":   "/absolute/path/or/snapshot/root",
  "samples": [
    {
      "id":         "scene_001_clutter_00000",
      "scene":      "scene_001",
      "phase":      "clutter",
      "difficulty": "hard",
      "rgb":   "scenes/scene_001/0/rgb/00000.png",
      "depth": "scenes/scene_001/0/depth/00000.png",
      "mask":  "scenes/scene_001/0/mask/00000.png",
      "pose":  "scenes/scene_001/0/pose/00000.npz"
    },
    ...
  ]
}

All sample paths are resolved relative to root unless they are absolute. depth files are 16-bit PNGs in millimetres (as saved by save_device_data.py); the loader converts to float32 metres on read. pose files are .npz with position and orientation (T265 convention, [x, y, z, w] quaternion).

Errors

All load-time failures are raised as :class:~rpx_benchmark.exceptions.ManifestError so user code can catch them specifically.

RPXDataset(samples: List[Dict[str, Any]], task: TaskType, root: Path, batch_size: int = 1) dataclass

Iterates over RPX samples for a specific task.

Manifest format (JSON)::

{
  "task": "object_segmentation",
  "root": "/path/to/data",
  "samples": [
    {
      "id": "scene_001_clutter_00000",
      "scene": "scene_001",
      "phase": "clutter",
      "difficulty": "hard",
      "rgb":   "scene_001/0/rgb/00000.png",
      "depth": "scene_001/0/depth/00000.png",
      "mask":  "scene_001/0/mask/00000.png",
      "pose":  "scene_001/0/pose/00000.npz",
      ...
    }
  ]
}

All paths are relative to root. depth files are 16-bit PNG in millimetres (as saved by save_device_data.py). pose files are NPZ with keys position ([x,y,z] metres) and orientation ([x,y,z,w] quaternion) from the T265 tracker.

from_manifest(manifest_path: str | Path, batch_size: int = 1) -> 'RPXDataset' classmethod

Load a manifest JSON file from disk and return a dataset.

Parameters:

Name Type Description Default
manifest_path str or Path

Path to the manifest JSON. Produced either by :func:rpx_benchmark.hub.download_split or by a custom upload script.

required
batch_size int

Number of samples per iteration. Default 1.

1

Returns:

Type Description
RPXDataset

Raises:

Type Description
ManifestError

If the manifest file is missing, not valid JSON, or is missing required top-level fields.

Source code in rpx_benchmark/loader.py
@classmethod
def from_manifest(cls, manifest_path: str | Path, batch_size: int = 1) -> "RPXDataset":
    """Load a manifest JSON file from disk and return a dataset.

    Parameters
    ----------
    manifest_path : str or Path
        Path to the manifest JSON. Produced either by
        :func:`rpx_benchmark.hub.download_split` or by a custom
        upload script.
    batch_size : int
        Number of samples per iteration. Default 1.

    Returns
    -------
    RPXDataset

    Raises
    ------
    ManifestError
        If the manifest file is missing, not valid JSON, or is
        missing required top-level fields.
    """
    manifest_path = Path(manifest_path)
    if not manifest_path.is_file():
        raise ManifestError(
            f"Manifest file not found: {manifest_path}",
            hint="Did the HuggingFace download fail? Try rerunning with "
                 "--cache-dir pointing at a writable location.",
        )
    try:
        with manifest_path.open("r", encoding="utf-8") as f:
            manifest = json.load(f)
    except json.JSONDecodeError as e:
        raise ManifestError(
            f"Manifest at {manifest_path} is not valid JSON: {e}",
        ) from e
    return cls.from_dict(
        manifest, batch_size=batch_size, default_root=manifest_path.parent
    )

from_dict(manifest: Dict[str, Any], batch_size: int = 1, default_root: str | Path | None = None) -> 'RPXDataset' classmethod

Build a dataset from an already-parsed manifest dict.

Raises:

Type Description
ManifestError

If task is missing or unknown, or if samples is missing.

Source code in rpx_benchmark/loader.py
@classmethod
def from_dict(
    cls,
    manifest: Dict[str, Any],
    batch_size: int = 1,
    default_root: str | Path | None = None,
) -> "RPXDataset":
    """Build a dataset from an already-parsed manifest dict.

    Raises
    ------
    ManifestError
        If ``task`` is missing or unknown, or if ``samples`` is
        missing.
    """
    if "task" not in manifest:
        raise ManifestError(
            "Manifest is missing required field 'task'.",
            hint="Task must be one of: " +
                 ", ".join(t.value for t in TaskType),
        )
    try:
        task = TaskType(manifest["task"])
    except ValueError as e:
        raise ManifestError(
            f"Manifest task {manifest['task']!r} is not a known TaskType.",
            hint="Expected one of: " +
                 ", ".join(t.value for t in TaskType),
        ) from e

    if "samples" not in manifest:
        raise ManifestError(
            "Manifest is missing required field 'samples'.",
        )

    root = Path(manifest.get("root") or default_root or ".")
    samples = manifest["samples"]
    if not isinstance(samples, list):
        raise ManifestError(
            f"Manifest 'samples' must be a list, got {type(samples).__name__}",
        )
    log.debug("loaded manifest: task=%s root=%s samples=%d",
              task.value, root, len(samples))
    return cls(samples=samples, task=task, root=root, batch_size=batch_size)