Forge Roadmap

Current Status (v0.1)

Supported Formats

Format	Read	Write	Notes
RLDS	✅	✅	Open X-Embodiment, OpenVLA, Octo
LeRobot v2	✅	-	HuggingFace robotics
LeRobot v3	✅	✅	Target format for all conversions
Zarr	✅	-	Diffusion Policy datasets
HDF5	✅	-	robomimic, ACT/ALOHA datasets
ROS bags	✅	-	Raw robot recordings

Core Features

Format auto-detection
CLI (forge inspect, convert, visualize, formats, hub)
YAML configuration files
Parallel episode processing (--workers N)
Unified visualizer
HuggingFace Hub integration (hf:// URLs)

Completed (v0.1.1)

✅ HuggingFace Hub Integration

Status: Done

# Download and inspect
forge inspect hf://lerobot/pusht

# Download and convert
forge convert hf://openvla/modified_libero_rlds output/ --format lerobot-v3

# Search datasets
forge hub "robot manipulation"
forge hub --author lerobot

Features:

hf:// URL scheme for HuggingFace datasets
Automatic download and caching (~/.cache/forge/datasets/)
Dataset search/discovery (forge hub command)
Streaming support for large datasets (future)

Completed (v0.2)

✅ RLDS Writer

Status: Done

# Convert LeRobot to RLDS
forge convert lerobot_dataset/ output/ --format rlds

# Round-trip conversion (RLDS → LeRobot v3 → RLDS)
forge convert original_rlds/ temp/ --format lerobot-v3
forge convert temp/ reconverted_rlds/ --format rlds

Features:

Write RLDS format (TFRecord)
Generate proper episode metadata (dataset_info.json, features.json)
Support OXE-compatible schema
Round-trip conversion verified

High Priority (v0.3)

1. Validation Tools

Why: Ensure converted datasets are correct and complete.

forge validate output/ --check-frames --check-videos

Frame count validation
Video integrity checks
Schema compatibility checks
Compare source vs converted statistics

Medium Priority (v0.3)

4. Additional Formats

DROID Dataset

Large-scale bimanual manipulation dataset
Already converted to LeRobot: IPEC-COMMUNITY/droid_lerobot

BridgeData V2

60k+ trajectories from WidowX robots
Core component of Octo training

LIBERO

Simulation benchmark used by OpenVLA and Pi0
Pre-converted versions exist on HF Hub

5. Streaming & Large Dataset Support

Lazy loading for datasets > RAM
Chunked processing
Resume interrupted conversions

✅ Statistics & Analysis

forge stats dataset/ --plot
forge stats dataset/ --sample 100 --output stats.json

Action/state distributions (min, max, mean, std per dimension)
Episode length histograms
Coverage metrics (language, success labels, rewards)
JSON export for programmatic access
Matplotlib visualization with --plot flag

Low Priority (v0.4+)

7. HuggingFace Upload

forge upload output/ --repo my-org/my-dataset

8. Data Augmentation

Image augmentations during conversion
Action noise injection
Temporal subsampling

9. Format Converters

HDF5 reader (robomimic, ACT/ALOHA) ✅ Done
MuJoCo dataset support
Isaac Gym recordings

VLA Model Compatibility Matrix

Model	Training Format	Fine-tune Format	Forge Support
OpenVLA	RLDS	RLDS	✅ Read
Octo	RLDS	RLDS	✅ Read
Pi0/OpenPI	Proprietary	LeRobot v2	✅ Read/Write
ACT	HDF5	HDF5	✅ Read
Diffusion Policy	Zarr	Zarr	✅ Read

References

Open X-Embodiment - 1M+ trajectories, RLDS format
OpenVLA - Uses RLDS, fine-tunes on RLDS
Octo - Uses RLDS from OXE
OpenPI (Pi0) - Uses LeRobot for fine-tuning
LeRobot - HuggingFace robotics standard

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Forge Roadmap

Current Status (v0.1)

Supported Formats

Core Features

Completed (v0.1.1)

✅ HuggingFace Hub Integration

Completed (v0.2)

✅ RLDS Writer

High Priority (v0.3)

1. Validation Tools

Medium Priority (v0.3)

4. Additional Formats

DROID Dataset

BridgeData V2

LIBERO

5. Streaming & Large Dataset Support

✅ Statistics & Analysis

Low Priority (v0.4+)

7. HuggingFace Upload

8. Data Augmentation

9. Format Converters

VLA Model Compatibility Matrix

References

FilesExpand file tree

ROADMAP.md

Latest commit

History

ROADMAP.md

File metadata and controls

Forge Roadmap

Current Status (v0.1)

Supported Formats

Core Features

Completed (v0.1.1)

✅ HuggingFace Hub Integration

Completed (v0.2)

✅ RLDS Writer

High Priority (v0.3)

1. Validation Tools

Medium Priority (v0.3)

4. Additional Formats

DROID Dataset

BridgeData V2

LIBERO

5. Streaming & Large Dataset Support

✅ Statistics & Analysis

Low Priority (v0.4+)

7. HuggingFace Upload

8. Data Augmentation

9. Format Converters

VLA Model Compatibility Matrix

References