All real hardware, no simulation. Franka FR3 arm with a Robotiq gripper, physical totes, real objects. Every run is recorded with synced video and telemetry (you can watch any episode on the site).
That's the whole point – simulation benchmarks exist, but operators deploying robots care about real-world performance.
We're working on adding DreamZero (NVIDIA's latest) next. The leaderboard is open to any model – both open-source and closed-source. If you have a checkpoint, we'll run it on the same hardware under the same blind protocol. Closed-source participants can submit their model as a container and we evaluate it without accessing the weights. Reach out at hi@phail.ai if you want to submit.
I built this because I hit a wall with ML pipelines where I needed to feed S3 data into libraries that only understand local paths (like OpenCV imread, pandas, or PyTorch), and I didn't want to rewrite all my I/O code to use boto3 or s3fs.
Unlike s3fs which mounts S3 as a virtual filesystem (often slow for heavy random access), pos3 mirrors the specific data you need to a local cache before your code block runs. This means your script runs at native disk speed.
It handles the diffing/syncing automatically using a context manager:
reply