Robometer: Scaling General-Purpose Robotic Reward Models via Trajectory Comparisons

Anthony Liang,Yigit Korkmaz,Jiahui Zhang,Minyoung Hwang,Abrar Anwar,Sid Kaushik,Adit Shah,Alex Huang,Luke S. Zettlemoyer,Dieter Fox,Yu Xiang,Anqi Li,Andreea Bobu,Abhishek Gupta,Stephen Tu,Erdem Biyik,Jesse Zhang

Published 2026 in Unknown venue

ABSTRACT

General-purpose robot reward models are typically trained to predict absolute task progress from expert demonstrations, providing only local, frame-level supervision. While effective for expert demonstrations, this paradigm scales poorly to large-scale robotics datasets where failed and suboptimal trajectories are abundant and assigning dense progress labels is ambiguous. We introduce Robometer, a scalable reward modeling framework that combines intra-trajectory progress supervision with inter-trajectory preference supervision. Robometer is trained with a dual objective: a frame-level progress loss that anchors reward magnitude on expert data, and a trajectory-comparison preference loss that imposes global ordering constraints across trajectories of the same task, enabling effective learning from both real and augmented failed trajectories. To support this formulation at scale, we curate RBM-1M, a reward-learning dataset comprising over one million trajectories spanning diverse robot embodiments and tasks, including substantial suboptimal and failure data. Across benchmarks and real-world evaluations, Robometer learns more generalizable reward functions than prior methods and improves robot learning performance across a diverse set of downstream applications. Code, model weights, and videos at https://robometer.github.io/.

PUBLICATION RECORD

Publication year
2026
Venue
Unknown venue
Publication date
2026-03-02
Fields of study
Computer Science, Engineering
Identifiers
arXiv 2603.02115
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

PROGRESSLM: Towards Progress Reasoning in Vision-Language Models
2026cited by this paper
RoboReward: General-Purpose Vision-Language Reward Models for Robotics
2026influential reference
World Action Models are Zero-shot Policies
2026cited by this paper
TOPReward: Token Probabilities as Hidden Zero-Shot Rewards for Robotics
2026cited by this paper
Galaxea Open-World Dataset and G0 Dual-System VLA Model
2025cited by this paper
MolmoAct: Action Reasoning Models that can Reason in Space
2025cited by this paper
CUPID: Curating Data your Robot Loves with Influence Functions
2025cited by this paper
RoboArena: Distributed Real-World Evaluation of Generalist Robot Policies
2025influential reference
RoboMonkey: Scaling Test-Time Sampling and Verification for Vision-Language-Action Models
2025cited by this paper
Steering Your Diffusion Policy with Latent Space Reinforcement Learning
2025cited by this paper
SAFE: Multitask Failure Detection for Vision-Language-Action Models
2025cited by this paper
A Smooth Sea Never Made a Skilled SAILOR: Robust Imitation via Learning to Search
2025cited by this paper
HAND Me the Data: Fast Robot Adaptation via Hand Path Retrieval
2025cited by this paper
PEEK: Guiding and Minimal Image Representations for Zero-Shot Generalization of Robot Manipulation Policies
2025cited by this paper
A Vision-Language-Action-Critic Model for Robotic Real-World Reinforcement Learning
2025influential reference
Self-Improving Embodied Foundation Models
2025cited by this paper
Data Retrieval with Importance Weights for Few-Shot Imitation Learning
2025cited by this paper
Humanoid Policy ~ Human Policy
2025cited by this paper
Can We Detect Failures Without Failure Data? Uncertainty-Aware Runtime Failure Detection for Imitation Learning Policies
2025cited by this paper
Towards Improving Reward Design in RL: A Reward Alignment Metric for RL Practitioners
2025cited by this paper
Subtask-Aware Visual Reward Learning from Segmented Demonstrations
2025cited by this paper
Human2Robot: Learning Robot Actions from Paired Human-Robot Videos
2025cited by this paper
Qwen2.5-VL Technical Report
2025cited by this paper
MILE: Model-Based Intervention Learning
2025cited by this paper
Robot Data Curation with Mutual Information Estimators
2025cited by this paper
HAMSTER: Hierarchical Action Models For Open-World Robot Manipulation
2025cited by this paper
FAST: Efficient Action Tokenization for Vision-Language-Action Models
2025cited by this paper
Robo-Dopamine: General Process Reward Modeling for High-Precision Robotic Manipulation
2025influential reference
AdaPower: Specializing World Foundation Models for Predictive Manipulation
2025cited by this paper
Qwen3-VL Technical Report
2025cited by this paper
π*0.6: a VLA That Learns From Experience
2025cited by this paper
Masked IRL: LLM-Guided Reward Disambiguation from Demonstrations and Language
2025cited by this paper
Humanoid Everyday: A Comprehensive Robotic Dataset for Open-World Humanoid Manipulation
2025cited by this paper
FailSafe: Reasoning and Recovery from Failures in Vision-Language-Action Models
2025cited by this paper
SARM: Stage-Aware Reward Modeling for Long Horizon Robot Manipulation
2025influential reference
OpenGVL - Benchmarking Visual Temporal Progress for Data Curation
2025cited by this paper
RoboFAC: A Comprehensive Framework for Robotic Failure Analysis and Correction
2025cited by this paper
EgoDex: Learning Dexterous Manipulation from Large-Scale Egocentric Video
2025cited by this paper
ReWiND: Language-Guided Rewards Teach Robot Policies without New Demonstrations
2025influential reference
π0.5: a Vision-Language-Action Model with Open-World Generalization
2025cited by this paper
AutoEval: Autonomous Evaluation of Generalist Robot Manipulation Policies in the Real World
2025cited by this paper
"Task Success" is not Enough: Investigating the Use of Video-Language Models as Behavior Critics for Catching Undesirable Agent Behaviors
2024cited by this paper
DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset
2024influential reference
Rank2Reward: Learning Shaped Reward Functions from Passive Video
2024cited by this paper
Vision Language Models are In-Context Value Learners
2024influential reference
VICtoR: Learning Hierarchical Vision-Instruction Correlation Rewards for Long-horizon Manipulation
2024cited by this paper
FuRL: Visual-Language Models as Fuzzy Rewards for Reinforcement Learning
2024cited by this paper
OpenVLA: An Open-Source Vision-Language-Action Model
2024cited by this paper
Autonomous Improvement of Instruction Following Skills via Foundation Models
2024cited by this paper
FlowRetrieval: Flow-Guided Data Retrieval for Few-Shot Imitation Learning
2024cited by this paper
RACER: Rich Language-Guided Failure Recovery Policies for Imitation Learning
2024cited by this paper
ManiSkill3: GPU Parallelized Robotics Simulation and Rendering for Generalizable Embodied AI
2024cited by this paper
π0: A Vision-Language-Action Flow Model for General Robot Control
2024cited by this paper
Robot Policy Learning with Temporal Optimal Transport Reward
2024cited by this paper
Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation
2024cited by this paper
Unpacking Failure Modes of Generative Policies: Runtime Monitoring of Consistency and Progress
2024cited by this paper
Trajectory Improvement and Reward Learning from Comparative Language Feedback
2024cited by this paper
STRAP: Robot Sub-Trajectory Retrieval for Augmented Policy Learning
2024cited by this paper
Real-World Offline Reinforcement Learning from Vision Language Model Feedback
2024cited by this paper
RL-VLM-F: Reinforcement Learning from Vision Language Foundation Model Feedback
2024cited by this paper
RoboCLIP: One Demonstration is Enough to Learn Robot Policies
2023cited by this paper
Train Offline, Test Online: A Real Robot Learning Benchmark
2023cited by this paper
Do Embodied Agents Dream of Pixelated Sheep?: Embodied Decision Making using Language Guided World Modelling
2023cited by this paper
Reward Design with Language Models
2023cited by this paper
Vision-Language Models as Success Detectors
2023cited by this paper
Sigmoid Loss for Language Image Pre-Training
2023cited by this paper
DINOv2: Learning Robust Visual Features without Supervision
2023cited by this paper
Behavior Retrieval: Few-Shot Imitation Learning by Querying Unlabeled Datasets
2023cited by this paper
FurnitureBench: Reproducible real-world benchmark for long-horizon complex manipulation
2023cited by this paper
LIV: Language-Image Representations and Rewards for Robotic Control
2023cited by this paper
LIBERO: Benchmarking Knowledge Transfer for Lifelong Robot Learning
2023influential reference
Language to Rewards for Robotic Skill Synthesis
2023cited by this paper
Robot Learning with Sensorimotor Pre-training
2023cited by this paper
RH20T: A Comprehensive Robotic Dataset for Learning Diverse Skills in One-Shot
2023cited by this paper
HYDRA: Hybrid Robot Actions for Imitation Learning
2023cited by this paper
Multistage Cable Routing Through Hierarchical Imitation Learning
2023cited by this paper
Look Before You Leap: An Exploratory Study of Uncertainty Measurement for Large Language Models
2023cited by this paper
Language Reward Modulation for Pretraining Reinforcement Learning
2023cited by this paper
BridgeData V2: A Dataset for Robot Learning at Scale
2023cited by this paper
RoboAgent: Generalization and Efficiency in Robot Manipulation via Semantic Augmentations and Action Chunking
2023cited by this paper
MUTEX: Learning Unified Policies from Multimodal Task Specifications
2023cited by this paper
TAIL: Task-specific Adapters for Imitation Learning with Large Pretrained Models
2023cited by this paper
Eureka: Human-Level Reward Design via Coding Large Language Models
2023cited by this paper
Vision-Language Models are Zero-Shot Reward Models for Reinforcement Learning
2023cited by this paper
Which One? Leveraging Context Between Objects and Multiple Views for Language Grounding
2023cited by this paper
LiFT: Unsupervised Reinforcement Learning with Foundation Models as Teachers
2023cited by this paper
Intern VL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks
2023cited by this paper
Text2Reward: Reward Shaping with Language Models for Reinforcement Learning
2023cited by this paper
BC-Z: Zero-Shot Task Generalization with Robotic Imitation Learning
2022cited by this paper
VIP: Towards Universal Visual Reward and Representation via Value-Implicit Pre-Training
2022cited by this paper
Zero-Shot Reward Specification via Grounded Natural Language
2022cited by this paper
Watch and Match: Supercharging Imitation with Regularized Optimal Transport
2022cited by this paper
MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge
2022cited by this paper
Can Foundation Models Perform Zero-Shot Task Specification For Robot Manipulation?
2022cited by this paper
Learning to Mediate Disparities Towards Pragmatic Communication
2022cited by this paper
RT-1: Robotics Transformer for Real-World Control at Scale
2022cited by this paper
Transformer Adapters for Robot Learning
2022cited by this paper
Few-Shot Preference Learning for Human-in-the-Loop RL
2022cited by this paper
Interactive Language: Talking to Robots in Real Time
2022cited by this paper
Real-World Robot Learning with Masked Visual Pre-training
2022cited by this paper

CITED BY

No citing papers are available for this paper.