CalibDB: Diverse Multimodal Calibration Benchmark

Justin Yue Ayoub Elidrissi Divyank Shah Jerin Peter Konstantinos Karydis Hang Qiu

University of California, Riverside

In submission, RSS 2025 Demo

CalibDB is a diverse and challenging multi-modal calibration dataset and benchmark. The dataset contains various discrete and continous traces from cameras and LiDARs, which are placed in different poses with dynamic extrinsics via robotic arms manipulation. The benchmark evaluates the state-of-the-art multi-modal calibration methods, which demonstrates the research gaps and the challenges existing methods face. The proposed pipeline and dataset pave the way for the community to develop more accurate, robust, domain-transferable multimodal calibration methods.

CalibDB Intro

CalibDB Overview

CalibDB's platform is useful for collecting a multi-modal calibration dataset. Placed in a motion capture (MoCap) environment, our platform mounts a LiDAR and camera sensors on 2 Kinova Gen3 Lite arms. We place MoCap markers on both the sensors and the robot arms' end effectors for precise tracking of their poses. The use of robot arms allows for automated and accurate placement of their respective sensors throughout the scene, allowing for diverse collection of the extrinsics between the sensors. These extrinsics are categorized as the following: discrete traces with static extrinsics, discrete traces with dynamic extrinsics, continuous traces with static extrinsics, and continuous traces with dynamic extrinsics. Some "discrete sequences w/ static extrinsics" samples can be found here.

Data Collection

Discrete Sequences w/ Static Extrinsics

Continuous Sequences w/ Static Extrinsics

Discrete Sequences w/ Dynamic Extrinsics

Qualitative Results

The baseline methods' effectiveness can be confirmed by perform lidar-to-camera overlays using their predicted extrinsics. Surprisingly, these predicted extrinsics do not result in good overlays. Koide3 tends to change the orientation of the point cloud, suggesting poor transfer to the indoor setting. In some cases, CalibAnything's overlays appear as the closest to the ground-truth overlay, possibly due to using the ground-truth transform as the initial guess. Regnet's incorrect extrinsics is especially obvious from the large error in depth. Calibnet's results are omitted due to no points from the LiDAR point cloud projected into the image.

Citation

TBD