SIREN: Semantic, Initialization-Free Registration of Multi-Robot Gaussian Splatting Maps

We introduce SIREN, a semantic, initialization-free registration algorithm for multi-robot Gaussian Splatting (GSplat) maps. SIREN enables robust registration of multi-robot GSplat maps, with no access to camera poses, images, and inter-map relative poses, via semantics-grounded optimization centered on feature-rich regions.

Abstract

We present SIREN for registration of multi-robot Gaussian Splatting (GSplat) maps, with zero access to camera poses, images, and inter-map transforms for initialization or fusion of local submaps. To realize these capabilities, SIREN harnesses the versatility and robustness of semantics in three critical ways to derive a rigorous registration pipeline for multi-robot GSplat maps. First, SIREN utilizes semantics to identify feature-rich regions of the local maps where the registration problem is better posed, eliminating the need for any initialization which is generally required in prior work. Second, SIREN identifies candidate correspondences between Gaussians in the local maps using robust semantic features, constituting the foundation for robust geometric optimization, coarsely aligning 3D Gaussian primitives extracted from the local maps. Third, this key step enables subsequent photometric refinement of the transformation between the submaps, where SIREN leverages novel-view synthesis in GSplat maps along with a semantics-based image filter to compute a high-accuracy non-rigid transformation for the generation of a high-fidelity fused map. We demonstrate the superior performance of SIREN compared to competing baselines across a range of real-world datasets, and in particular, across the most widely-used robot hardware platforms, including a manipulator, drone, and quadruped. In our experiments, SIREN achieves about $90$x smaller rotation errors, 300 smaller translation errors, and 44x smaller scale errors in the most challenging scenes, where competing methods struggle.

SIREN: Robust Registration Pipeline

At its core, SIREN leverages open-vocabulary semantics within a principled optimization-based framework to enable the robust registration of multi-robot GSplat maps, via: (a) Semantic feature extraction and matching, (b) Coarse Gaussian-to-Gaussian geometric registration, and (c) Fine photometric registration, illustrated in the figure. In the first step, SIREN identifies corresponding pairs of Gaussians in a pair of GSplat maps via semantic matching. Subsequently, SIREN solves a Gaussian-to-Gaussian optimization problem to compute the optimal transformation aligning the pair of multi-robot maps with a robust objective function to guard against the impacts of outliers. Finally, SIREN harnesses the novel-view synthesis capabilities of Gaussian Splatting for fine registration via a structure-from-motion-based approach.


Semantics Visualization: Drone Mapping

Experiments

We compare SIREN to state-of-the-art baselines: GaussReg, PhotoReg, and RANSAC and ICP methods, using the rotation, translation, and scale error, and photometric scores (PSNR, SSIM, LPIPS). We evaluate each method on standard benchmark dataset and real- world robot datasets and discuss the results in our paper. We find that SIREN outperforms all baselines by significant margins.


Drone Mapping

Playroom Scene

Outdoor Scene

The website design was adapted from Nerfies.