IndoorR2X: Indoor Robot-to-Everything Coordination with LLM-Driven Planning

Abstract

Although robot-to-robot (R2R) communication improves indoor scene understanding beyond what a single robot can achieve, R2R alone cannot overcome partial observability without substantial exploration overhead or scaling team size. In contrast, many indoor environments already include low-cost Internet of Things (IoT) sensors (e.g., cameras) that provide persistent, building-wide context beyond onboard perception.

We therefore introduce IndoorR2X, the first benchmark and simulation framework for Large Language Model (LLM)-driven multi-robot task planning with Robot-to-Everything (R2X) perception and communication in indoor environments. IndoorR2X integrates observations from mobile robots and static IoT devices to construct a global semantic state that supports scalable scene understanding, reduces redundant exploration, and enables high-level coordination through LLM-based planning.

Extensive experiments across diverse settings demonstrate that IoT-augmented world modeling improves multi-robot efficiency and reliability, and we highlight key insights and failure modes for advancing LLM-based collaboration between robot teams and indoor IoT sensors.

Our IndoorR2X Framework

Our main contributions are threefold:

Novel R2X Benchmark: We introduce IndoorR2X, the first indoor multi-robot benchmark that strictly enforces partial observability and integrates configurable IoT sensors to evaluate realistic team coordination.
LLM-Driven Semantic Fusion: We propose a centralized framework that fuses onboard robot perception with ambient IoT signals into a shared global semantic state, enabling LLMs to plan parallel tasks without exhaustive physical exploration.
Empirical & Real-World Validation: Extensive simulations and physical deployments demonstrate our framework significantly reduces path length, action steps, and LLM token costs, while exhibiting high resilience to missing sensor data.

Qualitative Demonstrations

Qualitative demonstration in simulation — **Fig. 2 — Qualitative demonstration of IndoorR2X (simulation environment).** Three robots and IoT sensors coordinate to efficiently dispose of perishables, power down devices, and consolidate items in the family room.

Real world experiment — **Fig. 3 — Illustration of our real-world experiment.** Two mobile Stretch robots jointly perform tasks in a three-room environment, utilizing two web cameras for out-of-sight visibility. A third Stretch robot stands stationary by a robot dog as a target.

Performance & Scalability

**Fig. 4 — Scalability analysis.** Success rate and efficiency metrics as a function of team size ($N=2$ to $6$). While success remains stable up to $N=5$, the coordination overhead (total distance traveled) increases with fleet size.

Robustness to 'X' failures — **Fig. 5 — Robustness to “X” failures.** The system is resilient to missing detections, maintaining a constant success rate at the cost of increased travel. However, incorrect semantic status reports significantly impact success, as false positives can lead to unrecoverable planning errors.

Citation

If you find this project useful for your research, please use the following BibTeX entry:

@misc{yang2026indoorr2xindoorrobottoeverythingcoordination,
      title={IndoorR2X: Indoor Robot-to-Everything Coordination with LLM-Driven Planning}, 
      author={Fan Yang and Soumya Teotia and Shaunak A. Mehta and Prajit KrisshnaKumar and Quanting Xie and Jun Liu and Yueqi Song and Wenkai Li and Atsunori Moteki and Kanji Uchino and Yonatan Bisk},
      year={2026},
      eprint={2603.20182},
      archivePrefix={arXiv},
      primaryClass={cs.RO},
      url={https://arxiv.org/abs/2603.20182}, 
}