IndoorR2X: Indoor Robot-to-Everything Coordination
with LLM-Driven Planning

Fan Yang1, Soumya Teotia2, Shaunak A. Mehta1, Prajit KrisshnaKumar1, Quanting Xie2,
Jun Liu2, Yueqi Song2, Wenkai Li2, Atsunori Moteki1, Kanji Uchino1, and Yonatan Bisk2

1Fujitsu Research, 2Carnegie Mellon University

Abstract

Although robot-to-robot (R2R) communication improves indoor scene understanding beyond what a single robot can achieve, R2R alone cannot overcome partial observability without substantial exploration overhead or scaling team size. In contrast, many indoor environments already include low-cost Internet of Things (IoT) sensors (e.g., cameras) that provide persistent, building-wide context beyond onboard perception.

We therefore introduce IndoorR2X, the first benchmark and simulation framework for Large Language Model (LLM)-driven multi-robot task planning with Robot-to-Everything (R2X) perception and communication in indoor environments. IndoorR2X integrates observations from mobile robots and static IoT devices to construct a global semantic state that supports scalable scene understanding, reduces redundant exploration, and enables high-level coordination through LLM-based planning.

Extensive experiments across diverse settings demonstrate that IoT-augmented world modeling improves multi-robot efficiency and reliability, and we highlight key insights and failure modes for advancing LLM-based collaboration between robot teams and indoor IoT sensors.

Our IndoorR2X Framework

Our IndoorR2X framework
Fig. 1 — The IndoorR2X Framework. CCTV observations and other IoT device signals are collected to augment the world model beyond the perception range of the robots’ ego cameras. These heterogeneous observations are synchronized through a coordination hub, where an LLM-based online planner generates parallel actions for each robot and executes them to perform their respective tasks. As an example scenario, robots are assigned to perform household tasks in the morning. After potential overnight changes to object locations or device statuses, robots first update their indoor world model by leveraging the “X” observations.

Our main contributions are threefold:

  • Novel R2X Benchmark: We introduce IndoorR2X, the first indoor multi-robot benchmark that strictly enforces partial observability and integrates configurable IoT sensors to evaluate realistic team coordination.
  • LLM-Driven Semantic Fusion: We propose a centralized framework that fuses onboard robot perception with ambient IoT signals into a shared global semantic state, enabling LLMs to plan parallel tasks without exhaustive physical exploration.
  • Empirical & Real-World Validation: Extensive simulations and physical deployments demonstrate our framework significantly reduces path length, action steps, and LLM token costs, while exhibiting high resilience to missing sensor data.

Qualitative Demonstrations

Qualitative demonstration in simulation
Fig. 2 — Qualitative demonstration of IndoorR2X (simulation environment). Three robots and IoT sensors coordinate to efficiently dispose of perishables, power down devices, and consolidate items in the family room.
Real world experiment
Fig. 3 — Illustration of our real-world experiment. Two mobile Stretch robots jointly perform tasks in a three-room environment, utilizing two web cameras for out-of-sight visibility. A third Stretch robot stands stationary by a robot dog as a target.

Performance & Scalability

Scalability analysis
Fig. 4 — Scalability analysis. Success rate and efficiency metrics as a function of team size ($N=2$ to $6$). While success remains stable up to $N=5$, the coordination overhead (total distance traveled) increases with fleet size.
Robustness to 'X' failures
Fig. 5 — Robustness to “X” failures. The system is resilient to missing detections, maintaining a constant success rate at the cost of increased travel. However, incorrect semantic status reports significantly impact success, as false positives can lead to unrecoverable planning errors.

Citation

If you find this project useful for your research, please use the following BibTeX entry:

@misc{yang2026indoorr2xindoorrobottoeverythingcoordination,
      title={IndoorR2X: Indoor Robot-to-Everything Coordination with LLM-Driven Planning}, 
      author={Fan Yang and Soumya Teotia and Shaunak A. Mehta and Prajit KrisshnaKumar and Quanting Xie and Jun Liu and Yueqi Song and Wenkai Li and Atsunori Moteki and Kanji Uchino and Yonatan Bisk},
      year={2026},
      eprint={2603.20182},
      archivePrefix={arXiv},
      primaryClass={cs.RO},
      url={https://arxiv.org/abs/2603.20182}, 
}