CompassNav

CompassNav: Steering from Path Imitation to Decision Understanding in Navigation

LinFeng Li 1,2 Jian Zhao 2 Yuan Xie 1 Xin Tan 1 Xuelong Li 2

1 East China Normal University; 2 The Institute of Artificial Intelligence (TeleAI), China Telecom

>

The top panel contrasts our End-to-End Goal Navigation paradigm with traditional approaches. Unlike Vision-Language Navigation (VLN), which relies on dense, step-by-step instructions, and complex Modular Navigation pipelines, CompassNav directly maps a high-level goal (e.g., "find the plant") to an action through integrated spatial logical reasoning.

The bottom panel details our core contribution-how to stimulate model reasoning ability: a paradigm shift from "Path Imitation" to "Decision Understanding." While traditional methods train agents to replicate a single expert trajectory and penalize any deviation, our agent learns to evaluate the relative quality of all feasible paths at each decision point. This approach cultivates a true "internal compass," enabling the agent to make more intelligent and flexible decisions in unseen environments.

CompassNav

Official code release for the paper CompassNav.

This repository contains two main components:

train/: training code for CompassNav-style policy learning, including SFT and RL stages.
Navigation/: evaluation code for embodied ObjectNav inference (built on top of VLMnav).

Repository URL: https://github.com/linengcs/CompassNav

Project Structure

CompassNav/
├── train/                     # Training code
│   ├── train.sh               # One-command training launcher
│   ├── examples/              # Prompt + reward examples
│   ├── verl/                  # Internal training runtime
│   ├── scripts/model_merger.py
│   ├── Dockerfile             # Recommended training container
│   └── requirements.txt
├── Navigation/                # Navigation evaluation/inference
│   ├── config/ObjectNav.yaml  # Main eval config
│   ├── scripts/main.py        # Single-process entry
│   ├── scripts/aggregator.py  # Parallel logging aggregator
│   ├── parallel.sh            # Multi-instance launcher (tmux)
│   └── src/                   # Env, agent, simulator wrapper, VLM API

1) Training

The training pipeline in this repository contains two stages:

SFT: based on LLaMA-Factory
RL: with configuration referenced from EasyR1, which is a simplified version of veRL

The current recommended RL entry point here is:

train/train.sh

1.1 Installation

For environment setup and dependency installation:

For SFT, please follow the official LLaMA-Factory instructions.
For RL, please follow the official EasyR1 instructions first, then return to this repository for task-specific configuration and launch.

This repository also provides train/Dockerfile if you prefer preparing the training environment with Docker.

1.2 Supervised Fine-Tuning (SFT)

The SFT part is based on LLaMA-Factory. The detailed CompassNav-specific SFT configuration is currently being organized and will be added in a future update.

1.3 Reinforcement Learning (RL) Configuration

Before launching training, update the following items:

Set MODEL_PATH in train/train.sh to your local base model path.
Prepare the config file referenced by the launcher: train/examples/config.yaml.
Adjust data.train_files and data.val_files in train/train.sh.
Keep or modify train/examples/format_prompt/nav.jinja as the prompt template.
Keep or modify train/examples/reward_function/nav.py as the reward function.

1.4 RL Data Format (Key Fields)

The training dataset loader expects fields compatible with train/verl/utils/dataset.py.

Required/important keys:

prompt (or custom prompt_key)
answer (or custom answer_key)
action_to_goal_dis (dict, used by reward function)
optional images for VLM training samples

1.5 Launch RL Training

Use the bundled one-command launcher:

cd train
bash train.sh

Model merge utility after distributed checkpointing:

python scripts/model_merger.py --local_dir /path/to/checkpoint_dir

2) Navigation (Evaluation)

2.1 Environment Setup

For navigation environment setup, please refer to the official VLMnav repository. Their installation process is based on a Conda environment with Habitat-Sim:

conda create -n vlm_nav python=3.9 cmake=3.14.0
conda activate vlm_nav
conda install habitat-sim=0.3.1 withbullet headless -c conda-forge -c aihabitat

cd Navigation
pip install -e .
pip install -r requirements.txt
export PYTHONPATH=$(pwd)/src:$PYTHONPATH

Notes:

The code is designed for Habitat-Sim-based ObjectNav evaluation.
VLMnav also requires downloading the Habitat-Matterport 3D Research dataset and benchmark datasets such as ObjectNav/GOAT-Bench. Please follow the dataset instructions in the official VLMnav repository.
VLM inference endpoints are configured in config/ObjectNav.yaml (base_url, model).

2.2 Configure Model Endpoints

Edit Navigation/config/ObjectNav.yaml:

agent_cfg.vlm_cfg.model_kwargs.base_url
agent_cfg.vlm_cfg.model_kwargs.model
agent_cfg.vlm_cfg.stop_model_kwargs.base_url
agent_cfg.vlm_cfg.stop_model_kwargs.model

2.3 Run Evaluation

Single process:

cd Navigation
export PYTHONPATH=$(pwd)/src:$PYTHONPATH
python scripts/main.py --config ObjectNav --name your_run_name -ne 50 -ms 500

Parallel (multi-instance, tmux-based):

cd Navigation
bash parallel.sh

Before using parallel.sh, set your hardware/runtime values in the script (e.g., NUM_GPU, INSTANCES, VENV_NAME, PORT).

3) Acknowledgements

This codebase builds on/openly adapts components from:

VLMnav (navigation framework basis for Navigation/)
LLaMA-Factory (SFT framework basis for train/)
EasyR1 (RL configuration/reference basis for train/)

Please also cite these projects where appropriate.

4) Citation

If you find this code useful, please cite our paper:

@article{compassnav2026,
  title={CompassNav: [Paper Title]},
  author={[Authors]},
  journal={[Venue]},
  year={2026}
}

(Replace placeholders with final camera-ready metadata.)

5) License

The repository currently contains upstream components with their own licenses (e.g., train/LICENSE).

Please ensure your redistribution and usage follow the corresponding upstream license terms.

TODO

Release CompassNav training code
Release CompassNav Object Goal Nav/Instance Image-Goal Nav test code
Release Compass-Data-22k
Release CompassNav-7B

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
Navigation		Navigation
assets		assets
docs		docs
train		train
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CompassNav: Steering from Path Imitation to Decision Understanding in Navigation

CompassNav

Project Structure

1) Training

1.1 Installation

1.2 Supervised Fine-Tuning (SFT)

1.3 Reinforcement Learning (RL) Configuration

1.4 RL Data Format (Key Fields)

1.5 Launch RL Training

2) Navigation (Evaluation)

2.1 Environment Setup

2.2 Configure Model Endpoints

2.3 Run Evaluation

3) Acknowledgements

4) Citation

5) License

TODO

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CompassNav: Steering from Path Imitation to Decision Understanding in Navigation

CompassNav

Project Structure

1) Training

1.1 Installation

1.2 Supervised Fine-Tuning (SFT)

1.3 Reinforcement Learning (RL) Configuration

1.4 RL Data Format (Key Fields)

1.5 Launch RL Training

2) Navigation (Evaluation)

2.1 Environment Setup

2.2 Configure Model Endpoints

2.3 Run Evaluation

3) Acknowledgements

4) Citation

5) License

TODO

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages