Vision-and-Language Driving: A Benchmark for Long-horizon Human-instructed Autonomous Driving

We introduce a novel task, Vision-and-Language Driving (VLD), aiming to enable vehicles to follow long-horizon human natural-language instructions to autonomously navigate in traffic environments.

The paper of this work is under review.

Data Collection

This paper builds a dataset for the VLD task. We use CARLA 0.9.15 to collect data with the Leaderboard 2.0 framework in Town 12. We design driving routes by configuring starting points, waypoints, and end points.

The vehicle is equipped with multiple sensors: four RGB cameras, four semantic cameras, four depth cameras, and one LiDAR. Our sensor placement is developed from the DriveLM placement scheme, which is shown in the figure below. The specifications of the sensors is displayed in the table below.

Downloads

Please use the password provided in the manuscript to access the download pages and unzip the zip files.

The full VLD dataset: download link

The preprocessed raw data: download link

Note: The preprocessed raw data contains additional information collected from the CARLA simulator. You may develop your own datasets based on the raw materials for your own tasks.

A mini sample data with a single route: download link

Demonstrations

The following video is an RGB recording of a vehicle's driving route selected from the dataset, with the instruction: Go straight along the current road, turn left at the T-junction after passing a blue-purple kiosk, then go straight, turn right at the T-junction after passing a white plastic table, then go straight, turn left at the T-junction at the end of the road, then go straight, turn left at the crossroads after passing a Coca-Cola vending machine, then go straight, turn right at the T-junction at the end of the road, then go straight and stop near the mailbox on the right side of the road.

A.collected.route.in.the.VLD.dataset.mp4

For more details, please refer to the paper.

The prediction results:

output_video.mp4

Acknowledgement

DriveLM

Contact

CityU Autonomous Systems Lab

Prof. Yuxiang Sun

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
pics		pics
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Vision-and-Language Driving: A Benchmark for Long-horizon Human-instructed Autonomous Driving

Data Collection

Downloads

Demonstrations

Acknowledgement

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Vision-and-Language Driving: A Benchmark for Long-horizon Human-instructed Autonomous Driving

Data Collection

Downloads

Demonstrations

Acknowledgement

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages