This project was developed with a classmate, Naria SAVARY, during my 3rd year at Blagnac's IUT (University Institute of Technology) in Computer Science in France.
The goal was to compare the performance of several pathfinding algorithms (such as Dijkstra, A*, etc.) using real-world OpenStreetMap (OSM) data.
Unlike a full OSM processing pipeline, the focus here is entirely on algorithmic implementation and benchmarking, using CSV datasets that were preprocessed and provided by our teacher. You can find the code to preprocess the datasets here in French.
This project was made in a short period of time and with limited knowledge. Some parts could be greatly improved by optimizing some loops or using different data structures.
The project implements and compare the 3 following algorithms:
- Dijkstra algorithm
- A* algorithm
- Greedy Dijkstra algorithm
All the project is written in Python. To compare the algorithms they are run one after the other with the same dataset and start and end coordinates (nodes).
The last algorithm almost always fails because it only searches for the next node with the smallest distance without looking at previous, smaller paths.
-
main.py: run this file to test the project. -
constants.py: pretty straightforward, the file containing all the project constants. Use this file to change which datasets, start and destination nodes are used. -
global_file.py: file containing methods and classes used by all the algorithms. -
dijksta.py: file containing the Dijkstra algorithm implementation. -
greedy_dijksta.py: file containing the Greedy Dijkstra algorithm implementation. -
a_star.py: file containing the A * algorithm implementation. -
exporting_OSM_data_to_CSV/osm2csv.py: file (in French) used to transform pbf files to csv.
You can look at the report here to see the results of our benchmark and further explanations on the project.
The project requires the Python dependencies pandas and haversine, which can be installed in a virtual environment using the following commands:
pip install pandas
pip install pandas-stubspip install haversineTo run the project, you will also need a dataset. You can download them from the releases. In this report, we used the following files:
data_ariege/osm_nodes.csvdata_ariege/osm_ways.csv.gzdata_serres/osm_nodes.csvdata_serres/osm_ways.csv
Place the folder containing the data at the same level as the project files.
The data files can have either the .csv or .csv.gz extension.
In the constants.py file, update the relative paths for NODES and WAYS so they match your data.
You can then comment or uncomment the STARTING_NODE_ID and DESTINATION_NODE_ID constants depending on which nodes you want to use.
Finally, you can run the program with the following command:
python3 main.py