research/ProjectOverview at main · spatial-ninjas/research · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
Overview
This document collects the Spatial Ninjas literature summaries, synthesis notes, and next steps.

Project OVERVIEW (updated 13/04 2026)

LLM models chosen for this project: GPT and Gemini.
Gemma3 is being used for testing purposes in the Test_LLM_for_routing.py as example, although the code can be used for any LLM. Routingpy Module is being used as Ground Truth while a certain area of Helsinki from Open Street Map (OSM) is chosen as the reference map.

Our Current Script Inventory Our team had developed three scripts that connect raw map data with AI-driven navigation:

Data Processing (network_to_ssal.py): We extract road data from OpenStreetMap (OSM) for Southern Helsinki. To save tokens, we convert complex geographical data into SSAL (Simplified Semantic Adjacency List). SSAL only keeps essentials: Node IDs, street names, lengths, and one-way status.

The Engine (Test_LLM_for_routing.py): This is the algorithm that feeds the SSAL data and a routing prompt to the LLM. It asks the LLM to act as a GPS and output a route in JSON format.

The Interface (app.py): A GUI that allows us to compare OpenAI and Gemini models side-by-side. It persists all test history in a SQLite database (history.db) for later analysis.

Technical Specs & Data Logic To keep the models efficient, we have pruned the OSM attributes:

FeatureLogicSSAL FormatNode: Neighbor {Length, Name, Direction}. Minimalist for token efficiency. Edge FilteringWe keep u, v, name, length, and oneway. We discard speed limits, lane counts, and road types. Node FilteringWe keep osmid and x/y coordinates so the LLM understands "North/South" and relative positions.

Update 15/04 2026

Compare Routes.py was created and succesfully tested. The script takes the JSON output from the GUI after being passed to the LLM, converts the nodes into coordinates, asks the Routingpy algorithm to make the same route and compares both the correct selected nodes as well as distance estimation in percentate.

The following tests were made so far:

Test 1
Find path between Origin Node: 25291537 (Bulewardi) to Destination Node: 313984198 (Bulewardi).

Results:

Gemini 2.5 Flash: Failed to deliver correct format in response.

GPT 5.4 Mini: Node Sequence Accuracy: 66.7%, Distance Precision: 99.1%

Test 2
Find path between Origin Node: 25291537 (Bulewardi) to Destination Node: 313984198 (Bulewardi).

Results:

Gemini 2.5 Flash: Failed to deliver correct format in response.

GPT 5.4: Node Sequence Accuracy: 66.7%, Distance Precision: 99.1%


Test 3
Find path between Origin Node: 25291537 (Bulewardi) to Destination Node: 311112501 (Korkeavuorenkatu).

Results:

Gemini 2.5 Flash: Failed to deliver correct format in response.

GPT 5.4: Node Sequence Accuracy: 8.1%, Distance Precision: 2.9%

Test 4
Find path between Origin Node: 25291564 (Bulewardi) to Destination Node: 25291567 (Yrjonkatu).

Results:

Gemini 2.5 Flash: Failed to deliver correct format in response.

GPT 5.4: Node Sequence Accuracy: 25.0%, Distance Precision: 24.1%

Test 5
Find path between Origin Node: 25291537 (Bulewardi) to Destination Node: 25291550 (Uudenmaankatu).

Results:

Gemini 2.5 Flash: Failed to deliver correct format in response.

GPT 5.4 Mini: Node Sequence Accuracy: 15.8%, Distance Precision: 4.2%

Test 6
Find path between Origin Node: 313984203 (Bulewardi) to Destination Node: 3232013778 (Annankatu).

Results:

Gemini 2.5 Flash: Failed to deliver correct format in response.

GPT 5.4: Failed to deliver correct format in response.

Test 7
Find path between Origin Node: 25291537 (Bulewardi) to Destination Node: 3228745582 (Hietalandenventa).

Results:

Gemini 2.5 Flash: Failed to deliver correct format in response.

GPT 5.4: Failed to deliver correct format in response.

Comment: It appears that when the tests become more difficult the accuracy significantly diminishes even for GPT 5.4. Gemini 2.5 Flash fails everytime to deliver a correct response format and keeps getting stuch with a response looking like this and stops short before delivering the full answer.

response_text"```json { "origin": "25291564", "