Skip to content

[SIP-28] Proposal for Geocoding #8544

@Sascha-Gschwind

Description

@Sascha-Gschwind

[SIP] Proposal for Geocoding

Motivation

A Superset user wants to use address-based data to generate charts like deck.gl Scatterplot. They then first need to convert their address based data using an external source so it has the required latitude/longitude columns.

Many other BI tools can convert addresses automatically.

Proposed Change

We want to implement a feature using the GeoPy package with the Mapbox Geocoding API that can convert addresses to latitude/longitude and save those values as additional columns or overwrite certain columns in the same table.

To make the API calls we plan to use the same API-Key that is already used for the background maps (Mapbox API Key).

The feature will be available under the menu "Sources" > "Geocode Addresses" and will be implemented asynchronously. Only one geocoding can be in progress at once though. There are multiple reasons for this decision:

  • Most geocoding API's limit the amount of requests per second
  • Most geocoding API's limit the amount of requests that can be made over a certain time period
  • Depending on the amount of data in the table the process can take a very long time

If the geocoding is in process and the user navigates to the "Geocode Addresses" URL he will see a progressbar and will have the ability to cancel the process. If no process is ongoing the geocoding form will be shown.

The user can decide what happens if anything goes wrong (for ex. call limit reached, connection issues, etc.) or the process is interrupted. He can choose to save the already converted data or discard it.

New or Changed Public Interfaces

  • There will be a new form for the Geocoding in React
    image
  • There will be a new REST API that geocodes the address based data on a specific table and adds or overwrites columns
  • There will be a new REST API that informs the caller if a geocoding is already progress (boolean, and an integer representing the progress (%))
  • There will be a new REST API with wich a user can interrupt a geocoding progress
  • There will be a new REST API where you can get a list of columns for a selected table

New dependencies

  • We do not need a new dependency, because Superset is already using GeoPy

Migration Plan and Compatibility

The documentation will likely need to be added which describes the usage of this new feature once this is merged into master

Rejected Alternatives

  • We thought about on-the-fly geocoding and accepting address-data in certain charts like the deck.gl Scatterplot but rejected the idea since geocoding itself is an expensive operation and should only be done once on a specific dataset.

Metadata

Metadata

Labels

enhancement:requestEnhancement request submitted by anyone from the communityinactiveInactive for >= 30 dayssipSuperset Improvement Proposal

Type

No type
No fields configured for issues without a type.

Projects

Status

Denied / Closed / Discarded

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions