Indy Explorer is a Streamlit app designed to help you navigate resorts in the Indy Pass product with ease.
- Resort Data Extraction: Uses BeautifulSoup to scrape data from the Indy Pass resort pages.
- Location Normalization: Utilizes the Google Maps Geocoding API to normalize location data.
- Interactive UI: Built with Streamlit for greater interactivity with resort information.
Data is sourced from indyskipass.com as of December 23, 2025.
-
Clone the repository:
git clone https://github.com/jonathanstelman/indy-explorer.git cd indy-explorer -
Install Poetry:
pipx install poetry
-
Install the required dependencies:
poetry install
-
Set up your Mapbox API token:
- Sign up for a free Mapbox account.
- Create an access token in your Mapbox account dashboard.
- Add your token to
.streamlit/secrets.toml:Or set it as an environment variable:MAPBOX_TOKEN = "your_mapbox_token_here"
export MAPBOX_TOKEN=your_mapbox_token_here
-
Run the Streamlit app:
poetry run streamlit run src/app.py
- Fetch Resort Data: Use the
get_page_htmlfunction to fetch resort data from the web or cache. - Parse Resort Data: Use the
parse_resort_pagefunction to extract relevant resort data. - View Data: The data is displayed in an interactive Streamlit app.
To update all resort data (recommended about once per year), follow these steps:
-
Backup your web cache and resort data (optional but recommended):
cp -r cache backups/cache_backup_$(date +%Y%m%d_%H%M%S) cp -r data backups/data_backup_$(date +%Y%m%d_%H%M%S)
-
Remove the old cache and data folders (optional, for a clean refresh):
rm -rf cache rm -rf data
-
Recreate required directories:
mkdir -p cache data/resort_page_extracts
-
Fetch and cache the latest resort data:
poetry run python src/page_scraper.py
This will:
- Download and cache the latest "our resorts" page.
- Parse and save
data/resorts_raw.json. - Download, cache, and parse each individual resort page, saving to
data/resort_page_extracts/<slug>.json.
Notes:
- Use live mode to re-download everything:
poetry run python src/page_scraper.py --read-mode live
- Cached HTML files are not overwritten (the scraper uses
open(..., 'x')). Deletecache/*.htmlif you want to re-fetch.
-
Placeholder: update blackout date handling (next task):
- The blackout Google Sheet format has changed; revisit and update the blackout pipeline here.
-
Fetch blackout dates (optional):
src/prep_resort_data.pywill auto-fetchdata/blackout_dates_raw.csvif it is missing.- To force a refresh, run:
poetry run python src/prep_resort_data.py --refresh-blackout
- To increase verbosity, add
--log-level DEBUG. - You can still use
src/blackout.pyto fetch the sheet and print QA output if you want to inspect it.
-
Fetch reservation requirements (optional):
- Cache the HTML page:
poetry run python src/cache_blackout_reservations.py --read-mode live
- Parse the cached reservations list into JSON:
poetry run python src/reservations.py --read-mode cache
- This produces
data/reservations_raw.json.
- Cache the HTML page:
-
Prepare the final CSV for the Streamlit app:
poetry run python src/prep_resort_data.py
This will:
- Use the Google Maps API to retrieve and save normalized location data if
data/resort_locations.csvis missing. - Merge all resort data and location info.
- Produce
data/resorts.csvfor the Streamlit app.
- Use the Google Maps API to retrieve and save normalized location data if
-
Run the Streamlit app:
poetry run streamlit run src/app.py
Note:
- If you encounter errors related to missing or outdated data files, re-run the above steps in order.
- The Google Maps API key must be set in your environment (see
.env.example). - A Mapbox API token is required for map rendering in the Streamlit app.
- If you have a limited Google Maps API quota, be aware that regenerating
data/resort_locations.csvwill make API calls for each unique resort location. - If you want to preserve previous location lookups, make sure to back up
data/resort_locations.csvas well:cp data/resort_locations.csv data_backup_$(date +%Y%m%d_%H%M%S)_resort_locations.csv - If you add new resorts or change location names, you may need to manually review or update
data/resort_locations.csv.
Blackout dates are sourced from a published Google Sheet and merged into the resort data.
-
Fetch the latest sheet data and print QA output:
poetry run python src/blackout.py
This currently prints QA output only (it does not write
data/blackout_dates_raw.csv). -
Run the prep script to merge blackout dates into
data/resorts.csv:poetry run python src/prep_resort_data.py
- Print the raw sheet and name-mismatch diagnostics:
poetry run python src/blackout.py
- If blackout resort names don’t match
data/resorts.csv, updateBLACKOUT_RESORT_NAME_MAPinsrc/blackout.py.
We welcome contributions! Please follow these steps:
- Fork the repository.
- Create a new branch (
git checkout -b feature-branch). - Make your changes.
- Commit your changes (
git commit -am 'Add new feature'). - Push to the branch (
git push origin feature-branch). - Open a new Pull Request.
We include unit tests and CI for this project. Here are the quick commands and notes you need to run the tests, generate reports, and keep code formatted:
-
Install dependencies with Poetry:
poetry install
-
Run the full test suite:
poetry run pytest -q
-
Run tests and generate JUnit + coverage reports (matches CI):
mkdir -p reports poetry run pytest --junitxml=reports/junit.xml --cov=src --cov-report=xml:reports/coverage.xml -q
Notes:
- Tests live in
tests/and use fixtures intests/fixtures/(e.g.,powder_ridge_fixture.html,powder_ridge_malformed.html). tests/conftest.pyensures the project root is onsys.pathsoimport srcworks in CI.- The project includes
pytest-covfor coverage reporting (installed in the dev group).
We use Black for code formatting. To check formatting locally:
poetry run black --check .To run Black and reformat files:
poetry run black .Black configuration lives in pyproject.toml (we set skip-string-normalization = true to preserve single quotes).
- GitHub Actions runs tests and checks formatting on push/PRs. The CI workflow:
- Installs dependencies with Poetry
- Runs
pytestand generatesreports/junit.xmlandreports/coverage.xml - Uploads a test artifact named
test-reports - Publishes a GitHub Check Run summary using
dorny/test-reporter@v2(configured forpython-xunit) - Optionally uploads coverage to Codecov if
CODECOV_TOKENis set in repository secrets
src/utils.pynow avoids creating thegooglemaps.Clientat import time whenGOOGLE_MAPS_API_KEYis missing, which prevents import failures in CI and local environments without the key. Tests monkeypatchutils.gmapswhen needed.beautifulsoup4is declared inpyproject.tomlsobs4is available in CI.
Help improve this app:
This project is licensed under the MIT License. See the LICENSE file for details.
Made by Jonathan Stelman