Support Lancaster (UK) food waste collection#1895
Conversation
📝 WalkthroughWalkthroughAdded module-level Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #1895 +/- ##
=======================================
Coverage 86.67% 86.67%
=======================================
Files 9 9
Lines 1141 1141
=======================================
Hits 989 989
Misses 152 152 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
uk_bin_collection/uk_bin_collection/councils/LancasterCityCouncil.py (1)
74-79: Raise a parser-specific error here.
Exceptionmakes expected markup drift look the same as any unrelated runtime failure. Using a specific parse error such asValueErrorgives callers and tests a much clearer contract.Minimal cleanup
- raise Exception("Could not find scheduled collections section on the page") + raise ValueError("Could not find scheduled collections section on the page") - raise Exception("No collection services found") + raise ValueError("No collection services found")Based on learnings, in uk_bin_collection/**/*.py, when parsing council bin collection data, prefer explicit failures (raise exceptions on unexpected formats) over silent defaults or swallowed errors. This ensures format changes are detected early. Use clear exception types, document error causes, and add tests to verify that invalid inputs raise as expected.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@uk_bin_collection/uk_bin_collection/councils/LancasterCityCouncil.py` around lines 74 - 79, Replace the generic Exception raises in LancasterCityCouncil.py parsing logic with a parser-specific error to make failures explicit: change the two raise Exception(...) occurrences triggered when services is None and when services_sub is empty to a more specific type (e.g., ValueError or a custom ParseError class) and update any tests or callers that expect these failures to catch the new error; locate the checks around the variables services and services_sub inside the LancasterCityCouncil parsing function and replace the exception type and message accordingly so callers can distinguish parse errors from unrelated runtime exceptions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@uk_bin_collection/uk_bin_collection/councils/LancasterCityCouncil.py`:
- Around line 113-115: The new sort uses date_format via a star import, causing
Ruff F405; replace the wildcard import with an explicit import of date_format
from the common module and ensure datetime is imported explicitly too if not
already; update the import statement to explicitly import date_format (and
datetime if missing) so the lambda using
datetime.strptime(x.get("collectionDate"), date_format) no longer relies on
import *.
- Around line 83-111: The parser currently swallows malformed rows by skipping
when date_text is blank or when a ValueError occurs; update the parsing logic in
the LancasterCityCouncil parsing method so that instead of continuing silently
you raise a clear exception (e.g., ValueError or a custom ParseError) with
contextual details (include the offending date_text, index i, and raw
services_sub snippet) whenever date_text is missing, datetime.strptime fails, or
expected sibling elements (type_text) are absent; ensure the raised message
references the symbols BIN_TYPES, SUFFIXES, and date_format and that
callers/tests expect and assert these failures rather than allowing partial
schedules to be returned.
---
Nitpick comments:
In `@uk_bin_collection/uk_bin_collection/councils/LancasterCityCouncil.py`:
- Around line 74-79: Replace the generic Exception raises in
LancasterCityCouncil.py parsing logic with a parser-specific error to make
failures explicit: change the two raise Exception(...) occurrences triggered
when services is None and when services_sub is empty to a more specific type
(e.g., ValueError or a custom ParseError class) and update any tests or callers
that expect these failures to catch the new error; locate the checks around the
variables services and services_sub inside the LancasterCityCouncil parsing
function and replace the exception type and message accordingly so callers can
distinguish parse errors from unrelated runtime exceptions.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: e2a00efc-5947-417e-b636-3cc26df81693
📒 Files selected for processing (1)
uk_bin_collection/uk_bin_collection/councils/LancasterCityCouncil.py
Recently Lancaster City Council (UK) started doing food waste collection :) Now they have added it to the web page thing. They also decided to mess up the text that is given for other recycling. This adds support for the food waste collection and changes the logic to trim the text for the collection type to bypass some of the issues the new formatting gives. This also updates the logic to better match that found in the similar project “mampfes/hacs_waste_collection_schedule”.
There was a problem hiding this comment.
♻️ Duplicate comments (2)
uk_bin_collection/uk_bin_collection/councils/LancasterCityCouncil.py (2)
6-6:⚠️ Potential issue | 🟠 MajorReplace wildcard import to resolve Ruff F405 for
date_format.Line 6 still uses
import *, whiledate_formatis referenced at Line 106 and Line 114; Ruff correctly flags this as F405.Minimal fix
-from uk_bin_collection.uk_bin_collection.common import * +from uk_bin_collection.uk_bin_collection.common import date_format#!/bin/bash # Verify wildcard import and date_format references in this file. rg -n 'from uk_bin_collection\.uk_bin_collection\.common import \*|date_format' \ uk_bin_collection/uk_bin_collection/councils/LancasterCityCouncil.pyExpected result after fix: no wildcard import line;
date_formatremains explicitly referenced.Also applies to: 106-106, 114-114
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@uk_bin_collection/uk_bin_collection/councils/LancasterCityCouncil.py` at line 6, The file currently uses a wildcard import "from uk_bin_collection.uk_bin_collection.common import *" which triggers Ruff F405 for date_format; replace the wildcard import with an explicit import of the symbols actually used (at minimum import date_format) from uk_bin_collection.uk_bin_collection.common so LancasterCityCouncil (the class/functions referencing date_format at the usages) can reference date_format directly; scan the file for any other names coming from common and add them to the explicit import list to avoid future F405 errors.
83-111:⚠️ Potential issue | 🟠 MajorStop silently dropping malformed schedule rows.
Line 86 and Line 109-Line 111 still allow malformed rows to be skipped, which can return partial schedules instead of surfacing a parser break.
Proposed fix
- date_text = ( - services_sub[i + 1].text.strip() if services_sub[i + 1] else None - ) - if date_text: - try: - dt = datetime.strptime(date_text, "%d/%m/%Y").date() - type_text = services_sub[i + 2].text.strip() + try: + date_text = services_sub[i + 1].text.strip() + type_text = services_sub[i + 2].text.strip() + if not date_text or not type_text: + raise ValueError( + f"Incomplete Lancaster row at index {i}: " + f"date_text={date_text!r}, type_text={type_text!r}" + ) + dt = datetime.strptime(date_text, "%d/%m/%Y").date() collection_type = next( ( bin_type for bin_type in BIN_TYPES if type_text.startswith(bin_type) ), None, ) if collection_type is None: collection_type = type_text for suffix in SUFFIXES: collection_type = collection_type.removesuffix(suffix) - if collection_type is not None: - data["bins"].append( - { - "type": collection_type, - "collectionDate": dt.strftime(date_format), - } - ) - except ValueError: - # Skip invalid date or missing elements - continue + if not collection_type: + raise ValueError( + f"Unresolved Lancaster collection type at index {i}: " + f"type_text={type_text!r}, BIN_TYPES={BIN_TYPES}, SUFFIXES={SUFFIXES}" + ) + data["bins"].append( + { + "type": collection_type, + "collectionDate": dt.strftime(date_format), + } + ) + except (IndexError, AttributeError, ValueError) as exc: + snippet = [li.get_text(strip=True) for li in services_sub[i : i + 3]] + raise ValueError( + f"Unexpected Lancaster scheduled row at index {i} with " + f"date_text={locals().get('date_text', None)!r}; " + f"expected format date={date_format}; row={snippet}" + ) from excBased on learnings, in uk_bin_collection/**/*.py, when parsing council bin collection data, prefer explicit failures (raise exceptions on unexpected formats) over silent defaults or swallowed errors.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@uk_bin_collection/uk_bin_collection/councils/LancasterCityCouncil.py` around lines 83 - 111, The parser currently swallows malformed rows in LancasterCityCouncil.py by continuing on ValueError or missing elements; instead, in the block handling services_sub (look for services_sub, BIN_TYPES, SUFFIXES, data["bins"], and date_format) ensure you validate presence of services_sub[i+1] and services_sub[i+2] and raise a descriptive exception if they are missing, and when datetime.strptime(date_text, "%d/%m/%Y") fails raise a ValueError with contextual info (e.g., include date_text and surrounding type_text) rather than continue, so callers get explicit failures for unexpected schedule formats.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@uk_bin_collection/uk_bin_collection/councils/LancasterCityCouncil.py`:
- Line 6: The file currently uses a wildcard import "from
uk_bin_collection.uk_bin_collection.common import *" which triggers Ruff F405
for date_format; replace the wildcard import with an explicit import of the
symbols actually used (at minimum import date_format) from
uk_bin_collection.uk_bin_collection.common so LancasterCityCouncil (the
class/functions referencing date_format at the usages) can reference date_format
directly; scan the file for any other names coming from common and add them to
the explicit import list to avoid future F405 errors.
- Around line 83-111: The parser currently swallows malformed rows in
LancasterCityCouncil.py by continuing on ValueError or missing elements;
instead, in the block handling services_sub (look for services_sub, BIN_TYPES,
SUFFIXES, data["bins"], and date_format) ensure you validate presence of
services_sub[i+1] and services_sub[i+2] and raise a descriptive exception if
they are missing, and when datetime.strptime(date_text, "%d/%m/%Y") fails raise
a ValueError with contextual info (e.g., include date_text and surrounding
type_text) rather than continue, so callers get explicit failures for unexpected
schedule formats.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 3da11b44-af3c-41b5-b8b0-786f579d592d
📒 Files selected for processing (1)
uk_bin_collection/uk_bin_collection/councils/LancasterCityCouncil.py
If date parsing errors occur, expose the error correctly. Also, only import required parts of the common lib.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
♻️ Duplicate comments (2)
uk_bin_collection/uk_bin_collection/councils/LancasterCityCouncil.py (2)
90-119:⚠️ Potential issue | 🟠 MajorReject blank collection types before appending.
Line 113 only guards against
None, but an emptytype_textstill reaches the append path as"". That hides a malformed row and produces a bin with notype.Based on learnings, council parsers in this repo should fail explicitly on unexpected formats rather than returning partial data.Minimal fix
if collection_type is None: collection_type = type_text for suffix in SUFFIXES: collection_type = collection_type.removesuffix(suffix) if collection_type is not None: + if not collection_type: + raise ValueError( + "Unexpected blank Lancaster collection type after normalization: " + f"index={i}, raw_type={type_text!r}" + ) data["bins"].append( { "type": collection_type, "collectionDate": dt.strftime(date_format), }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@uk_bin_collection/uk_bin_collection/councils/LancasterCityCouncil.py` around lines 90 - 119, The code currently only checks for collection_type is None before appending but allows empty strings (from type_item.text) to be treated as valid, producing bins with no type; update the logic in the LancasterCityCouncil parser (use variables date_item, type_item, BIN_TYPES, SUFFIXES, collection_type) to explicitly reject empty or whitespace-only collection types after trimming (e.g., if not collection_type: raise ValueError with context including date_text and type_text) instead of appending, so malformed rows cause a clear failure rather than producing a blank "type" in data["bins"].
81-88:⚠️ Potential issue | 🟠 MajorRaise on truncated service rows.
Line 82 still silently drops a trailing 1-2
<li>group. Because that guard skips the body, the missing-element checks at Lines 85-88 never run, so a page shape change can still return a partial schedule instead of failing.Based on learnings, council parsers in this repo should fail explicitly on unexpected formats rather than returning partial data.Minimal fix
for i in range(0, len(services_sub), 3): + if i + 2 >= len(services_sub): + raise ValueError( + "Incomplete Lancaster scheduled row: " + f"index={i}, items=" + f"{[item.get_text(strip=True) for item in services_sub[i:]]}" + ) if i + 2 < len(services_sub): date_item = services_sub[i + 1] type_item = services_sub[i + 2] if date_item is None: raise Exception(f"Missing collection date element at index {i + 1}") if type_item is None: raise Exception(f"Missing collection type element at index {i + 2}")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@uk_bin_collection/uk_bin_collection/councils/LancasterCityCouncil.py` around lines 81 - 88, The loop in LancasterCityCouncil (iterating services_sub) currently uses "if i + 2 < len(services_sub)" which silently skips a trailing 1–2 <li> items and prevents the subsequent missing-element checks from running; change the guard so that instead of skipping incomplete groups you detect a truncated tail and raise an explicit error (e.g., when i+1 or i+2 are out of range or when date_item/type_item is None) so that the parser fails fast on unexpected page shape; specifically update the loop that processes services_sub in LancasterCityCouncil (the block that sets date_item and type_item) to throw a descriptive Exception when the group is incomplete rather than skipping it so the missing-element checks always run.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@uk_bin_collection/uk_bin_collection/councils/LancasterCityCouncil.py`:
- Around line 90-119: The code currently only checks for collection_type is None
before appending but allows empty strings (from type_item.text) to be treated as
valid, producing bins with no type; update the logic in the LancasterCityCouncil
parser (use variables date_item, type_item, BIN_TYPES, SUFFIXES,
collection_type) to explicitly reject empty or whitespace-only collection types
after trimming (e.g., if not collection_type: raise ValueError with context
including date_text and type_text) instead of appending, so malformed rows cause
a clear failure rather than producing a blank "type" in data["bins"].
- Around line 81-88: The loop in LancasterCityCouncil (iterating services_sub)
currently uses "if i + 2 < len(services_sub)" which silently skips a trailing
1–2 <li> items and prevents the subsequent missing-element checks from running;
change the guard so that instead of skipping incomplete groups you detect a
truncated tail and raise an explicit error (e.g., when i+1 or i+2 are out of
range or when date_item/type_item is None) so that the parser fails fast on
unexpected page shape; specifically update the loop that processes services_sub
in LancasterCityCouncil (the block that sets date_item and type_item) to throw a
descriptive Exception when the group is incomplete rather than skipping it so
the missing-element checks always run.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 2f5b6279-ab32-4af2-98a1-1fd6e2504e9a
📒 Files selected for processing (1)
uk_bin_collection/uk_bin_collection/councils/LancasterCityCouncil.py
|
Merged into the combined March 2026 release PR #1898. Closing to reduce release noise. Thank you for the contribution! |
Recently Lancaster City Council (UK) started doing food waste collection :)
Now they have added it to the web page thing. They also decided to mess up the text that is given for other recycling. This adds support for the food waste collection and changes the logic to trim the text for the collection type to bypass some of the issues the new formatting gives.
This also updates the logic to better match that found here.
Summary by CodeRabbit