Conversation
fix: #1836 - fix: London Borough Redbridge
fix: #1831 - Harborough District Council
fix: #1504 - Adding Hammersmith & Fulham
fix: HarboroughDistrictCouncil
Bumps [pip](https://github.com/pypa/pip) from 25.3 to 26.0. - [Changelog](https://github.com/pypa/pip/blob/main/NEWS.rst) - [Commits](pypa/pip@25.3...26.0) --- updated-dependencies: - dependency-name: pip dependency-version: '26.0' dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>
fix: #1846 - Powys Council
fix: #1845 - Mid Suffolk District Council
Richmond upon Thames doesn't work anymore. Bin collections can now be queried with Unique Property Reference Number (UPRN) in QS. Use "https://www.richmond.gov.uk/my_richmond" as URL, will request https://www.richmond.gov.uk/my_richmond/?pid=<YourUKPropertyID> and parse. "https://www.findmyaddress.co.uk/search" can be used for finding the UPRN.
Bumps [pillow](https://github.com/python-pillow/Pillow) from 12.0.0 to 12.1.1. - [Release notes](https://github.com/python-pillow/Pillow/releases) - [Changelog](https://github.com/python-pillow/Pillow/blob/main/CHANGES.rst) - [Commits](python-pillow/Pillow@12.0.0...12.1.1) --- updated-dependencies: - dependency-name: pillow dependency-version: 12.1.1 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>
fix: #1851 Bromley Borough Council
fix: #1853 - Wakefield City Council
…verification errors
fix: #1848 - Redcar and Cleveland Council
fix: CumberlandCouncil - correct year assignment for all months - previous had accidentally hardcoded the year as 2025 for all months except Jan/Feb
further feedback from AI agent. this fixes the issues flagged and builds on previous commit
fix: #1855 - Barking & Dagenham
fix: #1858 - Cumberland Council
fix: #1861 - North East Derbyshire District Council
fix: #1864 - Leeds City Council
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 6 to 7. - [Release notes](https://github.com/actions/upload-artifact/releases) - [Commits](actions/upload-artifact@v6...v7) --- updated-dependencies: - dependency-name: actions/upload-artifact dependency-version: '7' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com>
fix: #1876 - Bath and North East Somerset
fix: #1869 - Adding North Warwickshire Borough Council
fix: #1872 - Broxtowe Borough Council
Bumps [black](https://github.com/psf/black) from 25.1.0 to 26.3.1. - [Release notes](https://github.com/psf/black/releases) - [Changelog](https://github.com/psf/black/blob/main/CHANGES.md) - [Commits](psf/black@25.1.0...26.3.1) --- updated-dependencies: - dependency-name: black dependency-version: 26.3.1 dependency-type: direct:development ... Signed-off-by: dependabot[bot] <support@github.com>
another update as didnt work in execution
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (57)
📝 WalkthroughWalkthroughThis PR updates GitHub Actions in CI workflows, modifies Poetry configuration structure, and refactors multiple UK council bin collection scrapers. Changes include replacing web scraping with REST API calls for some councils, adopting calendar-based data sources for others, updating HTML selectors and request parameters, adding new councils, and improving user-agent handling. Wiki documentation and test data are correspondingly updated. Changes
Sequence Diagram(s)sequenceDiagram
actor User
participant Client
participant Login API
participant Data Broker API
User->>Client: Request bin data with UPRN
activate Client
Client->>Login API: GET (establish session & extract auth ID)
activate Login API
Login API-->>Client: Session + Auth ID
deactivate Login API
Client->>Data Broker API: POST food waste request (with auth ID)
activate Data Broker API
Data Broker API-->>Client: Food waste items
deactivate Data Broker API
Client->>Data Broker API: POST refuse request (with auth ID)
activate Data Broker API
Data Broker API-->>Client: Refuse items
deactivate Data Broker API
Client->>Data Broker API: POST recycling request (with auth ID)
activate Data Broker API
Data Broker API-->>Client: Recycling items
deactivate Data Broker API
Client->>Data Broker API: POST garden request (with auth ID)
activate Data Broker API
Data Broker API-->>Client: Garden items
deactivate Data Broker API
Client->>Client: Aggregate and sort by date
Client-->>User: Unified bin collection list
deactivate Client
sequenceDiagram
actor User
participant Client
participant Web Server
participant ICS Server
User->>Client: Request bin data with UPRN
activate Client
Client->>Client: Generate ICS URL using UPRN
Client->>ICS Server: Fetch ICS calendar (365 days)
activate ICS Server
ICS Server-->>Client: ICS events
deactivate ICS Server
Client->>Client: Parse events, extract bin types
Client->>Client: Normalize collection type names
Client->>Client: Aggregate and sort by date
Client-->>User: Structured bin collection list
deactivate Client
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
❌ 11 Tests Failed:
View the top 3 failed test(s) by shortest run time
View the full list of 8 ❄️ flaky test(s)
To view more test analytics, go to the Test Analytics Dashboard |
There was a problem hiding this comment.
Actionable comments posted: 1
Note
Due to the large number of review comments, Critical severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (10)
uk_bin_collection/uk_bin_collection/councils/RedcarandClevelandCouncil.py (1)
40-47:⚠️ Potential issue | 🔴 CriticalFix unsafe fallback for
place_idselection (can crash or silently misroute request).At Line 46,
addresses[1]is dereferenced in thenext()default expression, which raisesIndexErrorwhen the API returns fewer than 2 addresses. Also, falling back toNonecan produce a misleading downstream request to/places/None/...instead of failing clearly at parse time.Suggested fix
- place_id = next( - ( - item["place_id"] - for item in addresses - if item.get("name", "").startswith(user_paon) - ), - addresses[1]["place_id"] if addresses[1] else None, - ) + place_id = next( + ( + item.get("place_id") + for item in addresses + if item.get("name", "").startswith(user_paon) + ), + None, + ) + + if place_id is None: + if len(addresses) > 1 and addresses[1].get("place_id"): + place_id = addresses[1]["place_id"] + else: + raise ValueError( + f"Unable to resolve place_id for postcode={user_postcode}, paon={user_paon}" + )Based on learnings: In uk_bin_collection/**/*.py, when parsing council bin collection data, prefer explicit failures (raise exceptions on unexpected formats) over silent defaults or swallowed errors.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@uk_bin_collection/uk_bin_collection/councils/RedcarandClevelandCouncil.py` around lines 40 - 47, The current selection of place_id uses addresses[1] in the next() default which can raise IndexError or silently pass None into downstream requests; update the logic around place_id (the addresses list and the next(...) call in RedcarandClevelandCouncil) to explicitly validate addresses length and contents before choosing a fallback: if addresses is empty or has no matching name, raise a descriptive exception (e.g., ValueError) rather than using addresses[1] or None; ensure the code checks that items contain "place_id" and that user_paon matching is correct, then assign place_id from the validated item or fail fast with a clear error message.uk_bin_collection/uk_bin_collection/councils/LondonBoroughRedbridge.py (2)
1-1:⚠️ Potential issue | 🟡 MinorIncorrect comment references Bromley Council.
This file is
LondonBoroughRedbridge.pybut the comment says "Bromley Council Bins Data" — appears to be a copy-paste artifact.📝 Suggested fix
-# This script pulls (in one hit) the data from Bromley Council Bins Data +# This script pulls (in one hit) the data from London Borough of Redbridge Bins Data🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@uk_bin_collection/uk_bin_collection/councils/LondonBoroughRedbridge.py` at line 1, The file header comment incorrectly references "Bromley Council Bins Data"; update the top-of-file comment in LondonBoroughRedbridge.py to accurately describe the file (e.g., "This script pulls the data for London Borough of Redbridge bin collections" or similar), removing the Bromley reference so the comment matches the module purpose and filename.
120-124:⚠️ Potential issue | 🟡 MinorConsider explicit failure instead of silent continuation on date parse errors.
When
ValueErroris raised during date parsing, the error is printed but processing continues, potentially returning incomplete data without the caller knowing. Based on learnings, the project prefers explicit failures to ensure format changes are detected early.If partial data is acceptable, consider at minimum logging at a warning level or collecting errors to report at the end.
🛡️ Option: Re-raise to fail explicitly
except ValueError as e: - # Handle the case where the date format is invalid - print( - f"Error parsing date '{date_string}' for {collection_type}: {e}" - ) + # Fail explicitly on unexpected date format + raise ValueError( + f"Failed to parse date '{date_string}' for {collection_type}" + ) from eBased on learnings: "In uk_bin_collection/**/*.py, when parsing council bin collection data, prefer explicit failures (raise exceptions on unexpected formats) over silent defaults or swallowed errors."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@uk_bin_collection/uk_bin_collection/councils/LondonBoroughRedbridge.py` around lines 120 - 124, Replace the silent print in the ValueError handler with an explicit failure: in the LondonBoroughRedbridge code path where you catch ValueError (the except ValueError as e block referencing date_string and collection_type), re-raise a descriptive exception (or raise the original exception) instead of printing so callers fail fast on unexpected date formats; alternatively, if partial results are acceptable, change the print to a processLogger.warning and collect the error for later reporting, but prefer raising an exception to detect format changes early.uk_bin_collection/uk_bin_collection/councils/NuneatonBedworthBoroughCouncil.py (2)
54-59:⚠️ Potential issue | 🔴 CriticalExceptions are created but never raised — errors are silently ignored.
These
Exception(...)calls instantiate exception objects but do not raise them. The code continues execution as if no error occurred, returning an emptydatadict and masking failures.Based on learnings: prefer explicit failures over silent defaults to ensure format changes are detected immediately.
🐛 Proposed fix to raise exceptions
if len(matches) == 1: street_url = matches[0]["href"] full_url = base_url.rstrip("/") + street_url bin_data = self.get_bin_data(full_url) for k, v in bin_data.items(): for date in v: dict_data = { "type": k, "collectionDate": datetime.strptime( date, "%Y-%m-%d" ).strftime(date_format), } data["bins"].append(dict_data) return data elif len(matches) > 1: - Exception("Multiple street URLs found. Please refine your search.") + raise ValueError("Multiple street URLs found. Please refine your search.") else: - Exception("Street URL not found.") + raise ValueError("Street URL not found.") else: - Exception("Failed to retrieve search results.") + raise ConnectionError(f"Failed to retrieve search results: HTTP {search_response.status_code}") return data🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@uk_bin_collection/uk_bin_collection/councils/NuneatonBedworthBoroughCouncil.py` around lines 54 - 59, The current error branches instantiate Exception objects but don't raise them, so failures are swallowed; update the error handling in NuneatonBedworthBoroughCouncil.py (the block that checks matches and the else branch for failed search results) to raise the exceptions (e.g., raise Exception("Multiple street URLs found. Please refine your search."), raise Exception("Street URL not found."), and raise Exception("Failed to retrieve search results.")) instead of just calling Exception(...), so the function fails fast and surfaces format or retrieval errors; ensure these raises occur before any return of the empty data dict.
938-946:⚠️ Potential issue | 🔴 CriticalMultiple critical error-handling bugs: missing
raise, potentialKeyError, andUnboundLocalError.
- Line 938: If
filenamedoesn't match a key inbin_data, aKeyErroris raised with no meaningful message.- Lines 941, 944:
Exception(...)withoutraise— errors are silently ignored.- Line 946: If the response status is not 200 or
download_linkisNone,outputis never assigned, causingUnboundLocalError.Based on learnings: prefer explicit failures to ensure format changes are detected immediately.
🐛 Proposed fix
output = bin_data[filename] + return output else: - print(bin_day_response.content) - Exception("Bin data Download link not found.") - - else: - Exception("Failed to retrieve bin data.") - - return output + raise ValueError(f"Bin data download link not found on page: {url}") + else: + raise ConnectionError(f"Failed to retrieve bin data: HTTP {bin_day_response.status_code}")Additionally, consider wrapping the dictionary lookup for a clearer error message:
if filename not in bin_data: raise ValueError(f"Unknown calendar filename '{filename}'. Expected one of: {list(bin_data.keys())}") output = bin_data[filename]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@uk_bin_collection/uk_bin_collection/councils/NuneatonBedworthBoroughCouncil.py` around lines 938 - 946, The current block silently swallows errors and can leave `output` uninitialized; fix it by explicitly raising exceptions and validating lookups: after getting `download_link` check response.status_code == 200 and raise a descriptive exception (including `bin_day_response.content`) if not; ensure `download_link` is not None and raise if missing; replace the bare dict access `output = bin_data[filename]` with an explicit membership check (e.g. if `filename not in bin_data: raise ValueError(f"Unknown calendar filename {filename}; expected {list(bin_data.keys())}")`) then assign `output`; this prevents KeyError and UnboundLocalError and makes failures explicit for `filename`, `bin_data`, `bin_day_response`, `download_link`, and `output`.uk_bin_collection/uk_bin_collection/councils/BroxbourneCouncil.py (1)
107-111:⚠️ Potential issue | 🟡 MinorYear assignment logic may incorrectly assign past dates.
The current logic only handles January correctly. If the current month is December and a parsed date is February-December, the code assigns the current year, placing the date in the past.
For example: In December 2026, parsing "Sat 15 Feb" yields February 2026 (past) instead of February 2027.
🐛 Suggested fix for year rollover
collection_date = datetime.strptime(collection_date_text, "%a %d %b") - if collection_date.month == 1 and current_month != 1: + # If parsed month is before current month, it's likely next year + if collection_date.month < current_month: collection_date = collection_date.replace(year=current_year + 1) else: collection_date = collection_date.replace(year=current_year)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@uk_bin_collection/uk_bin_collection/councils/BroxbourneCouncil.py` around lines 107 - 111, The year-assignment only handles January rollover and can yield past dates; update the logic in BroxbourneCouncil.py after parsing collection_date_text (where collection_date = datetime.strptime(collection_date_text, "%a %d %b")) so the year is set to current_year + 1 whenever the parsed month is less than the current_month (i.e., collection_date.month < current_month), otherwise use current_year; also consider that if parsed month equals current_month you can keep current_year (or optionally compare day to today to bump to next year), but the primary fix is replacing the existing January-only branch with the generic month comparison using collection_date.month, current_month, and current_year.uk_bin_collection/uk_bin_collection/councils/PowysCouncil.py (1)
133-145:⚠️ Potential issue | 🟠 MajorDo not silently skip garden-waste parse failures.
The bare
except: continuedrops data without surfacing format drift, which can hide scraper breakage.Proposed fix
for date in garden_waste_dates: - try: - dict_data = { - "type": "Garden Waste", - "collectionDate": datetime.strptime( - remove_ordinal_indicator_from_date_string( - date.split(" (")[0] - ), - "%A %d %B %Y", - ).strftime(date_format), - } - data["bins"].append(dict_data) - except: - continue + cleaned_date = remove_ordinal_indicator_from_date_string( + date.split(" (")[0] + ) + try: + parsed_date = datetime.strptime(cleaned_date, "%A %d %B %Y") + except ValueError as exc: + raise ValueError( + f"Unexpected garden waste date format: '{date}'" + ) from exc + + data["bins"].append( + { + "type": "Garden Waste", + "collectionDate": parsed_date.strftime(date_format), + } + )Based on learnings: In uk_bin_collection/**/*.py, when parsing council bin collection data, prefer explicit failures (raise exceptions on unexpected formats) over silent defaults or swallowed errors.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@uk_bin_collection/uk_bin_collection/councils/PowysCouncil.py` around lines 133 - 145, The try/except block around building dict_data in PowysCouncil should not silently swallow parse errors; replace the bare except with catching the specific parsing error (ValueError) and re-raise a descriptive exception (or at least log an error) that includes the raw date string and context so format drift is visible. Specifically, change the code around remove_ordinal_indicator_from_date_string(...) and datetime.strptime(..., "%A %d %B %Y") that builds dict_data (and references date_format and data["bins"]) to catch ValueError as e and raise a new ValueError (or logging + raise) that mentions "PowysCouncil", the offending date, and the original exception (use "from e") instead of continuing silently.uk_bin_collection/uk_bin_collection/councils/EastleighBoroughCouncil.py (1)
80-87:⚠️ Potential issue | 🟠 MajorDon’t return synthetic
"Error"bins on parser failures.Lines 80-87 convert hard scraping failures into normal-looking data, which can mask upstream breakages and bad collections output. Prefer raising an exception after logging context.
💡 Suggested fix
- except Exception as e: - import traceback - - error_message = f"Error fetching/parsing data for Eastleigh: {str(e)}\n{traceback.format_exc()}" - print(error_message) - # Use the correct date format for the error fallback - today = datetime.now().strftime("%d/%m/%Y") - return {"bins": [{"type": "Error", "collectionDate": today}]} + except Exception as e: + raise RuntimeError( + f"Error fetching/parsing data for Eastleigh: {e}" + ) from eBased on learnings: In uk_bin_collection/**/*.py, prefer explicit failures (raise exceptions on unexpected formats) over silent defaults or swallowed errors.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@uk_bin_collection/uk_bin_collection/councils/EastleighBoroughCouncil.py` around lines 80 - 87, The except block in EastleighBoroughCouncil.py currently catches exceptions and returns a synthetic bin entry (the "Error" bin); instead, log the error context (include traceback) and then re-raise the exception (or raise a descriptive custom exception) so failures surface to callers; update the except Exception as e handler in the EastleighBoroughCouncil class/method where the try/except appears to stop returning {"bins":[{"type":"Error",...}]} and instead log the error (avoid print if a logger is available) and raise the original or a new exception after logging.uk_bin_collection/uk_bin_collection/councils/HarboroughDistrictCouncil.py (1)
60-87:⚠️ Potential issue | 🟠 MajorStop swallowing row-parse failures.
Line 74 and Lines 86-87 currently drop unexpected row formats/date errors and keep going. That can turn a Harborough site change into partial or empty data with no signal. Raise with the offending row text instead of
continue.🧪 Suggested fail-fast parsing
lis = bin_collection.find_all("li") for li in lis: - try: + row_text = li.get_text(" ", strip=True) + try: # Try the new format first (with span.pull-right) date_span = li.find("span", {"class": "pull-right"}) if date_span: date_text = date_span.text.strip() date = datetime.strptime(date_text, "%d %B %Y").strftime("%d/%m/%Y") # Extract bin type from the text before the span - bin_type = li.text.replace(date_text, "").strip() + bin_type = row_text.replace(date_text, "").strip() else: # Fall back to old format (regex match) - split = re.match(r"(.+)\s(\d{1,2} \w+ \d{4})$", li.text) + split = re.match(r"(.+)\s(\d{1,2} \w+ \d{4})$", row_text) if not split: - continue + raise ValueError(f"Unexpected Harborough row format: {row_text!r}") bin_type = split.group(1).strip() date = datetime.strptime( split.group(2), "%d %B %Y", ).strftime("%d/%m/%Y") @@ - except Exception: - continue + except Exception as exc: + raise ValueError(f"Failed to parse Harborough row: {row_text!r}") from excBased on learnings: in
uk_bin_collection/**/*.py, prefer explicit failures over silent defaults when parsing council data so format changes surface immediately.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@uk_bin_collection/uk_bin_collection/councils/HarboroughDistrictCouncil.py` around lines 60 - 87, In the for-loop over lis (for li in lis) replace the silent continues on parse failures with an explicit raise that includes the offending li text so failures surface; specifically, when date_span parsing or the regex match fails (the branches using date_span, date_text, split, bin_type, date, dict_data and the append to bindata["bins"]) do not catch Exception and continue — instead let parsing errors bubble or raise a ValueError/RuntimeError containing li.text (or the problematic date_text) so the caller/tester can see which row failed; remove the broad try/except that swallows errors and only catch/rethrow with the li content if you need to add context.uk_bin_collection/uk_bin_collection/councils/BathAndNorthEastSomersetCouncil.py (1)
60-75:⚠️ Potential issue | 🟠 MajorAdd timeout and status checks before iterating the BathNES payload.
response.text == ""is insufficient protection for this endpoint. A 4xx/5xx response, HTML error page, or non-list JSON payload will currently fail with a generic error. The request also has no timeout, and the response is not validated withraise_for_status()like nearly all other council integrations in the codebase.🛠 Suggested hardening
response = session.get( f"https://api.bathnes.gov.uk/webapi/api/BinsAPI/v2/BartecFeaturesandSchedules/CollectionSummary/{user_uprn}", headers=headers, + timeout=30, ) - if response.text == "": + response.raise_for_status() + if not response.text.strip(): raise ValueError( "Error parsing data. Please check the provided UPRN. " "If this error continues please open an issue on GitHub." ) - json_data = json.loads(response.text) + try: + json_data = json.loads(response.text) + except json.JSONDecodeError as exc: + raise ValueError( + f"Unexpected non-JSON BathNES response for UPRN {user_uprn}" + ) from exc + if not isinstance(json_data, list): + raise ValueError( + f"Unexpected BathNES collection payload for UPRN {user_uprn}" + )Per the project's established pattern, explicit failures on format changes prevent silent data loss when APIs change.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@uk_bin_collection/uk_bin_collection/councils/BathAndNorthEastSomersetCouncil.py` around lines 60 - 75, The current GET to the BathNES CollectionSummary endpoint should be hardened: call session.get with a sensible timeout and then call response.raise_for_status() to fail on 4xx/5xx; validate the Content-Type and handle JSONDecodeError when calling json.loads(response.text); ensure the parsed json_data is a list before iterating (raise a clear ValueError if it's not) and preserve the existing error message but include context (e.g., endpoint/UPrn); reference the session.get call, response, json.loads, json_data and the loop that uses datetime.fromisoformat(collection["nextCollectionDate"]) so you update those spots.
🟠 Major comments (21)
uk_bin_collection/uk_bin_collection/councils/HinckleyandBosworthBoroughCouncil.py-31-43 (1)
31-43:⚠️ Potential issue | 🟠 MajorFail fast on malformed ICS events instead of silently dropping data.
On Line 32, events without
summary/startare silently ignored, and on Lines 33-38 an empty token (e.g., trailing comma) can produce a blanktype. This can mask upstream format changes and return incomplete results.💡 Suggested fix (explicit validation)
- for event in sorted(upcoming_events, key=lambda e: e.start): - if event.summary and event.start: - collections = event.summary.split(",") - for collection in collections: - if collection.strip() == "bin collection": - collection = "food waste caddy" - collection = collection.strip().replace(" collection", "") - bindata["bins"].append( - { - "type": collection, - "collectionDate": event.start.date().strftime(date_format), - } - ) + for event in sorted(upcoming_events, key=lambda e: e.start): + if not event.summary or not event.start: + raise ValueError("Unexpected ICS event format: missing summary or start") + + collections = [c.strip() for c in event.summary.split(",")] + if any(not c for c in collections): + raise ValueError(f"Unexpected ICS event summary format: {event.summary!r}") + + for collection in collections: + if collection == "bin collection": + collection = "food waste caddy" + collection = collection.replace(" collection", "") + bindata["bins"].append( + { + "type": collection, + "collectionDate": event.start.date().strftime(date_format), + } + )Based on learnings: In uk_bin_collection/**/*.py, when parsing council bin collection data, prefer explicit failures (raise exceptions on unexpected formats) over silent defaults or swallowed errors.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@uk_bin_collection/uk_bin_collection/councils/HinckleyandBosworthBoroughCouncil.py` around lines 31 - 43, Validate each ICS event before processing in the loop over upcoming_events: if an event lacks event.summary or event.start raise an exception (e.g., ValueError) with details about the offending event so we fail fast instead of silently ignoring it; when splitting event.summary into collections filter out empty tokens (after .strip()) and if any token becomes empty or unexpected after replacing " collection" (i.e., results in an empty type) raise an exception mentioning the raw event and date_format rather than appending a blank entry to bindata["bins"]; update the logic around upcoming_events, event.summary, event.start, and bindata["bins"] accordingly so malformed ICS rows surface as errors.uk_bin_collection/uk_bin_collection/councils/HinckleyandBosworthBoroughCouncil.py-25-26 (1)
25-26:⚠️ Potential issue | 🟠 MajorReplace wildcard import with explicit imports to resolve F405 linting errors.
Lines 19, 25, 26, 41, and 46 use symbols from the wildcard import that Ruff flags as undefined (F405). Make imports explicit:
Suggested fix
+from datetime import datetime, timedelta from icalevents.icalevents import events -from uk_bin_collection.uk_bin_collection.common import * +from uk_bin_collection.uk_bin_collection.common import check_uprn, date_format from uk_bin_collection.uk_bin_collection.get_bin_data import AbstractGetBinDataClassThis eliminates F405 errors and removes fragility from implicit transitive coupling through wildcard imports.
Also applies to: 41-41, 46-46
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@uk_bin_collection/uk_bin_collection/councils/HinckleyandBosworthBoroughCouncil.py` around lines 25 - 26, The file currently relies on a wildcard import which causes F405 lint errors; replace the star import with explicit imports for the exact symbols used (e.g., import datetime and timedelta for the expressions datetime.now() and timedelta(...), plus the other named symbols referenced on lines 19, 41 and 46). Update the top-level import statement to list those specific symbols (include datetime, timedelta and any classes/functions/constants referenced in this module) instead of using `from ... import *` so Ruff no longer flags undefined names.uk_bin_collection/uk_bin_collection/councils/WakefieldCityCouncil.py-22-22 (1)
22-22:⚠️ Potential issue | 🟠 MajorReplace wildcard import with explicit symbols to resolve Ruff F405.
The wildcard import at line 5 causes Ruff to report unresolved symbols when
create_webdriveris used at line 22. The file also usesdate_formatfrom the common module at lines 62, 86, and 106. Replace the wildcard import with both explicit symbols:Suggested fix
-from uk_bin_collection.uk_bin_collection.common import * +from uk_bin_collection.uk_bin_collection.common import create_webdriver, date_format🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@uk_bin_collection/uk_bin_collection/councils/WakefieldCityCouncil.py` at line 22, Replace the wildcard import from the common module with explicit imports for the symbols actually used: import create_webdriver and date_format instead of "from ...common import *"; update the top of WakefieldCityCouncil.py to import create_webdriver (used where driver = create_webdriver(...)) and date_format (used at the calls around lines referenced) so Ruff F405 is resolved and only the required symbols are imported.uk_bin_collection/uk_bin_collection/councils/EastleighBoroughCouncil.py-27-27 (1)
27-27:⚠️ Potential issue | 🟠 MajorReplace wildcard import with explicit imports to avoid Ruff F405 warnings.
Line 6 uses a wildcard import that makes the code lint-fragile. Explicitly importing only what is used (
create_webdriver,check_uprn,date_format) prevents undefined name errors and makes dependencies clear.💡 Suggested fix
+from datetime import datetime from bs4 import BeautifulSoup from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC from selenium.webdriver.support.ui import Select, WebDriverWait -from uk_bin_collection.uk_bin_collection.common import * +from uk_bin_collection.uk_bin_collection.common import ( + check_uprn, + create_webdriver, + date_format, +)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@uk_bin_collection/uk_bin_collection/councils/EastleighBoroughCouncil.py` at line 27, Replace the wildcard import on line 6 with explicit imports to avoid Ruff F405: instead of "from ... import *" import only the used symbols (create_webdriver, check_uprn, date_format). Update the import statement that currently provides those utilities so it explicitly lists create_webdriver, check_uprn, date_format; this will remove the undefined-name/lint warnings and make dependencies for EastleighBoroughCouncil clearer.uk_bin_collection/uk_bin_collection/councils/BromleyBoroughCouncil.py-35-35 (1)
35-35:⚠️ Potential issue | 🟠 MajorImport
create_webdriverandremove_ordinal_indicator_from_date_stringexplicitly instead of relying on*import.Line 35 calls
create_webdriver(and line 68 callsremove_ordinal_indicator_from_date_string), but both are only resolved viafrom uk_bin_collection.uk_bin_collection.common import *, which is what Ruff F405 is flagging and can block CI.💡 Suggested fix
-from uk_bin_collection.uk_bin_collection.common import * +from uk_bin_collection.uk_bin_collection.common import ( + create_webdriver, + remove_ordinal_indicator_from_date_string, +)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@uk_bin_collection/uk_bin_collection/councils/BromleyBoroughCouncil.py` at line 35, Replace the wildcard import that brings in common helpers with explicit imports: import create_webdriver and remove_ordinal_indicator_from_date_string from the common module instead of using from ...common import *; update the import statement used by BromleyBoroughCouncil.py (the module that calls create_webdriver and remove_ordinal_indicator_from_date_string) to explicitly import those two symbols so Ruff F405 is resolved and CI no longer flags the star import.uk_bin_collection/uk_bin_collection/councils/HarboroughDistrictCouncil.py-29-37 (1)
29-37:⚠️ Potential issue | 🟠 MajorAdd explicit timeouts to the FCC session requests.
Lines 31–37 add two outbound HTTP calls without timeouts. This can cause the scrape to hang indefinitely on a stalled endpoint. Both
session.get()andsession.post()should include explicittimeoutparameters.Note: The
verify=Falsebypass and the module-levelurllib3.disable_warnings()were intentionally added in#1831to resolve connectivity issues with this endpoint and should remain.🔧 Add timeouts
timeout = (10, 30) session = requests.session() response = session.get( - URI1, verify=False + URI1, verify=False, timeout=timeout ) # Initialize session state (cookies) required by URI2 response.raise_for_status() # Validate session initialization params = {"Uprn": user_uprn} - response = session.post(URI2, data=params, verify=False) + response = session.post(URI2, data=params, verify=False, timeout=timeout)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@uk_bin_collection/uk_bin_collection/councils/HarboroughDistrictCouncil.py` around lines 29 - 37, The two outbound requests using session.get(URI1, verify=False) and session.post(URI2, data=params, verify=False) in HarboroughDistrictCouncil (the block that initializes session state and posts the Uprn) must include explicit timeout values to avoid hanging; update the session.get and session.post calls to add a timeout argument (e.g., timeout=10 or a connect/read tuple like timeout=(5,30)) while keeping verify=False and the existing session usage intact so session state (cookies) remains initialized before the post.uk_bin_collection/uk_bin_collection/councils/CumberlandCouncil.py-34-34 (1)
34-34:⚠️ Potential issue | 🟠 MajorFilter list items to those with expected structure and add explicit validation.
The current code assumes every
<li>element has the expected span/time structure. An unrelated<li>will cause crypticAttributeErrororTypeErrorinstead of a diagnostic error.Filter for items with the required span class and validate structure before accessing attributes:
🔧 Proposed fix
- lis = content_region.find_all("li") - for li in lis: + rows = [ + li + for li in content_region.find_all("li") + if li.find("span", class_="waste-collection__day--day") + ] + if not rows: + raise ValueError("No waste collection rows found in Cumberland response") + + for li in rows: collection_day = li.find("span", class_="waste-collection__day--day") collection_type_str = li.find("span", class_="waste-collection__day--type") - - collection_date = collection_day.find("time")["datetime"] + time_tag = collection_day.find("time") if collection_day else None + if not collection_type_str or not time_tag or "datetime" not in time_tag.attrs: + raise ValueError("Unexpected waste collection row structure in Cumberland response") + collection_date = time_tag["datetime"]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@uk_bin_collection/uk_bin_collection/councils/CumberlandCouncil.py` at line 34, The code currently sets lis = content_region.find_all("li") and then assumes each li has the expected span/time structure; change this to only collect list items that contain the required children (e.g., a span with the expected class and a time element) by replacing the naive find_all with a filtered list comprehension or generator that checks li.find("span", class_=...) and li.find("time") exist, then add explicit validation before accessing attributes in the parsing function (raise or log a clear ValueError mentioning the offending li when structure is missing); reference the variable lis and content_region.find_all("li") and update the parsing code that consumes lis to handle missing fields gracefully.uk_bin_collection/uk_bin_collection/councils/CumberlandCouncil.py-1-1 (1)
1-1:⚠️ Potential issue | 🟠 MajorReplace star import with explicit imports to resolve Ruff F405 flags.
This parser uses
datetime.strptime()at lines 43 and 53, anddate_formatat lines 47 and 53, along withcheck_uprn()at line 19—all currently sourced via the star import fromcommon. The importeddateat line 1 is unused. Ruff correctly flags the undefineddatetimeanddate_formatas F405.Proposed fix
-from datetime import date +from datetime import datetime import requests from bs4 import BeautifulSoup -from uk_bin_collection.uk_bin_collection.common import * +from uk_bin_collection.uk_bin_collection.common import check_uprn, date_format from uk_bin_collection.uk_bin_collection.get_bin_data import AbstractGetBinDataClass🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@uk_bin_collection/uk_bin_collection/councils/CumberlandCouncil.py` at line 1, Remove the unused "from datetime import date" and the star import from common; instead import only the symbols you use: add "from datetime import datetime" to allow calls to datetime.strptime, and import date_format and check_uprn from common (e.g., from common import date_format, check_uprn) so references to date_format (lines ~47/53) and check_uprn (line ~19) are defined and Ruff F405 warnings are resolved.pyproject.toml-27-27 (1)
27-27:⚠️ Potential issue | 🟠 MajorRegenerate and commit
poetry.lockafter this dependency-section migration.The pipeline is already failing because the lock file no longer matches
pyproject.tomlafter Line 27’s section change.✅ Required fix
poetry lock --no-update git add poetry.lock🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pyproject.toml` at line 27, The CI is failing because poetry.lock is out of sync with the pyproject.toml change to the [tool.poetry.group.dev.dependencies] section; regenerate the lockfile by running the lock command (e.g., use poetry lock --no-update), then stage and commit the updated poetry.lock so it matches pyproject.toml and fixes the pipeline.uk_bin_collection/uk_bin_collection/councils/MertonCouncil.py-115-127 (1)
115-127:⚠️ Potential issue | 🟠 MajorAdd explicit DOM guards before dereferencing parsed nodes.
Line 118 and Line 126 can crash with
AttributeErrorif the council DOM shifts. Raise explicit parser errors with clear messages instead.🔧 Suggested fix
govuk_grid_column_two_thirds = soup.find( "div", class_="govuk-grid-column-two-thirds" ) + if govuk_grid_column_two_thirds is None: + raise ValueError("Expected Merton container 'govuk-grid-column-two-thirds' not found") + waste_service_grids = govuk_grid_column_two_thirds.find_all( "div", class_="waste-service-grid" ) + if not waste_service_grids: + raise ValueError("No waste service grids found in Merton response") for waste_service_grid in waste_service_grids: h3 = waste_service_grid.find("h3", class_="waste-service-name") + if h3 is None: + raise ValueError("Missing waste service name header in Merton response") bin_type = h3.get_text().strip()Based on learnings: prefer explicit failures (raise exceptions on unexpected formats) over silent defaults or swallowed errors.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@uk_bin_collection/uk_bin_collection/councils/MertonCouncil.py` around lines 115 - 127, The code dereferences parsed nodes without guarding for missing elements; add explicit DOM guards and raise clear parser exceptions when expectations fail: check that govuk_grid_column_two_thirds (result of soup.find("div", class_="govuk-grid-column-two-thirds")) is not None before calling .find_all and raise a ParserError (or ValueError) with a descriptive message if missing; similarly, inside the loop ensure waste_service_grid.find("h3", class_="waste-service-name") returns a non-None h3 before calling .get_text(), and raise a parser error naming the missing element (e.g., "missing waste-service-name h3 in waste_service_grid") so downstream callers fail fast and with useful diagnostics.uk_bin_collection/uk_bin_collection/councils/LeedsCityCouncil.py-33-35 (1)
33-35:⚠️ Potential issue | 🟠 MajorDo not hardcode the subscription key in source code.
Line 34 embeds an API key directly in code. Move it to runtime configuration (env var/secret store) and fail explicitly if it is missing.
🔧 Suggested fix
+import os ... - headers = { - "ocp-apim-subscription-key": "ad8dd80444fe45fcad376f82cf9a5ab4", + subscription_key = os.environ["LEEDS_WASTE_API_SUBSCRIPTION_KEY"] + headers = { + "ocp-apim-subscription-key": subscription_key, "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/145.0.0.0 Safari/537.36", }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@uk_bin_collection/uk_bin_collection/councils/LeedsCityCouncil.py` around lines 33 - 35, The subscription key is hardcoded in the headers dict inside LeedsCityCouncil.py (the "ocp-apim-subscription-key" value); change this to read from a runtime configuration (e.g., an environment variable like LEEDS_API_SUBSCRIPTION_KEY or a secrets store) when building headers, and make the code raise an explicit error (or exit) if that variable is missing or empty so the process fails fast rather than using a baked-in secret.uk_bin_collection/tests/input.json-1851-1859 (1)
1851-1859:⚠️ Potential issue | 🟠 Major
NorthWarwickshireBoroughCouncilLAD code appears wrong.Line 1858 sets
LAD24CDtoE07000220, which is already used by Rugby. North Warwickshire should beE07000218.🔧 Suggested fix
- "LAD24CD": "E07000220" + "LAD24CD": "E07000218"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@uk_bin_collection/tests/input.json` around lines 1851 - 1859, The LAD24CD for the NorthWarwickshireBoroughCouncil entry is incorrect (currently "E07000220" which is Rugby); update the NorthWarwickshireBoroughCouncil object's LAD24CD value to the correct code "E07000218" so it no longer conflicts with Rugby, ensuring you edit the "NorthWarwickshireBoroughCouncil" JSON object and replace the existing LAD24CD string with "E07000218".uk_bin_collection/uk_bin_collection/councils/SwaleBoroughCouncil.py-49-56 (1)
49-56:⚠️ Potential issue | 🟠 MajorDo not swallow page-load/parser failures in this block.
Line 54 catches
Exceptionand only prints, then execution continues. Fail fast here with a typed exception so site-format changes are surfaced immediately.🔧 Suggested fix
+from selenium.common.exceptions import TimeoutException ... - except Exception: - print("Page failed to load. Probably due to Cloudflare robot check!") + except TimeoutException as exc: + raise RuntimeError( + "Swale page did not expose postcode field; selector/layout or bot protection likely changed." + ) from excBased on learnings: prefer explicit failures (raise exceptions on unexpected formats) over silent defaults or swallowed errors.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@uk_bin_collection/uk_bin_collection/councils/SwaleBoroughCouncil.py` around lines 49 - 56, The current try/except around WebDriverWait and inputElement_postcode.send_keys swallows all Exceptions and only prints a message; change this to catch a specific exception (e.g., selenium.common.exceptions.TimeoutException or NoSuchElementException) and re-raise or raise a new descriptive RuntimeError so failures surface immediately; update the except block in the block containing WebDriverWait(driver, 10).until(...), inputElement_postcode and send_keys to stop swallowing errors — log if needed, then raise a typed exception with context (include user_postcode and the element id "q499089_q1") so site-format changes are not silently ignored.uk_bin_collection/uk_bin_collection/councils/NewhamCouncil.py-22-23 (1)
22-23:⚠️ Potential issue | 🟠 MajorAdd timeout and HTTP status checking to the request.
Line 22 makes an HTTP request without timeout or status validation, allowing silent failures and potential hangs. While
verify=Falseis used across multiple council parsers in the codebase (likely for SSL cert workarounds), the request must callraise_for_status()after the request to enforce explicit failure on HTTP errors.🔧 Suggested fix
- page = requests.get(url, verify=False) + page = requests.get(url, verify=False, timeout=30) + page.raise_for_status()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@uk_bin_collection/uk_bin_collection/councils/NewhamCouncil.py` around lines 22 - 23, The HTTP request in NewhamCouncil.py uses requests.get(url, verify=False) without timeout or status checking; update the call to include a sensible timeout and immediately call response.raise_for_status() on the returned object (the variable currently named page) before parsing with BeautifulSoup to ensure HTTP errors raise exceptions; if there is existing exception handling around this code, keep it and ensure it catches requests.RequestException to handle timeouts and HTTP errors from raise_for_status().uk_bin_collection/uk_bin_collection/councils/LeedsCityCouncil.py-41-45 (1)
41-45:⚠️ Potential issue | 🟠 MajorHarden this API request and remove payload logging.
Line 41 has no timeout/status check, Line 43 prints raw response content (potentially sensitive data), and Line 34 exposes the API key in source code. Use explicit error handling with
raise_for_status()andresponse.json(), and move credentials to environment variables.🔧 Suggested fix
- response = requests.get(URI, params=params, headers=headers) - - print(response.content) - - collections = json.loads(response.content) + response = requests.get(URI, params=params, headers=headers, timeout=30) + response.raise_for_status() + collections = response.json()Also move the subscription key from hardcoded string to an environment variable.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@uk_bin_collection/uk_bin_collection/councils/LeedsCityCouncil.py` around lines 41 - 45, The API call in LeedsCityCouncil (around the requests.get call that uses URI, params, headers and the hardcoded subscription key) must be hardened: remove the debug print(response.content); replace json.loads(response.content) with response.json(); call response.raise_for_status() after the get and wrap the request in a try/except catching requests.exceptions.RequestException to surface/log errors; add an explicit timeout argument to requests.get; and move the hardcoded subscription key into an environment variable (read it via os.environ) and inject it into headers rather than keeping the literal string in source.uk_bin_collection/uk_bin_collection/councils/LondonBoroughHammersmithandFulham.py-35-60 (1)
35-60:⚠️ Potential issue | 🟠 MajorDon't treat an empty scrape as success.
If the selector drifts and
find_all("a")yields nothing, this parser currently returns{"bins": []}as if it succeeded. Please raise once no collections were extracted.Suggested change
dict_data = { "type": collection_type, "collectionDate": collection_day, } bindata["bins"].append(dict_data) + if not bindata["bins"]: + raise RuntimeError("LBHF: no bin data found in page HTML.") + bindata["bins"].sort( key=lambda x: datetime.strptime(x.get("collectionDate"), "%d/%m/%Y") )Based on learnings, prefer explicit failures over silent defaults or swallowed errors when parsing council bin collection data so format changes surface immediately.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@uk_bin_collection/uk_bin_collection/councils/LondonBoroughHammersmithandFulham.py` around lines 35 - 60, The parser currently treats an empty result from ol.find_all("a") as success; after computing bin_collections (the result of ol.find_all("a") in the function that builds bindata) add an explicit failure: if bin_collections is empty or None, raise an exception (e.g., ValueError or a custom ParseError) with a clear message that no bin collections were extracted so the caller/test harness can surface selector drift; do this check before the for loop that populates bindata["bins"] and only proceed to sort and return when collections were found.uk_bin_collection/uk_bin_collection/councils/LondonBoroughOfRichmondUponThames.py-25-34 (1)
25-34:⚠️ Potential issue | 🟠 MajorUse the parsed PID instead of a raw substring check.
This branch currently treats any URL containing
pid=as authoritative, even when the value is empty or malformed, so a badbase_urlcan ignore a validpid/paonfallback and fetch the wrong page.Suggested change
- if "pid=" in (base_url or ""): + if pid_from_url: target_url = base_url elif pid_arg or pid_from_paon: pid = pid_arg or pid_from_paon sep = "&" if "?" in (base_url or "") else "?"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@uk_bin_collection/uk_bin_collection/councils/LondonBoroughOfRichmondUponThames.py` around lines 25 - 34, The branch that currently checks "pid=" in base_url should instead use the parsed pid_from_url (returned by _pid_from_url) to determine if a valid PID is present; replace the raw substring check with a truthy check of pid_from_url and, if present and valid, use base_url as target_url, otherwise fall back to pid_arg or pid_from_paon (from _pid_from_paon) to construct the target_url with the proper separator logic; ensure references to pid_from_url, pid_from_paon, pid_arg, _pid_from_url, _pid_from_paon, and target_url are used so the code prefers a parsed PID over a mere "pid=" substring.uk_bin_collection/uk_bin_collection/councils/NorthWarwickshireBoroughCouncil.py-162-195 (1)
162-195:⚠️ Potential issue | 🟠 MajorRaise if all lookup calls produce no bins.
Right now auth/API regressions can fall through every stream block and return an apparently valid empty payload. Please fail fast once aggregation produced nothing.
Suggested change
if garden: for key, item in garden.items(): dict_data = { "type": item["JobName"].strip(), "collectionDate": item["Date"], } bindata["bins"].append(dict_data) + if not bindata["bins"]: + raise RuntimeError("North Warwickshire: no bin data returned from API.") + bindata["bins"].sort( key=lambda x: datetime.strptime(x.get("collectionDate"), date_format) )Based on learnings, prefer explicit failures over silent defaults or swallowed errors when parsing council bin collection data so format changes surface immediately.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@uk_bin_collection/uk_bin_collection/councils/NorthWarwickshireBoroughCouncil.py` around lines 162 - 195, After aggregating into bindata['bins'] (from food_waste, refuse, recycling, garden), add a fail-fast check immediately before the sort/return: if bindata["bins"] is empty, raise a clear exception (e.g., ValueError or a CouncilDataError) with a message indicating no bins were returned for NorthWarwickshireBoroughCouncil; keep the check between the aggregation loop and the existing bindata["bins"].sort(...) so missing data fails fast and prevents returning an empty payload.uk_bin_collection/uk_bin_collection/councils/NorthWarwickshireBoroughCouncil.py-35-35 (1)
35-35:⚠️ Potential issue | 🟠 MajorAdd timeouts to every Achieve call.
Lines 35, 60, 102, 120, 138, and 156 all make network requests via the requests library without specifying a timeout. Without explicit timeouts, a slow or unresponsive upstream server will cause the entire scrape to hang indefinitely. Add timeout parameters (e.g.,
timeout=10) to eachs.get()ands.post()call.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@uk_bin_collection/uk_bin_collection/councils/NorthWarwickshireBoroughCouncil.py` at line 35, Network requests in NorthWarwickshireBoroughCouncil.py use requests.Session methods without timeouts (e.g., the call r = s.get(SESSION_URL) and the other s.get()/s.post() calls at the same file); update every s.get(...) and s.post(...) invocation in this module to include a timeout parameter (for example timeout=10) so each network call (including the one using SESSION_URL and the calls around lines 60, 102, 120, 138, 156) becomes s.get(..., timeout=10) or s.post(..., timeout=10).uk_bin_collection/uk_bin_collection/councils/LondonBoroughHammersmithandFulham.py-28-32 (1)
28-32:⚠️ Potential issue | 🟠 MajorAdd a timeout to the LBHF request.
session.get()lacks a timeout parameter, risking indefinite hangs that block the entire scrape operation.Suggested change
- response = session.get(URI) + response = session.get(URI, timeout=30)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@uk_bin_collection/uk_bin_collection/councils/LondonBoroughHammersmithandFulham.py` around lines 28 - 32, The session.get(URI) call in LondonBoroughHammersmithandFulham (where session is created and headers set) has no timeout and can hang; update the call to pass a timeout parameter (e.g., session.get(URI, timeout=10) or use an existing DEFAULT_TIMEOUT constant if one exists) so the request will fail fast, and keep the existing response.raise_for_status() logic unchanged; ensure any surrounding exception handling will catch requests.exceptions.Timeout if needed.uk_bin_collection/uk_bin_collection/councils/LondonBoroughHammersmithandFulham.py-6-7 (1)
6-7:⚠️ Potential issue | 🟠 MajorReplace the star import with explicit imports before merging.
The star import at line 6 causes F403/F405 lint violations in Ruff and obscures the module's dependencies. The file uses four symbols from
common:check_postcode,date_format,days_of_week, andget_next_day_of_week. Replace with explicit imports to resolve linting failures.Suggested change
-from uk_bin_collection.uk_bin_collection.common import * +from uk_bin_collection.uk_bin_collection.common import ( + check_postcode, + date_format, + days_of_week, + get_next_day_of_week, +)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@uk_bin_collection/uk_bin_collection/councils/LondonBoroughHammersmithandFulham.py` around lines 6 - 7, Replace the star import from uk_bin_collection.uk_bin_collection.common with explicit imports to fix lint F403/F405: import check_postcode, date_format, days_of_week, and get_next_day_of_week instead of using *. Update the top of LondonBoroughHammersmithandFulham.py where common is imported and keep the existing import of AbstractGetBinDataClass unchanged so the class and functions used in this file resolve without wildcard imports.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: b2d6c013-1c4a-4248-aabd-c391b68dad66
⛔ Files ignored due to path filters (1)
poetry.lockis excluded by!**/*.lock
📒 Files selected for processing (30)
.github/workflows/ha_compatibility_test.yml.github/workflows/release.ymlpyproject.tomluk_bin_collection/tests/input.jsonuk_bin_collection/uk_bin_collection/councils/BarkingDagenham.pyuk_bin_collection/uk_bin_collection/councils/BathAndNorthEastSomersetCouncil.pyuk_bin_collection/uk_bin_collection/councils/BromleyBoroughCouncil.pyuk_bin_collection/uk_bin_collection/councils/BroxbourneCouncil.pyuk_bin_collection/uk_bin_collection/councils/BroxtoweBoroughCouncil.pyuk_bin_collection/uk_bin_collection/councils/CumberlandCouncil.pyuk_bin_collection/uk_bin_collection/councils/EastleighBoroughCouncil.pyuk_bin_collection/uk_bin_collection/councils/HarboroughDistrictCouncil.pyuk_bin_collection/uk_bin_collection/councils/HinckleyandBosworthBoroughCouncil.pyuk_bin_collection/uk_bin_collection/councils/LeedsCityCouncil.pyuk_bin_collection/uk_bin_collection/councils/LondonBoroughHammersmithandFulham.pyuk_bin_collection/uk_bin_collection/councils/LondonBoroughHavering.pyuk_bin_collection/uk_bin_collection/councils/LondonBoroughOfRichmondUponThames.pyuk_bin_collection/uk_bin_collection/councils/LondonBoroughRedbridge.pyuk_bin_collection/uk_bin_collection/councils/MertonCouncil.pyuk_bin_collection/uk_bin_collection/councils/MidSuffolkDistrictCouncil.pyuk_bin_collection/uk_bin_collection/councils/MidlothianCouncil.pyuk_bin_collection/uk_bin_collection/councils/NewhamCouncil.pyuk_bin_collection/uk_bin_collection/councils/NorthEastDerbyshireDistrictCouncil.pyuk_bin_collection/uk_bin_collection/councils/NorthWarwickshireBoroughCouncil.pyuk_bin_collection/uk_bin_collection/councils/NuneatonBedworthBoroughCouncil.pyuk_bin_collection/uk_bin_collection/councils/PowysCouncil.pyuk_bin_collection/uk_bin_collection/councils/RedcarandClevelandCouncil.pyuk_bin_collection/uk_bin_collection/councils/SwaleBoroughCouncil.pyuk_bin_collection/uk_bin_collection/councils/WakefieldCityCouncil.pywiki/Councils.md
| URI = "https://api-prd.havering.gov.uk" | ||
| endpoint = f"{URI}/whitespace/GetCollectionByUprnAndDate" | ||
| subscription_key = "2ea6a75f9ea34bb58d299a0c9f84e72e" | ||
| subscription_key = "545bcf53c9094dfd980dd9da72b0514d" |
There was a problem hiding this comment.
Remove the hardcoded subscription key and fail fast on missing secret.
Line 26 introduces a plaintext API credential in source control, which is a release-blocking security issue. Load it from environment/secret storage and raise an explicit error when it is absent.
🔐 Proposed fix
+import os
import time
import requests
from bs4 import BeautifulSoup
@@
- URI = "https://api-prd.havering.gov.uk"
+ URI = "https://api-prd.havering.gov.uk"
endpoint = f"{URI}/whitespace/GetCollectionByUprnAndDate"
- subscription_key = "545bcf53c9094dfd980dd9da72b0514d"
+ subscription_key = os.getenv("HAVERING_API_SUBSCRIPTION_KEY")
+ if not subscription_key:
+ raise RuntimeError("Missing HAVERING_API_SUBSCRIPTION_KEY")Based on learnings: In uk_bin_collection/**/*.py, prefer explicit failures over silent behavior so upstream changes/issues are detected immediately.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| URI = "https://api-prd.havering.gov.uk" | |
| endpoint = f"{URI}/whitespace/GetCollectionByUprnAndDate" | |
| subscription_key = "2ea6a75f9ea34bb58d299a0c9f84e72e" | |
| subscription_key = "545bcf53c9094dfd980dd9da72b0514d" | |
| URI = "https://api-prd.havering.gov.uk" | |
| endpoint = f"{URI}/whitespace/GetCollectionByUprnAndDate" | |
| subscription_key = os.getenv("HAVERING_API_SUBSCRIPTION_KEY") | |
| if not subscription_key: | |
| raise RuntimeError("Missing HAVERING_API_SUBSCRIPTION_KEY") |
🧰 Tools
🪛 Gitleaks (8.30.0)
[high] 26-26: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@uk_bin_collection/uk_bin_collection/councils/LondonBoroughHavering.py` around
lines 24 - 26, Replace the hardcoded subscription_key with a runtime secret
lookup and fail fast: in the LondonBoroughHavering class/module replace the
literal subscription_key variable with code that reads the key from
environment/secret storage (e.g. using os.environ or your project's secrets
loader) and if the key is missing raise an explicit error/exception immediately
so initialization fails loudly; keep the existing URI and endpoint variables
(URI, endpoint) unchanged and reference the same subscription_key symbol so
callers that use subscription_key continue to work.
…across 20 more council scrapers
Combined release merging 15 PRs into a single release branch.
Fix
Feat
Chore
Closes
Closes #1836, closes #1831, closes #1844, closes #1846, closes #1845, closes #1851, closes #1853, closes #1848, closes #1855, closes #1858, closes #1861, closes #1864, closes #1863, closes #1867, closes #1870, closes #1872, closes #1868, closes #1876, closes #1879, closes #1880, closes #1504, closes #1869
PRs Included
#1841, #1843, #1847, #1849, #1852, #1854, #1856, #1857, #1860, #1866, #1871, #1873, #1877, #1878, #1882
Merge Conflicts Resolved
Summary by CodeRabbit
Release Notes
New Features
Bug Fixes
Documentation