fix: Cumberland Council by robbrad · Pull Request #1764 · robbrad/UKBinCollectionData

robbrad · 2025-12-08T00:35:34Z

Summary by CodeRabbit

Bug Fixes

Cumberland Council bin collection data now retrieves from the official council schedule page.
Postcode field is no longer required for Cumberland Council queries.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2025-12-08T00:35:43Z

Walkthrough

Updated CumberlandCouncil integration from form-based bin collection retrieval to direct GET requests. Removed postcode from test data and simplified URL to official schedule page. Replaced form token parsing with direct HTML content extraction and line-based bin schedule parsing.

Changes

Cohort / File(s)	Summary
CumberlandCouncil Test Data `uk_bin_collection/tests/input.json`	Removed postcode field and updated URL from renderform endpoint to official Cumberland bin-collection schedule page; UPRN identifier preserved.
CumberlandCouncil Implementation `uk_bin_collection/uk_bin_collection/councils/CumberlandCouncil.py`	Refactored data retrieval from multi-step form-based POST/GET sequence to direct GET using UPRN. Replaced form token extraction with simplified HTML parsing targeting `lgd-region--content` div. Introduced month/year context detection and line-based bin type/date mapping.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Parsing logic changes: New month/year heuristic (2026 context) and line-based tokenization require validation against expected output
HTML structure dependency: Direct reliance on lgd-region--content class selector may be fragile; verify with council's current page structure
Test coverage: Confirm test data (input.json) aligns with parsing expectations and covers edge cases

Suggested reviewers

dp247

Poem

🐰 A form-based fetch now becomes a sprint,
Direct HTML parsing leaves no hint,
The UPRN guides us down the street,
Where bin schedules and logic sweetly meet!
No tokens or tokens, just content so neat! ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.
Title check	❓ Inconclusive	The title 'fix: Cumberland Council' is vague and does not clearly specify what aspect of Cumberland Council functionality is being fixed or improved.	Consider using a more specific title that describes the actual fix, such as 'fix: Replace form-based bin collection retrieval with direct UPRN lookup for Cumberland Council' or 'fix: Simplify Cumberland Council bin collection parsing logic'.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat_dec_fixes

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2025-12-08T00:38:08Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.79%. Comparing base (7970654) to head (5849bac).
⚠️ Report is 14 commits behind head on master.
✅ All tests successful. No failed tests found.

Additional details and impacted files

@@           Coverage Diff           @@
##           master    #1764   +/-   ##
=======================================
  Coverage   86.79%   86.79%           
=======================================
  Files           9        9           
  Lines        1136     1136           
=======================================
  Hits          986      986           
  Misses        150      150

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (2)

uk_bin_collection/uk_bin_collection/councils/CumberlandCouncil.py (2)
79-80: Silent exception swallowing may hide parsing failures.

Catching ValueError with a bare pass means malformed dates fail silently. Consider logging to aid debugging when collection dates aren't being captured correctly.
                     except ValueError:
-                        pass
+                        # Log or handle malformed date gracefully
+                        pass  # Consider: logging.warning(f"Failed to parse date: {date_str}")
88-90: Use consistent date format reference.

Line 76 uses date_format variable, but line 89 hardcodes "%d/%m/%Y". For maintainability, use the same format constant in both places.
         bindata["bins"].sort(
-            key=lambda x: datetime.strptime(x.get("collectionDate"), "%d/%m/%Y")
+            key=lambda x: datetime.strptime(x.get("collectionDate"), date_format)
         )

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a0a7f11 and 5849bac.

📒 Files selected for processing (2)

uk_bin_collection/tests/input.json (1 hunks)
uk_bin_collection/uk_bin_collection/councils/CumberlandCouncil.py (1 hunks)

🧰 Additional context used

🪛 Ruff (0.14.7)

uk_bin_collection/uk_bin_collection/councils/CumberlandCouncil.py

24-24: Probable use of requests call without timeout

(S113)

54-54: datetime may be undefined, or defined from star imports

(F405)

72-72: datetime may be undefined, or defined from star imports

(F405)

76-76: date_format may be undefined, or defined from star imports

(F405)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Run Integration Tests (3.12, 1.8.4)
GitHub Check: Run Unit Tests (3.12, 1.8.4)

🔇 Additional comments (1)

uk_bin_collection/tests/input.json (1)

629-635: Approved: CumberlandCouncil test data updated correctly for direct-GET approach.

The test configuration update correctly reflects the refactored CumberlandCouncil integration:

URL simplified to the official schedule page (line 631)

Postcode field removed (aligns with UPRN-only direct GET approach)

UPRN retained (required for new parsing logic)

Wiki guidance unchanged (still directs to FindMyAddress)

To confirm the changes work end-to-end, verify that the new URL structure at line 631 returns expected HTML content when accessed with the UPRN (10009457328) and that the parser implementation in CumberlandCouncil.py correctly extracts bin schedule data from the page's div.lgd-region--content element (as noted in the summary).

coderabbitai · 2025-12-08T00:38:43Z

+        response = requests.get(url)
+        soup = BeautifulSoup(response.content, "html.parser")


⚠️ Potential issue | 🟠 Major

Add timeout and error handling for the HTTP request.

The requests.get() call lacks a timeout, which can cause indefinite hangs if the server is unresponsive. Additionally, there's no error handling for failed requests.

- response = requests.get(url) - soup = BeautifulSoup(response.content, "html.parser") + response = requests.get(url, timeout=30) + response.raise_for_status() + soup = BeautifulSoup(response.content, "html.parser")

If you prefer graceful degradation instead of raising on error:

response = requests.get(url, timeout=30) if not response.ok: return bindata

🧰 Tools

🪛 Ruff (0.14.7)

24-24: Probable use of requests call without timeout

(S113)

🤖 Prompt for AI Agents

In uk_bin_collection/uk_bin_collection/councils/CumberlandCouncil.py around lines 24-25 the requests.get call has no timeout and no error handling; update the code to call requests.get(url, timeout=30) (or another sensible timeout), wrap the request in a try/except to catch requests.exceptions.RequestException, and handle non-2xx responses by checking response.ok and either returning the existing bindata for graceful degradation or raising a clear exception; also ensure any caught exception is logged or re-raised with context so failures are observable.

coderabbitai · 2025-12-08T00:38:43Z

+        # Determine the year range from the page header
+        year_2026 = "2026" in text_content
+
+        while i < len(lines):
+            line = lines[i]
+
+            # Check if this is a month name
+            if line in ["January", "February", "March", "April", "May", "June", 
+                       "July", "August", "September", "October", "November", "December"]:
+                current_month = line
+                # Determine year based on month and context
+                if year_2026:
+                    current_year = "2026" if line in ["January", "February"] else "2025"
+                else:
+                    current_year = str(datetime.now().year)


⚠️ Potential issue | 🟠 Major

Year inference logic is brittle and will break over time.

The current approach has two issues:

Checking "2026" in text_content may match unrelated occurrences (addresses, IDs).

Hardcoded years ("2025", "2026") will produce incorrect dates after this period.

Consider deriving the year dynamically based on the current date and whether the month has passed:

- # Determine the year range from the page header - year_2026 = "2026" in text_content + # Base year for date inference + today = datetime.now() + base_year = today.year

Then in the month handling section:

- # Determine year based on month and context - if year_2026: - current_year = "2026" if line in ["January", "February"] else "2025" - else: - current_year = str(datetime.now().year) + # Determine year: if month is earlier in year than current month, + # assume it's next year (for schedules spanning year boundary) + month_num = datetime.strptime(line, "%B").month + if month_num < today.month: + current_year = str(base_year + 1) + else: + current_year = str(base_year)

🧰 Tools

🪛 Ruff (0.14.7)

54-54: datetime may be undefined, or defined from star imports

(F405)

jasperatus16 · 2025-12-08T07:49:20Z

Thanks for the fix, I got the logic and assume this works no problem.

However, I have tried to setup the integration again and both entries for Cumberland Council (I don't know why there are still two) are requesting "House Number" and "Postcode" - this fixed suggests that this now just needs a UPRN.

Has this been made live?

fix: Cumberland Council

5849bac

This was referenced Dec 8, 2025

Cumberland council URL/data change #1456

Closed

Cumberland Council - No longer detecting recycling #1620

Closed

Cumberland Council Duplicate Entries - both act same in config #1627

Closed

robbrad merged commit 0b13739 into master Dec 8, 2025
14 of 16 checks passed

coderabbitai Bot reviewed Dec 8, 2025

View reviewed changes

robbrad mentioned this pull request Dec 8, 2025

West Oxfordshire Council - ID #1468

Closed

4 tasks

coderabbitai Bot mentioned this pull request Dec 22, 2025

Change of bin collection provider for Lancaster City Council #1785

Open

4 tasks

This was referenced Jan 15, 2026

Remove Selenium from Arun #1812

Draft

Remove Selenium from BarkingDagenham #1813

Draft

coderabbitai Bot mentioned this pull request Jan 23, 2026

fix: UttlesfordDistrictCouncil year and bin type parsing #1823

Closed

coderabbitai Bot mentioned this pull request Feb 19, 2026

Update CumberlandCouncil.py #1860

Closed

coderabbitai Bot mentioned this pull request Mar 14, 2026

March 2026 Release #1883

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Cumberland Council#1764

fix: Cumberland Council#1764
robbrad merged 1 commit into
masterfrom
feat_dec_fixes

robbrad commented Dec 8, 2025 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Dec 8, 2025 •

edited

Loading

Uh oh!

codecov Bot commented Dec 8, 2025 •

edited

Loading

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Dec 8, 2025

Uh oh!

coderabbitai Bot Dec 8, 2025

Uh oh!

jasperatus16 commented Dec 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		response = requests.get(url)
		soup = BeautifulSoup(response.content, "html.parser")

Conversation

robbrad commented Dec 8, 2025 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Bug Fixes

Uh oh!

coderabbitai Bot commented Dec 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

codecov Bot commented Dec 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

jasperatus16 commented Dec 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

robbrad commented Dec 8, 2025 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Dec 8, 2025 •

edited

Loading

codecov Bot commented Dec 8, 2025 •

edited

Loading