Fix for Bromley council, closes #1829 by oliyh · Pull Request #1832 · robbrad/UKBinCollectionData

oliyh · 2026-01-30T11:15:56Z

Minimal change to work with new structure of website

Summary by CodeRabbit

Bug Fixes
- Updated data parsing logic for Bromley Borough Council to ensure accurate bin collection information retrieval.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2026-01-30T11:16:13Z

📝 Walkthrough

Walkthrough

This change updates HTML selectors and extraction logic in the BromleyBoroughCouncil scraper. Selectors were modified to target a waste-service-grid container structure, service title extraction now uses h3 elements with specific classes, and next collection data retrieval was refactored. An unused import was removed.

Changes

Cohort / File(s)	Summary
Bromley Borough Council Scraper Update `uk_bin_collection/uk_bin_collection/councils/BromleyBoroughCouncil.py`	Updated HTML selectors to target waste-service-grid container instead of govuk-summary-list. Modified service title extraction to locate h3 elements with govuk-heading-m and waste-service-name classes. Refactored next collection extraction to search for dt elements with "Next collection" text. Removed unused dateutil.relativedelta import.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Suggested reviewers

dp247

Poem

🐰 The selectors dance, the grid takes shape,
New classes found where old ones scrape,
With waste-service precision, h3 heights,
Our Bromley bins now gleam so bright! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Fix for Bromley council, closes `#1829`' directly relates to the main change in the PR, which updates selectors for the Bromley Borough Council web scraper due to website structure changes. However, it is somewhat generic and doesn't convey the specific nature of the fix.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Tip

🧪 Unit Test Generation v2 is now available!

We have significantly improved our unit test generation capabilities.

To enable: Add this to your .coderabbit.yaml configuration:

reviews:
  finishing_touches:
    unit_tests:
      enabled: true

Try it out by using the @coderabbitai generate unit tests command on your code files or under ✨ Finishing Touches on the walkthrough!

Have feedback? Share your thoughts on our Discord thread!

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@uk_bin_collection/uk_bin_collection/councils/BromleyBoroughCouncil.py`:
- Around line 43-46: The parser currently calls soup.find_all("div",
class_="waste-service-grid") and assigns it to waste_services but silently
continues if the list is empty; change this to fail fast by checking if
waste_services is empty right after that call and raise a descriptive exception
(e.g., ValueError or a custom ParseError) that includes context such as "no
waste-service-grid elements found" and the council identifier; update the
function containing waste_services (the parsing routine in
BromleyBoroughCouncil.py) to raise this error so callers can detect site/format
changes rather than receiving an empty result.
- Around line 49-55: The service element lookups in BromleyBoroughCouncil.py can
raise AttributeError and are brittle: replace service.find("h3",
class_="govuk-heading-m waste-service-name").get_text(...) with a CSS selector
via service.select_one("h3.govuk-heading-m.waste-service-name") and check the
returned node is not None before calling get_text; similarly ensure
next_collection = service.find("dt", string="Next collection") is validated (or
use a selector) and raise or log a clear, contextual error if either element is
missing so callers get informative failure messages; update any surrounding code
that assumes these variables (service_title, next_collection) to handle the None
case consistently.

coderabbitai · 2026-01-30T11:20:33Z

+            # Find all elements with class 'waste-service-grid'
            waste_services = soup.find_all(
-                "h3", class_="govuk-heading-m waste-service-name"
+                "div", class_="waste-service-grid"
            )


⚠️ Potential issue | 🟠 Major

Fail fast when no services are found.

If the selector yields zero services, the parser silently returns an empty list and masks site changes. Raise a clear exception when waste_services is empty.

🔧 Proposed fix

- waste_services = soup.find_all( - "div", class_="waste-service-grid" - ) + waste_services = soup.find_all( + "div", class_="waste-service-grid" + ) + if not waste_services: + raise ValueError( + "BromleyBoroughCouncil: no waste-service-grid elements found" + )

Based on learnings: In uk_bin_collection/**/*.py, when parsing council bin collection data, prefer explicit failures (raise exceptions on unexpected formats) over silent defaults or swallowed errors.

🤖 Prompt for AI Agents

In `@uk_bin_collection/uk_bin_collection/councils/BromleyBoroughCouncil.py` around lines 43 - 46, The parser currently calls soup.find_all("div", class_="waste-service-grid") and assigns it to waste_services but silently continues if the list is empty; change this to fail fast by checking if waste_services is empty right after that call and raise a descriptive exception (e.g., ValueError or a custom ParseError) that includes context such as "no waste-service-grid elements found" and the council identifier; update the function containing waste_services (the parsing routine in BromleyBoroughCouncil.py) to raise this error so callers can detect site/format changes rather than receiving an empty result.

coderabbitai · 2026-01-30T11:20:34Z

+                service_title = service.find(
+                    "h3", class_="govuk-heading-m waste-service-name"
+                ).get_text(strip=True)
+
+                next_collection = service.find(
                    "dt", string="Next collection"
                )


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n uk_bin_collection/uk_bin_collection/councils/BromleyBoroughCouncil.py | sed -n '40,65p'

Repository: robbrad/UKBinCollectionData

Length of output: 1303

🏁 Script executed:

# Check if the file exists and get its full line count wc -l uk_bin_collection/uk_bin_collection/councils/BromleyBoroughCouncil.py

Repository: robbrad/UKBinCollectionData

Length of output: 144

🏁 Script executed:

# Look for the complete function context around lines 49-55 cat -n uk_bin_collection/uk_bin_collection/councils/BromleyBoroughCouncil.py | sed -n '30,80p'

Repository: robbrad/UKBinCollectionData

Length of output: 2463

🌐 Web query:

BeautifulSoup class_ parameter multiple classes matching behavior

💡 Result:

In BeautifulSoup (bs4), the class_ filter has three different behaviors depending on what you pass:

Single class name (matches ANY class on the element)

soup.find_all("div", class_="btn")

Matches elements whose class list contains "btn" (even if they also have other classes). [1]

List/iterable of class names (OR / union match)

soup.find_all("div", class_=["btn", "primary"])

Matches elements that have any one of the listed classes (btn or primary, or both). [2]

Space-separated string (exact attribute string match; order-sensitive)

soup.find_all("div", class_="btn primary")

Matches only if the class attribute string is exactly "btn primary" (so "primary btn" will not match). [1]

To require multiple classes (AND / intersection)

Use a CSS selector:

soup.select("div.btn.primary")

This matches elements that have both classes, in any order. [1]

Gotcha: attrs={"class": lambda ...} may not see the whole class list

When you pass a function for attrs["class"], BeautifulSoup may call it on each individual class string, not the full list—so exclusion/combination logic can behave unexpectedly. [3]

Sources:
[1] Beautiful Soup documentation (class matching + exact-string + selector recommendation) (crummy.com)
[2] Stack Overflow discussion of list meaning OR/union for class_ (stackoverflow.com)
[3] Beautiful Soup bug report explaining callable applied per-class string (bugs.launchpad.net)

Citations:

1: https://www.crummy.com/software/BeautifulSoup/bs4/doc/?utm_source=openai

2: https://stackoverflow.com/questions/5041008/how-to-find-elements-by-class?utm_source=openai

3: https://bugs.launchpad.net/bugs/2075974?utm_source=openai

🏁 Script executed:

# Check the HuntingdonDistrictCouncil.py file mentioned in learnings (lines 47-54) cat -n uk_bin_collection/uk_bin_collection/councils/HuntingdonDistrictCouncil.py | sed -n '40,60p'

Repository: robbrad/UKBinCollectionData

Length of output: 972

🏁 Script executed:

# Search for similar error handling patterns in other council files rg "\.find\(" uk_bin_collection/uk_bin_collection/councils/ -A 2 -B 1 | head -50

Repository: robbrad/UKBinCollectionData

Length of output: 4597

🏁 Script executed:

# Check if there are any try-except patterns around find() calls in council files rg "try:" uk_bin_collection/uk_bin_collection/councils/ -A 5 | head -60

Repository: robbrad/UKBinCollectionData

Length of output: 6326

🏁 Script executed:

# Check if there are any HTML examples or test files that show the structure fd -e html -e json uk_bin_collection/ | head -20

Repository: robbrad/UKBinCollectionData

Length of output: 448

🏁 Script executed:

# Look for any tests or documentation about BromleyBoroughCouncil rg "BromleyBoroughCouncil" --type py -l

Repository: robbrad/UKBinCollectionData

Length of output: 53

🏁 Script executed:

# Check the broader context - is there a PR or issue reference? git log --oneline --all -- uk_bin_collection/uk_bin_collection/councils/BromleyBoroughCouncil.py | head -10

Repository: robbrad/UKBinCollectionData

Length of output: 605

Add explicit error handling and use CSS selectors for robust element matching.

Line 49–51 lacks a None check and will raise an opaque AttributeError if the element isn't found. Additionally, the class_ parameter with a space-separated string is order-sensitive, so using a CSS selector with select_one() is more robust. The code also has inconsistent error handling: line 57 checks if next_collection: but line 49 does not. Add explicit checks with informative errors at all element lookups.

🔧 Proposed fix

- service_title = service.find( - "h3", class_="govuk-heading-m waste-service-name" - ).get_text(strip=True) + title_el = service.select_one( + "h3.govuk-heading-m.waste-service-name" + ) + if not title_el: + raise ValueError( + "BromleyBoroughCouncil: service title not found" + ) + service_title = title_el.get_text(strip=True) - next_collection = service.find( - "dt", string="Next collection" - ) + next_collection = service.find( + "dt", string=lambda s: s and s.strip() == "Next collection" + ) + if not next_collection: + raise ValueError( + "BromleyBoroughCouncil: 'Next collection' label not found" + ) - if next_collection: - next_collection_date = next_collection.find_next_sibling().get_text( - strip=True - ) + next_collection_value = next_collection.find_next_sibling("dd") + if not next_collection_value: + raise ValueError( + "BromleyBoroughCouncil: 'Next collection' value not found" + ) + next_collection_date = next_collection_value.get_text(strip=True)

🤖 Prompt for AI Agents

In `@uk_bin_collection/uk_bin_collection/councils/BromleyBoroughCouncil.py` around lines 49 - 55, The service element lookups in BromleyBoroughCouncil.py can raise AttributeError and are brittle: replace service.find("h3", class_="govuk-heading-m waste-service-name").get_text(...) with a CSS selector via service.select_one("h3.govuk-heading-m.waste-service-name") and check the returned node is not None before calling get_text; similarly ensure next_collection = service.find("dt", string="Next collection") is validated (or use a selector) and raise or log a clear, contextual error if either element is missing so callers get informative failure messages; update any surrounding code that assumes these variables (service_title, next_collection) to handle the None case consistently.

#1821 changes as they are more comprehensive)

robbrad · 2026-02-01T17:54:33Z

This PR has been merged into the February 2026 consolidated release PR #1837.

Thank you for your contribution!

Fix for Bromley council, closes robbrad#1829

df3d08a

coderabbitai Bot reviewed Jan 30, 2026

View reviewed changes

robbrad added a commit that referenced this pull request Feb 1, 2026

Merge PR #1832: Fix for Bromley council (conflict resolved - keeping PR

4daf6aa

#1821 changes as they are more comprehensive)

robbrad mentioned this pull request Feb 1, 2026

February 2026 Release - Consolidated Council Fixes #1837

Merged

robbrad closed this Feb 1, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix for Bromley council, closes #1829#1832

Fix for Bromley council, closes #1829#1832
oliyh wants to merge 1 commit into
robbrad:masterfrom
oliyh:master

oliyh commented Jan 30, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jan 30, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jan 30, 2026

Uh oh!

coderabbitai Bot Jan 30, 2026

Uh oh!

robbrad commented Feb 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

oliyh commented Jan 30, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jan 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jan 30, 2026

Choose a reason for hiding this comment

To require multiple classes (AND / intersection)

Gotcha: attrs={"class": lambda ...} may not see the whole class list

Uh oh!

robbrad commented Feb 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

oliyh commented Jan 30, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jan 30, 2026 •

edited

Loading

Gotcha: `attrs={"class": lambda ...}` may not see the whole class list