fix: Kingston-upon-Thames website HTML format change by andypiper · Pull Request #1828 · robbrad/UKBinCollectionData

andypiper · 2026-01-25T19:26:03Z

The Kingston upon Thames page layout slightly changed to a structure using a div.waste-service-grid container that now wraps both the service name and the collection details. This was breaking the data retrieval.

While I was in there I also updated the comment with the URL for the help page on the Kingston website about bin collection (was 404).

Tested locally on Home Assistant Green / 2026.1.3 and this successfully now returns a calendar again.

Fixes #1824

Summary by CodeRabbit

Bug Fixes
- Improved Kingston Upon Thames bin collection parsing for greater resilience to website layout changes, with clearer messages when data is missing and more accurate next-collection date detection and formatting.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Signed-off-by: Andy Piper <andypiper@users.noreply.github.com>

coderabbitai · 2026-01-25T19:26:22Z

📝 Walkthrough

Walkthrough

Parsing for Kingston Upon Thames Council was changed to iterate over div.waste-service-grid elements, extracting h3.waste-service-name and dl.govuk-summary-list per grid with explicit validation and descriptive ValueError raises when expected DOM elements are missing.

Changes

Cohort / File(s)	Summary
DOM Parsing Logic Refactor `uk_bin_collection/uk_bin_collection/councils/KingstonUponThamesCouncil.py`	Replaced previous sibling-based DOM traversal with iteration over `div.waste-service-grid`. Extracts service type from inner `h3.waste-service-name`, finds collection info in `dl.govuk-summary-list` rows, constructs date string including current year, uses `get_next_occurrence_from_day_month`, and adds explicit validation with descriptive ValueError messages for missing elements.

Sequence Diagram(s)

(Skipped — changes are localized parsing updates and do not introduce multi-component sequential flows.)

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Suggested reviewers

dp247

Poem

🐰 I hopped through grids where h3s hide,
Found names and dates with checks beside.
No more NoneType scares in the night,
Kingston's bins will pop up right! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title accurately summarizes the main change: fixing Kingston Council's parser due to an HTML format change on their website.
Linked Issues check	✅ Passed	The code changes directly address issue `#1824` by updating the parser to handle the new div.waste-service-grid structure, fixing the AttributeError and restoring bin collection data retrieval.
Out of Scope Changes check	✅ Passed	All changes are scoped to the Kingston Council parser to address the HTML structure change; the help-page URL comment update is a necessary correction directly related to the parsing issue.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2026-01-25T19:28:02Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.67%. Comparing base (274152f) to head (b057b08).
✅ All tests successful. No failed tests found.

Additional details and impacted files

@@           Coverage Diff           @@
##           master    #1828   +/-   ##
=======================================
  Coverage   86.67%   86.67%           
=======================================
  Files           9        9           
  Lines        1141     1141           
=======================================
  Hits          989      989           
  Misses        152      152

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

coderabbitai

Actionable comments posted: 3

🤖 Fix all issues with AI agents

In `@uk_bin_collection/uk_bin_collection/councils/KingstonUponThamesCouncil.py`:
- Line 75: Replace the use of .capitalize() on service_name in the dict
construction (the line that sets the "type" key) so the original casing is
preserved; in KingstonUponThamesCouncil.py stop calling
service_name.capitalize() and assign service_name directly to the "type" field
(keeping the rest of the code that builds the service dict intact), ensuring any
user-visible labels remain as provided by the source.
- Around line 47-48: The parser currently uses soup.find_all("div", {"class":
"waste-service-grid"}) and then continues silently if service_grids is empty;
modify the code around the service_grids variable (the result of find_all) to
check if not service_grids and immediately raise a descriptive exception (e.g.,
ValueError or a custom ParseError) with a message like "No waste-service-grid
elements found on KingstonUponThamesCouncil page" so failures are explicit;
ensure the check sits before any iteration over service_grids (the for grid in
service_grids loop) so the function fails fast when the page structure changes.
- Around line 65-69: The code that extracts the "next collection" date (inside
KingstonUponThamesCouncil.py where dt = row.find("dt") and collection_date is
built using row.find("dd")) does not guard against a missing <dd> and will raise
an AttributeError; update the logic to check that dd = row.find("dd") is not
None before calling get_text(), and if it is None raise a clear, descriptive
error (or handle gracefully) that includes context (e.g., the offending row or
dt text); ensure you still call
remove_ordinal_indicator_from_date_string(dd.get_text()) only when dd is
present.

Signed-off-by: Andy Piper <andypiper@users.noreply.github.com>

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

uk_bin_collection/uk_bin_collection/councils/KingstonUponThamesCouncil.py (1)

67-94: Fail fast if no “next collection” row is found per service.
Right now, if a grid lacks a “next collection” row, the parser silently skips that bin. That can yield partial data without surfacing a format change. Suggest tracking whether a match was found and raising when absent. Based on learnings, explicit failures are preferred.

🔧 Proposed fix

-                rows = summary_list.find_all("div", {"class": "govuk-summary-list__row"})
+                rows = summary_list.find_all("div", {"class": "govuk-summary-list__row"})
+                found_next_collection = False
                 for row in rows:
                     dt = row.find("dt")
                     if dt and dt.get_text().strip().lower() == "next collection":
+                        found_next_collection = True
                         dd = row.find("dd")
                         if not dd:
                             raise ValueError(
                                 f"Kingston parser: missing dd element for 'next collection' in {service_name}"
                             )
                         collection_date = remove_ordinal_indicator_from_date_string(
                             dd.get_text()
                         ).strip().replace(" (In progress)", "")
                         # strip out any text inside of the date string
                         collection_date = re.sub(
                             r"\n\s*\(this.*?\)", "", collection_date
                         )
                         dict_data = {
                             "type": service_name,
                             "collectionDate": get_next_occurrence_from_day_month(
                                 datetime.strptime(
                                     collection_date
                                     + " "
                                     + datetime.now().strftime("%Y"),
                                     "%A, %d %B %Y",
                                 )
                             ).strftime(date_format),
                         }
                         data["bins"].append(dict_data)
+                if not found_next_collection:
+                    raise ValueError(
+                        f"Kingston parser: no 'next collection' row found for {service_name}"
+                    )

robbrad · 2026-02-01T17:54:32Z

This PR has been merged into the February 2026 consolidated release PR #1837.

Thank you for your contribution!

fix: Kingston-upon-Thames website HTML format change

41eb303

Signed-off-by: Andy Piper <andypiper@users.noreply.github.com>

coderabbitai Bot reviewed Jan 25, 2026

View reviewed changes

Comment thread uk_bin_collection/uk_bin_collection/councils/KingstonUponThamesCouncil.py

Comment thread uk_bin_collection/uk_bin_collection/councils/KingstonUponThamesCouncil.py

Comment thread uk_bin_collection/uk_bin_collection/councils/KingstonUponThamesCouncil.py Outdated

fix: Kingston parser for HTML format change with explicit error handling

b057b08

Signed-off-by: Andy Piper <andypiper@users.noreply.github.com>

coderabbitai Bot reviewed Jan 25, 2026

View reviewed changes

robbrad added a commit that referenced this pull request Feb 1, 2026

Merge PR #1828: Fix Kingston-upon-Thames website HTML format change

278832f

robbrad mentioned this pull request Feb 1, 2026

February 2026 Release - Consolidated Council Fixes #1837

Merged

robbrad closed this Feb 1, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Kingston-upon-Thames website HTML format change#1828

fix: Kingston-upon-Thames website HTML format change#1828
andypiper wants to merge 2 commits into
robbrad:masterfrom
andypiper:fix-kingston-council

andypiper commented Jan 25, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jan 25, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Suggested reviewers

Poem

Uh oh!

codecov Bot commented Jan 25, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

robbrad commented Feb 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

andypiper commented Jan 25, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jan 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Suggested reviewers

Poem

Uh oh!

codecov Bot commented Jan 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

robbrad commented Feb 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

andypiper commented Jan 25, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jan 25, 2026 •

edited

Loading

codecov Bot commented Jan 25, 2026 •

edited

Loading