Skip to content

fix: Update NorthTynesideCouncil to reflect changes to collection schedule pages#1659

Merged
dp247 merged 2 commits into
robbrad:masterfrom
charliejones1:fix/1654-north-tyneside-schedule
Oct 18, 2025
Merged

fix: Update NorthTynesideCouncil to reflect changes to collection schedule pages#1659
dp247 merged 2 commits into
robbrad:masterfrom
charliejones1:fix/1654-north-tyneside-schedule

Conversation

@charliejones1
Copy link
Copy Markdown
Contributor

@charliejones1 charliejones1 commented Oct 17, 2025

Summary:

  • Fix North Tyneside council integration following changes to collection schedule pages

Changes:

  • Updated URL from https://my.northtyneside.gov.uk/category/81/bin-collection-dates to https://www.northtyneside.gov.uk/waste-collection-schedule
  • Adjusted parsing to match the new page structure.
  • Updated input.json to update URL and remove unneeded postcode parameter

Fixes: #1654

Summary by CodeRabbit

  • Bug Fixes
    • North Tyneside waste lookup now works with UPRN only (postcode removed); schedule is fetched from the updated service and parsed directly, improving reliability of dates, bin types and colours with better error handling.
  • Documentation
    • Help/wiki text and lookup URL updated; guidance now points to FindMyAddress for UPRN lookup.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Oct 17, 2025

Walkthrough

Replaced North Tyneside Council's multi-step postcode form flow with a direct UPRN-based fetch of the waste-collection schedule page and implemented a new parser for the page's structured schedule block to produce chronological bin entries.

Changes

Cohort / File(s) Summary
Test Configuration
uk_bin_collection/tests/input.json
Updated NorthTynesideCouncil test entry: removed postcode, changed URL to the waste-collection-schedule endpoint, added wiki_command_url_override, adjusted wiki_name formatting and updated wiki_note to instruct UPRN-only input and point to FindMyAddress for UPRN lookup.
Council Implementation
uk_bin_collection/uk_bin_collection/councils/NorthTynesideCouncil.py
Replaced postcode/form_build_id submission flow with direct GET using UPRN to the schedule view; removed old POST/form parsing; added parsing of waste-collection__schedule and waste-collection__day items (date via , bin type, colour); added per-item parsing guards and logger warnings; builds sorted bin list from parsed dates.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant OldFlow as Old Flow
    participant NewFlow as New Flow
    participant Website

    rect rgb(230,240,255)
    Note over OldFlow: Old: Multi-step form (postcode)
    Client->>OldFlow: submit postcode
    OldFlow->>Website: POST form (retrieve form_build_id)
    Website->>OldFlow: form response (form_build_id)
    OldFlow->>Website: POST with form_build_id
    Website->>OldFlow: HTML response
    OldFlow->>Client: compute bins (calendar logic)
    end

    rect rgb(230,255,240)
    Note over NewFlow: New: Direct UPRN schedule fetch
    Client->>NewFlow: provide UPRN
    NewFlow->>Website: GET /waste-collection-schedule?uprn=...
    Website->>NewFlow: HTML with waste-collection__schedule
    NewFlow->>NewFlow: parse days (time, type, colour) ✓ / warn ✖
    NewFlow->>Client: return chronological bins
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I hopped from form to straight-up UPRN,
No postcode dance, no POST again.
Schedule blocks now sing the dates,
Bins lined up — tidy fates. 🎋

Pre-merge checks and finishing touches

✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The pull request title "fix: Update NorthTynesideCouncil to reflect changes to collection schedule pages" directly and clearly summarizes the main change: updating the NorthTynesideCouncil integration to work with the council's updated website structure. The title is concise, specific, and accurately reflects the core purpose of the changeset without unnecessary noise. It provides sufficient context that a reviewer scanning the history would immediately understand this addresses a website integration update.
Linked Issues Check ✅ Passed The changes directly address the requirements from issue #1654. The PR updates the URL from the outdated domain (https://my.northtyneside.gov.uk/category/81/bin-collection-dates) to the suspected new endpoint (https://www.northtyneside.gov.uk/waste-collection-schedule), replaces the outdated form submission logic with direct page retrieval, removes the postcode-based approach in favor of UPRN-only lookups, and parses the new schedule structure from the page. The modifications to both input.json and NorthTynesideCouncil.py align with the reporter's findings and the suspected new form location, effectively resolving the broken integration described in the issue.
Out of Scope Changes Check ✅ Passed All changes in this pull request are directly related to fixing the North Tyneside Council integration. The modifications are contained to the NorthTynesideCouncil entry in input.json and the NorthTynesideCouncil.py implementation file, with no changes to other councils or unrelated components. The summary explicitly notes that no other public/exported-entity signatures changed, confirming that the scope is appropriately limited to the integration update described in issue #1654.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
uk_bin_collection/uk_bin_collection/councils/NorthTynesideCouncil.py (1)

76-84: Replace wildcard import with explicit imports including date_format.

The date_format constant is defined in common.py (line 17: date_format = "%d/%m/%Y"), making it available via the current wildcard import. However, using explicit imports improves code clarity and maintainability. Replace line 5:

from uk_bin_collection.uk_bin_collection.common import date_format, check_uprn

(and include any other symbols needed from common.py), removing the wildcard import.

🧹 Nitpick comments (5)
uk_bin_collection/tests/input.json (1)

1762-1764: Add a wiki_command_url_override to show the exact view endpoint that requires the UPRN.

  • Current change looks good. To help users, add a command override that includes the UPRN placeholder.

Apply:

     "NorthTynesideCouncil": {
       "skip_get_url": true,
       "uprn": "47097627",
-      "url": "https://www.northtyneside.gov.uk/waste-collection-schedule",
+      "url": "https://www.northtyneside.gov.uk/waste-collection-schedule",
+      "wiki_command_url_override": "https://www.northtyneside.gov.uk/waste-collection-schedule/view/XXXXXXXX",
       "wiki_name": "North Tyneside",
-      "wiki_note": "Pass the UPRN. You can find the UPRN using [FindMyAddress](https://www.findmyaddress.co.uk/search).",
+      "wiki_note": "Pass only the UPRN (no postcode). Example view URL: /waste-collection-schedule/view/UPRN.",
       "LAD24CD": "E08000022"
     },
uk_bin_collection/uk_bin_collection/councils/NorthTynesideCouncil.py (4)

13-17: Silence unused-arg warning for page.

  • Keep signature but mark/handle unused param.

Apply:

 def parse_data(self, page: str, **kwargs) -> dict:
 
     user_uprn = kwargs.get("uprn")
     check_uprn(user_uprn)
+    # `page` is unused because we construct the view URL directly.
+    # del page  # noqa: F841  # (alternatively use ruff: # noqa: ARG002 on def line)

As per coding guidelines.


18-20: Avoid redundant global warning toggles here.

  • You already set verify=False on requests or can rely on AbstractGetBinDataClass.get_data (which disables warnings centrally). This local toggle via requests.packages... is unnecessary.

Apply:

-        # diable warnings so that we can ignore cert verification
-        requests.packages.urllib3.disable_warnings()
+        # TLS warning suppression is handled in AbstractGetBinDataClass.get_data.

30-34: Prefer a specific exception type and include URL context.

Apply:

-        if schedule is None:
-            raise Exception("No waste-collection schedule info found - has the page changed?")
+        if schedule is None:
+            raise ValueError("No waste-collection schedule found. The page structure may have changed.")

56-75: Harden parsing and avoid brittle "NoneType" string checks.

  • Guard each element fetch and continue when missing.
  • Catch narrow exceptions instead of generic Exception.

Apply:

-        for day in schedule.find_all("li", {"class": "waste-collection__day"}):
-            try:
-                # extract the date, bin type and colour
-                collection_date = datetime.strptime(day.find("time")["datetime"], "%Y-%m-%d")
-
-                # for the collection type we only want the text before any nested span
-                type_span = day.find("span", {"class": "waste-collection__day--type"})
-                bin_type = next(type_span.strings).strip()
-
-                bin_colour = day.find("span", {"class": "waste-collection__day--colour"}).text.strip()
-
-                collections.append((f'{bin_type} ({bin_colour})', collection_date))
-
-            except Exception as e:
-                # here NoneType typically suggests parsing errors so report them and continue
-                if "NoneType" in str(e):
-                    logger.warning(f'Error while processing {day}: {e}')
-                    continue
-                raise
+        for day in schedule.find_all("li", {"class": "waste-collection__day"}):
+            try:
+                time_el = day.find("time")
+                if not time_el or not time_el.get("datetime"):
+                    logger.warning("Skipping day: missing time/datetime")
+                    continue
+                collection_date = datetime.strptime(time_el["datetime"], "%Y-%m-%d")
+
+                type_span = day.find("span", {"class": "waste-collection__day--type"})
+                # Direct text only (exclude nested spans, e.g., bank-holiday note)
+                bin_type_text = type_span.find(text=True, recursive=False) if type_span else None
+                if not bin_type_text:
+                    logger.warning("Skipping day: missing type")
+                    continue
+                bin_type = bin_type_text.strip()
+
+                colour_span = day.find("span", {"class": "waste-collection__day--colour"})
+                if not colour_span:
+                    logger.warning("Skipping day: missing colour")
+                    continue
+                bin_colour = colour_span.get_text(strip=True)
+
+                collections.append((f"{bin_type} ({bin_colour})", collection_date))
+            except (AttributeError, KeyError, TypeError, ValueError) as e:
+                logger.warning(f"Skipping unparsable day node: {e}")
+                continue
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 11ee035 and c231f8b.

📒 Files selected for processing (2)
  • uk_bin_collection/tests/input.json (1 hunks)
  • uk_bin_collection/uk_bin_collection/councils/NorthTynesideCouncil.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
uk_bin_collection/uk_bin_collection/councils/NorthTynesideCouncil.py (2)
uk_bin_collection/uk_bin_collection/get_bin_data.py (1)
  • AbstractGetBinDataClass (43-146)
uk_bin_collection/uk_bin_collection/common.py (1)
  • check_uprn (67-78)
🪛 Ruff (0.14.0)
uk_bin_collection/uk_bin_collection/councils/NorthTynesideCouncil.py

5-5: from uk_bin_collection.uk_bin_collection.common import * used; unable to detect undefined names

(F403)


13-13: Unused method argument: page

(ARG002)


16-16: check_uprn may be undefined, or defined from star imports

(F405)


19-19: requests may be undefined, or defined from star imports

(F405)


22-22: requests may be undefined, or defined from star imports

(F405)


33-33: Create your own exception

(TRY002)


33-33: Avoid specifying long messages outside the exception class

(TRY003)

Comment thread uk_bin_collection/uk_bin_collection/councils/NorthTynesideCouncil.py Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
uk_bin_collection/uk_bin_collection/councils/NorthTynesideCouncil.py (1)

59-65: Schedule extraction logic is sound.

The code correctly validates the presence of the expected structure and provides a helpful error message for debugging. The static analysis hint (TRY003) about the long exception message is a minor style preference—the descriptive message is actually valuable for users when the page structure changes.

If you prefer to address the TRY003 hint, you could define a custom exception class, but this is optional:

class ScheduleNotFoundError(ValueError):
    """Raised when waste collection schedule is not found on the page."""
    pass

Then on line 64:

-            raise ValueError("No waste-collection schedule found. The page structure may have changed.")
+            raise ScheduleNotFoundError("The page structure may have changed.")
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c231f8b and 4720875.

📒 Files selected for processing (2)
  • uk_bin_collection/tests/input.json (1 hunks)
  • uk_bin_collection/uk_bin_collection/councils/NorthTynesideCouncil.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • uk_bin_collection/tests/input.json
🧰 Additional context used
🧬 Code graph analysis (1)
uk_bin_collection/uk_bin_collection/councils/NorthTynesideCouncil.py (2)
uk_bin_collection/uk_bin_collection/common.py (1)
  • check_uprn (67-78)
uk_bin_collection/uk_bin_collection/get_bin_data.py (2)
  • AbstractGetBinDataClass (43-146)
  • get_data (110-128)
🪛 Ruff (0.14.0)
uk_bin_collection/uk_bin_collection/councils/NorthTynesideCouncil.py

64-64: Avoid specifying long messages outside the exception class

(TRY003)

🔇 Additional comments (4)
uk_bin_collection/uk_bin_collection/councils/NorthTynesideCouncil.py (4)

1-8: LGTM! Import concerns from previous review are resolved.

The explicit imports on line 5 address the previous wildcard import concern. The previous request to add import requests is now obsolete since the code correctly uses self.get_data() (inherited from the parent class) instead of calling requests.get() directly.


50-56: LGTM! HTTP request handling correctly addresses previous review.

The code now uses self.get_data() which includes proper headers, timeout, and SSL verification settings. The defensive raise_for_status() check ensures fast failure on HTTP errors.


114-122: LGTM! Return structure is correct.

The chronological sorting and date formatting using the shared date_format ensure consistent output across the codebase.


87-112: Well-structured parsing loop with good defensive coding.

The parsing logic correctly handles the documented HTML structure, including the bank holiday notification edge case (lines 95-101). The recursive=False pattern on line 97 elegantly extracts only the direct text content while skipping nested spans. Exception handling with logging ensures invalid entries don't break the entire parse.

@codecov
Copy link
Copy Markdown

codecov Bot commented Oct 18, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.79%. Comparing base (11ee035) to head (4720875).
⚠️ Report is 8 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #1659   +/-   ##
=======================================
  Coverage   86.79%   86.79%           
=======================================
  Files           9        9           
  Lines        1136     1136           
=======================================
  Hits          986      986           
  Misses        150      150           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@dp247 dp247 merged commit 4f75e78 into robbrad:master Oct 18, 2025
12 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

North Tyneside council website update breaks integration

2 participants