Skip to content

fix(MertonCouncil): add postcode search for property ID resolution#2034

Open
InertiaUK wants to merge 1 commit into
robbrad:masterfrom
InertiaUK:fix/merton-postcode-search
Open

fix(MertonCouncil): add postcode search for property ID resolution#2034
InertiaUK wants to merge 1 commit into
robbrad:masterfrom
InertiaUK:fix/merton-postcode-search

Conversation

@InertiaUK
Copy link
Copy Markdown
Contributor

@InertiaUK InertiaUK commented May 12, 2026

Summary

  • Merton uses internal SocietyWorks IDs, not ONSUD UPRNs — direct UPRN lookup fails for most users
  • Added postcode search: POST to /waste with postcode, parse address dropdown, match by house number
  • Same fix pattern as LondonBoroughSutton (same SocietyWorks platform)
  • Updated input.json with postcode + house_number test params

Summary by CodeRabbit

  • New Features

    • Merton Council: Added support for property lookup via postcode and house number as an alternative to UPRN.
  • Bug Fixes

    • Merton Council: Enhanced waste collection data polling reliability.

Review Change Stack

Merton uses internal SocietyWorks IDs, not ONSUD UPRNs. Old scraper
required manual UPRN lookup. Added postcode search: POST to /waste
with postcode, parse address dropdown, match by house number.
Falls back to direct UPRN if provided and valid.

Same fix pattern as LondonBoroughSutton (same SocietyWorks platform).
Updated input.json with postcode + house_number test params.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 12, 2026

📝 Walkthrough

Walkthrough

The Merton Council bin collection scraper is refactored to resolve property identifiers via postcode and house number lookup against the FixMyStreet endpoint as a fallback when numeric UPRN is unavailable, updates polling and date extraction logic, and adjusts test input to reflect the new input format.

Changes

Merton Council Scraper Refactor

Layer / File(s) Summary
Property resolution via postcode lookup
uk_bin_collection/uk_bin_collection/councils/MertonCouncil.py
New BASE_URL constant and _resolve_property_id() function POST postcode and paon data to FixMyStreet endpoint, parse the address select element, and return matching property ID or None.
Parse data integration and polling config
uk_bin_collection/uk_bin_collection/councils/MertonCouncil.py
parse_data() refactored to resolve property_id from numeric UPRN or postcode+paon lookup, builds waste URL using BASE_URL, raises ValueError when unresolvable. Polling sleep time increased to 3 seconds and loop now waits for waste-service heading elements, raising RuntimeError on timeout.
Collection data extraction and output construction
uk_bin_collection/uk_bin_collection/councils/MertonCouncil.py
HTML extraction finds grid parent, iterates service grids, filters "Next collection" rows, parses dates by text split with year-boundary adjustment. Output constructed directly from sorted (bin_type, datetime) tuples with formatted collectionDate values.
Test input with postcode and house number fields
uk_bin_collection/tests/input.json
MertonCouncil test entry now includes example house_number and postcode fields. Wiki note updated to indicate postcode and house number should be provided (Merton UPRN still accepted).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • robbrad/UKBinCollectionData#1729: Implements session-based lookup and adds postcode+PAON → UPRN resolution for council scrapers with BASE_URL/lookup helper functions.
  • robbrad/UKBinCollectionData#1716: Modifies MertonCouncil.py to adapt scraper to FixMyStreet site with updated UPRN/postcode handling and polling for JS-loaded collection data.

Suggested reviewers

  • dp247

Poem

🐰 A rabbit hops through Merton's data,
Postcode lookups now come later—
No UPRN? No need to wait,
Just house number seals the gate!
Collection dates now parse just right,
Another borough brought to light! 🗑️

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding postcode search capability for property ID resolution in MertonCouncil, which is the primary functionality introduced.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

Review ran into problems

🔥 Problems

Timed out fetching pipeline failures after 30000ms


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 12, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.67%. Comparing base (8ecf878) to head (249b868).

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #2034   +/-   ##
=======================================
  Coverage   86.67%   86.67%           
=======================================
  Files           9        9           
  Lines        1141     1141           
=======================================
  Hits          989      989           
  Misses        152      152           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@uk_bin_collection/uk_bin_collection/councils/MertonCouncil.py`:
- Around line 21-33: The current PAON selection in MertonCouncil (paon_lower,
select.find_all("option"), text.startswith(paon_lower), best) silently falls
back to the first non-empty option; change this to require an exact or uniquely
identifying match: collect all option values where the visible text equals or
matches the PAON more strictly (e.g., exact match or normalized equality), do
not use the loose startswith heuristic, and if zero or more than one match is
found raise a descriptive exception (e.g., ValueError) instead of returning
best; remove/stop using the fallback variable best and ensure calling code can
handle the raised error.
- Around line 42-43: The parser currently only reads paon from kwargs so house
numbers from the new input contract are ignored; update the code in
MertonCouncil.py to also read kwargs.get("house_number") and, if present,
normalize/assign it into the paon variable (or prefer house_number over paon)
before any postcode lookup logic (same change where kwargs.get("paon") is read
around the other occurrence at the block that currently reads paon on lines
~60-61); ensure you handle None and empty values so existing fallback behavior
remains when no house_number is provided.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 34fafc45-e199-4471-a1b8-016990e1393c

📥 Commits

Reviewing files that changed from the base of the PR and between 8ecf878 and 249b868.

📒 Files selected for processing (2)
  • uk_bin_collection/tests/input.json
  • uk_bin_collection/uk_bin_collection/councils/MertonCouncil.py

Comment on lines +21 to +33
paon_lower = (paon or "").strip().lower()
best = None
for opt in select.find_all("option"):
val = opt.get("value", "")
if not val or val == "missing":
continue
text = opt.get_text(strip=True).lower()
if paon_lower and text.startswith(paon_lower):
return val
if not best and val:
best = val

Required Parameters:
uprn (str): Unique Property Reference Number (numeric only)
return best
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fail instead of guessing the property from postcode results.

text.startswith(paon_lower) will match prefixes like 1 against 10 ..., and if nothing matches this returns the first non-empty option anyway. On any postcode with multiple addresses, that can silently resolve the wrong property and return another household's collection dates. Please make this path require a unique match and raise when the PAON cannot be matched exactly enough.

Based on learnings: "prefer explicit failures (raise exceptions on unexpected formats) over silent defaults or swallowed errors."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@uk_bin_collection/uk_bin_collection/councils/MertonCouncil.py` around lines
21 - 33, The current PAON selection in MertonCouncil (paon_lower,
select.find_all("option"), text.startswith(paon_lower), best) silently falls
back to the first non-empty option; change this to require an exact or uniquely
identifying match: collect all option values where the visible text equals or
matches the PAON more strictly (e.g., exact match or normalized equality), do
not use the loose startswith heuristic, and if zero or more than one match is
found raise a descriptive exception (e.g., ValueError) instead of returning
best; remove/stop using the fallback variable best and ensure calling code can
handle the raised error.

Comment on lines +42 to +43
postcode = kwargs.get("postcode")
paon = kwargs.get("paon")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Read house_number here too.

The PR updates input.json and the wiki note to use house_number, but this parser only reads paon. With the current code, postcode lookups ignore the supplied house number and rely on the ambiguous fallback above. A local normalization here would keep the new input contract working:

Suggested fix
-        paon = kwargs.get("paon")
+        paon = kwargs.get("paon") or kwargs.get("house_number")

Also applies to: 60-61

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@uk_bin_collection/uk_bin_collection/councils/MertonCouncil.py` around lines
42 - 43, The parser currently only reads paon from kwargs so house numbers from
the new input contract are ignored; update the code in MertonCouncil.py to also
read kwargs.get("house_number") and, if present, normalize/assign it into the
paon variable (or prefer house_number over paon) before any postcode lookup
logic (same change where kwargs.get("paon") is read around the other occurrence
at the block that currently reads paon on lines ~60-61); ensure you handle None
and empty values so existing fallback behavior remains when no house_number is
provided.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant