fix: NorthKestevenDistrictCouncil - make own HTTP request instead of using page param#1987
fix: NorthKestevenDistrictCouncil - make own HTTP request instead of using page param#1987InertiaUK wants to merge 1 commit into
Conversation
📝 WalkthroughWalkthroughThe NorthKestevenDistrictCouncil implementation shifts from using pre-provided page content to fetching live HTML directly. The test configuration is updated with Changes
Sequence Diagram(s)sequenceDiagram
actor Client
participant Parser as NorthKestevenDistrictCouncil
participant HTTP as HTTP Client
participant API as Council API
participant Parser as DOM Parser
Client->>Parser: parse_data(page, uprn=...)
Parser->>Parser: check_uprn(uprn)
Parser->>HTTP: GET /bins/display?uprn=...
Note over HTTP: User-Agent Header
HTTP->>API: Request with custom headers
API-->>HTTP: HTML Response
HTTP-->>Parser: response.text
Parser->>Parser: Extract bin types & dates
Parser->>Parser: Parse collection dates
Parser-->>Client: Return bin collection dict
Estimated Code Review Effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly Related PRs
Suggested Reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 0/1 reviews remaining, refill in 60 minutes.Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@uk_bin_collection/uk_bin_collection/councils/NorthKestevenDistrictCouncil.py`:
- Around line 10-18: Validate the uprn variable locally before constructing the
URL and making the HTTP request: do not rely on check_uprn() since it can
swallow bad input — instead explicitly check uprn (e.g., ensure it's present and
matches the expected format) and raise an exception (ValueError or a custom
exception) if invalid, then only build the url = f"...?uprn={uprn}" and call
requests.get when validation passes; reference the uprn variable, the
check_uprn() call, and the URL/request block to locate where to add this
early-fail validation.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: fc99435e-0cac-4f3c-a92c-8528165abcfd
📒 Files selected for processing (2)
uk_bin_collection/tests/input.jsonuk_bin_collection/uk_bin_collection/councils/NorthKestevenDistrictCouncil.py
| uprn = kwargs.get("uprn") | ||
| check_uprn(uprn) | ||
|
|
||
| url = f"https://www.n-kesteven.org.uk/bins/display?uprn={uprn}" | ||
| headers = { | ||
| "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/147.0.0.0 Safari/537.36" | ||
| } | ||
| response = requests.get(url, headers=headers, timeout=30) | ||
| response.raise_for_status() |
There was a problem hiding this comment.
Validate uprn here instead of relying on check_uprn().
check_uprn() currently swallows invalid input, so a missing fixture can still fall through to ...?uprn=None and only fail after making a remote request. Raise locally before building the URL so malformed inputs fail fast. Based on learnings: prefer explicit failures (raise exceptions on unexpected formats) over silent defaults or swallowed errors.
🔧 Suggested fix
uprn = kwargs.get("uprn")
- check_uprn(uprn)
+ if uprn in (None, ""):
+ raise ValueError("Invalid UPRN")
url = f"https://www.n-kesteven.org.uk/bins/display?uprn={uprn}"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| uprn = kwargs.get("uprn") | |
| check_uprn(uprn) | |
| url = f"https://www.n-kesteven.org.uk/bins/display?uprn={uprn}" | |
| headers = { | |
| "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/147.0.0.0 Safari/537.36" | |
| } | |
| response = requests.get(url, headers=headers, timeout=30) | |
| response.raise_for_status() | |
| uprn = kwargs.get("uprn") | |
| if uprn in (None, ""): | |
| raise ValueError("Invalid UPRN") | |
| url = f"https://www.n-kesteven.org.uk/bins/display?uprn={uprn}" | |
| headers = { | |
| "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/147.0.0.0 Safari/537.36" | |
| } | |
| response = requests.get(url, headers=headers, timeout=30) | |
| response.raise_for_status() |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@uk_bin_collection/uk_bin_collection/councils/NorthKestevenDistrictCouncil.py`
around lines 10 - 18, Validate the uprn variable locally before constructing the
URL and making the HTTP request: do not rely on check_uprn() since it can
swallow bad input — instead explicitly check uprn (e.g., ensure it's present and
matches the expected format) and raise an exception (ValueError or a custom
exception) if invalid, then only build the url = f"...?uprn={uprn}" and call
requests.get when validation passes; reference the uprn variable, the
check_uprn() call, and the URL/request block to locate where to add this
early-fail validation.
|
Included in May 2026 Release PR #1992. Closing. |
The scraper calls page.text but page is a string, not a Response object, so it throws an AttributeError. Switched to making a direct requests.get() with the UPRN embedded in the URL and added skip_get_url + uprn to input.json.
Tested against UPRN 10006545854 (Sleaford area) on 2026-04-30.
Summary by CodeRabbit