fix: add User-Agent header to KingsLynnandWestNorfolkBC scraper by ReeceLaww · Pull Request #1733 · robbrad/UKBinCollectionData

ReeceLaww · 2025-11-27T21:02:04Z

The Kings Lynn and West Norfolk council scraper was returning empty bin data because the website (https://www.west-norfolk.gov.uk) was blocking requests without a proper User-Agent header, resulting in a 403 Forbidden HTTP error.

Root cause:

The scraper was sending HTTP requests with only a Cookie header
The council website's server requires a User-Agent header to identify the client
Without this header, the server rejected the request with HTTP 403 Forbidden
This caused BeautifulSoup to parse an error page instead of bin collection data
The scraper found zero bin_date_container divs, resulting in empty bins array

Solution:

Added a standard Chrome User-Agent string to the request headers
The website now accepts the request and returns the expected HTML content
The scraper successfully parses bin collection dates from the response

Testing:

Verified with test UPRN - now returns bin collections successfully
Integration test passes successfully
All unit tests continue to pass (76/77, unrelated Chrome driver failure)

Summary by CodeRabbit

Chores
- Improved HTTP request handling for enhanced compatibility with external services.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

The Kings Lynn and West Norfolk council scraper was returning empty bin data because the website (https://www.west-norfolk.gov.uk) was blocking requests without a proper User-Agent header, resulting in a 403 Forbidden HTTP error. Root cause: - The scraper was sending HTTP requests with only a Cookie header - The council website's server requires a User-Agent header to identify the client - Without this header, the server rejected the request with HTTP 403 Forbidden - This caused BeautifulSoup to parse an error page instead of bin collection data - The scraper found zero bin_date_container divs, resulting in empty bins array Solution: - Added a standard Chrome User-Agent string to the request headers - The website now accepts the request and returns the expected HTML content - The scraper successfully parses bin collection dates from the response Testing: - Verified with test UPRN - now returns bin collections successfully - Integration test passes successfully - All unit tests continue to pass (76/77, unrelated Chrome driver failure)

coderabbitai · 2025-11-27T21:02:14Z

Walkthrough

A User-Agent header is added to the HTTP request in the Kings Lynn and West Norfolk Borough Council bin collection module, while preserving the existing Cookie header with the Uprn value. No changes to parsing, query logic, or data extraction.

Changes

Cohort / File(s)	Summary
HTTP Request Headers `uk_bin_collection/uk_bin_collection/councils/KingsLynnandWestNorfolkBC.py`	Added User-Agent header to HTTP request alongside existing Cookie header

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~5 minutes

Single file modification affecting HTTP request headers only
No logic or parsing changes

Poem

🐰 A header new, so fresh and bright,
User-Agent sent with all our might,
The request flows with proper grace,
Through Norfolk's bins, we set the pace! ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately and concisely describes the primary change: adding a User-Agent header to fix the KingsLynnandWestNorfolkBC scraper.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Tip

📝 Customizable high-level summaries are now available in beta!

You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.

Provide your own instructions using the high_level_summary_instructions setting.
Format the summary however you like (bullet lists, tables, multi-section layouts, contributor stats, etc.).
Use high_level_summary_in_walkthrough to move the summary from the description to the walkthrough section.

Example instruction:

"Divide the high-level summary into five sections:

📝 Description — Summarize the main change in 50–60 words, explaining what was done.

📓 References — List relevant issues, discussions, documentation, or related PRs.

📦 Dependencies & Requirements — Mention any new/updated dependencies, environment variable changes, or configuration updates.

📊 Contributor Summary — Include a Markdown table showing contributions:
| Contributor | Lines Added | Lines Removed | Files Changed |

✔️ Additional Notes — Add any extra reviewer context.
Keep each section concise (under 200 words) and use bullet or numbered lists for clarity."

Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

uk_bin_collection/uk_bin_collection/councils/KingsLynnandWestNorfolkBC.py (1)

25-28: User-Agent header fix looks good; consider extracting UA to a constant.

The added User-Agent together with the existing Cookie header is a sensible, minimal fix for the 403s. To make future maintenance easier (and to reuse this UA across any other scrapers if needed), you could move the UA string to a module-level constant and reference it here.

@@
-from uk_bin_collection.uk_bin_collection.common import *
-from uk_bin_collection.uk_bin_collection.get_bin_data import AbstractGetBinDataClass
-
-
-# import the wonderful Beautiful Soup and the URL grabber
-class CouncilClass(AbstractGetBinDataClass):
+from uk_bin_collection.uk_bin_collection.common import *
+from uk_bin_collection.uk_bin_collection.get_bin_data import AbstractGetBinDataClass
+
+USER_AGENT = (
+    "Mozilla/5.0 (Windows NT 10.0; Win64; x64) "
+    "AppleWebKit/537.36 (KHTML, like Gecko) "
+    "Chrome/91.0.4472.124 Safari/537.36"
+)
+
+
+# import the wonderful Beautiful Soup and the URL grabber
+class CouncilClass(AbstractGetBinDataClass):
@@
-        headers = {
-            "Cookie": f"bcklwn_uprn={user_uprn}",
-            "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36",
-        }
+        headers = {
+            "Cookie": f"bcklwn_uprn={user_uprn}",
+            "User-Agent": USER_AGENT,
+        }

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 37c8a80 and a12c533.

📒 Files selected for processing (1)

uk_bin_collection/uk_bin_collection/councils/KingsLynnandWestNorfolkBC.py (1 hunks)

ReeceLaww · 2025-12-06T15:39:46Z

Any update on this please? Keen to get it implemented to fix my local council. Thank you.

coderabbitai Bot reviewed Nov 27, 2025

View reviewed changes

robbrad changed the base branch from master to dec_release December 7, 2025 10:23

robbrad merged commit e03a268 into robbrad:dec_release Dec 7, 2025
1 check passed

ReeceLaww deleted the fix/kings-lynn-west-norfolk-user-agent branch March 5, 2026 14:22

coderabbitai Bot mentioned this pull request Apr 12, 2026

Add User-Agent to data parser for WestSuffolkCouncil #1961

Closed

coderabbitai Bot mentioned this pull request May 12, 2026

fix: ValeofGlamorganCouncil — add User-Agent header to API request #2042

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: add User-Agent header to KingsLynnandWestNorfolkBC scraper#1733

fix: add User-Agent header to KingsLynnandWestNorfolkBC scraper#1733
robbrad merged 1 commit into
robbrad:dec_releasefrom
ReeceLaww:fix/kings-lynn-west-norfolk-user-agent

ReeceLaww commented Nov 27, 2025 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Nov 27, 2025 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

ReeceLaww commented Dec 6, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ReeceLaww commented Nov 27, 2025 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Nov 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

ReeceLaww commented Dec 6, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ReeceLaww commented Nov 27, 2025 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Nov 27, 2025 •

edited

Loading