fix: fix UPRN param encoding for SouthamptonCityCouncil#1722
Conversation
WalkthroughReplaced set-like UPRN parameter with a plain string in the request and added explicit regex matching and an error guard when extracting the calendar view to avoid NoneType subscripting. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Parser as SouthamptonCityCouncil.parse_data
participant API as Waste Calendar API
Note over Parser,API: Request UPRN formatting fix
Parser->>API: POST /calendar {"UPRN": "100060745730"}
API-->>Parser: HTML response
Note over Parser: Calendar extraction with regex and guard
Parser->>Parser: re.search(calendar_pattern, html) -> calendar_match
alt match found
Parser->>Parser: use calendar_match.group(0) -> parse events
else no match
Parser->>Parser: raise ValueError("Could not find calendar view")
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Possibly related PRs
Poem
Pre-merge checks and finishing touches✅ Passed checks (5 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## dec_release #1722 +/- ##
============================================
Coverage 86.79% 86.79%
============================================
Files 9 9
Lines 1136 1136
============================================
Hits 986 986
Misses 150 150 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
uk_bin_collection/uk_bin_collection/councils/SouthamptonCityCouncil.py (1)
68-70: Add error handling for failed regex match.While the line 57 fix should resolve the immediate issue, there's no defensive error handling here. If
re.search()returnsNone(e.g., if the council website HTML structure changes), attempting to subscript with[0]will raise the same 'NoneType' object is not subscriptable error reported in issue #1719.Apply this diff to add proper error handling:
- calendar_view_only = re.search( - r"#calendar1.*?listView", r.text, flags=re.DOTALL - )[0] + calendar_match = re.search( + r"#calendar1.*?listView", r.text, flags=re.DOTALL + ) + if not calendar_match: + raise ValueError( + "Unable to find calendar view in response. The council website structure may have changed." + ) + calendar_view_only = calendar_match.group(0)
🧹 Nitpick comments (1)
uk_bin_collection/uk_bin_collection/councils/SouthamptonCityCouncil.py (1)
74-82: Consider adding error handling for date parsing.While the regex pattern on line 37 validates the date format, if the council website changes its date format, the
strptime()call on line 78 could raise aValueError. Consider wrapping the date parsing in a try-except block to provide a more informative error message.Example:
for item in results: - - dict_data = { - "type": item[0], - "collectionDate": datetime.strptime(item[1], "%m/%d/%Y").strftime( - "%d/%m/%Y" - ), - } - bindata["bins"].append(dict_data) + try: + dict_data = { + "type": item[0], + "collectionDate": datetime.strptime(item[1], "%m/%d/%Y").strftime( + "%d/%m/%Y" + ), + } + bindata["bins"].append(dict_data) + except ValueError as e: + raise ValueError( + f"Failed to parse date '{item[1]}' for waste type '{item[0]}': {e}" + )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
uk_bin_collection/uk_bin_collection/councils/SouthamptonCityCouncil.py(3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Run Unit Tests (3.12, 1.8.4)
- GitHub Check: Run Integration Tests (3.12, 1.8.4)
🔇 Additional comments (1)
uk_bin_collection/uk_bin_collection/councils/SouthamptonCityCouncil.py (1)
56-58: Critical fix correctly applied!The change from
{"UPRN": {user_uprn}}to{"UPRN": user_uprn}is the correct fix. The old code was constructing a set ({user_uprn}) instead of passing the string directly, which would have caused the UPRN to be serialized incorrectly in the URL parameters. This prevented the server from returning valid HTML, leading to the downstream 'NoneType' error.
|
I have no idea how to appease this CI. It's a black box. |
Signed-off-by: Tim Collins <tim@thecollins.team> formatting Signed-off-by: Tim Collins <tim@thecollins.team> chore: trigger CI for SouthamptonCityCouncil fix
a7e8b77 to
8e7fdb4
Compare
Signed-off-by: Tim Collins <tim@thecollins.team>
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
uk_bin_collection/uk_bin_collection/councils/SouthamptonCityCouncil.py (1)
5-5: Consider adding explicit import forremodule.The static analysis tool flags that
remay be undefined due to the star import on line 5. While this is a pre-existing issue not introduced by this PR, adding an explicitimport restatement would improve code clarity and eliminate the static analysis warning.Apply this diff to add an explicit import:
import time import requests +import re from uk_bin_collection.uk_bin_collection.common import *
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
uk_bin_collection/uk_bin_collection/councils/SouthamptonCityCouncil.py(4 hunks)
🧰 Additional context used
🪛 Ruff (0.14.4)
uk_bin_collection/uk_bin_collection/councils/SouthamptonCityCouncil.py
68-68: re may be undefined, or defined from star imports
(F405)
69-69: re may be undefined, or defined from star imports
(F405)
72-74: Avoid specifying long messages outside the exception class
(TRY003)
🔇 Additional comments (3)
uk_bin_collection/uk_bin_collection/councils/SouthamptonCityCouncil.py (3)
56-58: Critical bug fix: Correct UPRN parameter encoding.The old code
{"UPRN": {user_uprn}}created a set object as the parameter value due to Python's set literal syntax. Whenrequests.getserialized this to a query string, it would produce malformed output that the council website couldn't parse correctly. This was the root cause of issue #1719.The fix correctly passes the UPRN as a plain string value.
68-75: Good defensive programming: Added error guard for calendar view extraction.The new code properly checks if the regex search returns None before attempting to access the match result. This prevents the 'NoneType' object is not subscriptable error reported in issue #1719 when the calendar view is not found in the response.
The error message is clear and helps diagnose issues if the council website structure changes.
17-93: Request verification: Confirm fix resolves issue #1719.The author was unable to test locally due to CAPTCHA issues. Please verify that these changes successfully resolve the setup failure reported in issue #1719 using the example UPRN
100060745730or similar test data.The two fixes (UPRN parameter encoding + error guard) should address the root cause, but confirmation with real data is essential.
If you have access to test the integration, you can verify by:
- Setting up the Southampton City Council integration with a valid UPRN
- Confirming that setup completes without the 'NoneType' object is not subscriptable error
- Verifying that bin collection data is retrieved successfully
I'm hoping this will fix #1719
I have been unable to test locally due to Captcha issues.
Summary by CodeRabbit