Skip to content

Conversation

@equeim
Copy link
Contributor

@equeim equeim commented Feb 14, 2025

This PR mainly does 2 things:

  1. Makes searching for <head> tag case-insensitive since HTML tags are not guaranteed to be in lower case
  2. Add support of non-UTF-8 encodings by using ResponseBody.charStream() which automatically handles charset parameter in Content-Type header.

Also removes unnecessary close() calls.

OkHttp already parses it into MediaType object, make use of it.
use {} already takes care of that.
HTML tags can be capitalized.
…encodings

charStream() function creates Reader with charset from Content-Type header (if system supports it).
@codecov
Copy link

codecov bot commented Feb 15, 2025

Codecov Report

Attention: Patch coverage is 42.85714% with 4 lines in your changes missing coverage. Please review.

Project coverage is 19.70%. Comparing base (9df5ca1) to head (0e5d682).
Report is 2 commits behind head on develop.

Files with missing lines Patch % Lines
...src/main/java/com/readrops/api/utils/HtmlParser.kt 50.00% 0 Missing and 3 partials ⚠️
...i/src/main/java/com/readrops/api/utils/ApiUtils.kt 0.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@              Coverage Diff              @@
##             develop     #273      +/-   ##
=============================================
- Coverage      19.77%   19.70%   -0.08%     
- Complexity       448      450       +2     
=============================================
  Files            190      190              
  Lines          10002     9998       -4     
  Branches        1564     1564              
=============================================
- Hits            1978     1970       -8     
- Misses          7908     7909       +1     
- Partials         116      119       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Shinokuni
Copy link
Member

LGTM, also thanks for the tests!

@Shinokuni Shinokuni merged commit 02621a1 into readrops:develop Feb 15, 2025
1 check passed
@equeim equeim deleted the fix-html-parsing branch February 15, 2025 16:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants