add last1K limit to readNextEndLine #439

jonnythebard · 2018-06-30T05:05:10Z

I've been processing over 50k of pdf file with PyPDF2 for last several weeks and found it isn't filtering some malformed pdf file. The problem with malformed pdf file was that it had %%EOF marker at the beginning followed by 30m bytes of b'\x00'. Current version of PyPDF2 tries to travel all the way though 30m bytes of b'\x00' and find %%EOF. Since %%EOF marker should appear in last 1k of the file i thought it would make sense to add last1K limit to readNextEndLine function. i applied this to my application and it works fine.

elyssonmr · 2018-10-04T17:44:43Z

Looking forward for this Pull request to be accepted

MartinThoma · 2022-04-09T06:39:44Z

Have you seen #642 ? What do you think about it?

jonnythebard · 2022-04-09T06:51:36Z

@MartinThoma Yes it looks much better than mine because my commit replaces a condition in the if phrase instead of adding one. I didn't notice my mistake. Glad that someone is finally being aware of this issue though 😂

codecov-commenter · 2022-04-16T05:33:36Z

Codecov Report

Merging #439 (8c2cc97) into main (d5a5eea) will not change coverage.
The diff coverage is 80.00%.

@@           Coverage Diff           @@
##             main     #439   +/-   ##
=======================================
  Coverage   70.59%   70.59%           
=======================================
  Files          10       10           
  Lines        3425     3425           
  Branches      798      798           
=======================================
  Hits         2418     2418           
  Misses        763      763           
  Partials      244      244

Impacted Files	Coverage Δ
PyPDF2/pdf.py	`72.42% <80.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d5a5eea...8c2cc97. Read the comment docs.

add last1K limit to readNextEndLine

bd3ae44

jonnythebard closed this Jun 30, 2018

jonnythebard reopened this Jun 30, 2018

jonnythebard force-pushed the master branch 2 times, most recently from 272851b to bd3ae44 Compare December 19, 2019 04:54

rltpoa mentioned this pull request Oct 9, 2021

Fix reading more than last1K for EOF #642

Merged

MartinThoma added is-bug From a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF Tiny Pull requests that make a tiny change - and thus should be easy to merge labels Apr 6, 2022

MartinThoma linked an issue Apr 9, 2022 that may be closed by this pull request

PdfFileReader keep looking for "%%EOF" on more than the last 1024 bytes of stream in malformed PDF files #639

Closed

Merge branch 'main' into master

a50d79e

Merge branch 'main' into master

8c2cc97

MartinThoma closed this in 03ea3ec Apr 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add last1K limit to readNextEndLine #439

add last1K limit to readNextEndLine #439

Uh oh!

jonnythebard commented Jun 30, 2018 •

edited

Loading

Uh oh!

elyssonmr commented Oct 4, 2018

Uh oh!

MartinThoma commented Apr 9, 2022

Uh oh!

jonnythebard commented Apr 9, 2022

Uh oh!

codecov-commenter commented Apr 16, 2022 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

add last1K limit to readNextEndLine #439

add last1K limit to readNextEndLine #439

Uh oh!

Conversation

jonnythebard commented Jun 30, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elyssonmr commented Oct 4, 2018

Uh oh!

MartinThoma commented Apr 9, 2022

Uh oh!

jonnythebard commented Apr 9, 2022

Uh oh!

codecov-commenter commented Apr 16, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jonnythebard commented Jun 30, 2018 •

edited

Loading

codecov-commenter commented Apr 16, 2022 •

edited

Loading