-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Closed
Labels
PdfWriterThe PdfWriter component is affectedThe PdfWriter component is affectedis-bugFrom a users perspective, this is a bug - a violation of the expected behavior with a compliant PDFFrom a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF
Description
STEPS TO REPRODUCE:
-
Download the test pdf file:
https://suokunlong.cn/owncloud/index.php/s/bWyTHYfoMii3Yh9
The file is named 2017-Textbook-EconomicLaw.pdf, which is 65.1MB of 511 pages. -
Run the following code:
from PyPDF2 import PdfFileWriter, PdfFileReader
pdf_in_filename = r"/path/to/2017-Textbook-EconomicLaw.pdf"
pdf_out_filename = r"/path/to/2017-Textbook-EconomicLaw-new.pdf"
pdf_out = PdfFileWriter()
pdf_in = PdfFileReader(open(pdf_in_filename, 'rb'))
numpages = pdf_in.getNumPages()
for i in range(numpages):
pdf_out.addPage(pdf_in.getPage(i))
with open(pdf_out_filename, 'wb') as outputStream:
pdf_out.write(outputStream)- The code is running forever at the last row.
OTHER USEFUL INFORMATION:
- I noticed that if I revise the line:
for i in range(numpages):to:
for i in range(3):
then I will get the output very quickly.
- I also noticed that if I open the test pdf file using evince in my Linux desktop, and print it to a new pdf file, then the above code finishes within 5s.
PyPDF2.version
'1.25.1'
Metadata
Metadata
Assignees
Labels
PdfWriterThe PdfWriter component is affectedThe PdfWriter component is affectedis-bugFrom a users perspective, this is a bug - a violation of the expected behavior with a compliant PDFFrom a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF