All images are included in split up documents #314

itb-dev-sn · 2025-10-17T12:02:52Z

itb-dev-sn
Oct 17, 2025

Hi,

we face a problem when splitting up a PDF into several "smaller" PDF files.
We basically use the code below to split the PDF by using page ranges provided:

var filenameBase = Path.GetFileNameWithoutExtension(inputFile);
var pageRanges = new List<PageRange>() {
  new PageRange() { LowerBound = 1, UpperBound = 2 },
  new PageRange() { LowerBound = 3, UpperBound = 4 },
  ...
};

using var inputDocument = PdfReader.Open(inputFile, PdfDocumentOpenMode.Import);

foreach(var range in pageRanges.Select((pageRange, index) => new { Pages = pageRange, Index = index })) {
  var outputDocument = new PdfDocument();
  outputDocument.Options.UseFlateDecoderForJpegImages = PdfUseFlateDecoderForJpegImages.Automatic;
  outputDocument.Options.FlateEncodeMode = PdfFlateEncodeMode.BestCompression;
  outputDocument.Options.EnableCcittCompressionForBilevelImages = true;
  outputDocument.Options.CompressContentStreams = true;
  outputDocument.Options.NoCompression = false;
  outputDocument.Version = inputDocument.Version;
  var pageInfo = $"Pages {range.Pages.LowerBound}-{range.Pages.UpperBound}";
  outputDocument.Info.Title = $"{pageInfo} of {inputDocument.Info.Title}";
  outputDocument.Info.Creator = inputDocument.Info.Creator;
  for (var pageNumber = range.Pages.LowerBound; pageNumber <= range.Pages.UpperBound; pageNumber++)
  {
      outputDocument.AddPage(inputDocument.Pages[pageNumber - 1]);
  }
  
  var outputPath = Path.Combine(targetDirectory, $"{filenameBase}_part_{range.Index + 1}.pdf");
  outputDocument.Save(outputPath);
}

This creates several documents only containing the pages of the page ranges. However some of the documents are basically as large as the original document.
I am not into the PDF specs, but in debugger I could see that some of the pages seems to have references to all images contained in the original document, even if only some of them are visible, e.g. in a PDF reader. It seems that all images are then included in the output split document, even if they are not required for the pages.
When splitting up the file by other tools, these drop all the references not really used on the split pages.
The document Sample Catalog.pdf has this behavior.

Is this intentional behavior?
Is there any way such that only required images are included in the output documents?

Thanks you!

Answered by ThomasHoevel

Oct 20, 2025

Is this intentional behavior? Is there any way such that only required images are included in the output documents?

The creators of that PDF document took a simple approach: There is only a single resource catalog that lists all images - and that catalog is used for all pages.

PDFsharp does not analyze the contents of the page, PDFsharp includes all images listed in the required resources of the page.

Normally, PDF pages should only list the resources they actually use. In that case, PDFsharp will only include the images needed by the imported pages.
Other tools may be smarter and analyze the page contents. That is not yet done by PDFsharp and not planned for the near future.

View full answer

ThomasHoevel · 2025-10-20T11:05:10Z

ThomasHoevel
Oct 20, 2025
Maintainer

Is this intentional behavior? Is there any way such that only required images are included in the output documents?

The creators of that PDF document took a simple approach: There is only a single resource catalog that lists all images - and that catalog is used for all pages.

PDFsharp does not analyze the contents of the page, PDFsharp includes all images listed in the required resources of the page.

Normally, PDF pages should only list the resources they actually use. In that case, PDFsharp will only include the images needed by the imported pages.
Other tools may be smarter and analyze the page contents. That is not yet done by PDFsharp and not planned for the near future.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

All images are included in split up documents #314

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

All images are included in split up documents #314

Uh oh!

Uh oh!

itb-dev-sn Oct 17, 2025

Replies: 1 comment

Uh oh!

ThomasHoevel Oct 20, 2025 Maintainer

itb-dev-sn
Oct 17, 2025

ThomasHoevel
Oct 20, 2025
Maintainer