-
Notifications
You must be signed in to change notification settings - Fork 0
Stages
Brian "Moses" Hall edited this page Nov 3, 2021
·
4 revisions
- Verifies that the shipment contains at least one objid and records objids in
shipment.metadata[:initial_barcodes] - Iterates non-directory files at the shipment's top level and deals with those that can be ignored or deleted, anything else causes an error.
- Run well-formedness check on each objid (for non-DLXS objects this is a Luhn check).
- Iterates each objid directory checking for unknown (non-ignorable, non-deletable, non-image) files or directories. TIFF and JP2 files must conform to 8.3 lowercase naming convention.
- Bails out if an error is detected at this point.
- Creates and populates
sourcedirectory -- if it doesn’t exist already -- for image masters.
Iterates all TIFF and JP2 files in the shipment, making sure bits per sample, samples per pixel, and resolution are according to spec.
Detect skipped and (shouldn’t happen) duplicate pagination.
Uses the external tiffset program (via lib/tiff.rb) to add metadata to all TIFF files in the shipment.
- Sets 274 orientation (set to
1) and 315 artist (sets todcu) tags by default. - Supports options like
--tagger-scanner=Xand--tagger-software=Xfor further customization. Seelib/tag_data.rbfor a complete list of artist/make/model/software codes.
Compresses bitonal TIFF files and converts contone TIFF files to JP2. This is a complex class with heavy reliance on third-party executables: tiffinfo, tiffset, exiftool, ImageMagick, Kakadu.
Note: this stage is only run when invoked with --config-profile=dlxs.
Further compresses JP2 files into bitonal TIFFs and renames JP2s with a "p" prefix.
- Runs JHOVE against each objid (config key
feed_validate_scriptpoints to the Perl code in the HathiTrustfeedrepo). - Checks
shipment.metadata[:initial_barcodes]objid list to make sure nothing was deleted in the course of processing. - Runs fixity check with SHA checksums of each file in
sourceagainst shipment metadata (fromPreflight) for added/removed/modified source files.