A robust Java CLI application for batch processing customer CSV files with validation, transformation, and error reporting capabilities. Perfect for data migration, ETL pipelines, and data quality assurance workflows.
- Features
- Quick Start
- Installation
- Usage
- Data Validation Rules
- Output Examples
- Command Line Options
- Exit Codes
- Troubleshooting
- π Batch Processing: Process multiple CSV files in one operation
- β Data Validation: Comprehensive validation with detailed error reporting
- π Data Transformation: Clean and standardize customer data
- π Error Reporting: Separate files for valid data and validation errors
- ποΈ File Archiving: Automatically archive processed files
- π Performance Metrics: Processing statistics and timing information
- π‘οΈ Error Handling: Graceful handling of malformed data and edge cases
- Java 17 or higher
- Maven 3.9 or higher
git clone https://github.com/PixelPerfectDesigns/batch-file-processor.git
cd batch-file-processor
mvn clean packagemkdir -p data/in data/out data/archive# Create a test CSV file
cat > data/in/customers.csv << EOF
customer_id,full_name,email,signup_date
123,john doe,JOHN.DOE@gmail.com,2023-01-15
456,jane smith,jane.smith@example.com,2023-02-20
abc,bob johnson,invalid-email,2023-03-10
789,alice brown,alice@company.org,2023-04-05
EOFjava -jar target/batch-file-processor.jar \
--input ./data/in \
--output ./data/out \
--archive ./data/archive \
--pattern "*.csv" \
--failOnError=false# Clone the repository
git clone https://github.com/PixelPerfectDesigns/batch-file-processor.git
cd batch-file-processor
# Run tests
mvn clean verify
# Build executable JAR
mvn clean packageThe executable JAR will be created at target/batch-file-processor.jar
java -jar target/batch-file-processor.jar [OPTIONS]java -jar target/batch-file-processor.jar \
--input ./data/input \
--output ./data/processed \
--archive ./data/archive \
--pattern "customer*.csv" \
--failOnError=trueYour CSV files must include these header columns:
| Column | Type | Description | Example |
|---|---|---|---|
customer_id |
Integer | Unique customer identifier (positive) | 12345 |
full_name |
String | Customer's full name (2-80 chars) | John Smith |
email |
String | Valid email address | john.smith@email.com |
signup_date |
Date | Account signup date (ISO format) | 2023-01-15 |
customer_id,full_name,email,signup_date
123,john doe,JOHN.DOE@gmail.com,2023-01-15
456,jane smith,jane.smith@example.com,2023-02-20
789,bob johnson,bob.johnson@company.org,2023-03-10The processor validates each record against these rules:
- β Required: Must not be empty
- β Format: Must be a valid integer
- β Range: Must be greater than 0
- β Required: Must not be empty
- β Length: Must be between 2 and 80 characters
- β Required: Must not be empty
- β Format: Must contain '@' and not end with '@'
- β Structure: Basic email format validation
- β Required: Must not be empty
- β Format: Must be a valid date format
Clean, validated, and transformed records:
customer_id,full_name,email,signup_date,processed_at
123,John Doe,john.doe@gmail.com,2023-01-15,2026-01-30T10:30:00
456,Jane Smith,jane.smith@example.com,2023-02-20,2026-01-30T10:30:00
789,Bob Johnson,bob.johnson@company.org,2023-03-10,2026-01-30T10:30:00Records that failed validation:
row_number,field,error,original_data
4,customer_id,Must be an integer,"abc,invalid user,bad@,2023-bad-date"
4,email,Invalid format,"abc,invalid user,bad@,2023-bad-date"
4,signup_date,Invalid date format,"abc,invalid user,bad@,2023-bad-date"2026-01-30 10:30:15 INFO Discovered 1 file(s) in ./data/in
2026-01-30 10:30:15 INFO Processing: customers.csv
2026-01-30 10:30:15 WARN customers.csv had 1 validation error(s)
2026-01-30 10:30:15 INFO Run complete: files=1 totalRows=4 validRows=3 invalidRows=1 timeMs=245
| Option | Required | Description | Example |
|---|---|---|---|
--input |
β | Directory containing CSV files to process | ./data/in |
--output |
β | Directory for processed output files | ./data/out |
--archive |
β | Directory to archive original files | ./data/archive |
--pattern |
β | File pattern to match (default: *) |
customer*.csv |
--failOnError |
β | Exit with error code if validation fails (default: false) |
true |
*.csv- All CSV filescustomer*.csv- Files starting with "customer"*2023*.csv- Files containing "2023"data_*.csv- Files starting with "data_"
| Code | Status | Description |
|---|---|---|
0 |
β Success | All files processed successfully |
2 |
β Configuration Error | Invalid arguments or configuration |
3 |
β Runtime Error | Unexpected failure during processing |
4 |
β Validation Error | Validation failures when --failOnError=true |
π« "No files found"
# Check file pattern and directory
ls ./data/in/
java -jar target/batch-file-processor.jar --input ./data/in --pattern "*.csv" ...π« "Invalid CSV format"
- Ensure your CSV has the required headers:
customer_id,full_name,email,signup_date - Check for proper CSV formatting (commas, quotes)
π« "Permission denied"
# Ensure directories are writable
chmod 755 data/out data/archiveπ« "Java not found"
# Verify Java installation
java -version
# Should show Java 17 or higherπ« "Build failed"
# Clean and rebuild
mvn clean
mvn compile
mvn package- π Check the documentation
- π Report bugs via GitHub Issues
- π‘ Request features via GitHub Discussions
Built with β€οΈ by PixelPerfectDesigns