Professional CSV Data Validation Tool with Beautiful Web Interface
A modern, enterprise-grade CSV validation application that combines powerful command-line functionality with an intuitive web dashboard. Perfect for data teams, developers, and businesses who need reliable data quality assurance.
- π¨ Modern Web UI: Beautiful drag-and-drop interface with real-time validation
- β‘ CLI Tool: Pipeline-friendly command-line interface for automation
- β Customer ID: Integer validation, positive values only
- β Names: Length validation (2-80 characters), handles Unicode
- β Emails: Format validation with proper @ symbol checks
- β Dates: YYYY-MM-DD format, future date prevention
- π Interactive Dashboard: Real-time statistics and charts
- π CSV Error Reports: Detailed per-file error exports
- π JSON Summaries: Machine-readable validation results
- π― Color-coded Results: Instant visual feedback
- π§ͺ 22 Comprehensive Tests: Full test coverage
- π CI/CD Pipeline: GitHub Actions integration
- π Unicode Support: International characters and names
- π± Responsive Design: Works on desktop and mobile
Beautiful, user-friendly interface for CSV validation
Interactive error reporting with detailed explanations
Clean, successful validation results
# Clone the repository
git clone https://github.com/yourusername/data-validator-app.git
cd data-validator-app
# Create virtual environment
python -m venv .venv
# Activate virtual environment
# Windows:
.venv\Scripts\activate
# macOS/Linux:
source .venv/bin/activate
# Install dependencies
pip install -r requirements.txt# Start the web server
python -m validator.web
# Open http://localhost:8000 in your browser# Validate CSV files
python -m validator.cli --input data --reports reports --pattern "*.csv" --failOnError trueYour CSV files must contain these exact column headers:
| Column | Type | Requirements | Example |
|---|---|---|---|
customer_id |
Integer | Positive number > 0 | 1, 25, 1000 |
full_name |
String | 2-80 characters | "Alice Johnson" |
email |
Valid format with @ | "user@domain.com" |
|
signup_date |
Date | YYYY-MM-DD, not future | "2025-01-30" |
customer_id,full_name,email,signup_date
1,Alice Johnson,alice@example.com,2025-01-15
2,Bob Martinez,bob@example.com,2025-01-16
3,Catherine Chen,catherine@example.com,2025-01-17- Upload: Drag & drop CSV files or click to browse
- Validate: Click "Validate Files" button
- Review: View interactive results with error details
- Export: Download error reports and summaries
# Basic validation
python -m validator.cli --input ./data --reports ./reports
# With custom pattern and error handling
python -m validator.cli \
--input ./customer_data \
--reports ./validation_reports \
--pattern "customers_*.csv" \
--failOnError true \
--logLevel DEBUG0: Success (no validation errors)2: Invalid arguments or file format3: Unexpected runtime failure4: Validation errors found (when--failOnErroris true)
Run the comprehensive test suite:
# Run all tests
pytest -v
# Run with coverage
pytest --cov=validator --cov-report=html
# Run specific test categories
pytest tests/test_validation.py -vTest Coverage: 22 tests covering validation logic, CLI interface, and file operations.
The application includes 6 sample CSV files to help you understand validation:
| File | Description | Purpose |
|---|---|---|
sample_perfect.csv |
β All valid data (8 records) | See successful validation |
sample_mixed_errors.csv |
Learn about error types | |
sample_large_dataset.csv |
π Performance test (20 records) | Test with more data |
sample_edge_cases.csv |
π Boundary conditions | Unicode & special cases |
sample_critical_errors.csv |
β Worst-case scenarios | See error handling |
sample_international.csv |
π Unicode characters | International names |
- Open http://localhost:8000
- Click "Browse Samples" in the green section
- Preview files to see their content
- Download files to your computer, or
- Try Now to test them immediately
Sample files are located in the data/ folder of the project
# Test a perfect file
python -m validator.cli --input data --reports reports --pattern "sample_perfect.csv"
# Test a file with errors
python -m validator.cli --input data --reports reports --pattern "sample_mixed_errors.csv"The repository includes sample CSV files for testing:
sample_perfect.csv- All valid data βsample_mixed_errors.csv- Various validation errorsβ οΈ sample_large_dataset.csv- Performance testing (20 records) πsample_edge_cases.csv- Boundary conditions πsample_critical_errors.csv- Worst-case scenarios βsample_international.csv- Unicode characters π
data-validator-app/
βββ π¦ validator/ # Core application package
β βββ π₯οΈ cli.py # Command-line interface
β βββ π web.py # FastAPI web server
β βββ π models.py # Data models (Pydantic)
β βββ β
validation.py # Core validation logic
β βββ π csv_io.py # CSV file operations
β βββ βοΈ config.py # Configuration management
β βββ π discovery.py # File discovery utilities
β βββ π runner.py # Validation orchestration
β βββ π¨ templates/ # HTML templates
βββ π§ͺ tests/ # Comprehensive test suite
βββ π data/ # Sample CSV files
βββ βοΈ .github/workflows/ # CI/CD pipeline
βββ π requirements.txt # Python dependencies
βββ π README.md # This file
- π¨ FastAPI Backend: Modern async web framework
- πΌοΈ Jinja2 Templates: Server-side HTML rendering
- πΎ CSV Processing: Pandas-powered data handling
- β Pydantic Models: Type-safe data validation
- π§ͺ Pytest Testing: Comprehensive test coverage
# Install development dependencies
pip install -r requirements.txt
pip install pytest pytest-cov black isort flake8 mypy
# Run code formatting
black .
isort .
# Run linting
flake8 validator tests
# Type checking
mypy validator- Update validation logic in
validator/validation.py - Add corresponding tests in
tests/test_validation.py - Update documentation
def validate_customer(record: CustomerRecord, row_number: int) -> list[ValidationError]:
errors = []
# Add your validation rule here
if some_condition:
errors.append(ValidationError(row_number, "field_name", "Error message"))
return errors# Start web server with hot reload
python -m validator.webFROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
EXPOSE 8000
CMD ["python", "-m", "validator.web"]GitHub Actions workflow included:
- β Automated testing on Python 3.10, 3.11, 3.12
- π Code quality checks (Black, isort, flake8)
- π Test coverage reporting
- ποΈ Integration testing
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
- Write tests for new features
- Follow PEP 8 style guidelines
- Update documentation for API changes
- Ensure all tests pass before submitting
- β‘ Fast Processing: Handles thousands of records efficiently
- πΎ Memory Optimized: Streaming CSV processing
- π Async Operations: Non-blocking file uploads
- π Real-time Feedback: Instant validation results
Benchmarks:
- 1,000 records: ~0.5 seconds
- 10,000 records: ~2.1 seconds
- 100,000 records: ~15.8 seconds
This project is licensed under the MIT License - see the LICENSE file for details.
- FastAPI for the excellent web framework
- Tailwind CSS for beautiful, responsive styling
- Alpine.js for reactive frontend functionality
- Font Awesome for professional icons
- π Bug Reports: Open an issue
- π‘ Feature Requests: Start a discussion
- π§ Email: alex@pixelperfect-designs.com
β If you found this project helpful, please give it a star! β
Made with β€οΈ by Alex Staples