Command-line toolkit for Excel data manipulation and analysis.
Excel CLI Toolkit (xl) is a powerful command-line interface for performing data wrangling and analysis operations on Excel files. Designed for both human users and AI systems, it provides fast, predictable operations without requiring scripts or programming.
- Filter & Search: Filter rows by conditions, search for values
- Sort & Group: Sort by columns, group and aggregate data
- Transform: Apply transformations, clean data, deduplicate
- Multi-format: Support for XLSX, CSV, JSON, and Parquet
- Functional Programming: Built with Result/Maybe types for robust error handling
- AI-Friendly: Simple, composable commands perfect for AI automation
- Fast: Efficient processing of large files
pip install excel-toolkit-cwduv pip install excel-toolkit-cwdgit clone https://github.com/wareflowx/excel-toolkit.git
cd excel-toolkit
uv pip install -e ".[dev]"pip install "excel-toolkit-cwd[parquet]"# Filter rows where Amount > 1000
xl filter sales.xlsx --where "Amount > 1000" --output filtered.xlsx
# Filter by multiple conditions
xl filter data.xlsx --where "Region == 'North' and Price > 100" -o result.xlsx# Sort by column
xl sort data.xlsx --by "Date" --descending --output sorted.xlsx
# Group and aggregate
xl group sales.xlsx --by "Region" --aggregate "Amount:sum" --output grouped.xlsx# Remove duplicates
xl dedupe data.xlsx --by "Email" --output unique.xlsx
# Clean whitespace and standardize
xl clean data.xlsx --trim --lowercase --columns "Name,Email" --output clean.xlsx# Convert Excel to CSV
xl convert data.xlsx --output data.csv
# Convert CSV to Excel
xl convert data.csv --output data.xlsx# Chain operations
xl filter sales.xlsx --where "Amount > 1000" | \
xl sort --by "Date" --descending | \
xl group --by "Region" --aggregate "Amount:sum" \
--output final.xlsxFor detailed documentation, see the docs/ directory:
- FEATURES.md - Complete feature list
- PROJECT_STRUCTURE.md - Architecture overview
- CONTRIBUTING.md - Contributing guidelines
# Clone repository
git clone https://github.com/wareflowx/excel-toolkit.git
cd excel-toolkit
# Install dependencies
uv sync --all-extras
# Install pre-commit hooks
pre-commit install# Run all tests
uv run pytest
# Run with coverage
uv run pytest --cov=excel_toolkit
# Run specific test file
uv run pytest tests/unit/test_filtering.py# Linting
uv run ruff check .
# Formatting
uv run ruff format .
# Type checking
uv run mypy excel_toolkit/xl filter- Filter rows based on conditionsxl select- Select specific columnsxl sort- Sort by column(s)xl group- Group and aggregate dataxl join- Join multiple filesxl clean- Clean data (trim, case, etc.)xl dedupe- Remove duplicatesxl transform- Apply transformationsxl convert- Convert between formatsxl merge- Merge multiple filesxl stats- Calculate statisticsxl info- Display file informationxl validate- Validate data
# General help
xl --help
# Command-specific help
xl filter --help# Extract sales data for Q1, filter high-value orders, group by region
xl filter sales.xlsx --where "Date >= '2024-01-01' and Date <= '2024-03-31'" | \
xl filter --where "Amount > 1000" | \
xl group --by "Region" --aggregate "Amount:sum,Orders:count" \
--output q1_high_value_by_region.xlsx# Clean messy CSV file
xl clean messy_data.csv \
--trim \
--lowercase \
--columns "email,name" \
--output cleaned.csv
# Remove duplicates and validate
xl dedupe cleaned.csv --by "email" --output unique.csv
xl validate unique.csv --columns "email:email,age:int:0-120" --output final.csvBuilt with a functional programming approach:
- Result types: Explicit error handling without exceptions
- Maybe types: Safe handling of optional values
- Immutable configuration: Predictable behavior
- Composable operations: Chain commands efficiently
For more details, see FUNCTIONAL_ANALYSIS.md.
Contributions are welcome! Please see CONTRIBUTING.md for guidelines.
This project is licensed under the MIT License - see the LICENSE file for details.
Built with: