-
Notifications
You must be signed in to change notification settings - Fork 0
Closed
Labels
criticalCritical priority issues that block functionalityCritical priority issues that block functionalityfeature-requestNew feature or functionality requestNew feature or functionality requesti18nInternationalization and localization issuesInternationalization and localization issues
Description
Description
The xl transform command does not provide date/time extraction functions. Users cannot extract year, month, quarter, day of week from datetime columns, making time series analysis (year-over-year comparison, seasonal analysis, etc.) impossible.
Current Behavior
# Attempt to extract year from date
$ xl transform "sales.xlsx" -c "DateVente" -o "Year" -f "extract_year"
Usage: xl transform [OPTIONS] FILE_PATH
Try 'xl transform --help' for help
+- Error ---------------------------------------------------------------------+
| Invalid transformation type: 'extract_year' |No date extraction functions are available.
Expected Behavior
Users should be able to extract date components:
# Extract year
xl transform "sales.xlsx" -c "DateVente" -o "Year" -f "year"
# Extract month
xl transform "sales.xlsx" -c "DateVente" -o "Month" -f "month"
# Extract quarter
xl transform "sales.xlsx" -c "DateVente" -o "Quarter" -f "quarter"
# Extract day of week
xl transform "sales.xlsx" -c "DateVente" -o "Weekday" -f "weekday"Steps to Reproduce
- Create Excel file with datetime column (e.g., "2023-04-13 00:00:00")
- Try to extract year:
xl transform file.xlsx -c "Date" -o "Year" -f "year" - Observe error: "Invalid transformation type"
Real-World Impact
Use Case: Year-Over-Year Sales Analysis
Business Question: How did each circuit perform in 2023 vs 2024 vs 2025?
Current Approach (Manual):
# Option 1: Manual extraction from pre-aggregated TCD sheet
# Time: 30 minutes of manual data entry
# Error-prone, not scalable
# Option 2: Export to Python/pandas
xl export "sales.xlsx" --output sales.csv
# Python script:
import pandas as pd
df = pd.read_csv('sales.csv')
df['Year'] = pd.to_datetime(df['DateVente']).dt.year
yearly_sales = df.groupby('Year')['Sales'].sum()
yearly_sales.to_excel('yearly_sales.xlsx')Proposed Approach (with xl):
# One command chain:
xl transform "sales.xlsx" -c "DateVente" -o "Year" -f "year" --output with_year.xlsx | \
xl group --by "Year,Circuit" --aggregate "Sales:sum"Critical Business Analyses Blocked
-
Year-over-Year Growth Rates
- Cannot calculate:
(2024_sales - 2023_sales) / 2023_sales - Essential for business planning
- Cannot calculate:
-
Seasonal Analysis
- Cannot extract month/quarter
- Cannot identify seasonal patterns
-
Day-of-Week Patterns
- Cannot extract weekday
- Cannot analyze weekend vs weekday performance
-
Trend Analysis
- Cannot group by time periods
- Cannot identify upward/downward trends
Proposed Solution
Add date/time extraction functions to xl transform command.
Function Syntax
xl transform FILE_PATH -c COLUMN -o OUTPUT_COLUMN -f FUNCTIONSupported Functions
| Function | Description | Example Input | Example Output |
|---|---|---|---|
year |
Extract year (4-digit) | 2023-04-13 | 2023 |
month |
Extract month (1-12) | 2023-04-13 | 4 |
quarter |
Extract quarter (1-4) | 2023-04-13 | 2 |
day |
Extract day of month (1-31) | 2023-04-13 | 13 |
weekday |
Extract day of week (0-6, Mon=0) | 2023-04-13 | 3 (Thursday) |
week |
Extract ISO week number (1-53) | 2023-04-13 | 15 |
dayofyear |
Extract day of year (1-366) | 2023-04-13 | 103 |
hour |
Extract hour (0-23) | 2023-04-13 14:30:00 | 14 |
minute |
Extract minute (0-59) | 2023-04-13 14:30:00 | 30 |
date |
Extract date only | 2023-04-13 14:30:00 | 2023-04-13 |
time |
Extract time only | 2023-04-13 14:30:00 | 14:30:00 |
Extended Functions (Future)
| Function | Description | Example |
|---|---|---|
year_month |
Combined year-month (YYYY-MM) | 2023-04 |
year_quarter |
Combined year-quarter (YYYY-QX) | 2023-Q2 |
month_name |
Month name (January, February, ...) | April |
weekday_name |
Weekday name (Monday, Tuesday, ...) | Thursday |
is_weekend |
Boolean: is weekend? | true/false |
Acceptance Criteria
- Can extract year from datetime column
- Can extract month from datetime column
- Can extract quarter from datetime column
- Can extract day of week from datetime column
- Can extract hour/minute from datetime column
- Works with Excel datetime format
- Works with CSV datetime format
- Handles NULL/missing datetime values gracefully
- Output column has correct data type (integer for year/month, etc.)
- Documentation updated with examples
- Error handling: clear error if column is not datetime
Test Cases
# Test 1: Extract year
xl transform sales.xlsx -c "DateVente" -o "Year" -f "year"
# Expected: New column "Year" with values 2023, 2024, 2025
# Test 2: Extract month
xl transform sales.xlsx -c "DateVente" -o "Month" -f "month"
# Expected: New column "Month" with values 1-12
# Test 3: Extract quarter
xl transform sales.xlsx -c "DateVente" -o "Quarter" -f "quarter"
# Expected: New column "Quarter" with values 1-4
# Test 4: Extract weekday
xl transform sales.xlsx -c "DateVente" -o "Weekday" -f "weekday"
# Expected: New column "Weekday" with values 0-6
# Test 5: Chain for time series analysis
xl transform sales.xlsx -c "DateVente" -o "Year" -f "year" --output with_year.xlsx
xl group with_year.xlsx --by "Year" --aggregate "Sales:sum"
# Expected: Yearly totals
# Test 6: Multiple extractions
xl transform sales.xlsx -c "DateVente" -o "Year" -f "year" \
-c "DateVente" -o "Month" -f "month" \
-c "DateVente" -o "Quarter" -f "quarter"
# Expected: Three new columns
# Test 7: Handle NULL dates
xl transform sales.xlsx -c "DateVente" -o "Year" -f "year"
# Expected: NULL or NaN for missing datesReal-World Examples
Example 1: Year-Over-Year Analysis
# Extract year and aggregate
xl transform sales.xlsx -c "DateVente" -o "Year" -f "year" --output temp.xlsx
xl group temp.xlsx --by "Year,Circuit" --aggregate "Sales:sum"
# Result:
# Year | Circuit | Sales_sum
# 2023 | 24H | 2299
# 2024 | 24H | 1518
# 2025 | 24H | 2073Example 2: Monthly Seasonality
# Extract month and analyze
xl transform sales.xlsx -c "DateVente" -o "Month" -f "month" --output temp.xlsx
xl group temp.xlsx --by "Month" --aggregate "Sales:sum" --sort desc
# Result: Identify peak months (e.g., April, May, June)Example 3: Weekend vs Weekday
# Extract weekday
xl transform sales.xlsx -c "DateVente" -o "Weekday" -f "weekday" --output temp.xlsx
# Add weekend flag (requires conditional transform - future feature)
# xl transform temp.xlsx -c "Weekday" -o "IsWeekend" -f "if gt 4 1 0"Implementation Priority
Phase 1: Critical Functions (MVP)
year- Most common use casemonth- Seasonal analysisquarter- Quarterly reporting
Phase 2: Highly Useful
weekday- Day of week analysisday- Day-level analysisweek- Weekly reporting
Phase 3: Nice to Have
hour,minute- Intra-day analysisyear_month,year_quarter- Pre-combined formatsmonth_name,weekday_name- Human-readable formats
Related Features
This feature enables:
- Time Series Analysis: Group by year/month/quarter
- Seasonal Pattern Detection: Identify cyclical patterns
- Trend Analysis: Identify upward/downward trends
- Period-over-Period Comparison: YoY, MoM, QoQ growth
- Business Intelligence: Time-based reporting
Workarounds
Current Workaround 1: Manual Pivot Table
# Use TCD sheet if already aggregated by year
# Time: 30 minutes manual copy-paste
# Not scalable for different date rangesCurrent Workaround 2: Export to Python
import pandas as pd
df = pd.read_excel('sales.xlsx')
df['Year'] = pd.to_datetime(df['Date']).dt.year
df['Month'] = pd.to_datetime(df['Date']).dt.month
df.to_excel('sales_with_dates.xlsx')Current Workaround 3: Use Excel Formulas
# Open in Excel
# Add formula: =YEAR(A2) in new column
# Drag down for all rows
# Save and use xl groupBenefits
- Time Series Analysis: Enables YoY, MoM, QoQ analysis
- Seasonal Insights: Identify seasonal patterns
- Business Intelligence: Critical for business decisions
- Workflow Efficiency: No need to export to Python/Excel
- Competitive Parity: SQL, pandas, Excel all have this
Industry Standards
SQL Equivalent
SELECT
YEAR(Date) as Year,
MONTH(Date) as Month,
SUM(Sales) as Total
FROM data
GROUP BY YEAR(Date), MONTH(Date)pandas Equivalent
df['Year'] = df['Date'].dt.year
df['Month'] = df['Date'].dt.monthExcel Equivalent
=YEAR(A2)
=MONTH(A2)
xl should match these standard capabilities.
Related Issues
- XL-006: No growth rate calculations (depends on year extraction)
- XL-003: No sorting in group (time series needs sorting by date)
- XL-007: Multiple aggregations (time series with sum, count, mean)
Labels
feature-request critical time-series datetime transform analysis
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
criticalCritical priority issues that block functionalityCritical priority issues that block functionalityfeature-requestNew feature or functionality requestNew feature or functionality requesti18nInternationalization and localization issuesInternationalization and localization issues