Skip to content

xl group command has no sorting option - results appear in arbitrary order #5

@AliiiBenn

Description

@AliiiBenn

Description

The xl group command does not support sorting the aggregated results. Users must save output to a file and then run a separate xl sort command to see ranked results (e.g., top products by sales).

Current Behavior

$ xl group "sales.xlsx" -s DONNEES --by "Circuit" --aggregate "Qté VTE:sum"

+-----------+---------------+
| Circuit   |   Qté VTE_sum |
+===========+===============+
| 24H       |          5890 |
+-----------+---------------+
| BOL       |          3357 |
+-----------+---------------+
| GP        |         23788 |
+-----------+---------------+
| S2R       |         10901 |
+-----------+---------------+
| SPA       |          2845 |
+-----------+---------------+

Results are in arbitrary order (alphabetical in this case, but not guaranteed). To get ranked results, users need a second command.

Expected Behavior

Users should be able to sort results directly:

$ xl group "sales.xlsx" -s DONNEES --by "Circuit" --aggregate "Qté VTE:sum" --sort desc

+-----------+---------------+
| Circuit   |   Qté VTE_sum |
+===========+===============+
| GP        |         23788 |  # ← Highest first
+-----------+---------------+
| S2R       |         10901 |
+-----------+---------------+
| 24H       |          5890 |
+-----------+---------------+
| BOL       |          3357 |
+-----------+---------------+
| SPA       |          2845 |  # ← Lowest last
+-----------+---------------+

Steps to Reproduce

  1. Create an Excel file with categorical data
  2. Run xl group file.xlsx --by "Category" --aggregate "Sales:sum"
  3. Observe results are not sorted by Sales

Real-World Impact

Use Case: Top 10 Products by Sales

Without Sorting (Current):

# Step 1: Group all products
xl group "sales.xlsx" --by "Product" --aggregate "Sales:sum" --output all_products.xlsx

# Step 2: Sort
xl sort all_products.xlsx --by "Sales_sum" --order desc --output sorted_products.xlsx

# Step 3: View top 10
xl head sorted_products.xlsx -n 10

# Step 4: Cleanup
rm all_products.xlsx sorted_products.xlsx

With Sorting (Proposed):

# Single command:
xl group "sales.xlsx" --by "Product" --aggregate "Sales:sum" --sort desc --limit 10

Impact:

  • 4 commands → 1 command
  • Requires intermediate file management
  • More error-prone workflow
  • Cannot see top items at a glance

Affected Commands

  • xl group - Most critical use case
  • xl count - Also lacks sorting (see XL-006)

Proposed Solution

Add --sort flag to xl group command:

xl group FILE_PATH --by COLUMNS --aggregate SPEC --sort [asc|desc] [--sort-column COLUMN]

Syntax Options

Option 1: Simple sort (default to aggregation column)

xl group data.xlsx --by "Category" --aggregate "Sales:sum" --sort desc
# Sorts by Sales_sum descending

Option 2: Explicit column

xl group data.xlsx --by "Category" --aggregate "Sales:sum" --sort desc --sort-column "Sales_sum"

Option 3: Multiple aggregations (future)

xl group data.xlsx --by "Category" --aggregate "Sales:sum,Count:count" --sort desc
# Sorts by first aggregation column (Sales_sum)

Acceptance Criteria

  • Can sort grouped results ascending
  • Can sort grouped results descending
  • Default behavior: if no --sort, results in group-by column order
  • With --sort, results sorted by aggregation column
  • Works with single aggregation
  • Works with multiple group-by columns
  • Documentation updated with examples
  • Backwards compatible (no --sort = current behavior)

Test Cases

# Test 1: Sort descending
xl group test.xlsx --by "Category" --aggregate "Sales:sum" --sort desc
# Expected: Categories with highest Sales first

# Test 2: Sort ascending
xl group test.xlsx --by "Category" --aggregate "Sales:sum" --sort asc
# Expected: Categories with lowest Sales first

# Test 3: Multiple group-by columns
xl group test.xlsx --by "Region,Category" --aggregate "Sales:sum" --sort desc
# Expected: Sorted by Sales_sum, grouped by Region then Category

# Test 4: No sort flag (backwards compatible)
xl group test.xlsx --by "Category" --aggregate "Sales:sum"
# Expected: Current behavior (arbitrary order)

# Test 5: Works with different aggregations
xl group test.xlsx --by "Product" --aggregate "Qty:mean" --sort desc
# Expected: Products sorted by mean quantity

Implementation Notes

Flag Syntax

--sort [asc|desc]
# OR
--sort-desc
--sort-asc

Recommendation: Use --sort [asc|desc] for flexibility

Sorting Logic

# Pseudo-code
if sort_order == 'desc':
    df = df.sort_values(by=aggregation_columns[0], ascending=False)
elif sort_order == 'asc':
    df = df.sort_values(by=aggregation_columns[0], ascending=True)

Default: Sort by first aggregation column

Multiple Group-By Considerations

xl group data.xlsx --by "Year,Region" --aggregate "Sales:sum" --sort desc

Should this sort:

  • Within each group (by Region within Year)?
  • Globally (all Year-Region combinations)?

Recommendation: Global sort by default, add --sort-within flag later if needed

Related Features

This feature becomes even more valuable when combined with:

  • XL-005: Limit/Top-N in group command
  • XL-002: Command piping support
  • XL-007: Multiple aggregations per command

Combined Use Case:

# Get top 10 products by sales
xl group sales.xlsx --by "Product" --aggregate "Sales:sum" --sort desc --limit 10

Workarounds

Current Workaround 1: Two Commands

xl group data.xlsx --by "Category" --aggregate "Sales:sum" --output temp.xlsx
xl sort temp.xlsx --by "Sales_sum" --order desc
rm temp.xlsx

Current Workaround 2: Use xl sort after export

xl group data.xlsx --by "Category" --aggregate "Sales:sum" --output grouped.xlsx
# Open in Excel and sort manually

Current Workaround 3: Use different tool

xl export data.xlsx --output data.csv
csvsort --c "Sales" --r data.csv

Benefits

  1. Workflow Efficiency: 1 command instead of 3
  2. Immediate Insights: See top/bottom items at a glance
  3. Better UX: Natural user expectation
  4. Consistency: Matches SQL ORDER BY behavior
  5. Intermediate Files: No need for temp files

Related Issues

  • XL-002: No command piping (workaround for this issue)
  • XL-005: Limit/Top-N feature (complementary feature)
  • XL-006: xl count also needs sorting
  • XL-007: Multiple aggregations (makes sorting more complex)

Labels

feature-request high-priority ux sorting group-by workflow

Metadata

Metadata

Assignees

No one assigned

    Labels

    feature-requestNew feature or functionality requesthigh-priorityHigh priority issuesuxUser experience and interface issues

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions