Skip to content

RFC: Improve tabular output formats #3043

@pombredanne

Description

@pombredanne

The current CSV output is a mess, albeit a convenient mess. We need something and quick. I suggest these short term and long term actions

For now in v31:

In v32:

  • Create a new --csv-file option that would only list file level details in this way:

    • file info columns, license expression, copyright, holder, urls, emails, "for_package".
    • no line number columns, no package data
    • exactly one row per file
    • multiple values are joined with a line in a single cell
    • long values that exceed offices tools limits are truncated
  • Create a new --csv-package option that would only list package details:

    • one row per package instance
  • Create a new --csv-dependency option that would only list dependency details:

    • one row per dependency instance
  • Create a new --csv-license option that would only list file level license scan information, used for debugging and hidden from the CLI help:

    • one row for each license match with diagnostic details for each match (rule, scores, etc)
  • Drop the hidden --csv option in v32

  • Add new XLSX option that creates a proper spreadsheet with multiples tabs

    • files
    • packages
    • dependencies
    • summary
    • and possibly: file "package_data", file "license_detections"
      where these essentially mirror the new csv-* options

For reference, we have these related issues:

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions