Skip to content

chore: Rust lint clippy::large_enum_variant flags all uses of Result<T, DataFusionError> with features = ["avro"] #15860

@rroelke

Description

@rroelke

Minimal reproducer:

Cargo.toml

[package]
name = "datafusion-bug-repro"
edition = "2024"

[dependencies]
datafusion-common = { version = "47", features = ["avro"] }

src/lib.rs

use datafusion_common::error::DataFusionError;

pub fn foo() -> Result<(), DataFusionError> {
    Ok(())
}

And then:

$ rustc --version
rustc 1.88.0-nightly (d7ea436a0 2025-04-24)

$ cargo clippy
warning: the `Err`-variant returned from this function is very large
 --> src/lib.rs:3:17
  |
3 | pub fn foo() -> Result<(), DataFusionError> {
  |                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^ the `Err`-variant is at least 256 bytes
  |
  = help: try reducing the size of `datafusion_common::DataFusionError`, for example by boxing large elements or replacing it with `Box<datafusion_common::DataFusionError>`
  = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#result_large_err
  = note: `#[warn(clippy::result_large_err)]` on by default

warning: `datafusion-bug-repro` (lib) generated 1 warning

From the lint documentation:

Why is this bad?
A Result is at least as large as the Err-variant. While we expect that variant to be seldom used, the compiler needs to reserve and move that much memory every single time. Furthermore, errors are often simply passed up the call-stack, making use of the ?-operator and its type-conversion mechanics. If the Err-variant further up the call-stack stores the Err-variant in question (as library code often does), it itself needs to be at least as large, propagating the problem.

This is probably reason enough to fix this, but the reason I'm filing it is actually because of the impact it has on downstream projects. Downstream projects which can return Result<T, DataFusionError> (very likely) will also see this lint in their own projects, and disabling it is undesirable for the reasons stated above.

Exit criteria

This issue can be closed when this lint is fixed (not ignored). Specifically DataFusionError must have a small size.

Investigation and recommended fix

This lint only appears when features = ["avro"] is used. This enables the AvroError variant of DataFusionError. The easiest thing to do would be to box the AvroError as the lint suggests.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions