-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
When creating a lag function we are getting an unwanted projection when used with with_column on a dataframe. See the minimal reproducible example below.
This appears related to #12000
To Reproduce
use arrow::array::{Int32Array, RecordBatch};
use arrow::datatypes::{DataType, Field, Schema};
use datafusion::error::Result as DataFusionResult;
use datafusion::prelude::SessionContext;
use datafusion_catalog::MemTable;
use datafusion_expr::col;
use datafusion_functions_window::expr_fn::lag;
use std::sync::Arc;
#[tokio::test]
async fn with_column_lag() -> DataFusionResult<()> {
let schema = Schema::new(vec![Field::new("a", DataType::Int32, true)]);
let batch = RecordBatch::try_new(
Arc::new(schema.clone()),
vec![Arc::new(Int32Array::from(vec![1, 2, 3, 4, 5]))],
)?;
let ctx = SessionContext::new();
let provider = MemTable::try_new(Arc::new(schema), vec![vec![batch]])?;
ctx.register_table("t", Arc::new(provider))?;
let df = ctx.table("t").await?;
let lag_expr = lag(col("a"), Some(1), None);
df.with_column("lag_val", lag_expr)?.show().await?;
Ok(())
}Generates output:
+---+---------------------------------------------------------------------------------+---------+
| a | lag(t.a,Int64(1),NULL) ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING | lag_val |
+---+---------------------------------------------------------------------------------+---------+
| 1 | | |
| 2 | 1 | 1 |
| 3 | 2 | 2 |
| 4 | 3 | 3 |
| 5 | 4 | 4 |
+---+---------------------------------------------------------------------------------+---------+
Expected behavior
The output above should only contain two columns, a and lag_val.
Additional context
No response
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working