Skip to content

Disable create_default_catalog when the exist session state has default catalog for SessionStateBuilder #11988

@goldmedal

Description

@goldmedal

Is your feature request related to a problem or challenge?

I found that SessionStateBuilder::new_from_existing will unset the default catalog of the existing state. Consider the following case:

    fn employee_batch() -> RecordBatch {
        let name: ArrayRef = Arc::new(StringArray::from_iter_values([
            "Andy",
            "Andrew",
            "Oleks",
            "Chunchun",
            "Xiangpeng",
        ]));
        let age: ArrayRef = Arc::new(Int32Array::from(vec![11, 22, 33, 44, 55]));
        let position = Arc::new(StringArray::from_iter_values([
            "Engineer", "Manager", "Engineer", "Manager", "Engineer",
        ]));
        RecordBatch::try_from_iter(vec![("name", name), ("age", age), ("position", position)])
            .unwrap()
    }

    let ctx = SessionContext::new();
    ctx.register_batch("employee", employee_batch())?;
    let table = ctx.catalog("datafusion").unwrap().schema("public").unwrap().table("employee").await?;
    println!("{}", table.is_some());
    let new_state = SessionStateBuilder::new_from_existing(ctx.state()).build();
    let table = new_state.catalog_list().catalog("datafusion").unwrap().schema("public").unwrap().table("employee").await?;
    println!("{}", table.is_some());

The output result is

true
false

This behavior confuses me. After some research, I found that we have a configuration option, create_default_catalog_and_schema, which is true by default. However, I think the user might expect the new one to be exactly the same as the existing one, except for the session ID.

Describe the solution you'd like

I plan to add a check for SessionStateBuilder::new_from_existing. If the default catalog exists in the existing state, we can disable create_default_catalog_and_schema by default.

Describe alternatives you've considered

If we don't make this change, I think we should enhance the documentation for SessionStateBuilder::new_from_existing. Currently, the documentation only mentions that the session ID will be unset, while other settings remain the same.

/// Returns a new [SessionStateBuilder] based on an existing [SessionState]
/// The session id for the new builder will be unset; all other fields will
/// be cloned from what is set in the provided session state

Additional context

No response

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions