From 3ee1ed949e33907dece42664ac9e49d86aeb2940 Mon Sep 17 00:00:00 2001 From: Sergey Zhukov Date: Wed, 10 Dec 2025 15:34:41 +0300 Subject: [PATCH 1/3] Update README occording to the new examples (#18529) --- datafusion-examples/README.md | 185 +++++++++++++++++++++++----------- 1 file changed, 125 insertions(+), 60 deletions(-) diff --git a/datafusion-examples/README.md b/datafusion-examples/README.md index a8611c8c34602..7508e8f565d53 100644 --- a/datafusion-examples/README.md +++ b/datafusion-examples/README.md @@ -39,66 +39,131 @@ git submodule update --init # Change to the examples directory cd datafusion-examples/examples -# Run the `dataframe` example: -# ... use the equivalent for other examples +# Run all examples in a group +cargo run --example -- all + +# Run a specific example within a group +cargo run --example -- + +# Run all examples in the `dataframe` group +cargo run --example dataframe -- all + +# Run a single example from the `dataframe` group +# (apply the same pattern for any other group) cargo run --example dataframe -- dataframe ``` -## Single Process - -- [`examples/udf/advanced_udaf.rs`](examples/udf/advanced_udaf.rs): Define and invoke a more complicated User Defined Aggregate Function (UDAF) -- [`examples/udf/advanced_udf.rs`](examples/udf/advanced_udf.rs): Define and invoke a more complicated User Defined Scalar Function (UDF) -- [`examples/udf/advanced_udwf.rs`](examples/udf/advanced_udwf.rs): Define and invoke a more complicated User Defined Window Function (UDWF) -- [`examples/data_io/parquet_advanced_index.rs`](examples/data_io/parquet_advanced_index.rs): Creates a detailed secondary index that covers the contents of several parquet files -- [`examples/udf/async_udf.rs`](examples/udf/async_udf.rs): Define and invoke an asynchronous User Defined Scalar Function (UDF) -- [`examples/query_planning/analyzer_rule.rs`](examples/query_planning/analyzer_rule.rs): Use a custom AnalyzerRule to change a query's semantics (row level access control) -- [`examples/data_io/catalog.rs`](examples/data_io/catalog.rs): Register the table into a custom catalog -- [`examples/data_io/json_shredding.rs`](examples/data_io/json_shredding.rs): Shows how to implement custom filter rewriting for JSON shredding -- [`examples/proto/composed_extension_codec`](examples/proto/composed_extension_codec.rs): Example of using multiple extension codecs for serialization / deserialization -- [`examples/custom_data_source/csv_sql_streaming.rs`](examples/custom_data_source/csv_sql_streaming.rs): Build and run a streaming query plan from a SQL statement against a local CSV file -- [`examples/custom_data_source/csv_json_opener.rs`](examples/custom_data_source/csv_json_opener.rs): Use low level `FileOpener` APIs to read CSV/JSON into Arrow `RecordBatch`es -- [`examples/custom_data_source/custom_datasource.rs`](examples/custom_data_source/custom_datasource.rs): Run queries against a custom datasource (TableProvider) -- [`examples/custom_data_source/custom_file_casts.rs`](examples/custom_data_source/custom_file_casts.rs): Implement custom casting rules to adapt file schemas -- [`examples/custom_data_source/custom_file_format.rs`](examples/custom_data_source/custom_file_format.rs): Write data to a custom file format -- [`examples/external_dependency/dataframe_to_s3.rs`](examples/external_dependency/dataframe_to_s3.rs): Run a query using a DataFrame against a parquet file from s3 and writing back to s3 -- [`dataframe.rs`](examples/dataframe.rs): Run a query using a DataFrame API against parquet files, csv files, and in-memory data, including multiple subqueries. Also demonstrates the various methods to write out a DataFrame to a table, parquet file, csv file, and json file. -- [`examples/builtin_functions/date_time`](examples/builtin_functions/date_time.rs): Examples of date-time related functions and queries -- [`examples/custom_data_source/default_column_values.rs`](examples/custom_data_source/default_column_values.rs): Implement custom default value handling for missing columns using field metadata and PhysicalExprAdapter -- [`examples/dataframe/deserialize_to_struct.rs`](examples/dataframe/deserialize_to_struct.rs): Convert query results (Arrow ArrayRefs) into Rust structs -- [`examples/query_planning/expr_api.rs`](examples/query_planning/expr_api.rs): Create, execute, simplify, analyze and coerce `Expr`s -- [`examples/custom_data_source/file_stream_provider.rs`](examples/custom_data_source/file_stream_provider.rs): Run a query on `FileStreamProvider` which implements `StreamProvider` for reading and writing to arbitrary stream sources / sinks. -- [`flight/sql_server.rs`](examples/flight/sql_server.rs): Run DataFusion as a standalone process and execute SQL queries from Flight and and FlightSQL (e.g. JDBC) clients -- [`examples/builtin_functions/function_factory.rs`](examples/builtin_functions/function_factory.rs): Register `CREATE FUNCTION` handler to implement SQL macros -- [`examples/execution_monitoring/memory_pool_tracking.rs`](examples/execution_monitoring/memory_pool_tracking.rs): Demonstrates TrackConsumersPool for memory tracking and debugging with enhanced error messages -- [`examples/execution_monitoring/memory_pool_execution_plan.rs`](examples/execution_monitoring/memory_pool_execution_plan.rs): Shows how to implement memory-aware ExecutionPlan with memory reservation and spilling -- [`examples/execution_monitoring/tracing.rs`](examples/execution_monitoring/tracing.rs): Demonstrates the tracing injection feature for the DataFusion runtime -- [`examples/query_planning/optimizer_rule.rs`](examples/query_planning/optimizer_rule.rs): Use a custom OptimizerRule to replace certain predicates -- [`examples/data_io/parquet_embedded_index.rs`](examples/data_io/parquet_embedded_index.rs): Store a custom index inside a Parquet file and use it to speed up queries -- [`examples/data_io/parquet_encrypted.rs`](examples/data_io/parquet_encrypted.rs): Read and write encrypted Parquet files using DataFusion -- [`examples/data_io/parquet_encrypted_with_kms.rs`](examples/data_io/parquet_encrypted_with_kms.rs): Read and write encrypted Parquet files using an encryption factory -- [`examples/data_io/parquet_index.rs`](examples/data_io/parquet_index.rs): Create an secondary index over several parquet files and use it to speed up queries -- [`examples/data_io/parquet_exec_visitor.rs`](examples/data_io/parquet_exec_visitor.rs): Extract statistics by visiting an ExecutionPlan after execution -- [`examples/query_planning/parse_sql_expr.rs`](examples/query_planning/parse_sql_expr.rs): Parse SQL text into DataFusion `Expr`. -- [`examples/query_planning/plan_to_sql.rs`](examples/query_planning/plan_to_sql.rs): Generate SQL from DataFusion `Expr` and `LogicalPlan` -- [`examples/query_planning/planner_api.rs`](examples/query_planning/planner_api.rs) APIs to manipulate logical and physical plans -- [`examples/query_planning/pruning.rs`](examples/query_planning/pruning.rs): Use pruning to rule out files based on statistics -- [`examples/query_planning/thread_pools.rs`](examples/query_planning/thread_pools.rs): Demonstrates TrackConsumersPool for memory tracking and debugging with enhanced error messages and shows how to implement memory-aware ExecutionPlan with memory reservation and spilling -- [`examples/external_dependency/query_aws_s3.rs`](examples/external_dependency/query_aws_s3.rs): Configure `object_store` and run a query against files stored in AWS S3 -- [`examples/data_io/query_http_csv.rs`](examples/data_io/query_http_csv.rs): Configure `object_store` and run a query against files via HTTP -- [`examples/builtin_functions/regexp.rs`](examples/builtin_functions/regexp.rs): Examples of using regular expression functions -- [`examples/relation_planner/match_recognize.rs`](examples/relation_planner/match_recognize.rs): Use custom relation planner to implement MATCH_RECOGNIZE pattern matching -- [`examples/relation_planner/pivot_unpivot.rs`](examples/relation_planner/pivot_unpivot.rs): Use custom relation planner to implement PIVOT and UNPIVOT operations -- [`examples/relation_planner/table_sample.rs`](examples/relation_planner/table_sample.rs): Use custom relation planner to implement TABLESAMPLE clause -- [`examples/data_io/remote_catalog.rs`](examples/data_io/remote_catalog.rs): Examples of interfacing with a remote catalog (e.g. over a network) -- [`examples/udf/simple_udaf.rs`](examples/udf/simple_udaf.rs): Define and invoke a User Defined Aggregate Function (UDAF) -- [`examples/udf/simple_udf.rs`](examples/udf/simple_udf.rs): Define and invoke a User Defined Scalar Function (UDF) -- [`examples/udf/simple_udtf.rs`](examples/udf/simple_udtf.rs): Define and invoke a User Defined Table Function (UDTF) -- [`examples/udf/simple_udfw.rs`](examples/udf/simple_udwf.rs): Define and invoke a User Defined Window Function (UDWF) -- [`examples/sql_ops/analysis.rs`](examples/sql_ops/analysis.rs): Analyse SQL queries with DataFusion structures -- [`examples/sql_ops/frontend.rs`](examples/sql_ops/frontend.rs): Create LogicalPlans (only) from sql strings -- [`examples/sql_ops/dialect.rs`](examples/sql_ops/dialect.rs): Example of implementing a custom SQL dialect on top of `DFParser` -- [`examples/sql_ops/query.rs`](examples/sql_ops/query.rs): Query data using SQL (in memory `RecordBatches`, local Parquet files) - -## Distributed - -- [`examples/flight/client.rs`](examples/flight/client.rs) and [`examples/flight/server.rs`](examples/flight/server.rs): Run DataFusion as a standalone process and execute SQL queries from a client using the Arrow Flight protocol. +## Builtin Functions Examples + +| Group | Subcommand | Category | File Path | Description | +| ----------------- | ---------------- | -------------- | ------------------------------------------------ | ---------------------------------------------------------- | +| builtin_functions | date_time | Single Process | `examples/builtin_functions/date_time.rs` | Examples of date-time related functions and queries | +| builtin_functions | function_factory | Single Process | `examples/builtin_functions/function_factory.rs` | Register `CREATE FUNCTION` handler to implement SQL macros | +| builtin_functions | regexp | Single Process | `examples/builtin_functions/regexp.rs` | Examples of using regular expression functions | + +## Custom Data Source Examples + +| Group | Subcommand | Category | File Path | Description | +| ------------------ | --------------------- | -------------- | ------------------------------------------------------ | --------------------------------------------- | +| custom_data_source | csv_sql_streaming | Single Process | `examples/custom_data_source/csv_sql_streaming.rs` | Run a streaming SQL query against CSV data | +| custom_data_source | csv_json_opener | Single Process | `examples/custom_data_source/csv_json_opener.rs` | Use low-level FileOpener APIs for CSV/JSON | +| custom_data_source | custom_datasource | Single Process | `examples/custom_data_source/custom_datasource.rs` | Query a custom TableProvider | +| custom_data_source | custom_file_casts | Single Process | `examples/custom_data_source/custom_file_casts.rs` | Implement custom casting rules for schemas | +| custom_data_source | custom_file_format | Single Process | `examples/custom_data_source/custom_file_format.rs` | Write to a custom file format | +| custom_data_source | default_column_values | Single Process | `examples/custom_data_source/default_column_values.rs` | Custom default values using metadata | +| custom_data_source | file_stream_provider | Single Process | `examples/custom_data_source/file_stream_provider.rs` | Read/write via FileStreamProvider for streams | + +## Data IO Examples + +| Group | Subcommand | Category | File Path | Description | +| ------- | -------------------------- | -------------- | ------------------------------------------------ | ---------------------------------------------------------------- | +| data_io | catalog | Single Process | `examples/data_io/catalog.rs` | Register tables into a custom catalog | +| data_io | json_shredding | Single Process | `examples/data_io/json_shredding.rs` | Implement custom filter rewriting for JSON shredding | +| data_io | parquet_adv_idx | Single Process | `examples/data_io/parquet_advanced_index.rs` | Creates a detailed secondary index across multiple parquet files | +| data_io | parquet_emb_idx | Single Process | `examples/data_io/parquet_embedded_index.rs` | Store a custom index inside Parquet files | +| data_io | parquet_enc | Single Process | `examples/data_io/parquet_encrypted.rs` | Read & write encrypted Parquet files | +| data_io | parquet_enc_with_kms | Single Process | `examples/data_io/parquet_encrypted_with_kms.rs` | Encrypted Parquet I/O using a KMS-backed encryption factory | +| data_io | parquet_exec_visitor | Single Process | `examples/data_io/parquet_exec_visitor.rs` | Extract statistics by visiting an ExecutionPlan | +| data_io | parquet_idx | Single Process | `examples/data_io/parquet_index.rs` | Create a secondary index over several parquet files | +| data_io | query_http_csv | Single Process | `examples/data_io/query_http_csv.rs` | Query CSV files via HTTP using object_store | +| data_io | remote_catalog | Single Process | `examples/data_io/remote_catalog.rs` | Interact with a remote catalog | + +## DataFrame Examples + +| Group | Subcommand | Category | File Path | Description | +| --------- | --------------------- | -------------- | --------------------------------------------- | ----------------------------------------------------------------------------- | +| dataframe | dataframe | Single Process | `examples/dataframe.rs` | Query DataFrames from Parquet/CSV/memory and write output to multiple formats | +| dataframe | deserialize_to_struct | Single Process | `examples/dataframe/deserialize_to_struct.rs` | Convert Arrow arrays into Rust structs | + +## Execution Monitoring Examples + +| Group | Subcommand | Category | File Path | Description | +| -------------------- | -------------------------- | -------------- | ------------------------------------------------------------- | ---------------------------------------------------- | +| execution_monitoring | mem_pool_exec_plan | Single Process | `examples/execution_monitoring/memory_pool_execution_plan.rs` | Memory-aware ExecutionPlan with spilling | +| execution_monitoring | mem_pool_tracking | Single Process | `examples/execution_monitoring/memory_pool_tracking.rs` | Demonstrates memory tracking with TrackConsumersPool | +| execution_monitoring | tracing | Single Process | `examples/execution_monitoring/tracing.rs` | Demonstrates tracing injection in DataFusion runtime | + +## External Dependency Examples + +| Group | Subcommand | Category | File Path | Description | +| ------------------- | --------------- | -------------- | ------------------------------------------------- | ---------------------------------------- | +| external_dependency | dataframe_to_s3 | Single Process | `examples/external_dependency/dataframe_to_s3.rs` | Query DataFrames and write results to S3 | +| external_dependency | query_aws_s3 | Single Process | `examples/external_dependency/query_aws_s3.rs` | Query S3-backed data using object_store | + +## Flight Examples + +| Group | Subcommand | Category | File Path | Description | +| ------ | ------------- | ----------- | --------------------------- | ------------------------------------------------------ | +| flight | server | Distributed | `examples/flight/server.rs` | Run DataFusion server accepting FlightSQL/JDBC queries | +| flight | client | Distributed | `examples/flight/client.rs` | Execute SQL queries using the Arrow Flight protocol | +| flight | sql_server | Distributed | `examples/flight/sql_server.rs` | Run DataFusion as a standalone process and execute SQL queries from JDBC clients | + +## Proto Examples + +| Group | Subcommand | Category | File Path | Description | +| ----- | ------------------------ | -------------- | -------------------------------------------- | --------------------------------------------------------------- | +| proto | composed_extension_codec | Single Process | `examples/proto/composed_extension_codec.rs` | Use multiple extension codecs for serialization/deserialization | + +## Query Planning Examples + +| Group | Subcommand | Category | File Path | Description | +| -------------- | -------------- | -------------- | ------------------------------------------- | -------------------------------------------------------- | +| query_planning | analyzer_rule | Single Process | `examples/query_planning/analyzer_rule.rs` | Custom AnalyzerRule to change query semantics | +| query_planning | expr_api | Single Process | `examples/query_planning/expr_api.rs` | Create, execute, analyze, and coerce Exprs | +| query_planning | optimizer_rule | Single Process | `examples/query_planning/optimizer_rule.rs` | Replace predicates via a custom OptimizerRule | +| query_planning | parse_sql_expr | Single Process | `examples/query_planning/parse_sql_expr.rs` | Parse SQL text into DataFusion Expr | +| query_planning | plan_to_sql | Single Process | `examples/query_planning/plan_to_sql.rs` | Generate SQL from expressions or plans | +| query_planning | planner_api | Single Process | `examples/query_planning/planner_api.rs` | APIs for manipulating logical and physical plans | +| query_planning | pruning | Single Process | `examples/query_planning/pruning.rs` | Use pruning to skip irrelevant files | +| query_planning | thread_pools | Single Process | `examples/query_planning/thread_pools.rs` | Demonstrates memory tracking, spilling & execution pools | + +## Relation Planner Examples + +| Group | Subcommand | Category | File Path | Description | +| ---------------- | --------------- | -------------- | ---------------------------------------------- | ------------------------------------------ | +| relation_planner | match_recognize | Single Process | `examples/relation_planner/match_recognize.rs` | Implement MATCH_RECOGNIZE pattern matching | +| relation_planner | pivot_unpivot | Single Process | `examples/relation_planner/pivot_unpivot.rs` | Implement PIVOT / UNPIVOT | +| relation_planner | table_sample | Single Process | `examples/relation_planner/table_sample.rs` | Implement TABLESAMPLE | + +## SQL Ops Examples + +| Group | Subcommand | Category | File Path | Description | +| ------- | ---------- | -------------- | ------------------------------ | ---------------------------------------------- | +| sql_ops | analysis | Single Process | `examples/sql_ops/analysis.rs` | Analyze SQL queries with DataFusion structures | +| sql_ops | dialect | Single Process | `examples/sql_ops/dialect.rs` | Implement a custom SQL dialect | +| sql_ops | frontend | Single Process | `examples/sql_ops/frontend.rs` | Build LogicalPlans from SQL strings | +| sql_ops | query | Single Process | `examples/sql_ops/query.rs` | Query data via SQL | + + +## UDF Examples + +| Group | Subcommand | Category | File Path | Description | +| ----- | ------------- | -------------- | ------------------------------- | ---------------------------------------------------------------------------- | +| udf | adv_udaf | Single Process | `examples/udf/advanced_udaf.rs` | Define and invoke a more complicated User Defined Aggregate Function (UDAF) | +| udf | adv_udf | Single Process | `examples/udf/advanced_udf.rs` | Define and invoke a more complicated User Defined Scalar Function (UDF) | +| udf | adv_udwf | Single Process | `examples/udf/advanced_udwf.rs` | Define and invoke a more complicated User Defined Window Function (UDWF) | +| udf | async_udf | Single Process | `examples/udf/async_udf.rs` | Define and invoke an asynchronous User Defined Scalar Function (UDF) | +| udf | udaf | Single Process | `examples/udf/simple_udaf.rs` | Define and invoke a User Defined Aggregate Function (UDAF) | +| udf | udf | Single Process | `examples/udf/simple_udf.rs` | Define and invoke a User Defined Scalar Function (UDF) | +| udf | udtf | Single Process | `examples/udf/simple_udtf.rs` | Define and invoke a User Defined Table Function (UDTF) | +| udf | udfw | Single Process | `examples/udf/simple_udfw.rs` | Define and invoke a User Defined Window Function (UDWF) | From 439c9d6a39214392d36a2a84692cfed3d6ac4479 Mon Sep 17 00:00:00 2001 From: Sergey Zhukov Date: Thu, 11 Dec 2025 17:02:52 +0300 Subject: [PATCH 2/3] remove Group & Category, file path as a link --- datafusion-examples/README.md | 182 +++++++++++++++++++--------------- 1 file changed, 102 insertions(+), 80 deletions(-) diff --git a/datafusion-examples/README.md b/datafusion-examples/README.md index 7508e8f565d53..d0a75322a1c67 100644 --- a/datafusion-examples/README.md +++ b/datafusion-examples/README.md @@ -54,116 +54,138 @@ cargo run --example dataframe -- dataframe ``` ## Builtin Functions Examples +### Group: `builtin_functions` +#### Category: Single Process +| Subcommand | File Path | Description | +| ---------------- | -------------------------------------------------------------------------------------------------- | ---------------------------------------------------------- | +| date_time | [`builtin_functions/date_time.rs`](examples/builtin_functions/date_time.rs) | Examples of date-time related functions and queries | +| function_factory | [`builtin_functions/function_factory.rs`](examples/builtin_functions/function_factory.rs) | Register `CREATE FUNCTION` handler to implement SQL macros | +| regexp | [`builtin_functions/regexp.rs`](examples/builtin_functions/regexp.rs) | Examples of using regular expression functions | -| Group | Subcommand | Category | File Path | Description | -| ----------------- | ---------------- | -------------- | ------------------------------------------------ | ---------------------------------------------------------- | -| builtin_functions | date_time | Single Process | `examples/builtin_functions/date_time.rs` | Examples of date-time related functions and queries | -| builtin_functions | function_factory | Single Process | `examples/builtin_functions/function_factory.rs` | Register `CREATE FUNCTION` handler to implement SQL macros | -| builtin_functions | regexp | Single Process | `examples/builtin_functions/regexp.rs` | Examples of using regular expression functions | ## Custom Data Source Examples +### Group: `custom_data_source` +#### Category: Single Process +| Subcommand | File Path | Description | +| --------------------- | -------------------------------------------------------------------------------------------------------------- | --------------------------------------------- | +| csv_sql_streaming | [`custom_data_source/csv_sql_streaming.rs`](examples/custom_data_source/csv_sql_streaming.rs) | Run a streaming SQL query against CSV data | +| csv_json_opener | [`custom_data_source/csv_json_opener.rs`](examples/custom_data_source/csv_json_opener.rs) | Use low-level FileOpener APIs for CSV/JSON | +| custom_datasource | [`custom_data_source/custom_datasource.rs`](examples/custom_data_source/custom_datasource.rs) | Query a custom TableProvider | +| custom_file_casts | [`custom_data_source/custom_file_casts.rs`](examples/custom_data_source/custom_file_casts.rs) | Implement custom casting rules | +| custom_file_format | [`custom_data_source/custom_file_format.rs`](examples/custom_data_source/custom_file_format.rs) | Write to a custom file format | +| default_column_values | [`custom_data_source/default_column_values.rs`](examples/custom_data_source/default_column_values.rs) | Custom default values using metadata | +| file_stream_provider | [`custom_data_source/file_stream_provider.rs`](examples/custom_data_source/file_stream_provider.rs) | Read/write via FileStreamProvider for streams | -| Group | Subcommand | Category | File Path | Description | -| ------------------ | --------------------- | -------------- | ------------------------------------------------------ | --------------------------------------------- | -| custom_data_source | csv_sql_streaming | Single Process | `examples/custom_data_source/csv_sql_streaming.rs` | Run a streaming SQL query against CSV data | -| custom_data_source | csv_json_opener | Single Process | `examples/custom_data_source/csv_json_opener.rs` | Use low-level FileOpener APIs for CSV/JSON | -| custom_data_source | custom_datasource | Single Process | `examples/custom_data_source/custom_datasource.rs` | Query a custom TableProvider | -| custom_data_source | custom_file_casts | Single Process | `examples/custom_data_source/custom_file_casts.rs` | Implement custom casting rules for schemas | -| custom_data_source | custom_file_format | Single Process | `examples/custom_data_source/custom_file_format.rs` | Write to a custom file format | -| custom_data_source | default_column_values | Single Process | `examples/custom_data_source/default_column_values.rs` | Custom default values using metadata | -| custom_data_source | file_stream_provider | Single Process | `examples/custom_data_source/file_stream_provider.rs` | Read/write via FileStreamProvider for streams | ## Data IO Examples +### Group: `data_io` +#### Category: Single Process +| Subcommand | File Path | Description | +| -------------------- | -------------------------------------------------------------------------------------------------- | ------------------------------------------------------ | +| catalog | [`data_io/catalog.rs`](examples/data_io/catalog.rs) | Register tables into a custom catalog | +| json_shredding | [`data_io/json_shredding.rs`](examples/data_io/json_shredding.rs) | Implement filter rewriting for JSON shredding | +| parquet_adv_idx | [`data_io/parquet_advanced_index.rs`](examples/data_io/parquet_advanced_index.rs) | Create a secondary index across multiple parquet files | +| parquet_emb_idx | [`data_io/parquet_embedded_index.rs`](examples/data_io/parquet_embedded_index.rs) | Store a custom index inside Parquet files | +| parquet_enc | [`data_io/parquet_encrypted.rs`](examples/data_io/parquet_encrypted.rs) | Read & write encrypted Parquet files | +| parquet_enc_with_kms | [`data_io/parquet_encrypted_with_kms.rs`](examples/data_io/parquet_encrypted_with_kms.rs) | Encrypted Parquet I/O using a KMS-backed factory | +| parquet_exec_visitor | [`data_io/parquet_exec_visitor.rs`](examples/data_io/parquet_exec_visitor.rs) | Extract statistics by visiting an ExecutionPlan | +| parquet_idx | [`data_io/parquet_index.rs`](examples/data_io/parquet_index.rs) | Create a secondary index | +| query_http_csv | [`data_io/query_http_csv.rs`](examples/data_io/query_http_csv.rs) | Query CSV files via HTTP | +| remote_catalog | [`data_io/remote_catalog.rs`](examples/data_io/remote_catalog.rs) | Interact with a remote catalog | -| Group | Subcommand | Category | File Path | Description | -| ------- | -------------------------- | -------------- | ------------------------------------------------ | ---------------------------------------------------------------- | -| data_io | catalog | Single Process | `examples/data_io/catalog.rs` | Register tables into a custom catalog | -| data_io | json_shredding | Single Process | `examples/data_io/json_shredding.rs` | Implement custom filter rewriting for JSON shredding | -| data_io | parquet_adv_idx | Single Process | `examples/data_io/parquet_advanced_index.rs` | Creates a detailed secondary index across multiple parquet files | -| data_io | parquet_emb_idx | Single Process | `examples/data_io/parquet_embedded_index.rs` | Store a custom index inside Parquet files | -| data_io | parquet_enc | Single Process | `examples/data_io/parquet_encrypted.rs` | Read & write encrypted Parquet files | -| data_io | parquet_enc_with_kms | Single Process | `examples/data_io/parquet_encrypted_with_kms.rs` | Encrypted Parquet I/O using a KMS-backed encryption factory | -| data_io | parquet_exec_visitor | Single Process | `examples/data_io/parquet_exec_visitor.rs` | Extract statistics by visiting an ExecutionPlan | -| data_io | parquet_idx | Single Process | `examples/data_io/parquet_index.rs` | Create a secondary index over several parquet files | -| data_io | query_http_csv | Single Process | `examples/data_io/query_http_csv.rs` | Query CSV files via HTTP using object_store | -| data_io | remote_catalog | Single Process | `examples/data_io/remote_catalog.rs` | Interact with a remote catalog | ## DataFrame Examples +### Group: `dataframe` +#### Category: Single Process +| Subcommand | File Path | Description | +| --------------------- | -------------------------------------------------------------------------------------------- | ------------------------------------------------------ | +| dataframe | [`dataframe/dataframe.rs`](examples/dataframe/dataframe.rs) | Query DataFrames from various sources and write output | +| deserialize_to_struct | [`dataframe/deserialize_to_struct.rs`](examples/dataframe/deserialize_to_struct.rs) | Convert Arrow arrays into Rust structs | -| Group | Subcommand | Category | File Path | Description | -| --------- | --------------------- | -------------- | --------------------------------------------- | ----------------------------------------------------------------------------- | -| dataframe | dataframe | Single Process | `examples/dataframe.rs` | Query DataFrames from Parquet/CSV/memory and write output to multiple formats | -| dataframe | deserialize_to_struct | Single Process | `examples/dataframe/deserialize_to_struct.rs` | Convert Arrow arrays into Rust structs | ## Execution Monitoring Examples +### Group: `execution_monitoring` +#### Category: Single Process +| Subcommand | File Path | Description | +| ------------------ | ---------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------- | +| mem_pool_exec_plan | [`execution_monitoring/memory_pool_execution_plan.rs`](examples/execution_monitoring/memory_pool_execution_plan.rs) | Memory-aware ExecutionPlan with spilling | +| mem_pool_tracking | [`execution_monitoring/memory_pool_tracking.rs`](examples/execution_monitoring/memory_pool_tracking.rs) | Demonstrates memory tracking | +| tracing | [`execution_monitoring/tracing.rs`](examples/execution_monitoring/tracing.rs) | Demonstrates tracing integration | -| Group | Subcommand | Category | File Path | Description | -| -------------------- | -------------------------- | -------------- | ------------------------------------------------------------- | ---------------------------------------------------- | -| execution_monitoring | mem_pool_exec_plan | Single Process | `examples/execution_monitoring/memory_pool_execution_plan.rs` | Memory-aware ExecutionPlan with spilling | -| execution_monitoring | mem_pool_tracking | Single Process | `examples/execution_monitoring/memory_pool_tracking.rs` | Demonstrates memory tracking with TrackConsumersPool | -| execution_monitoring | tracing | Single Process | `examples/execution_monitoring/tracing.rs` | Demonstrates tracing injection in DataFusion runtime | ## External Dependency Examples +### Group: `external_dependency` +#### Category: Single Process +| Subcommand | File Path | Description | +| --------------- | ---------------------------------------------------------------------------------------------------- | ---------------------------------------- | +| dataframe_to_s3 | [`external_dependency/dataframe_to_s3.rs`](examples/external_dependency/dataframe_to_s3.rs) | Query DataFrames and write results to S3 | +| query_aws_s3 | [`external_dependency/query_aws_s3.rs`](examples/external_dependency/query_aws_s3.rs) | Query S3-backed data using object_store | -| Group | Subcommand | Category | File Path | Description | -| ------------------- | --------------- | -------------- | ------------------------------------------------- | ---------------------------------------- | -| external_dependency | dataframe_to_s3 | Single Process | `examples/external_dependency/dataframe_to_s3.rs` | Query DataFrames and write results to S3 | -| external_dependency | query_aws_s3 | Single Process | `examples/external_dependency/query_aws_s3.rs` | Query S3-backed data using object_store | ## Flight Examples +### Group: `flight` +#### Category: Distributed +| Subcommand | File Path | Description | +| ---------- | ---------------------------------------------------------------- | ------------------------------------------------------ | +| server | [`flight/server.rs`](examples/flight/server.rs) | Run DataFusion server accepting FlightSQL/JDBC queries | +| client | [`flight/client.rs`](examples/flight/client.rs) | Execute SQL queries via Arrow Flight protocol | +| sql_server | [`flight/sql_server.rs`](examples/flight/sql_server.rs) | Standalone SQL server for JDBC clients | -| Group | Subcommand | Category | File Path | Description | -| ------ | ------------- | ----------- | --------------------------- | ------------------------------------------------------ | -| flight | server | Distributed | `examples/flight/server.rs` | Run DataFusion server accepting FlightSQL/JDBC queries | -| flight | client | Distributed | `examples/flight/client.rs` | Execute SQL queries using the Arrow Flight protocol | -| flight | sql_server | Distributed | `examples/flight/sql_server.rs` | Run DataFusion as a standalone process and execute SQL queries from JDBC clients | ## Proto Examples +### Group: `proto` +#### Category: Single Process +| Subcommand | File Path | Description | +| ------------------------ | ------------------------------------------------------------------------------------------ | --------------------------------------------------------------- | +| composed_extension_codec | [`proto/composed_extension_codec.rs`](examples/proto/composed_extension_codec.rs) | Use multiple extension codecs for serialization/deserialization | -| Group | Subcommand | Category | File Path | Description | -| ----- | ------------------------ | -------------- | -------------------------------------------- | --------------------------------------------------------------- | -| proto | composed_extension_codec | Single Process | `examples/proto/composed_extension_codec.rs` | Use multiple extension codecs for serialization/deserialization | ## Query Planning Examples +### Group: `query_planning` +#### Category: Single Process +| Subcommand | File Path | Description | +| -------------- | ---------------------------------------------------------------------------------------- | -------------------------------------------------------- | +| analyzer_rule | [`query_planning/analyzer_rule.rs`](examples/query_planning/analyzer_rule.rs) | Custom AnalyzerRule to change query semantics | +| expr_api | [`query_planning/expr_api.rs`](examples/query_planning/expr_api.rs) | Create, execute, analyze, and coerce Exprs | +| optimizer_rule | [`query_planning/optimizer_rule.rs`](examples/query_planning/optimizer_rule.rs) | Replace predicates via a custom OptimizerRule | +| parse_sql_expr | [`query_planning/parse_sql_expr.rs`](examples/query_planning/parse_sql_expr.rs) | Parse SQL into DataFusion Expr | +| plan_to_sql | [`query_planning/plan_to_sql.rs`](examples/query_planning/plan_to_sql.rs) | Generate SQL from expressions or plans | +| planner_api | [`query_planning/planner_api.rs`](examples/query_planning/planner_api.rs) | APIs for logical and physical plan manipulation | +| pruning | [`query_planning/pruning.rs`](examples/query_planning/pruning.rs) | Use pruning to skip irrelevant files | +| thread_pools | [`query_planning/thread_pools.rs`](examples/query_planning/thread_pools.rs) | Configure custom thread pools for DataFusion execution | -| Group | Subcommand | Category | File Path | Description | -| -------------- | -------------- | -------------- | ------------------------------------------- | -------------------------------------------------------- | -| query_planning | analyzer_rule | Single Process | `examples/query_planning/analyzer_rule.rs` | Custom AnalyzerRule to change query semantics | -| query_planning | expr_api | Single Process | `examples/query_planning/expr_api.rs` | Create, execute, analyze, and coerce Exprs | -| query_planning | optimizer_rule | Single Process | `examples/query_planning/optimizer_rule.rs` | Replace predicates via a custom OptimizerRule | -| query_planning | parse_sql_expr | Single Process | `examples/query_planning/parse_sql_expr.rs` | Parse SQL text into DataFusion Expr | -| query_planning | plan_to_sql | Single Process | `examples/query_planning/plan_to_sql.rs` | Generate SQL from expressions or plans | -| query_planning | planner_api | Single Process | `examples/query_planning/planner_api.rs` | APIs for manipulating logical and physical plans | -| query_planning | pruning | Single Process | `examples/query_planning/pruning.rs` | Use pruning to skip irrelevant files | -| query_planning | thread_pools | Single Process | `examples/query_planning/thread_pools.rs` | Demonstrates memory tracking, spilling & execution pools | ## Relation Planner Examples +### Group: `relation_planner` +#### Category: Single Process +| Subcommand | File Path | Description | +| --------------- | ---------------------------------------------------------------------------------------------- | ------------------------------------------ | +| match_recognize | [`relation_planner/match_recognize.rs`](examples/relation_planner/match_recognize.rs) | Implement MATCH_RECOGNIZE pattern matching | +| pivot_unpivot | [`relation_planner/pivot_unpivot.rs`](examples/relation_planner/pivot_unpivot.rs) | Implement PIVOT / UNPIVOT | +| table_sample | [`relation_planner/table_sample.rs`](examples/relation_planner/table_sample.rs) | Implement TABLESAMPLE | -| Group | Subcommand | Category | File Path | Description | -| ---------------- | --------------- | -------------- | ---------------------------------------------- | ------------------------------------------ | -| relation_planner | match_recognize | Single Process | `examples/relation_planner/match_recognize.rs` | Implement MATCH_RECOGNIZE pattern matching | -| relation_planner | pivot_unpivot | Single Process | `examples/relation_planner/pivot_unpivot.rs` | Implement PIVOT / UNPIVOT | -| relation_planner | table_sample | Single Process | `examples/relation_planner/table_sample.rs` | Implement TABLESAMPLE | ## SQL Ops Examples - -| Group | Subcommand | Category | File Path | Description | -| ------- | ---------- | -------------- | ------------------------------ | ---------------------------------------------- | -| sql_ops | analysis | Single Process | `examples/sql_ops/analysis.rs` | Analyze SQL queries with DataFusion structures | -| sql_ops | dialect | Single Process | `examples/sql_ops/dialect.rs` | Implement a custom SQL dialect | -| sql_ops | frontend | Single Process | `examples/sql_ops/frontend.rs` | Build LogicalPlans from SQL strings | -| sql_ops | query | Single Process | `examples/sql_ops/query.rs` | Query data via SQL | +### Group: `sql_ops` +#### Category: Single Process +| Subcommand | File Path | Description | +| ---------- | -------------------------------------------------------------- | ------------------------------ | +| analysis | [`sql_ops/analysis.rs`](examples/sql_ops/analysis.rs) | Analyze SQL queries | +| dialect | [`sql_ops/dialect.rs`](examples/sql_ops/dialect.rs) | Implement a custom SQL dialect | +| frontend | [`sql_ops/frontend.rs`](examples/sql_ops/frontend.rs) | Build LogicalPlans from SQL | +| query | [`sql_ops/query.rs`](examples/sql_ops/query.rs) | Query data using SQL | ## UDF Examples - -| Group | Subcommand | Category | File Path | Description | -| ----- | ------------- | -------------- | ------------------------------- | ---------------------------------------------------------------------------- | -| udf | adv_udaf | Single Process | `examples/udf/advanced_udaf.rs` | Define and invoke a more complicated User Defined Aggregate Function (UDAF) | -| udf | adv_udf | Single Process | `examples/udf/advanced_udf.rs` | Define and invoke a more complicated User Defined Scalar Function (UDF) | -| udf | adv_udwf | Single Process | `examples/udf/advanced_udwf.rs` | Define and invoke a more complicated User Defined Window Function (UDWF) | -| udf | async_udf | Single Process | `examples/udf/async_udf.rs` | Define and invoke an asynchronous User Defined Scalar Function (UDF) | -| udf | udaf | Single Process | `examples/udf/simple_udaf.rs` | Define and invoke a User Defined Aggregate Function (UDAF) | -| udf | udf | Single Process | `examples/udf/simple_udf.rs` | Define and invoke a User Defined Scalar Function (UDF) | -| udf | udtf | Single Process | `examples/udf/simple_udtf.rs` | Define and invoke a User Defined Table Function (UDTF) | -| udf | udfw | Single Process | `examples/udf/simple_udfw.rs` | Define and invoke a User Defined Window Function (UDWF) | +### Group: `udf` +#### Category: Single Process +| Subcommand | File Path | Description | +| ---------- | ---------------------------------------------------------------- | ----------------------------------------------- | +| adv_udaf | [`udf/advanced_udaf.rs`](examples/udf/advanced_udaf.rs) | Advanced User Defined Aggregate Function (UDAF) | +| adv_udf | [`udf/advanced_udf.rs`](examples/udf/advanced_udf.rs) | Advanced User Defined Scalar Function (UDF) | +| adv_udwf | [`udf/advanced_udwf.rs`](examples/udf/advanced_udwf.rs) | Advanced User Defined Window Function (UDWF) | +| async_udf | [`udf/async_udf.rs`](examples/udf/async_udf.rs) | Asynchronous User Defined Scalar Function | +| udaf | [`udf/simple_udaf.rs`](examples/udf/simple_udaf.rs) | Simple UDAF example | +| udf | [`udf/simple_udf.rs`](examples/udf/simple_udf.rs) | Simple UDF example | +| udtf | [`udf/simple_udtf.rs`](examples/udf/simple_udtf.rs) | Simple UDTF example | +| udwf | [`udf/simple_udwf.rs`](examples/udf/simple_udwf.rs) | Simple UDWF example | From d03e3c4658ec5f997033f1a786e0be337c48c9ba Mon Sep 17 00:00:00 2001 From: Andrew Lamb Date: Thu, 11 Dec 2025 15:19:38 -0500 Subject: [PATCH 3/3] prettier --- datafusion-examples/README.md | 109 +++++++++++++++++++++------------- 1 file changed, 67 insertions(+), 42 deletions(-) diff --git a/datafusion-examples/README.md b/datafusion-examples/README.md index d0a75322a1c67..1469aad5417b8 100644 --- a/datafusion-examples/README.md +++ b/datafusion-examples/README.md @@ -54,20 +54,25 @@ cargo run --example dataframe -- dataframe ``` ## Builtin Functions Examples + ### Group: `builtin_functions` + #### Category: Single Process -| Subcommand | File Path | Description | -| ---------------- | -------------------------------------------------------------------------------------------------- | ---------------------------------------------------------- | + +| Subcommand | File Path | Description | +| ---------------- | ----------------------------------------------------------------------------------------- | ---------------------------------------------------------- | | date_time | [`builtin_functions/date_time.rs`](examples/builtin_functions/date_time.rs) | Examples of date-time related functions and queries | | function_factory | [`builtin_functions/function_factory.rs`](examples/builtin_functions/function_factory.rs) | Register `CREATE FUNCTION` handler to implement SQL macros | | regexp | [`builtin_functions/regexp.rs`](examples/builtin_functions/regexp.rs) | Examples of using regular expression functions | - ## Custom Data Source Examples + ### Group: `custom_data_source` + #### Category: Single Process -| Subcommand | File Path | Description | -| --------------------- | -------------------------------------------------------------------------------------------------------------- | --------------------------------------------- | + +| Subcommand | File Path | Description | +| --------------------- | ----------------------------------------------------------------------------------------------------- | --------------------------------------------- | | csv_sql_streaming | [`custom_data_source/csv_sql_streaming.rs`](examples/custom_data_source/csv_sql_streaming.rs) | Run a streaming SQL query against CSV data | | csv_json_opener | [`custom_data_source/csv_json_opener.rs`](examples/custom_data_source/csv_json_opener.rs) | Use low-level FileOpener APIs for CSV/JSON | | custom_datasource | [`custom_data_source/custom_datasource.rs`](examples/custom_data_source/custom_datasource.rs) | Query a custom TableProvider | @@ -76,12 +81,14 @@ cargo run --example dataframe -- dataframe | default_column_values | [`custom_data_source/default_column_values.rs`](examples/custom_data_source/default_column_values.rs) | Custom default values using metadata | | file_stream_provider | [`custom_data_source/file_stream_provider.rs`](examples/custom_data_source/file_stream_provider.rs) | Read/write via FileStreamProvider for streams | - ## Data IO Examples + ### Group: `data_io` + #### Category: Single Process -| Subcommand | File Path | Description | -| -------------------- | -------------------------------------------------------------------------------------------------- | ------------------------------------------------------ | + +| Subcommand | File Path | Description | +| -------------------- | ----------------------------------------------------------------------------------------- | ------------------------------------------------------ | | catalog | [`data_io/catalog.rs`](examples/data_io/catalog.rs) | Register tables into a custom catalog | | json_shredding | [`data_io/json_shredding.rs`](examples/data_io/json_shredding.rs) | Implement filter rewriting for JSON shredding | | parquet_adv_idx | [`data_io/parquet_advanced_index.rs`](examples/data_io/parquet_advanced_index.rs) | Create a secondary index across multiple parquet files | @@ -93,94 +100,112 @@ cargo run --example dataframe -- dataframe | query_http_csv | [`data_io/query_http_csv.rs`](examples/data_io/query_http_csv.rs) | Query CSV files via HTTP | | remote_catalog | [`data_io/remote_catalog.rs`](examples/data_io/remote_catalog.rs) | Interact with a remote catalog | - ## DataFrame Examples + ### Group: `dataframe` + #### Category: Single Process -| Subcommand | File Path | Description | -| --------------------- | -------------------------------------------------------------------------------------------- | ------------------------------------------------------ | + +| Subcommand | File Path | Description | +| --------------------- | ----------------------------------------------------------------------------------- | ------------------------------------------------------ | | dataframe | [`dataframe/dataframe.rs`](examples/dataframe/dataframe.rs) | Query DataFrames from various sources and write output | | deserialize_to_struct | [`dataframe/deserialize_to_struct.rs`](examples/dataframe/deserialize_to_struct.rs) | Convert Arrow arrays into Rust structs | - ## Execution Monitoring Examples + ### Group: `execution_monitoring` + #### Category: Single Process -| Subcommand | File Path | Description | -| ------------------ | ---------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------- | + +| Subcommand | File Path | Description | +| ------------------ | ------------------------------------------------------------------------------------------------------------------- | ---------------------------------------- | | mem_pool_exec_plan | [`execution_monitoring/memory_pool_execution_plan.rs`](examples/execution_monitoring/memory_pool_execution_plan.rs) | Memory-aware ExecutionPlan with spilling | | mem_pool_tracking | [`execution_monitoring/memory_pool_tracking.rs`](examples/execution_monitoring/memory_pool_tracking.rs) | Demonstrates memory tracking | | tracing | [`execution_monitoring/tracing.rs`](examples/execution_monitoring/tracing.rs) | Demonstrates tracing integration | - ## External Dependency Examples + ### Group: `external_dependency` + #### Category: Single Process -| Subcommand | File Path | Description | -| --------------- | ---------------------------------------------------------------------------------------------------- | ---------------------------------------- | + +| Subcommand | File Path | Description | +| --------------- | ------------------------------------------------------------------------------------------- | ---------------------------------------- | | dataframe_to_s3 | [`external_dependency/dataframe_to_s3.rs`](examples/external_dependency/dataframe_to_s3.rs) | Query DataFrames and write results to S3 | | query_aws_s3 | [`external_dependency/query_aws_s3.rs`](examples/external_dependency/query_aws_s3.rs) | Query S3-backed data using object_store | - ## Flight Examples + ### Group: `flight` + #### Category: Distributed -| Subcommand | File Path | Description | -| ---------- | ---------------------------------------------------------------- | ------------------------------------------------------ | + +| Subcommand | File Path | Description | +| ---------- | ------------------------------------------------------- | ------------------------------------------------------ | | server | [`flight/server.rs`](examples/flight/server.rs) | Run DataFusion server accepting FlightSQL/JDBC queries | | client | [`flight/client.rs`](examples/flight/client.rs) | Execute SQL queries via Arrow Flight protocol | | sql_server | [`flight/sql_server.rs`](examples/flight/sql_server.rs) | Standalone SQL server for JDBC clients | - ## Proto Examples + ### Group: `proto` + #### Category: Single Process -| Subcommand | File Path | Description | -| ------------------------ | ------------------------------------------------------------------------------------------ | --------------------------------------------------------------- | -| composed_extension_codec | [`proto/composed_extension_codec.rs`](examples/proto/composed_extension_codec.rs) | Use multiple extension codecs for serialization/deserialization | +| Subcommand | File Path | Description | +| ------------------------ | --------------------------------------------------------------------------------- | --------------------------------------------------------------- | +| composed_extension_codec | [`proto/composed_extension_codec.rs`](examples/proto/composed_extension_codec.rs) | Use multiple extension codecs for serialization/deserialization | ## Query Planning Examples + ### Group: `query_planning` + #### Category: Single Process -| Subcommand | File Path | Description | -| -------------- | ---------------------------------------------------------------------------------------- | -------------------------------------------------------- | -| analyzer_rule | [`query_planning/analyzer_rule.rs`](examples/query_planning/analyzer_rule.rs) | Custom AnalyzerRule to change query semantics | -| expr_api | [`query_planning/expr_api.rs`](examples/query_planning/expr_api.rs) | Create, execute, analyze, and coerce Exprs | -| optimizer_rule | [`query_planning/optimizer_rule.rs`](examples/query_planning/optimizer_rule.rs) | Replace predicates via a custom OptimizerRule | -| parse_sql_expr | [`query_planning/parse_sql_expr.rs`](examples/query_planning/parse_sql_expr.rs) | Parse SQL into DataFusion Expr | -| plan_to_sql | [`query_planning/plan_to_sql.rs`](examples/query_planning/plan_to_sql.rs) | Generate SQL from expressions or plans | -| planner_api | [`query_planning/planner_api.rs`](examples/query_planning/planner_api.rs) | APIs for logical and physical plan manipulation | -| pruning | [`query_planning/pruning.rs`](examples/query_planning/pruning.rs) | Use pruning to skip irrelevant files | -| thread_pools | [`query_planning/thread_pools.rs`](examples/query_planning/thread_pools.rs) | Configure custom thread pools for DataFusion execution | +| Subcommand | File Path | Description | +| -------------- | ------------------------------------------------------------------------------- | ------------------------------------------------------ | +| analyzer_rule | [`query_planning/analyzer_rule.rs`](examples/query_planning/analyzer_rule.rs) | Custom AnalyzerRule to change query semantics | +| expr_api | [`query_planning/expr_api.rs`](examples/query_planning/expr_api.rs) | Create, execute, analyze, and coerce Exprs | +| optimizer_rule | [`query_planning/optimizer_rule.rs`](examples/query_planning/optimizer_rule.rs) | Replace predicates via a custom OptimizerRule | +| parse_sql_expr | [`query_planning/parse_sql_expr.rs`](examples/query_planning/parse_sql_expr.rs) | Parse SQL into DataFusion Expr | +| plan_to_sql | [`query_planning/plan_to_sql.rs`](examples/query_planning/plan_to_sql.rs) | Generate SQL from expressions or plans | +| planner_api | [`query_planning/planner_api.rs`](examples/query_planning/planner_api.rs) | APIs for logical and physical plan manipulation | +| pruning | [`query_planning/pruning.rs`](examples/query_planning/pruning.rs) | Use pruning to skip irrelevant files | +| thread_pools | [`query_planning/thread_pools.rs`](examples/query_planning/thread_pools.rs) | Configure custom thread pools for DataFusion execution | ## Relation Planner Examples + ### Group: `relation_planner` + #### Category: Single Process -| Subcommand | File Path | Description | -| --------------- | ---------------------------------------------------------------------------------------------- | ------------------------------------------ | + +| Subcommand | File Path | Description | +| --------------- | ------------------------------------------------------------------------------------- | ------------------------------------------ | | match_recognize | [`relation_planner/match_recognize.rs`](examples/relation_planner/match_recognize.rs) | Implement MATCH_RECOGNIZE pattern matching | | pivot_unpivot | [`relation_planner/pivot_unpivot.rs`](examples/relation_planner/pivot_unpivot.rs) | Implement PIVOT / UNPIVOT | | table_sample | [`relation_planner/table_sample.rs`](examples/relation_planner/table_sample.rs) | Implement TABLESAMPLE | - ## SQL Ops Examples + ### Group: `sql_ops` + #### Category: Single Process -| Subcommand | File Path | Description | -| ---------- | -------------------------------------------------------------- | ------------------------------ | + +| Subcommand | File Path | Description | +| ---------- | ----------------------------------------------------- | ------------------------------ | | analysis | [`sql_ops/analysis.rs`](examples/sql_ops/analysis.rs) | Analyze SQL queries | | dialect | [`sql_ops/dialect.rs`](examples/sql_ops/dialect.rs) | Implement a custom SQL dialect | | frontend | [`sql_ops/frontend.rs`](examples/sql_ops/frontend.rs) | Build LogicalPlans from SQL | | query | [`sql_ops/query.rs`](examples/sql_ops/query.rs) | Query data using SQL | - ## UDF Examples + ### Group: `udf` + #### Category: Single Process -| Subcommand | File Path | Description | -| ---------- | ---------------------------------------------------------------- | ----------------------------------------------- | + +| Subcommand | File Path | Description | +| ---------- | ------------------------------------------------------- | ----------------------------------------------- | | adv_udaf | [`udf/advanced_udaf.rs`](examples/udf/advanced_udaf.rs) | Advanced User Defined Aggregate Function (UDAF) | | adv_udf | [`udf/advanced_udf.rs`](examples/udf/advanced_udf.rs) | Advanced User Defined Scalar Function (UDF) | | adv_udwf | [`udf/advanced_udwf.rs`](examples/udf/advanced_udwf.rs) | Advanced User Defined Window Function (UDWF) |