Skip to content

Conversation

@nicosuave
Copy link
Member

Fix GROUP BY spacing detection in semantic expansion and view processing. Add SQLLogic coverage for GROUP BY spacing in semantic queries. Tests: cd /Users/nico/conductor/workspaces/yardstick/yeosu-v1/yardstick-rs && cargo build --release
mkdir -p build/release
cmake -DEXTENSION_STATIC_BUILD=1 -DDUCKDB_EXTENSION_CONFIGS='/Users/nico/conductor/workspaces/yardstick/yeosu-v1/extension_config.cmake' -DOSX_BUILD_ARCH= -DDUCKDB_EXPLICIT_PLATFORM='' -DCUSTOM_LINKER= -DOVERRIDE_GIT_DESCRIBE="" -DUNITTEST_ROOT_DIRECTORY="/Users/nico/conductor/workspaces/yardstick/yeosu-v1/" -DBENCHMARK_ROOT_DIRECTORY="/Users/nico/conductor/workspaces/yardstick/yeosu-v1/" -DENABLE_UNITTEST_CPP_TESTS=FALSE -DBUILD_EXTENSION_TEST_DEPS=default -DCMAKE_BUILD_TYPE=Release -S "./duckdb/" -B build/release
-- git hash d1dc88f950, version v1.4.3, extension folder v1.4.3
-- Extensions will be deployed to: /Users/nico/conductor/workspaces/yardstick/yeosu-v1/build/release/repository
-- Load extension 'yardstick' from '/Users/nico/conductor/workspaces/yardstick/yeosu-v1' @ 4ab83db
-- Load extension 'core_functions' from '/Users/nico/conductor/workspaces/yardstick/yeosu-v1/duckdb/extensions' @ v1.4.3
-- Load extension 'parquet' from '/Users/nico/conductor/workspaces/yardstick/yeosu-v1/duckdb/extensions' @ v1.4.3
-- Rust_CARGO_TARGET: aarch64-apple-darwin
-- OS_NAME: osx
-- OS_ARCH: arm64
-- OSX_BUILD_ARCH:
-- Rust Toolchain: stable-aarch64-apple-darwin
-- Using Corrosion as a subdirectory
-- Extensions linked into DuckDB: [yardstick, core_functions, parquet]
-- Tests loaded for extensions: [yardstick]
-- Configuring done (1.7s)
-- Generating done (0.8s)
-- Build files have been written to: /Users/nico/conductor/workspaces/yardstick/yeosu-v1/build/release
cmake --build build/release --config Release
[ 8%] Built target duckdb_zstd
[ 8%] Built target duckdb_platform_binary
[ 8%] Built target duckdb_platform
[ 8%] Built target cargo-prebuild_yardstick
Copying byproducts libyardstick.a to /Users/nico/conductor/workspaces/yardstick/yeosu-v1/build/release/extension/yardstick
[ 8%] Built target _cargo-build_yardstick
[ 8%] Built target cargo-build_yardstick
[ 8%] Built target duckdb_fmt
[ 10%] Built target duckdb_pg_query
[ 10%] Built target duckdb_sorting
[ 10%] Built target duckdb_common_types
[ 10%] Built target duckdb_common_types_column
[ 10%] Built target duckdb_common_types_row
[ 10%] Built target duckdb_value_operations
[ 13%] Built target duckdb_vector_operations
[ 13%] Built target duckdb_logging
[ 13%] Built target duckdb_execution
[ 13%] Built target duckdb_expression_executor
[ 14%] Built target duckdb_nested_loop_join
[ 14%] Built target duckdb_operator_aggregate
[ 14%] Built target duckdb_csv_buffer_manager
[ 14%] Built target duckdb_csv_encode
[ 14%] Built target duckdb_csv_scanner
[ 15%] Built target duckdb_csv_sniffer
[ 15%] Built target duckdb_csv_state_machine
[ 15%] Built target duckdb_operator_csv_table_function
[ 15%] Built target duckdb_csv_util
[ 15%] Built target duckdb_operator_filter
[ 16%] Built target duckdb_operator_helper
[ 16%] Built target duckdb_operator_join
[ 16%] Built target duckdb_operator_order
[ 16%] Built target duckdb_operator_persistent
[ 16%] Built target duckdb_operator_projection
[ 17%] Built target duckdb_operator_scan
[ 17%] Built target duckdb_operator_schema
[ 17%] Built target duckdb_operator_set
[ 17%] Built target duckdb_physical_plan
[ 18%] Built target duckdb_execution_index
[ 18%] Built target duckdb_execution_index_art
[ 18%] Built target duckdb_sample
[ 18%] Built target duckdb_main
[ 19%] Built target duckdb_main_capi
[ 19%] Built target duckdb_main_capi_cast
[ 19%] Built target duckdb_generated_extension_loader
[ 20%] Built target duckdb_main_extension
[ 21%] Built target duckdb_common_http
[ 21%] Built target duckdb_main_relation
[ 21%] Built target duckdb_main_secret
[ 22%] Built target duckdb_main_settings
[ 22%] Built target duckdb_main_buffered_data
[ 22%] Built target duckdb_main_chunk_scan_state
[ 22%] Built target duckdb_parallel
[ 22%] Built target duckdb_storage
[ 22%] Built target duckdb_storage_buffer
[ 22%] Built target duckdb_storage_checkpoint
[ 23%] Built target duckdb_storage_compression
[ 23%] Built target duckdb_storage_compression_chimp
[ 23%] Built target duckdb_storage_compression_alp
[ 24%] Built target duckdb_storage_compression_roaring
[ 24%] Built target duckdb_storage_compression_dictionary
[ 24%] Built target duckdb_storage_compression_dict_fsst
[ 24%] Built target duckdb_storage_metadata
[ 24%] Built target duckdb_storage_serialization
[ 24%] Built target duckdb_storage_statistics
[ 25%] Built target duckdb_storage_table
[ 25%] Built target duckdb_transaction
[ 27%] Built target duckdb_verification
[ 28%] Built target duckdb_core_functions_algebraic
[ 28%] Built target duckdb_core_functions_distributive
[ 29%] Built target duckdb_core_functions_holistic
[ 30%] Built target duckdb_core_functions_nested
[ 30%] Built target duckdb_core_functions_regression
[ 30%] Built target duckdb_core_functions_array
[ 30%] Built target duckdb_core_functions_bit
[ 30%] Built target duckdb_core_functions_blob
[ 31%] Built target duckdb_core_functions_date
[ 31%] Built target duckdb_core_functions_debug
[ 31%] Built target duckdb_core_functions_enum
[ 31%] Built target duckdb_core_functions_generic
[ 31%] Built target duckdb_core_functions_list
[ 31%] Built target duckdb_core_functions_map
[ 31%] Built target duckdb_core_functions_math
[ 31%] Built target duckdb_core_functions_operators
[ 31%] Built target duckdb_core_functions_random
[ 31%] Built target duckdb_core_functions_string
[ 32%] Built target duckdb_core_functions_struct
[ 32%] Built target duckdb_core_functions_union
[ 33%] Built target duckdb_parquet_decoders
[ 33%] Built target duckdb_parquet_readers
[ 33%] Built target duckdb_parquet_reader_variant
[ 33%] Built target duckdb_parquet_writers
[ 33%] Built target duckdb_optimizer
[ 34%] Built target duckdb_optimizer_compressed_materialization
[ 34%] Built target duckdb_optimizer_join_order
[ 34%] Built target duckdb_optimizer_matcher
[ 34%] Built target duckdb_optimizer_pullup
[ 34%] Built target duckdb_optimizer_pushdown
[ 35%] Built target duckdb_optimizer_rules
[ 35%] Built target duckdb_optimizer_statistics_expr
[ 35%] Built target duckdb_optimizer_statistics_op
[ 36%] Built target duckdb_planner
[ 36%] Built target duckdb_planner_expression
[ 37%] Built target duckdb_bind_expression
[ 37%] Built target duckdb_bind_query_node
[ 37%] Built target duckdb_bind_statement
[ 37%] Built target duckdb_bind_tableref
[ 37%] Built target duckdb_expression_binders
[ 37%] Built target duckdb_planner_filter
[ 37%] Built target duckdb_planner_operator
[ 38%] Built target duckdb_planner_subquery
[ 39%] Built target duckdb_parser
[ 39%] Built target duckdb_constraints
[ 39%] Built target duckdb_expression
[ 39%] Built target duckdb_parsed_data
[ 40%] Built target duckdb_query_node
[ 41%] Built target duckdb_statement
[ 41%] Built target duckdb_parser_tableref
[ 41%] Built target duckdb_transformer_constraint
[ 41%] Built target duckdb_transformer_expression
[ 42%] Built target duckdb_transformer_helpers
[ 42%] Built target duckdb_transformer_statement
[ 42%] Built target duckdb_transformer_tableref
[ 42%] Built target duckdb_function
[ 42%] Built target duckdb_func_aggr
[ 42%] Built target duckdb_aggr_distr
[ 42%] Built target duckdb_func_cast
[ 42%] Built target duckdb_union_cast
[ 42%] Built target duckdb_variant_cast
[ 43%] Built target duckdb_func_pragma
[ 43%] Built target duckdb_func_scalar
[ 43%] Built target duckdb_func_compressed_materialization
[ 44%] Built target duckdb_func_date
[ 44%] Built target duckdb_func_generic_main
[ 44%] Built target duckdb_func_list_nested
[ 44%] Built target duckdb_function_map
[ 44%] Built target duckdb_func_ops_main
[ 44%] Built target duckdb_func_seq
[ 44%] Built target duckdb_func_string_main
[ 44%] Built target duckdb_func_string_regexp
[ 45%] Built target duckdb_func_struct_main
[ 46%] Built target duckdb_func_variant_main
[ 46%] Built target duckdb_func_system
[ 46%] Built target duckdb_func_table
[ 46%] Built target duckdb_table_func_system
[ 46%] Built target duckdb_func_table_version
[ 46%] Built target duckdb_arrow_conversion
[ 46%] Built target duckdb_func_window
[ 46%] Built target duckdb_catalog
[ 46%] Built target duckdb_catalog_entries
[ 46%] Built target duckdb_catalog_entries_dependency
[ 47%] Built target duckdb_catalog_default_entries
[ 47%] Built target duckdb_common
[ 48%] Built target duckdb_adbc
[ 48%] Built target duckdb_adbc_nanoarrow
[ 49%] Built target duckdb_common_arrow
[ 49%] Built target duckdb_common_arrow_appender
[ 49%] Built target duckdb_common_crypto
[ 49%] Built target duckdb_common_enums
[ 49%] Built target duckdb_common_exception
[ 49%] Built target duckdb_common_multi_file
[ 49%] Built target duckdb_common_operators
[ 49%] Built target duckdb_progress_bar
[ 50%] Built target duckdb_common_tree_renderer
[ 50%] Built target duckdb_row_operations
[ 50%] Built target duckdb_common_serializer
[ 50%] Built target duckdb_sort
[ 55%] Built target duckdb_re2
[ 55%] Built target duckdb_miniz
[ 56%] Built target duckdb_utf8proc
[ 57%] Built target duckdb_hyperloglog
[ 58%] Built target duckdb_skiplistlib
[ 59%] Built target duckdb_fastpforlib
[ 64%] Built target duckdb_mbedtls
[ 65%] Built target duckdb_fsst
[ 65%] Built target duckdb_yyjson
[ 77%] Built target parquet_extension
[ 77%] Built target core_functions_extension
[ 77%] Building CXX object extension/yardstick/CMakeFiles/yardstick_extension.dir/src/yardstick_extension.cpp.o
[ 78%] Building CXX object extension/yardstick/CMakeFiles/yardstick_extension.dir/src/yardstick_parser_ffi.cpp.o
[ 78%] Linking CXX static library libyardstick_extension.a
[ 78%] Built target yardstick_extension
[ 78%] Built target duckdb_static
[ 78%] Linking CXX shared library loadable_extension_optimizer_demo.duckdb_extension
[ 79%] Built target loadable_extension_optimizer_demo_loadable_extension
[ 79%] Building CXX object extension/yardstick/CMakeFiles/yardstick_loadable_extension.dir/src/yardstick_extension.cpp.o
[ 79%] Building CXX object extension/yardstick/CMakeFiles/yardstick_loadable_extension.dir/src/yardstick_parser_ffi.cpp.o
[ 80%] Linking CXX shared library yardstick.duckdb_extension
[ 80%] Built target yardstick_loadable_extension
[ 80%] Linking CXX shared library core_functions.duckdb_extension
[ 81%] Built target core_functions_loadable_extension
[ 82%] Linking CXX shared library parquet.duckdb_extension
[ 93%] Built target parquet_loadable_extension
[ 93%] Linking CXX shared library loadable_extension_demo.duckdb_extension
[ 93%] Built target loadable_extension_demo_loadable_extension
[ 94%] repository
[ 94%] Built target duckdb_local_extension_repo
[ 94%] Linking CXX shared library libduckdb.dylib
[ 94%] Built target duckdb
[ 95%] Built target sqlite3_api_wrapper_sqlite3
[ 96%] Built target sqlite3_udf_api
[ 96%] Built target sqlite3_api_wrapper_static
[ 96%] Linking CXX shared library libsqlite3_api_wrapper.dylib
[ 96%] Built target sqlite3_api_wrapper
[ 96%] Linking CXX executable test_sqlite3_api_wrapper
[ 97%] Built target test_sqlite3_api_wrapper
[ 98%] Built target duckdb_linenoise
[ 99%] Linking CXX executable ../../duckdb
[ 99%] Built target shell
[100%] Built target test_helpers
[100%] Built target test_sqlite
[100%] Linking CXX executable unittest
[100%] Built target unittest
[100%] Built target imdb
./build/release/"/test/unittest" ""test/""
Filters: test/

[0/1] (0%): test/sql/measures.test
[1/1] (100%): test/sql/measures.test

All tests passed (761 assertions in 1 test case)

./build/release/test/unittest

[0/2] (0%): test/sql/measures.test
[1/2] (50%): /Users/nico/conductor/workspaces/yardstick/yeosu-v1/test/sql/mea...
[2/2] (100%): /Users/nico/conductor/workspaces/yardstick/yeosu-v1/test/sql/me...

All tests passed (1522 assertions in 2 test cases).

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4ab83dbd26

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +869 to +873
fn matches_keyword_at(upper: &str, idx: usize, keyword: &str) -> bool {
if idx + keyword.len() > upper.len() {
return false;
}
if &upper[idx..idx + keyword.len()] != keyword {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid UTF-8 slicing panic in GROUP BY scan

has_group_by_anywhere advances idx byte-by-byte over the uppercased SQL and calls matches_keyword_at, which slices the upper string using idx (&upper[idx..idx + keyword.len()]). If the SQL contains any non-ASCII characters (e.g., Unicode identifiers or string literals) before the GROUP BY, idx can land on a non‑UTF‑8 boundary and this slice will panic. This is a regression from the previous contains("GROUP BY") check and can crash semantic expansion for otherwise valid queries that include Unicode content.

Useful? React with 👍 / 👎.

@nicosuave nicosuave merged commit f7d1760 into main Jan 9, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants