Skip to content

Minor: rename dictionary_coercion to dictionary_comparison_coercion, add comments#12102

Merged
alamb merged 1 commit intoapache:mainfrom
alamb:alamb/rename-coercsion
Aug 22, 2024
Merged

Minor: rename dictionary_coercion to dictionary_comparison_coercion, add comments#12102
alamb merged 1 commit intoapache:mainfrom
alamb:alamb/rename-coercsion

Conversation

@alamb
Copy link
Copy Markdown
Contributor

@alamb alamb commented Aug 21, 2024

Which issue does this PR close?

Closes #.

Rationale for this change

dictionary_coercion implies (to me) that this function applies to all dictionary coercions rather than just comparison coercion.

What changes are included in this PR?

Changes

  1. rename dictionary_coercion to dictionary_comparison_coercion
  2. Add some more comments to clarify the purpose of the coercion

Are these changes tested?

Are there any user-facing changes?

/// Not all operators support dictionaries, if `preserve_dictionaries` is true
/// dictionaries will be preserved if possible
fn dictionary_coercion(
fn dictionary_comparison_coercion(
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the only code change (and it is not a public function and thus this is not an API change)

@alamb alamb marked this pull request as ready for review August 21, 2024 18:56
fn dictionary_comparison_coercion(
lhs_type: &DataType,
rhs_type: &DataType,
preserve_dictionaries: bool,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think ideally dictionary_comparison_coercion should not preserve the dictionary (always false), only value type matters. But since it is used in comparison_coercion, the name looks good to me

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my mind, most single argument functions / operations / coercions would preserve dictionary encoding if possible as that would keep the data encoded in a more efficent form (dictionary) longer

For example substr(dict_column, 3, 10) would ideally produce a Dictionary as output (after applying the function to the values)

However, most of our functions don't know how to work with Dictionary arrays, so this would be a larger change.

@alamb alamb merged commit 902f1c6 into apache:main Aug 22, 2024
@alamb
Copy link
Copy Markdown
Contributor Author

alamb commented Aug 22, 2024

Thank you for the reviews @jayzhan211 and @andygrove

@alamb alamb deleted the alamb/rename-coercsion branch August 22, 2024 10:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants