Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 32 additions & 19 deletions datafusion/expr/src/built_in_function.rs
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
use crate::Volatility;
use datafusion_common::{DataFusionError, Result};
use lazy_static::lazy_static;
use std::collections::HashMap;
use std::fmt;
use std::str::FromStr;

Expand Down Expand Up @@ -205,9 +206,12 @@ pub enum BuiltinScalarFunction {

lazy_static! {
/// Mapping between SQL function names to `BuiltinScalarFunction` types.
/// Note that multiple SQL function names can represent the same `BuiltinScalarFunction`. These are treated as aliases.
/// In case of such aliases, the first SQL function name in the vector is used when displaying the function.
static ref NAME_TO_FUNCTION: Vec<(&'static str, BuiltinScalarFunction)> = vec![
///
/// Note that multiple SQL function names can represent the same
/// `BuiltinScalarFunction`. These are treated as aliases and the
/// first SQL function name in the vector is used when displaying
/// the function.
static ref ALL_FUNCTIONS: Vec<(&'static str, BuiltinScalarFunction)> = vec![
// math functions
("abs", BuiltinScalarFunction::Abs),
("acos", BuiltinScalarFunction::Acos),
Expand Down Expand Up @@ -317,6 +321,23 @@ lazy_static! {
];
}

lazy_static! {
/// Maps the sql function name to `BuiltinScalarFunction`
static ref NAME_TO_FUNCTION: HashMap<&'static str, BuiltinScalarFunction> = ALL_FUNCTIONS
.iter()
.cloned()
.collect();


/// Maps `BuiltinScalarFunction` --> canonical sql function
static ref FUNCTION_TO_NAME: HashMap<BuiltinScalarFunction, &'static str> = ALL_FUNCTIONS
.iter()
.map(|(name, func)| (func.clone(), *name))
// reverse so the first entry in the array is the one that is in the map
.rev()
.collect();
}

impl BuiltinScalarFunction {
/// an allowlist of functions to take zero arguments, so that they will get special treatment
/// while executing.
Expand Down Expand Up @@ -431,29 +452,21 @@ impl BuiltinScalarFunction {

impl fmt::Display for BuiltinScalarFunction {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
for (func_name, func) in NAME_TO_FUNCTION.iter() {
if func == self {
return write!(f, "{}", func_name);
}
if let Some(func_name) = FUNCTION_TO_NAME.get(self) {
write!(f, "{}", func_name)
} else {
// Should not be reached
panic!("Internal error: {self:?} not in ALL_FUNCTIONS list");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything is great and needed, one nit, do we need to panic?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This panic will occur if someone adds a new BuiltInFunction and doesn't add it to the built in list.

I can remove the panic but then if someone doesn't add the function to the new LIST I was worried the error would be masked (the wrong function name would be used or the lookup might fail). But maybe that is preferable 🤔

Copy link
Contributor

@comphead comphead May 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking wouldn't be that better to implement a function name

fn name(func: BuiltinScalarFunction) -> Result<String> {
 match func {
   BuiltinScalarFunction::Abs => "abs"
   ....
 }
}

This will ensure on compile time the newly added function gets its string alias

Then we can build the hashmaps the same way over iterating BuiltinScalarFunction enum

    for func in BuiltinScalarFunction::iter() {
        let name = name(func);
        hashmapLookup.insert(name, func);
        hashmapRevLookup.insert(func, name);
    }

I can play with this and create another PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can play with this and create another PR

Thank you 🙏

Note one gotcha you might hit is that the mapping is one-> many

So you might need something like

/// returns all the names that can be used for this function
fn names(func: BuiltinScalarFunction) -> &[&'str]] {
 match func {
   BuiltinScalarFunction::Abs => "abs"
   ....
 }
}

}

// Should not be reached
write!(f, "{}", format!("{self:?}").to_lowercase())
}
}

impl FromStr for BuiltinScalarFunction {
type Err = DataFusionError;
fn from_str(name: &str) -> Result<BuiltinScalarFunction> {
for (func_name, func) in NAME_TO_FUNCTION.iter() {
if name == *func_name {
return Ok(func.clone());
}
}

Err(DataFusionError::Plan(format!(
"There is no built-in function named {name}"
)))
NAME_TO_FUNCTION.get(name).cloned().ok_or_else(|| {
DataFusionError::Plan(format!("There is no built-in function named {name}"))
})
}
}

Expand Down