-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Description
sort: numeric sort (-n) does not recognize thousand separators
Component
sort
Description
GNU sort uses the locale's thousand separator when parsing numbers in numeric sort mode (-n). It retrieves the separator from localeconv() and passes it to the number comparison function.
In GNU sort, the thousand separator is obtained from the locale.
struct lconv const *locale = localeconv ();
...
thousands_sep = locale->thousands_sep[0];This separator is then used in numeric comparison in numcompare.
return strnumcmp (a, b, decimal_point, thousands_sep);However, in uutils sort, the NumInfoParseSettings struct has a thousands_separator field, but it defaults to None.
impl Default for NumInfoParseSettings {
fn default() -> Self {
Self {
accept_si_units: false,
thousands_separator: None,
decimal_pt: Some(b'.'),
}
}
}When parsing numbers for numeric sort at line 898-903, the default settings are used without setting the thousand separator from locale.
let (info, num_range) = NumInfo::parse(
range_str,
&NumInfoParseSettings {
accept_si_units: self.settings.mode == SortMode::HumanNumeric,
..Default::default()
},
);As a result, "1,000" is parsed as "1" because the comma terminates number parsing.
Test / Reproduction Steps
# GNU
$ printf '1,000\n500\n2,000\n100\n' | sort -n
100
500
1,000
2,000
# uutils
$ printf '1,000\n500\n2,000\n100\n' | coreutils sort -n
1,000
2,000
100
500Impact
Numbers with thousand separators are incorrectly parsed and causes wrong sort order.
Recommendations
Retrieve the locale's thousand separator and pass it to NumInfoParseSettings.