Skip to content

Revisit support for globalization invariant mode #3742

@edwardneal

Description

@edwardneal

Is your feature request related to a problem? Please describe.

Support for globalization invariant mode was previously rejected in #220 and #1249, on the grounds that the ICU libraries are necessary in order to be able to map LCIDs to codepages (and thus, to encodings.) Following this #2043 was raised, requesting an error diagnostic when globalization invariant mode is enabled. After looking at other TDS client libraries, I'd like to look again at support for this.

Other TDS client libraries don't use the OS-specific localization data:

Of these, the JDBC driver has the most complete set of collation-to-codepage mappings. The other drivers listed have a varying subset of these (or have lifted their mappings from the JDBC driver.)

The Microsoft Drivers for PHP for SQL Server use a hybrid approach which normalizes some encodings to/from UTF-16 manually and uses iconv as a fallback.

Finally, both SqlClient and the Linux build of the Microsoft ODBC Driver for SQL Server use the OS-provided localization data. SqlClient maps collations to code pages in TdsParser.GetCodePage using Culture.GetCultureInfo.

Describe the solution you'd like

I'd like SqlClient to support globalization invariant mode in netcore. To handle the core problem rendering it unsupported, rather than continuing to use Culture.GetCultureInfo, SqlClient would convert LCIDs to codepage IDs using the same mappings as the JDBC driver and run all test projects in globalization invariant mode under netcore.

Describe alternatives you've considered

We could continue to use Culture.GetCultureInfo, but this would mean that globalization invariant mode would remain unusable.

Alternatively, we could use another library's mappings. The JDBC driver has the largest set of mappings out of the listed ones, but if there are any more complete sets then we could use that instead.

Additional context

The JDBC driver's mappings seem to be complete. I've got a local branch which uses these mappings, runs in globalization invariant mode and successfully reads and roundtrip a sample string collated in every collation available to SQL 2022 and to SQL Azure instances.

I think GetCodePage's LCID mapping is the only critical point which prevents SqlClient from being used in globalization invariant mode. The other problems this causes are:

  • Passing a string value in an IEnumerable<SqlDataRecord> to a stored procedure implicitly uses the current client-side culture's LCID, which will no longer match. The LCIDs will need to be explicitly specified, or will need to default to the server's LCID instead.
  • Converting field names to ordinals in a SqlDataReader is case-insensitive. If a case-sensitive match can't be found, it tries to matching using the case-insensitive comparison rules from the database's collation, and this won't be possible. It'll need to use the comparison rules from the invariant culture instead.

I think the first problem is the most important one - it's likely to prevent the use of user-defined table types which contain varchar fields if the client's current culture isn't compatible with the column's collation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Area\DocumentationUse this for issues that requires changes in public documentations/samples.
    No fields configured for Feature.

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions