Skip to content

Export/BACPAC fails when sql_variant contains nvarchar with UTF-8 collation (Latin1_General_100_CI_AS_SC_UTF8) #765

@htonheim

Description

@htonheim

DacFx export fails when sql_variant contains a string value with UTF-8 collation

Suggested title

Export/BACPAC fails when sql_variant contains nvarchar with UTF-8 collation (Latin1_General_100_CI_AS_SC_UTF8)

Summary

SqlPackage /Action:Export fails during the data phase when a sql_variant column contains a string value whose collation is Latin1_General_100_CI_AS_SC_UTF8.

/Action:Extract succeeds against the same database, so this appears to be a data-export serialization issue rather than a schema-model issue.

The same failure was also observed via the built-in Azure SQL Database export flow, which surfaces only the generic wrapper error:

The given key was not present in the dictionary. Could not export schema and data from database.

This appears distinct from microsoft/DacFx#370, which reports a Synapse extract failure with SQL73068: Unsupported collation 'Latin1_General_100_CI_AS_SC_UTF8'. In this case:

  • source platform is Azure SQL Database, not Synapse dedicated SQL pool
  • /Action:Extract succeeds
  • /Action:Export fails in the data phase inside SqlVariantSerializer

Minimal repro

IF OBJECT_ID(N'dbo.DacFxSqlVariantUtf8Repro', N'U') IS NOT NULL
BEGIN
    DROP TABLE dbo.DacFxSqlVariantUtf8Repro;
END
GO

CREATE TABLE dbo.DacFxSqlVariantUtf8Repro (
    Id int NOT NULL CONSTRAINT PK_DacFxSqlVariantUtf8Repro PRIMARY KEY,
    VariantValue sql_variant NULL
);
GO

INSERT INTO dbo.DacFxSqlVariantUtf8Repro (Id, VariantValue)
VALUES
    (1, CONVERT(sql_variant, 1)),
    (2, CONVERT(sql_variant, CAST(1 AS bit))),
    (3, CONVERT(sql_variant, CONVERT(nvarchar(100), N'repro') COLLATE Latin1_General_100_CI_AS_SC_UTF8));
GO

SELECT
    Id,
    CONVERT(varchar(128), SQL_VARIANT_PROPERTY(VariantValue, 'BaseType')) AS base_type,
    COALESCE(CONVERT(varchar(128), SQL_VARIANT_PROPERTY(VariantValue, 'Collation')), '<null>') AS collation_name
FROM dbo.DacFxSqlVariantUtf8Repro
ORDER BY Id;

Repro steps

  1. Create or pick an empty Azure SQL Database.
  2. Run the SQL in the Minimal repro section above.
  3. Verify the inserted rows:
SELECT
    Id,
    CONVERT(varchar(128), SQL_VARIANT_PROPERTY(VariantValue, 'BaseType')) AS base_type,
    COALESCE(CONVERT(varchar(128), SQL_VARIANT_PROPERTY(VariantValue, 'Collation')), '<null>') AS collation_name
FROM dbo.DacFxSqlVariantUtf8Repro
ORDER BY Id;

Expected verification result:

1 | int      | <null>
2 | bit      | <null>
3 | nvarchar | Latin1_General_100_CI_AS_SC_UTF8
  1. Run schema extract:
sqlpackage /Action:Extract \
  /SourceConnectionString:"Server=tcp:<server>.database.windows.net,1433;Initial Catalog:<database>;User ID=<user>;Password=<password>;Encrypt=True;TrustServerCertificate=False;Connection Timeout=30;" \
  /TargetFile:"<database>.dacpac"
  1. Run BACPAC export:
sqlpackage /Action:Export \
  /SourceConnectionString:"Server=tcp:<server>.database.windows.net,1433;Initial Catalog:<database>;User ID=<user>;Password=<password>;Encrypt=True;TrustServerCertificate=False;Connection Timeout=30;" \
  /TargetFile:"<database>.bacpac" \
  /Diagnostics:True \
  /DiagnosticsLevel:Verbose

Expected behavior

/Action:Export should succeed, just as /Action:Extract does.

If the collation is unsupported, DacFx should emit a targeted validation error instead of failing with an internal dictionary lookup exception.

Actual behavior

/Action:Export fails during data export with:

Could not export schema and data from database.
One or more errors occurred. (The given key 'Latin1_General_100_CI_AS_SC_UTF8' was not present in the dictionary.)

Relevant stack trace:

System.Collections.Generic.KeyNotFoundException: The given key 'Latin1_General_100_CI_AS_SC_UTF8' was not present in the dictionary.
   at Microsoft.Data.Tools.Schema.Sql.SqlClient.Bcp.CollationInfo.GetInfo(String collation)
   at Microsoft.Data.Tools.Schema.Sql.SqlClient.Bcp.SqlVariantSerializer.GetBytesNoLength(SqlDataReader source)
   at Microsoft.Data.Tools.Schema.Sql.SqlClient.Bcp.SqlVariantSerializer.GetBytes(SqlDataReader source)

This strongly suggests the failure is in DacFx's sql_variant string serialization path for UTF-8 collations.

Environment

  • Azure SQL Database
  • Observed server version: 12.0.2000.8
  • Observed service objective: ElasticPool
  • SqlPackage version: 170.3.93.6
  • Observed on: macOS 26.3.1, .NET 10.0.3
  • Azure SQL built-in export shows the same wrapped failure

Additional observations

  • Schema-only extract succeeds.
  • The failure occurs in the export data phase.
  • The original production database that exposed this had a sql_variant column with:
    • int
    • bit
    • nvarchar values carrying collation Latin1_General_100_CI_AS_SC_UTF8
  • The minimal repro above reduces that shape to a single table and one UTF-8-collated string value inside sql_variant.

Impact

Any Azure SQL Database that stores UTF-8-collated string values inside sql_variant may be impossible to export to BACPAC using either:

  • Azure SQL built-in export
  • SqlPackage /Action:Export

Workaround

Current workaround is to avoid exporting those sql_variant string values as-is, for example by:

  • exporting from a sanitized copy of the database
  • migrating the problematic sql_variant data to a non-sql_variant type before export
  • excluding or rewriting the affected table/data before taking a BACPAC

Request

Please add support for UTF-8 collations in the sql_variant export path, or emit a deterministic validation error that clearly identifies the unsupported data shape.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions