Skip to content

Cosmos: rework expressions around scalar/structural types #33999

@roji

Description

@roji

The same expression (syntax-wise) in Cosmos can return both scalars and structural types. For example x.Foo can return a scalar (1) or a structural type (JSON object representing an entity type). This is different from relational, where generally an expression type either returns a scalar (e.g. ColumnExpression), or a structural type (e.g. TableExpression, which represents a set of structural types). An exception to this in relational is probably JSON column access.

Our general SQL expression tree design mirrors this: SqlExpression represents scalars (has a TypeMapping), non-SqlExpressions represent structural types. In Cosmos (but also in some places in relational) things are different: the same expression can typically return both a scalar and a structural type. For example, a JSON property access (x.Foo) can return a scalar or structural type.

As a result, after #33998 the Cosmos query pipeline has an explosion of explosion of expression types which represent the same syntax, but return different things. For example, ScalarAccessExpression represents x.Foo where Foo is a scalar, ObjectAccessExpression represents the same where Foo is a structural type, and ObjectArrayAccessExpression represents the same where Foo is an array of structural types.

This is a bad state of affairs; I considered unifying by e.g. having a dummy type mapping for structural types (allowing SqlExpression to represent structural types as well), but the expression split goes into shaper generation as well. So for now I continued along the current path of duplicating expression types. A more modern shaper generation architecture (and I think more aligned to relational) wouldn't require this separation at the expression level, but rather recognizes structural types via StructuralTypeShaperExpression; I went a bit in this direction but more work is needed.

Once our shaper no longer looks at the server/syntax expressions to determine structural type information (but uses StructuralTypeShaperExpression instead), we should be able to remove all structural type/navigation information from those syntax expressions, and unify them. This would make a much clearer separation between server (query) and client (shaper).

Metadata

Metadata

Assignees

No one assigned
    No fields configured for Feature.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions