The same expression (syntax-wise) in Cosmos can return both scalars and structural types. For example x.Foo can return a scalar (1) or a structural type (JSON object representing an entity type). This is different from relational, where generally an expression type either returns a scalar (e.g. ColumnExpression), or a structural type (e.g. TableExpression, which represents a set of structural types). An exception to this in relational is probably JSON column access.
Our general SQL expression tree design mirrors this: SqlExpression represents scalars (has a TypeMapping), non-SqlExpressions represent structural types. In Cosmos (but also in some places in relational) things are different: the same expression can typically return both a scalar and a structural type. For example, a JSON property access (x.Foo) can return a scalar or structural type.
As a result, after #33998 the Cosmos query pipeline has an explosion of explosion of expression types which represent the same syntax, but return different things. For example, ScalarAccessExpression represents x.Foo where Foo is a scalar, ObjectAccessExpression represents the same where Foo is a structural type, and ObjectArrayAccessExpression represents the same where Foo is an array of structural types.
This is a bad state of affairs; I considered unifying by e.g. having a dummy type mapping for structural types (allowing SqlExpression to represent structural types as well), but the expression split goes into shaper generation as well. So for now I continued along the current path of duplicating expression types. A more modern shaper generation architecture (and I think more aligned to relational) wouldn't require this separation at the expression level, but rather recognizes structural types via StructuralTypeShaperExpression; I went a bit in this direction but more work is needed.
Once our shaper no longer looks at the server/syntax expressions to determine structural type information (but uses StructuralTypeShaperExpression instead), we should be able to remove all structural type/navigation information from those syntax expressions, and unify them. This would make a much clearer separation between server (query) and client (shaper).
The same expression (syntax-wise) in Cosmos can return both scalars and structural types. For example
x.Foocan return a scalar (1) or a structural type (JSON object representing an entity type). This is different from relational, where generally an expression type either returns a scalar (e.g. ColumnExpression), or a structural type (e.g. TableExpression, which represents a set of structural types). An exception to this in relational is probably JSON column access.Our general SQL expression tree design mirrors this: SqlExpression represents scalars (has a TypeMapping), non-SqlExpressions represent structural types. In Cosmos (but also in some places in relational) things are different: the same expression can typically return both a scalar and a structural type. For example, a JSON property access (
x.Foo) can return a scalar or structural type.As a result, after #33998 the Cosmos query pipeline has an explosion of explosion of expression types which represent the same syntax, but return different things. For example, ScalarAccessExpression represents
x.Foowhere Foo is a scalar, ObjectAccessExpression represents the same where Foo is a structural type, and ObjectArrayAccessExpression represents the same where Foo is an array of structural types.This is a bad state of affairs; I considered unifying by e.g. having a dummy type mapping for structural types (allowing SqlExpression to represent structural types as well), but the expression split goes into shaper generation as well. So for now I continued along the current path of duplicating expression types. A more modern shaper generation architecture (and I think more aligned to relational) wouldn't require this separation at the expression level, but rather recognizes structural types via StructuralTypeShaperExpression; I went a bit in this direction but more work is needed.
Once our shaper no longer looks at the server/syntax expressions to determine structural type information (but uses StructuralTypeShaperExpression instead), we should be able to remove all structural type/navigation information from those syntax expressions, and unify them. This would make a much clearer separation between server (query) and client (shaper).