Skip to content

Open-API: Refactor TableRequirements#7710

Merged
Fokko merged 9 commits intoapache:mainfrom
Fokko:fd-fix-open-api-requirements
Oct 30, 2023
Merged

Open-API: Refactor TableRequirements#7710
Fokko merged 9 commits intoapache:mainfrom
Fokko:fd-fix-open-api-requirements

Conversation

@Fokko
Copy link
Copy Markdown
Contributor

@Fokko Fokko commented May 26, 2023

Noticed this when working on #6323

I believe the requirements are not defined in the right way. After generating code I got:

class TableRequirement(BaseModel):
    type: Literal[
        'assert-create',
        'assert-table-uuid',
        'assert-ref-snapshot-id',
        'assert-last-assigned-field-id',
        'assert-current-schema-id',
        'assert-last-assigned-partition-id',
        'assert-default-spec-id',
        'assert-default-sort-order-id',
    ]
    ref: Optional[str] = None
    uuid: Optional[str] = None
    snapshot_id: Optional[int] = Field(None, alias='snapshot-id')
    last_assigned_field_id: Optional[int] = Field(None, alias='last-assigned-field-id')
    current_schema_id: Optional[int] = Field(None, alias='current-schema-id')
    last_assigned_partition_id: Optional[int] = Field(
        None, alias='last-assigned-partition-id'
    )
    default_spec_id: Optional[int] = Field(None, alias='default-spec-id')
    default_sort_order_id: Optional[int] = Field(None, alias='default-sort-order-id')

Which encapsulates all the requirements. After the refactor in this PR, we'll end up with:

class TableRequirement(BaseModel):
    type: str


class AssertCreate(TableRequirement):
    """
    The table must not already exist; used for create transactions
    """

    type: Literal['assert-create']


class AssertTableUUID(TableRequirement):
    """
    The table UUID must match the requirement's `uuid`
    """

    type: Literal['assert-table-uuid']
    uuid: str


class AssertRefSnapshotId(TableRequirement):
    """
    The table branch or tag identified by the requirement's `ref` must reference the requirement's `snapshot-id`; if `snapshot-id` is `null` or missing, the ref must not already exist
    """

    type: Literal['assert-ref-snapshot-id']
    ref: str
    snapshot_id: int = Field(..., alias='snapshot-id')


class AssertLastAssignedFieldId(TableRequirement):
    """
    The table's last assigned column id must match the requirement's `last-assigned-field-id`
    """

    type: Literal['assert-last-assigned-field-id']
    last_assigned_field_id: int = Field(..., alias='last-assigned-field-id')


class AssertCurrentSchemaId(TableRequirement):
    """
    The table's current schema id must match the requirement's `current-schema-id`
    """

    type: Literal['assert-current-schema-id']
    current_schema_id: int = Field(..., alias='current-schema-id')


class AssertLastAssignedPartitionId(TableRequirement):
    """
    The table's last assigned partition id must match the requirement's `last-assigned-partition-id`
    """

    type: Literal['assert-last-assigned-partition-id']
    last_assigned_partition_id: int = Field(..., alias='last-assigned-partition-id')


class AssertDefaultSpecId(TableRequirement):
    """
    The table's default spec id must match the requirement's `default-spec-id`
    """

    type: Literal['assert-default-spec-id']
    default_spec_id: int = Field(..., alias='default-spec-id')


class AssertDefaultSortOrderId(TableRequirement):
    """
    The table's default sort order id must match the requirement's `default-sort-order-id`
    """

    type: Literal['assert-default-sort-order-id']
    default_sort_order_id: int = Field(..., alias='default-sort-order-id')

Which makes sense to me.

See below for the final version.

I believe the constraints are not defined in the right way.
After generating code I got:

```python
class TableRequirement(BaseModel):
    type: Literal[
        'assert-create',
        'assert-table-uuid',
        'assert-ref-snapshot-id',
        'assert-last-assigned-field-id',
        'assert-current-schema-id',
        'assert-last-assigned-partition-id',
        'assert-default-spec-id',
        'assert-default-sort-order-id',
    ]
    ref: Optional[str] = None
    uuid: Optional[str] = None
    snapshot_id: Optional[int] = Field(None, alias='snapshot-id')
    last_assigned_field_id: Optional[int] = Field(None, alias='last-assigned-field-id')
    current_schema_id: Optional[int] = Field(None, alias='current-schema-id')
    last_assigned_partition_id: Optional[int] = Field(
        None, alias='last-assigned-partition-id'
    )
    default_spec_id: Optional[int] = Field(None, alias='default-spec-id')
    default_sort_order_id: Optional[int] = Field(None, alias='default-sort-order-id')
```

Which encapulates all the requirements. After the refactor in this
PR, we'll end up with:

```python
class AssertCreate(BaseModel):
    type: Literal['assert-create']

class AssertTableUUID(BaseModel):
    type: Literal['assert-table-uuid']
    uuid: str

class AssertRefSnapshotId(BaseModel):
    type: Literal['assert-ref-snapshot-id']
    ref: str
    snapshot_id: int = Field(..., alias='snapshot-id')

class AssertLastAssignedFieldId(BaseModel):
    type: Literal['assert-last-assigned-field-id']
    last_assigned_partition_id: int = Field(..., alias='last-assigned-partition-id')

class AssertCurrentSchemaId(BaseModel):
    type: Literal['assert-current-schema-id']
    current_schema_id: int = Field(..., alias='current-schema-id')

class AssertLastAssignedPartitionId(BaseModel):
    type: Literal['assert-last-assigned-partition-id']
    last_assigned_partition_id: int = Field(..., alias='last-assigned-partition-id')

class AssertDefaultSpecId(BaseModel):
    type: Literal['assert-default-spec-id']
    default_spec_id: int = Field(..., alias='default-spec-id')

class AssertDefaultSortOrderId(BaseModel):
    type: Literal['assert-default-sort-order-id']
    default_sort_order_id: int = Field(..., alias='default-sort-order-id')

class TableRequirement(BaseModel):
    __root__: Union[
        AssertCreate,
        AssertTableUUID,
        AssertRefSnapshotId,
        AssertLastAssignedFieldId,
        AssertCurrentSchemaId,
        AssertLastAssignedPartitionId,
        AssertDefaultSpecId,
        AssertDefaultSortOrderId,
    ] = Field(..., discriminator='type')
```

Which makes sense to me.
Comment thread open-api/rest-catalog-open-api.yaml Outdated
Comment thread open-api/rest-catalog-open-api.yaml Outdated
Comment thread open-api/rest-catalog-open-api.yaml Outdated
Co-authored-by: Eduard Tudenhoefner <etudenhoefner@gmail.com>
@Fokko
Copy link
Copy Markdown
Contributor Author

Fokko commented Jun 4, 2023

Alright, now I have it right:

class TableRequirement(BaseModel):
    type: str


class AssertCreate(TableRequirement):
    """
    The table must not already exist; used for create transactions
    """

    type: Literal['assert-create']


class AssertTableUUID(TableRequirement):
    """
    The table UUID must match the requirement's `uuid`
    """

    type: Literal['assert-table-uuid']
    uuid: str


class AssertRefSnapshotId(TableRequirement):
    """
    The table branch or tag identified by the requirement's `ref` must reference the requirement's `snapshot-id`; if `snapshot-id` is `null` or missing, the ref must not already exist
    """

    type: Literal['assert-ref-snapshot-id']
    ref: str
    snapshot_id: int = Field(..., alias='snapshot-id')


class AssertLastAssignedFieldId(TableRequirement):
    """
    The table's last assigned column id must match the requirement's `last-assigned-field-id`
    """

    type: Literal['assert-last-assigned-field-id']
    last_assigned_field_id: int = Field(..., alias='last-assigned-field-id')


class AssertCurrentSchemaId(TableRequirement):
    """
    The table's current schema id must match the requirement's `current-schema-id`
    """

    type: Literal['assert-current-schema-id']
    current_schema_id: int = Field(..., alias='current-schema-id')


class AssertLastAssignedPartitionId(TableRequirement):
    """
    The table's last assigned partition id must match the requirement's `last-assigned-partition-id`
    """

    type: Literal['assert-last-assigned-partition-id']
    last_assigned_partition_id: int = Field(..., alias='last-assigned-partition-id')


class AssertDefaultSpecId(TableRequirement):
    """
    The table's default spec id must match the requirement's `default-spec-id`
    """

    type: Literal['assert-default-spec-id']
    default_spec_id: int = Field(..., alias='default-spec-id')


class AssertDefaultSortOrderId(TableRequirement):
    """
    The table's default sort order id must match the requirement's `default-sort-order-id`
    """

    type: Literal['assert-default-sort-order-id']
    default_sort_order_id: int = Field(..., alias='default-sort-order-id')

The inheritance tree now checks out.

Fokko added a commit to Fokko/iceberg-python that referenced this pull request Oct 25, 2023
It is not super clear from the current structure:

https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.py#L304-L329

This is because I had to deconstruct the object.
It would be great to get  apache/iceberg#7710 in
Fokko added a commit to Fokko/iceberg-python that referenced this pull request Oct 25, 2023
It is not super clear from the current structure:

https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.py#L304-L329

This is because I had to deconstruct the object.
It would be great to get  apache/iceberg#7710 in
Fokko added a commit to apache/iceberg-python that referenced this pull request Oct 25, 2023
It is not super clear from the current structure:

https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.py#L304-L329

This is because I had to deconstruct the object.
It would be great to get  apache/iceberg#7710 in
- $ref: '#/components/schemas/SetPropertiesUpdate'
- $ref: '#/components/schemas/RemovePropertiesUpdate'

TableRequirement:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Should TableRequirement be defined first?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I don't recall this isn't the case.

Copy link
Copy Markdown
Contributor

@rdblue rdblue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. I like using the discriminator. We should probably refactor other areas to have that structure.

@rdblue
Copy link
Copy Markdown
Contributor

rdblue commented Oct 28, 2023

Sorry for the late review. This looks great. Thanks @Fokko!

I didn't merge this in case it conflicts with the other linked PR. Feel free to merge in the order that makes the most sense.

@Fokko
Copy link
Copy Markdown
Contributor Author

Fokko commented Oct 30, 2023

Thanks for the review @rdblue and @nastra for the review. I'll might revisitor other parts as well 👍

@Fokko Fokko merged commit a661663 into apache:main Oct 30, 2023
@Fokko Fokko deleted the fd-fix-open-api-requirements branch October 30, 2023 15:59
@romusz
Copy link
Copy Markdown

romusz commented Nov 14, 2023

Hi,

Have you tested this refactoring with code generated for Go?
What we get now is:

type TableRequirement struct {
	Type string `json:"type"`
}

and (for example)

type AssertTableUUID struct {
	TableRequirement
	Type string `json:"type"`
	Uuid string `json:"uuid"`
}

There are couple of problems with that:

  1. double storage of Type
  2. provided []TableRequirement{} slice and the fact that TableRequirement is a struct not an interface, it cannot be "type switched" into it's "subtypes" (e.g. AssertTableUUID).
  3. All previously-available, useful methods on TableRequirement have been nuked (e.g. GetUuidOk that allowed to extract TableUUID from assert-table-uuid requirement).

In summary, Go generated code does not produce equivalent to what it's shown for Python.
Please let me know if there is other way to traverse the requirements - so far it looks like a regression.

@romusz
Copy link
Copy Markdown

romusz commented Nov 14, 2023

@Fokko not sure if you'll see the above comment since the pr has been merged, thus an explicit call-out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants