Skip to content

Conversation

@MikeWallis42
Copy link
Contributor

Fixes #3445

I was considering on top of this if it would be sensible to extend/copy the circle ci migration test.

@MikeWallis42
Copy link
Contributor Author

Looks like dbt-fabric published a new "patch" version and renamed a public function
microsoft/dbt-fabric@v1.8.7...v1.8.8#diff-c1b0256b44c042e86c431c5151b6c0812c05e8c047c1d4f9b12b0b9538cbdb55R183
this is a requirement for dbt-sqlserver which we're installing https://github.com/dbt-msft/dbt-sqlserver/blob/master/setup.py#L69C1-L69C36

I'm attempting to exclude this patch version for the moment.

@MikeWallis42
Copy link
Contributor Author

Looks like PySpark Pandas has an issue that will be fixed in v4 because of the removal of distutils from Python 3.12

https://github.com/apache/spark/blob/v3.5.3/python/pyspark/sql/pandas/utils.py#L24
apache/spark#43192
https://issues.apache.org/jira/browse/SPARK-45390

Still working out how to handle this.

@MikeWallis42
Copy link
Contributor Author

This was when the Bitnami image moved to Python 3.12
bitnami/containers@ef4c745

Looks like we should be able to use this image https://hub.docker.com/layers/bitnami/spark/3.5.3-debian-12-r0/images/sha256-0c045caae9ec3498fa2ddbe8d34ecd7d96178113502cac4feb2a58e837af2f6c?context=explore
until such time as v4 is released.

@MikeWallis42
Copy link
Contributor Author

Going to wait on feedback/suggestions now with the PySpark failure.
As it's still failing then it must be the image that Circle CI is running the tests with that's the complication and I don't want to just wind that back to a version of Ubuntu that has Python 3.11 packaged within it.
From a quick search apparently you can get an older version of Python in a couple of ways https://askubuntu.com/questions/1512005/python3-11-install-on-ubuntu-24-04

[
migrations,
major_minor(SQLGLOT_VERSION) != versions.minor_sqlglot_version,
major_minor(SQLMESH_VERSION) != versions.minor_sqlmesh_version,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the motivation for this change? The sqlmesh minor version was omitted intentionally to avoid migration when it's not needed even if the minor version was bumped. Generally there are only 2 reasons to migrate: if there's a new migration script (the migrations collection) or if the sqlglot version changed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read the issue carefully and now I understand the motivation better. However, migrate_rows not only controls whether the backups are made but also whether the migration should be performed. I suggested decoupling the 2 and only do backups on sqlmesh minor versions changes without affecting the migrate_rows flag.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After reading your comments and digging in to the code a bit more I believe I can see what you're talking about.

Hopefully I've addressed your concerns by only performing a table copy when it's a sqlmesh minor change without impacting the flag that drives the complex snapshot and environment migration process.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this looks great, thank you! However, I'm not sure why the refactor was performed in the other method (def migrate)? Is that change necessary? If not, can we please revert it.

@treysp
Copy link
Contributor

treysp commented Dec 2, 2024

Hello - we've fixed the Pyspark issue, so you should be good to revert the Dockerfile change and rebase

Copy link
Member

@izeigerman izeigerman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once green. Thanks for addressing comments!

@izeigerman izeigerman merged commit 8c78aae into TobikoData:main Dec 4, 2024
@MikeWallis42 MikeWallis42 deleted the rollback_for_sqlmesh branch December 5, 2024 08:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Allow rollbacks for SQLMesh updates

3 participants