Skip to content

feat(k8s): Add pure migration option for api component#22750

Merged
laipz8200 merged 2 commits intolanggenius:mainfrom
BorisPolonsky:migration-mode
Jul 23, 2025
Merged

feat(k8s): Add pure migration option for api component#22750
laipz8200 merged 2 commits intolanggenius:mainfrom
BorisPolonsky:migration-mode

Conversation

@BorisPolonsky
Copy link
Copy Markdown
Contributor

@BorisPolonsky BorisPolonsky commented Jul 22, 2025

Important

  1. Make sure you have read our contribution guidelines
  2. Ensure there is an associated issue and you have been assigned to it
  3. Use the correct syntax to link this PR: Fixes #<issue number>.

Summary

This PR ease maintenance in k8s scenario by adding pure migration mode that execute data migration logic and exit normally without launching the web server. This is useful when upgrading dify version, as it enable the maintainer to launch a separate container that execute the flask upgrade-db logic via entrypoint.sh instead of tampering with the actual api or worker

Without this mode, this logic will be executed in both api and worker component instead of a standalone, single replica logic. This will case dead lock in case there are multiple replica of api or worker, or if the worker starts before api and tries to execute the migration logic that end up with issues like #19910. With this migration only mode, it's possible to disable auto migration for api and worker and bring up a standalone container that does the migration beforehand (via a Job defined with pre-installation hook annotation that recognized by Helm).

Screenshots

Before After
Server will launch after flask upgrade-db Run flask upgrade-db only without launching server when MODE=migration

Checklist

  • This change requires a documentation update, included: Dify Document
  • I understand that this PR may be closed in case there was no previous discussion or issues. (This doesn't apply to typos!)
  • I've added a test for each change that was introduced, and I tried as much as possible to make a single atomic change.
  • I've updated the documentation accordingly.
  • I ran dev/reformat(backend) and cd web && npx lint-staged(frontend) to appease the lint gods

@dosubot dosubot bot added size:XS This PR changes 0-9 lines, ignoring generated files. 💪 enhancement New feature or request labels Jul 22, 2025
@BorisPolonsky BorisPolonsky changed the title Add pure migration mode for api component Add pure migration mode for api component for to ease k8s maintenance Jul 22, 2025
@BorisPolonsky BorisPolonsky changed the title Add pure migration mode for api component for to ease k8s maintenance Add pure migration mode for api component Jul 22, 2025
@BorisPolonsky BorisPolonsky changed the title Add pure migration mode for api component feat: Add pure migration mode for api component Jul 22, 2025
@BorisPolonsky BorisPolonsky changed the title feat: Add pure migration mode for api component feat: Add pure migration option for api component Jul 22, 2025
@BorisPolonsky BorisPolonsky changed the title feat: Add pure migration option for api component feat(k8s): Add pure migration option for api component Jul 22, 2025
@BorisPolonsky
Copy link
Copy Markdown
Contributor Author

BorisPolonsky commented Jul 22, 2025

As alternative solution, we may also change the else statement (at line 29) below and enforce explicit MODE=api condition (i.e. elif [[ "${MODE}" == "api" ]]; then) for the same purpose.

else
if [[ "${DEBUG}" == "true" ]]; then
exec flask run --host=${DIFY_BIND_ADDRESS:-0.0.0.0} --port=${DIFY_PORT:-5001} --debug
else
exec gunicorn \
--bind "${DIFY_BIND_ADDRESS:-0.0.0.0}:${DIFY_PORT:-5001}" \
--workers ${SERVER_WORKER_AMOUNT:-1} \
--worker-class ${SERVER_WORKER_CLASS:-gevent} \
--worker-connections ${SERVER_WORKER_CONNECTIONS:-10} \
--timeout ${GUNICORN_TIMEOUT:-200} \
app:app
fi
fi

@crazywoola crazywoola requested a review from laipz8200 July 22, 2025 07:59
@crazywoola
Copy link
Copy Markdown
Member

cc @laipz8200

Copy link
Copy Markdown
Member

@laipz8200 laipz8200 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although we've added a lock at the

lock = redis_client.lock(name="db_upgrade_lock", timeout=60)
to manage migration issues across multiple instances, I believe it would be beneficial to include an additional solution as well.

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Jul 23, 2025
@laipz8200 laipz8200 merged commit e64e756 into langgenius:main Jul 23, 2025
6 checks passed
@BorisPolonsky BorisPolonsky deleted the migration-mode branch July 24, 2025 00:17
tutkun pushed a commit to tutkun/dify that referenced this pull request Aug 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

💪 enhancement New feature or request lgtm This PR has been approved by a maintainer size:XS This PR changes 0-9 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants