Skip to content

fix: added graceful shutdown and entrypoint wrapper#1319

Merged
lucaseduoli merged 3 commits into
release-0.4.1from
fix/opensearch_process
Apr 2, 2026
Merged

fix: added graceful shutdown and entrypoint wrapper#1319
lucaseduoli merged 3 commits into
release-0.4.1from
fix/opensearch_process

Conversation

@lucaseduoli
Copy link
Copy Markdown
Collaborator

@lucaseduoli lucaseduoli commented Apr 1, 2026

This pull request introduces improvements to the OpenSearch container orchestration and application shutdown process, focusing on more reliable startup, graceful shutdown, and better operational health management. The most significant changes include the addition of a custom OpenSearch entrypoint wrapper for graceful shutdown, enhancements to the Docker Compose setup for health checks and persistent data, and the implementation of a graceful OpenSearch client shutdown in the application.

OpenSearch Container Entrypoint and Lifecycle Management:

  • Added a new opensearch-entrypoint-wrapper.sh script that starts OpenSearch, applies security setup after a delay, and handles graceful shutdown signals to ensure OpenSearch stops cleanly. (opensearch-entrypoint-wrapper.sh, Dockerfile) [1] [2] [3]
  • Updated Docker Compose to use the new entrypoint, added a healthcheck, persistent named volume (opensearch-data), and improved stop behavior with a grace period. (docker-compose.yml) [1] [2]

Application Shutdown Improvements:

  • Implemented graceful_opensearch_shutdown in utils/opensearch_utils.py to ensure all operations complete and connections close properly on app shutdown. (src/utils/opensearch_utils.py)
  • Updated the main application shutdown event to call this new shutdown function and improved logging around shutdown. (src/main.py)

Code Quality and Minor Fixes:

  • Minor code cleanup and parameter fix in OpenSearch client usage. (src/api/settings.py) [1] [2]
  • Commented out the direct image usage for openrag-backend in favor of building from source. (docker-compose.yml)

Closes #1170

@lucaseduoli lucaseduoli requested a review from mpawlow April 1, 2026 18:40
@lucaseduoli lucaseduoli self-assigned this Apr 1, 2026
@github-actions github-actions Bot added backend 🔷 Issues related to backend services (OpenSearch, Langflow, APIs) docker labels Apr 1, 2026
@lucaseduoli lucaseduoli changed the title added graceful shutdown and entrypoint wrapper fix: added graceful shutdown and entrypoint wrapper Apr 1, 2026
@github-actions github-actions Bot added the bug 🔴 Something isn't working. label Apr 1, 2026
Copy link
Copy Markdown
Collaborator

@mpawlow mpawlow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lucaseduoli

Code Review 1

  • See PR comments: (a) to (e)

Comment thread src/main.py
Comment thread docker-compose.yml Outdated
- opensearch-data:/usr/share/opensearch/data
stop_grace_period: 2m
healthcheck:
test: ["CMD-SHELL", "curl -ks https://localhost:9200 >/dev/null 2>&1 || exit 1"]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(b) [Normal] Healthcheck Does Not Verify OpenSearch Authentication

Problem

  • The Docker Compose healthcheck uses an unauthenticated curl request:
    test: ["CMD-SHELL", "curl -ks https://localhost:9200 >/dev/null 2>&1 || exit 1"]
  • curl exits 0 for any HTTP response, including 401 Unauthorized
  • After the security plugin is configured, the root endpoint returns 401
    • The healthcheck still reports the container as healthy in this case
  • Services using depends_on with condition: service_healthy would start even if security is not yet correctly configured
    • This can cause authentication failures at runtime

Solutions

  1. (Recommended) Include credentials and assert a successful response:
    test: ["CMD-SHELL", "curl -ku admin:$$OPENSEARCH_PASSWORD https://localhost:9200/_cluster/health?wait_for_status=yellow&timeout=5s -s | grep -q '\"status\"'"]
  2. Use -w "%{http_code}" to assert the HTTP status code explicitly rather than relying on curl's exit code.
  3. Fall back to TCP-only checking (nc -z localhost 9200) but document the limitation that it does not validate auth.

Comment thread opensearch-entrypoint-wrapper.sh
Comment thread opensearch-entrypoint-wrapper.sh
Comment thread src/utils/opensearch_utils.py
Comment thread docker-compose.yml Outdated
Issue

- #1170

Summary

- Improved OpenSearch graceful shutdown reliability by flushing pending writes, adding a force-kill fallback in the entrypoint wrapper, and preventing double-close of the client connection.
- Hardened the Docker Compose healthcheck to verify authenticated cluster health status rather than bare connectivity.

OpenSearch Shutdown Improvements

- Replaced `cluster.health()` call with `indices.flush(index="_all", wait_if_ongoing=True)` in `graceful_opensearch_shutdown` to ensure pending write operations are persisted before the client closes.
- Set `clients.opensearch = None` after graceful shutdown in `src/main.py` to prevent a redundant double-close during `clients.cleanup()`.
- Added a 90-second wait loop with `kill -0` polling in `opensearch-entrypoint-wrapper.sh` before issuing a force `SIGKILL`, ensuring the process has time to stop cleanly before being forcibly terminated.
- Removed stale "Made with Bob" comment from `opensearch-entrypoint-wrapper.sh`.

Docker Compose Healthcheck

- Updated the OpenSearch healthcheck command to authenticate with `admin:$OPENSEARCH_PASSWORD` and query `/_cluster/health`, asserting the cluster status is `green` or `yellow` rather than only checking for a successful TCP connection.
@github-actions github-actions Bot added bug 🔴 Something isn't working. and removed bug 🔴 Something isn't working. labels Apr 2, 2026
Copy link
Copy Markdown
Collaborator

@mpawlow mpawlow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review 2

  • ✅ LGTM / Approved
  • Added a commit to address PR comments: (a) to (e)
  • Note: No functional review performed yet

@github-actions github-actions Bot added the lgtm label Apr 2, 2026
Comment thread docker-compose.yml
Comment thread docker-compose.yml
@github-actions github-actions Bot added bug 🔴 Something isn't working. and removed bug 🔴 Something isn't working. labels Apr 2, 2026
@lucaseduoli lucaseduoli merged commit cee594e into release-0.4.1 Apr 2, 2026
10 checks passed
@github-actions github-actions Bot deleted the fix/opensearch_process branch April 2, 2026 13:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend 🔷 Issues related to backend services (OpenSearch, Langflow, APIs) bug 🔴 Something isn't working. docker lgtm

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: TransportError(503,'search_phase_execution_exception') while uploading documents

2 participants