Skip to content

Stabilize AWQ runtime loading and reproducible image build baseline#4

Merged
parkcheolhong merged 7 commits into
mainfrom
gpu-llm-server-awq-20260427
Apr 26, 2026
Merged

Stabilize AWQ runtime loading and reproducible image build baseline#4
parkcheolhong merged 7 commits into
mainfrom
gpu-llm-server-awq-20260427

Conversation

@parkcheolhong
Copy link
Copy Markdown
Owner

PR Title

Stabilize AWQ runtime loading and reproducible image build baseline

Summary

This PR packages the AWQ recovery work into a reviewable unit and includes reproducible runtime/image updates plus operational evidence documents.

Why

  • AWQ load path previously failed intermittently under generic model load flow.
  • Container builds were unstable due to package metadata/build path issues.
  • Operational closure needed explicit restart + health evidence and traceable release notes.

Scope Of Changes

  • Runtime
    • Added AWQ-first model loading path with safe fallback behavior.
  • Image/Dependency
    • Hardened dependency installation for gptqmodel and metadata-broken packages.
    • Added pypcre patch helper flow for deterministic builds.
  • Ops Documentation
    • Synchronized AWQ recovery checklist with post-restart verification evidence.
    • Added commit plan and release notes for traceability.
  • Repository Baseline
    • Added compose profiles, env presets, operation docs, monitoring/nginx configs, and lightweight web UI assets.

Commit Units

  • 5eb1b53 fix(runtime): add AWQ dedicated loader path and keep safe fallback
  • 896c7c1 fix(image): harden dependency install path for gptqmodel and broken package metadata
  • 62579fe docs(ops): sync AWQ recovery checklist with post-restart validation evidence
  • 3b35d42 chore(repo): add baseline configs, compose profiles, and operation docs
  • 30d3bac feat(observability): add monitoring stack and nginx gateway configs
  • 785a734 feat(web-ui): add lightweight model server dashboard
  • c314966 docs(release): add 2026-04-27 AWQ recovery release notes

Validation Evidence Checklist

  • AWQ loader success log confirmed in runtime startup flow.
  • Health endpoint success observed multiple times after startup stabilization.
  • Post-restart health verification reflected in ops checklist docs.
  • Tagged milestone commits created and pushed.

Tags Included

Risk And Rollback

  • Risk: Runtime behavior changes in model load path can impact non-AWQ model variants.
  • Mitigation: Fallback path retained; deploy on branch and validate health before promote.
  • Rollback: Revert from tag boundaries or rollback to pre-5eb1b53 state.

Reviewer Focus

  • AWQ-first loader path and fallback branching correctness.
  • Dockerfile dependency hardening correctness/reproducibility.
  • Operational evidence consistency between checklist and release notes.

Copilot AI review requested due to automatic review settings April 26, 2026 15:51
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @parkcheolhong, your pull request is larger than the review limit of 150000 diff characters

@parkcheolhong parkcheolhong merged commit cd278cb into main Apr 26, 2026
5 checks passed
@parkcheolhong parkcheolhong review requested due to automatic review settings April 26, 2026 16:15
Copy link
Copy Markdown
Owner Author

@parkcheolhong parkcheolhong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

모두 검토했습니다

Copy link
Copy Markdown
Owner Author

@parkcheolhong parkcheolhong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

모두 확인하였습니다, 잘못된 것이 있으면 수정해주세요.

@parkcheolhong parkcheolhong deleted the gpu-llm-server-awq-20260427 branch April 26, 2026 16:47
@parkcheolhong parkcheolhong restored the gpu-llm-server-awq-20260427 branch April 26, 2026 16:48
@parkcheolhong parkcheolhong deleted the gpu-llm-server-awq-20260427 branch April 26, 2026 16:49
Copy link
Copy Markdown
Owner Author

@parkcheolhong parkcheolhong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2026/04/27일자로 ui/ux 브라우저 변경했습니다.

Copy link
Copy Markdown
Owner Author

@parkcheolhong parkcheolhong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

확인했습니다.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant