fix(thread-safety): eliminate races in log mutex, watcher, and index threads#3
fix(thread-safety): eliminate races in log mutex, watcher, and index threads#3
Conversation
Review Summary by QodoFix thread-safety races in log mutex, watcher, and index threads
WalkthroughsDescription• Eliminate log mutex TOCTOU race by explicit initialization before threads • Protect watcher projects hash table with mutex from concurrent access • Add cbm_thread_detach() for POSIX and Windows platforms • Detach index job threads to prevent handle leaks Diagramflowchart LR
A["Log Mutex TOCTOU Race"] -->|Replace lazy-init| B["Explicit cbm_ui_log_init()"]
B -->|Called from main| C["Before thread creation"]
D["Watcher Hash Table Race"] -->|Add mutex protection| E["projects_lock"]
E -->|Wrap all operations| F["watch/unwatch/poll_once"]
G["Index Thread Handle Leak"] -->|Add detach function| H["cbm_thread_detach()"]
H -->|Call after creation| I["Free thread resources"]
File Changes1. src/foundation/compat_thread.c
|
Code Review by Qodo
1. Init flag precedes mutex
|
| /* Must be called once before any threads are created. */ | ||
| void cbm_ui_log_init(void) { | ||
| if (!atomic_exchange(&g_log_mutex_init, 1)) { | ||
| cbm_mutex_init(&g_log_mutex); | ||
| } | ||
| } |
There was a problem hiding this comment.
1. Init flag precedes mutex 🐞 Bug ≡ Correctness
cbm_ui_log_init() sets g_log_mutex_init=1 before calling cbm_mutex_init(), so another thread can observe init==1 and call cbm_mutex_lock() on an uninitialized mutex. This is undefined behavior and can crash under concurrent logging during initialization.
Agent Prompt
## Issue description
`cbm_ui_log_init()` currently sets `g_log_mutex_init` to 1 before `cbm_mutex_init(&g_log_mutex)` runs. Any concurrent `cbm_ui_log_append()` that sees the flag can lock an uninitialized mutex (UB/crash).
## Issue Context
Even if the intent is “call before any threads,” this function is now part of a thread-safety fix and should be correct under incidental concurrency.
## Fix Focus Areas
- Rework init sequencing so the mutex is fully initialized *before* any thread can observe `g_log_mutex_init == 1`.
- Prefer a robust one-time init pattern (e.g., 3-state init: 0=uninit, 1=initializing, 2=ready; wait/spin on 1), or store the flag only after init with release semantics and load with acquire semantics.
- src/ui/http_server.c[145-162]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| mg_http_reply(c, 500, g_cors_json, "{\"error\":\"thread creation failed\"}"); | ||
| return; | ||
| } | ||
| cbm_thread_detach(&tid); /* Don't leak thread handle */ | ||
|
|
There was a problem hiding this comment.
2. Index job slot reuse race 🐞 Bug ≡ Correctness
handle_index_start() reuses slots when status is 2/3, but index_thread_fn sets status=2/3 and then still reads job->root_path for its final log. With detached threads, a new request can overwrite job->root_path while the old thread is still running, causing a data race and corrupted logging/state.
Agent Prompt
## Issue description
A detached `index_thread_fn` writes `job->status` to 2/3 (making the slot reusable) and then reads `job->root_path` for the final `cbm_log_info`. `handle_index_start` can immediately reuse that slot and overwrite `job->root_path`, creating a race.
## Issue Context
Slots are considered free when `status` is 0/2/3. Once a thread stores 2/3, it must not access any `job` fields that the next request will overwrite.
## Fix Focus Areas
- Ensure that after `job->status` is set to a reusable value, the thread no longer reads/writes any `job` fields (especially `root_path` and `error_msg`).
- Simple options:
- Move the final `cbm_log_info("ui.index.done", ...)` to *before* storing status=2/3, or
- Copy `job->root_path` into a local buffer before storing status and log using the local copy, and do not touch `job` thereafter.
- Consider tightening slot reuse rules (e.g., only reuse when `status==0`, and add an explicit “reset to 0” API/state) if you want stronger guarantees.
- src/ui/http_server.c[725-733]
- src/ui/http_server.c[768-801]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| cbm_index_fn index_fn; | ||
| void *user_data; | ||
| CBMHashTable *projects; /* name → project_state_t* */ | ||
| cbm_mutex_t projects_lock; | ||
| atomic_int stopped; | ||
| }; |
There was a problem hiding this comment.
3. Watcher apis still racy 🐞 Bug ☼ Reliability
cbm_watcher_touch() and cbm_watcher_watch_count() access w->projects without projects_lock, so they can race with watch/unwatch/poll_once and potentially dereference freed project_state_t. The new lock only covers some access paths, leaving the hash table unsafely shared.
Agent Prompt
## Issue description
After introducing `projects_lock`, some public watcher APIs (`cbm_watcher_touch`, `cbm_watcher_watch_count`) still access `w->projects` without the mutex. This can race with `watch/unwatch/poll_once` and lead to use-after-free or corrupted hash table state.
## Issue Context
`watch`, `unwatch`, and `poll_once` now lock around hash table operations, so any remaining unlocked access becomes the weak link.
## Fix Focus Areas
- Wrap `cbm_ht_get`/state mutation in `cbm_watcher_touch()` with `cbm_mutex_lock/unlock(&w->projects_lock)`.
- Wrap `cbm_ht_count` in `cbm_watcher_watch_count()` similarly.
- src/watcher/watcher.c[260-318]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
There was a problem hiding this comment.
Pull request overview
This PR aims to remove several thread-safety hazards in the watcher/logging paths and to prevent leaking OS thread handles from background indexing operations.
Changes:
- Adds a mutex to synchronize access to the watcher’s
projectshash table across the main thread and watcher thread. - Reworks UI log ring-buffer mutex initialization to avoid a lazy-init race by introducing
cbm_ui_log_init()and calling it frommain(). - Introduces
cbm_thread_detach()and uses it for background index threads to avoid leaking thread handles.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| src/watcher/watcher.c | Adds projects_lock and wraps hash table operations/iteration with a mutex. |
| src/ui/http_server.h | Exposes cbm_ui_log_init() as an explicit init API for the UI log ring buffer. |
| src/ui/http_server.c | Implements cbm_ui_log_init(), updates cbm_ui_log_append(), and detaches index threads after creation. |
| src/main.c | Calls cbm_ui_log_init() during startup before launching watcher/UI threads. |
| src/foundation/compat_thread.h | Declares cbm_thread_detach() in the portable thread API. |
| src/foundation/compat_thread.c | Implements cbm_thread_detach() for Windows and POSIX. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| struct cbm_watcher { | ||
| cbm_store_t *store; | ||
| cbm_index_fn index_fn; | ||
| void *user_data; | ||
| CBMHashTable *projects; /* name → project_state_t* */ | ||
| cbm_mutex_t projects_lock; | ||
| atomic_int stopped; | ||
| }; |
There was a problem hiding this comment.
projects_lock is added to protect w->projects, but there are still unprotected accesses elsewhere in this file (e.g., cbm_watcher_touch() calls cbm_ht_get() and mutates s->next_poll_ns, and cbm_watcher_watch_count() calls cbm_ht_count() without taking the lock). With the watcher thread concurrently iterating the hash table in cbm_watcher_poll_once(), these remain data races. Please wrap those operations with projects_lock (or otherwise make them thread-safe) so all projects access is consistently synchronized.
| poll_ctx_t ctx = { | ||
| .w = w, | ||
| .now = now_ns(), | ||
| .reindexed = 0, | ||
| }; | ||
| cbm_mutex_lock(&w->projects_lock); | ||
| cbm_ht_foreach(w->projects, poll_project, &ctx); | ||
| cbm_mutex_unlock(&w->projects_lock); | ||
| return ctx.reindexed; |
There was a problem hiding this comment.
cbm_watcher_poll_once() holds projects_lock across cbm_ht_foreach(), and the callback can run git commands and invoke index_fn (which can take a long time). This effectively blocks watch/unwatch for the full duration of indexing, and can become a responsiveness bottleneck (and increases deadlock risk if index_fn ever calls back into watcher APIs). Consider iterating a snapshot (copy pointers/keys under the lock, then release it before expensive work) or using finer-grained locking.
| /* Create and start watcher in background thread */ | ||
| /* Initialize log mutex before any threads are created */ | ||
| cbm_ui_log_init(); | ||
|
|
There was a problem hiding this comment.
cbm_ui_log_init() is called here, but cbm_log_set_sink(cbm_ui_log_append) and early cbm_log_info("server.start", ...) happen earlier in main(). With the new cbm_ui_log_append() guard (!g_log_mutex_init → return), those startup logs will now be dropped. Also cbm_diag_start() may create a background thread before this call (when enabled), which conflicts with the “before any threads are created” requirement. Please move cbm_ui_log_init() earlier (before setting the log sink / first log / any optional thread creation) or make log initialization automatic inside the UI logging/server setup.
| /* Must be called once before any threads are created. */ | ||
| void cbm_ui_log_init(void) { | ||
| if (!atomic_exchange(&g_log_mutex_init, 1)) { | ||
| cbm_mutex_init(&g_log_mutex); | ||
| } | ||
| } | ||
|
|
||
| /* Called from a log hook — appends a line to the ring buffer (thread-safe) */ | ||
| void cbm_ui_log_append(const char *line) { | ||
| if (!line) | ||
| if (!line || !atomic_load(&g_log_mutex_init)) | ||
| return; | ||
| if (!atomic_load(&g_log_mutex_init)) { | ||
| cbm_mutex_init(&g_log_mutex); | ||
| atomic_store(&g_log_mutex_init, 1); | ||
| } | ||
| cbm_mutex_lock(&g_log_mutex); |
There was a problem hiding this comment.
cbm_ui_log_append() now becomes a no-op until cbm_ui_log_init() is called, but other code (e.g. handle_logs() in this file) locks g_log_mutex unconditionally. That means any future code path that starts the HTTP server without first calling cbm_ui_log_init() will hit undefined behavior by locking an uninitialized mutex. Consider initializing the log mutex inside cbm_http_server_new()/cbm_http_server_run() (before spawning threads), or adding a safe guard in the HTTP handlers to avoid locking until initialized.
| mg_http_reply(c, 500, g_cors_json, "{\"error\":\"thread creation failed\"}"); | ||
| return; | ||
| } | ||
| cbm_thread_detach(&tid); /* Don't leak thread handle */ |
There was a problem hiding this comment.
The return value of cbm_thread_detach() is ignored. On POSIX, pthread_detach() can fail (e.g., invalid thread ID), which would reintroduce the handle/resource leak this change is trying to fix. Please check the return code and log/handle failure (at minimum, emit a warning) so leaks don’t silently persist in error cases.
| cbm_thread_detach(&tid); /* Don't leak thread handle */ | |
| { | |
| int detach_rc = cbm_thread_detach(&tid); | |
| if (detach_rc != 0) { | |
| fprintf(stderr, | |
| "warning: cbm_thread_detach failed for index job slot %d (path: %s), rc=%d; " | |
| "thread handle may leak\n", | |
| slot, job->root_path, detach_rc); | |
| } | |
| } |
9c53fc0 to
deea49e
Compare
…threads - http_server: Replace lazy log mutex init with 3-state init (UNINIT → INITING → INITED) using atomic CAS. Concurrent callers spin until init completes, preventing use of uninitialized mutex. cbm_ui_log_append calls cbm_ui_log_init on first use so early startup logs are not dropped. - watcher: Add cbm_mutex_t to protect projects hash table. All accessors (watch, unwatch, touch, watch_count, poll_once) are guarded. poll_once snapshots project pointers under lock then polls without holding it, keeping the critical section small during git I/O and indexing. state_new OOM is handled with early return. - compat_thread: Add cbm_thread_detach() for POSIX and Windows. Both join() and detach() clear the handle on success across both platforms for consistent lifecycle tracking. - http_server: Detach index job threads to prevent handle leaks.
deea49e to
d6966f9
Compare
Summary
cbm_ui_log_appendused a non-atomic check-then-init pattern for the log ring buffer mutex. Two threads calling it simultaneously could both initialize the mutex, corrupting it. Replaced with explicitcbm_ui_log_init()called frommain()before any threads are created.projectshash table was written by the main thread (watch/unwatch) and iterated by the watcher thread (poll_once) with no synchronization. Addedcbm_mutex_t projects_lockto the watcher struct, wrapping all hash table operations.cbm_thread_createwere never joined or detached, leaking thread handles. Addedcbm_thread_detach()(POSIX + Windows) and called it after thread creation.Test plan
-fsanitize=thread(TSan) to verify no data races remain