Fix flaky integration tests with polling-based stderr wait#1746
Conversation
Replace fixed 1-second sleep with waitForStderr() helper that polls the stderr buffer for expected output with a 5-second deadline. This fixes race conditions where the gateway hadn't written to stderr before the test read and killed the process. Affected tests: - TestDIFCModeFilterViaEnv - TestDIFCModePropagateViaEnv - TestFullDIFCConfigFromJSON - TestDIFCWithFullConfig Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR aims to eliminate flakiness in DIFC integration tests by replacing fixed time.Sleep waits with a polling helper that waits for expected stderr log output before proceeding.
Changes:
- Added
waitForStderr()helper to poll stderr for expected substrings with a deadline. - Replaced several fixed
time.Sleep(1 * time.Second)calls withwaitForStderr()in DIFC/guards integration tests.
Comments suppressed due to low confidence (3)
test/integration/difc_config_test.go:274
- The return value of waitForStderr is ignored. If the substring never appears within the deadline, the test will continue and fail later with a less clear assertion. Please assert that waitForStderr returns true (and ideally include stderr output in the failure) so timeouts are explicit.
waitForStderr(&stderr, "Guards enforcement", 5*time.Second)
test/integration/difc_config_test.go:324
- The return value of waitForStderr is ignored. Please assert on it so the test fails immediately with a clear timeout message if the expected log output never appears.
waitForStderr(&stderr, "Guards enforcement", 5*time.Second)
test/integration/difc_config_test.go:388
- The return value of waitForStderr is ignored. Please assert on it (and include stderr) so a missing log line results in a clear timeout failure rather than proceeding and failing later.
waitForStderr(&stderr, "Guards enforcement", 5*time.Second)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // waitForStderr polls buf until it contains substr or the deadline expires. | ||
| // Returns true if the substring was found within the deadline. | ||
| func waitForStderr(buf *bytes.Buffer, substr string, deadline time.Duration) bool { | ||
| end := time.Now().Add(deadline) | ||
| for time.Now().Before(end) { | ||
| if strings.Contains(buf.String(), substr) { | ||
| return true | ||
| } | ||
| time.Sleep(100 * time.Millisecond) | ||
| } |
There was a problem hiding this comment.
waitForStderr reads buf.String() while the gateway subprocess is concurrently writing to the same bytes.Buffer via cmd.Stderr. bytes.Buffer is not safe for concurrent read/write, so this can introduce data races or corrupted reads (even if it reduces flakiness). Consider switching stderr capture to a concurrency-safe writer (e.g., wrap bytes.Buffer with a mutex and expose a locked String(), or read from cmd.StderrPipe() in a goroutine and synchronize access) and have waitForStderr operate on that safe abstraction instead of *bytes.Buffer.
There was a problem hiding this comment.
@copilot open a new pull request to apply changes based on this feedback
There was a problem hiding this comment.
Done in commit 0e95d8f. Introduced a syncBuffer type that wraps bytes.Buffer with a sync.RWMutex — Write takes the write lock and String takes the read lock. waitForStderr now accepts *syncBuffer instead of *bytes.Buffer, eliminating the data race between the subprocess writing and the polling goroutine reading. All four tests using waitForStderr now declare var stderr syncBuffer for their stderr capture. Also added require.Truef assertions on the waitForStderr return value in each call site so timeout failures produce actionable messages including the actual stderr content.
test/integration/difc_config_test.go
Outdated
| // Wait for startup — look for server name in logs | ||
| waitForStderr(&stderr, "playwright", 5*time.Second) |
There was a problem hiding this comment.
The PR description mentions updating TestDIFCWithFullConfig, but that test doesn’t appear in this file (and isn’t changed in this diff). Please either update the PR description to match the actual changes or include the intended test update so reviewers can validate the preventive fix claimed in the description.
There was a problem hiding this comment.
@copilot open a new pull request to apply changes based on this feedback
There was a problem hiding this comment.
TestDIFCWithFullConfig was a mistake in the PR description — that test doesn't exist. The actual test that was updated (preventively, with syncBuffer + waitForStderr) was TestDIFCConfigWithGuards. Updated the PR description to reflect the correct test names.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…in waitForStderr Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com>
|
Just as a heads up, I was blocked by some firewall rules while working on your feedback. Expand below for details. Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
Problem
Integration tests
TestDIFCModeFilterViaEnvandTestDIFCModePropagateViaEnvwere flaky — they used a fixed 1-secondtime.Sleepbefore reading stderr, but the gateway sometimes hadn't written the expected log output yet.Fix
Added a
waitForStderr()helper that polls a concurrency-safesyncBufferfor expected content with a 5-second deadline (100ms polling interval). This eliminates both the timing race condition and the data race caused by polling abytes.Bufferconcurrently written to by the subprocess.syncBufferwrapsbytes.Bufferwith async.RWMutex, providing a thread-safeWritemethod (used bycmd.Stderr) and a lockedStringmethod (used bywaitForStderrand assertions).All
waitForStderrcall sites now assert on the return value withrequire.Truef, including the actual stderr content in the failure message to make timeout failures diagnosable.Affected tests:
TestDIFCModeFilterViaEnvTestDIFCModePropagateViaEnvTestFullDIFCConfigFromJSONTestDIFCConfigWithGuardsVerification
make agent-finishedpasses — all unit + integration tests green.💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.