Skip to content

fix: limit the amount of logs per request#156

Merged
lucaspin merged 8 commits intomasterfrom
limit-log-lines-per-request
Jul 15, 2022
Merged

fix: limit the amount of logs per request#156
lucaspin merged 8 commits intomasterfrom
limit-log-lines-per-request

Conversation

@lucaspin
Copy link
Copy Markdown
Contributor

@lucaspin lucaspin commented Jul 14, 2022

  • Caps the number of log events sent in each log batch to avoid pushing too many logs at a time.
  • Stop trying to send logs if the API has indicated that no more space is available to avoid sending unnecessary requests that will be just rejected by the API.
  • Add some randomness to the delay we use between each batch. If the job is not finished, we use a wider range of delay between 1500ms and 3000ms. If the job is finished and we are trying to flush all the logs, we use a tighter range between 500ms and 1000ms. This is in order to avoid coordinating requests at the exact same interval.
  • When flushing logs after a job is finished, we use a default timeout of 60 seconds. If, after that period, the agent still couldn't flush everything, the agent gives up and leaves the logs incomplete.

Other unrelated changes

  • retry.RetryWithConstantWait() was refactored to accept a RetryOptions object. This was done to allow a new argument HideErrors to be added, to not log errors.

Example logs

Jul 14 13:52:32.367 : SYNC response (action: run-job, job: 633cfea0-f355-4134-89a1-94e752471f29)
Jul 14 13:52:32.868 : No executor specified - using shell executor
Jul 14 13:52:32.868 : Running job
Jul 14 13:52:32.868 : Logs will be pushed to https://semaphore.semaphoreci.com/api/v1/logs/633cfea0-f355-4134-89a1-94e752471f29
Jul 14 13:52:32.871 : Waiting 2.905s to push next batch of logs...
Jul 14 13:52:35.782 : Pushing next batch of logs with 216 log events...
Jul 14 13:52:36.009 : Waiting 2.121s to push next batch of logs...
Jul 14 13:52:37.369 : SYNC request (state: running-job, job: 633cfea0-f355-4134-89a1-94e752471f29)
Jul 14 13:52:37.505 : SYNC response (action: continue, job: )
Jul 14 13:52:38.132 : Pushing next batch of logs with 341 log events...
Jul 14 13:52:38.344 : Waiting 1.654s to push next batch of logs...
Jul 14 13:52:40.001 : Pushing next batch of logs with 3 log events...
Jul 14 13:52:40.134 : Waiting 2.396s to push next batch of logs...
Jul 14 13:52:42.507 : SYNC request (state: running-job, job: 633cfea0-f355-4134-89a1-94e752471f29)
Jul 14 13:52:42.532 : Pushing next batch of logs with 513 log events...
Jul 14 13:52:42.636 : SYNC response (action: continue, job: )
Jul 14 13:52:42.856 : Waiting 2.539s to push next batch of logs...
Jul 14 13:52:45.398 : No logs to push - skipping
Jul 14 13:52:45.398 : Waiting 1.534s to push next batch of logs...
Jul 14 13:52:46.936 : Pushing next batch of logs with 682 log events...
Jul 14 13:52:47.278 : Waiting 2.569s to push next batch of logs...
Jul 14 13:52:47.637 : SYNC request (state: running-job, job: 633cfea0-f355-4134-89a1-94e752471f29)
Jul 14 13:52:47.768 : SYNC response (action: continue, job: )
Jul 14 13:52:49.850 : Pushing next batch of logs with 2000 log events...
Jul 14 13:52:50.346 : Waiting 2.884s to push next batch of logs...
Jul 14 13:52:52.769 : SYNC request (state: running-job, job: 633cfea0-f355-4134-89a1-94e752471f29)
Jul 14 13:52:52.908 : SYNC response (action: continue, job: )
Jul 14 13:52:53.238 : Pushing next batch of logs with 2000 log events...
Jul 14 13:52:53.748 : Waiting 1.933s to push next batch of logs...
Jul 14 13:52:55.689 : Pushing next batch of logs with 2000 log events...
Jul 14 13:52:56.186 : Waiting 2.303s to push next batch of logs...
Jul 14 13:52:57.746 : Regular commands finished successfully
Jul 14 13:52:57.754 : Starting epilogue always commands
Jul 14 13:52:57.754 : Starting epilogue on pass commands
Jul 14 13:52:57.754 : Sending finished callback: {"result": "passed"}
Jul 14 13:52:57.910 : SYNC request (state: running-job, job: 633cfea0-f355-4134-89a1-94e752471f29)
Jul 14 13:52:57.979 : Waiting for all logs to be flushed...
Jul 14 13:52:58.047 : SYNC response (action: continue, job: )
Jul 14 13:52:58.499 : Pushing next batch of logs with 2000 log events...
Jul 14 13:52:59.024 : Waiting 765ms to push next batch of logs...
Jul 14 13:52:59.797 : Pushing next batch of logs with 2000 log events...
Jul 14 13:53:00.280 : Waiting 749ms to push next batch of logs...
Jul 14 13:53:01.038 : Pushing next batch of logs with 1032 log events...
Jul 14 13:53:01.335 : Waiting 698ms to push next batch of logs...
Jul 14 13:53:02.042 : No more logs to flush - stopping
Jul 14 13:53:02.042 : Stopped pushing logs.
Jul 14 13:53:02.983 : Sending teardown finished callback
Jul 14 13:53:03.048 : SYNC request (state: running-job, job: 633cfea0-f355-4134-89a1-94e752471f29)
Jul 14 13:53:03.144 : Job teardown finished
Jul 14 13:53:03.181 : SYNC response (action: continue, job: )
Jul 14 13:53:08.186 : SYNC request (state: finished-job, job: 633cfea0-f355-4134-89a1-94e752471f29)
Jul 14 13:53:08.322 : SYNC response (action: wait-for-jobs, job: )

@lucaspin lucaspin requested a review from shiroyasha July 14, 2022 17:25
@lucaspin lucaspin changed the title Limit log lines per request fix: limit the amount of logs per request Jul 14, 2022
@lucaspin lucaspin merged commit 46b4475 into master Jul 15, 2022
@lucaspin lucaspin deleted the limit-log-lines-per-request branch July 15, 2022 11:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant