[agent] debug logs for session, node events on dispatcher, heartbeats#2486
Conversation
| if backoff > maxSessionFailureBackoff { | ||
| backoff = maxSessionFailureBackoff | ||
| } | ||
| log.G(ctx).WithError(err).Errorf("agent: session failed. Backoff period: %d", backoff) |
There was a problem hiding this comment.
Use a field here: log.G(ctx).WithError(err).WithField("backoff", backoff).Errorf("agent: session failed").
Note this is the maximum backoff range for a random backoff. Actual backoff is rand([0, backoff)). ;)
| } | ||
|
|
||
| period, err := d.nodes.Heartbeat(nodeInfo.NodeID, r.SessionID) | ||
| log.G(ctx).WithField("dispatcher", "heartbeat").Infof("agent heartbeat period %v", period) |
There was a problem hiding this comment.
Might be better to use the module here, rather than a field.
Codecov Report
@@ Coverage Diff @@
## master #2486 +/- ##
==========================================
+ Coverage 61.23% 61.42% +0.19%
==========================================
Files 49 129 +80
Lines 6890 21313 +14423
==========================================
+ Hits 4219 13092 +8873
- Misses 2241 6812 +4571
- Partials 430 1409 +979 |
| // connection. | ||
| func (b *Broker) SelectRemote(dialOpts ...grpc.DialOption) (*Conn, error) { | ||
| peer, err := b.remotes.Select() | ||
| log.G(context.Background()).Infof("Manager selected by agent for session: %v", peer) |
There was a problem hiding this comment.
nit: all messages start with lowercase.
5bf93bc to
8f1bb1b
Compare
e0e280f to
f822e10
Compare
|
@anshulpundir CI is failing with a bunch of data races. |
|
Will check it out tomorrow @nishanttotla |
| cancel() | ||
| if err != nil { | ||
| if grpc.Code(err) == codes.NotFound { | ||
| log.G(ctx).WithFields(fields).WithError(err).Errorf("heartbeat failed") |
There was a problem hiding this comment.
do you want to add here again the manager details to make it easier to identify it?
|
|
||
| period, err := d.nodes.Heartbeat(nodeInfo.NodeID, r.SessionID) | ||
|
|
||
| log.G(ctx).WithField("method", "(*Dispatcher).Heartbeat").Infof("received heartbeat from worker %v, expect next heartbeat in %v", nodeInfo, period) |
There was a problem hiding this comment.
Its every 5 seconds, I thought thats not too frequent for logging ?
There was a problem hiding this comment.
it's per node in the cluster, so can easily be hundred every 5 sec :D
There was a problem hiding this comment.
Ahh yes, I mistook this for the log on the agent.
f8af5f4 to
fc5a878
Compare
| cancel() | ||
| if err != nil { | ||
| if grpc.Code(err) == codes.NotFound { | ||
| log.G(ctx).WithFields(fields).WithError(err).Errorf("heartbeat to manager %v failed", s.conn.Peer()) |
There was a problem hiding this comment.
do you want to move this one line above, so it's printed no matter from the grpc code?
fc5a878 to
bb17218
Compare
Signed-off-by: Anshul Pundir <anshul.pundir@docker.com>
bb17218 to
6fa4dda
Compare
Added info logs to the agent to track the manager its connecting to, timeouts, heartbeat from the dispatcher.
Signed-off-by: Anshul Pundir anshul.pundir@docker.com