Skip to content

[agent] debug logs for session, node events on dispatcher, heartbeats#2486

Merged
anshulpundir merged 1 commit into
moby:masterfrom
anshulpundir:worker_logs
Jan 24, 2018
Merged

[agent] debug logs for session, node events on dispatcher, heartbeats#2486
anshulpundir merged 1 commit into
moby:masterfrom
anshulpundir:worker_logs

Conversation

@anshulpundir
Copy link
Copy Markdown
Contributor

Added info logs to the agent to track the manager its connecting to, timeouts, heartbeat from the dispatcher.

Signed-off-by: Anshul Pundir anshul.pundir@docker.com

Comment thread agent/agent.go Outdated
if backoff > maxSessionFailureBackoff {
backoff = maxSessionFailureBackoff
}
log.G(ctx).WithError(err).Errorf("agent: session failed. Backoff period: %d", backoff)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use a field here: log.G(ctx).WithError(err).WithField("backoff", backoff).Errorf("agent: session failed").

Note this is the maximum backoff range for a random backoff. Actual backoff is rand([0, backoff)). ;)

Comment thread manager/dispatcher/dispatcher.go Outdated
}

period, err := d.nodes.Heartbeat(nodeInfo.NodeID, r.SessionID)
log.G(ctx).WithField("dispatcher", "heartbeat").Infof("agent heartbeat period %v", period)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be better to use the module here, rather than a field.

@codecov
Copy link
Copy Markdown

codecov Bot commented Jan 17, 2018

Codecov Report

Merging #2486 into master will increase coverage by 0.19%.
The diff coverage is 72.41%.

@@            Coverage Diff             @@
##           master    #2486      +/-   ##
==========================================
+ Coverage   61.23%   61.42%   +0.19%     
==========================================
  Files          49      129      +80     
  Lines        6890    21313   +14423     
==========================================
+ Hits         4219    13092    +8873     
- Misses       2241     6812    +4571     
- Partials      430     1409     +979

Comment thread connectionbroker/broker.go Outdated
// connection.
func (b *Broker) SelectRemote(dialOpts ...grpc.DialOption) (*Conn, error) {
peer, err := b.remotes.Select()
log.G(context.Background()).Infof("Manager selected by agent for session: %v", peer)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: all messages start with lowercase.

@anshulpundir anshulpundir force-pushed the worker_logs branch 2 times, most recently from 5bf93bc to 8f1bb1b Compare January 17, 2018 22:17
@anshulpundir anshulpundir changed the title [agent] Debug logs [agent] debug logs for session, node events on dispatcher, heartbeats Jan 17, 2018
@anshulpundir anshulpundir force-pushed the worker_logs branch 3 times, most recently from e0e280f to f822e10 Compare January 23, 2018 00:58
@anshulpundir
Copy link
Copy Markdown
Contributor Author

ping @stevvooe @dperny for review

@nishanttotla
Copy link
Copy Markdown
Contributor

@anshulpundir CI is failing with a bunch of data races.

@anshulpundir
Copy link
Copy Markdown
Contributor Author

Will check it out tomorrow @nishanttotla

Comment thread agent/session.go Outdated
cancel()
if err != nil {
if grpc.Code(err) == codes.NotFound {
log.G(ctx).WithFields(fields).WithError(err).Errorf("heartbeat failed")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you want to add here again the manager details to make it easier to identify it?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do!

Comment thread manager/dispatcher/dispatcher.go Outdated

period, err := d.nodes.Heartbeat(nodeInfo.NodeID, r.SessionID)

log.G(ctx).WithField("method", "(*Dispatcher).Heartbeat").Infof("received heartbeat from worker %v, expect next heartbeat in %v", nodeInfo, period)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this one debug?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its every 5 seconds, I thought thats not too frequent for logging ?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's per node in the cluster, so can easily be hundred every 5 sec :D

Copy link
Copy Markdown
Contributor Author

@anshulpundir anshulpundir Jan 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh yes, I mistook this for the log on the agent.

@anshulpundir anshulpundir force-pushed the worker_logs branch 3 times, most recently from f8af5f4 to fc5a878 Compare January 23, 2018 23:39
Comment thread agent/session.go Outdated
cancel()
if err != nil {
if grpc.Code(err) == codes.NotFound {
log.G(ctx).WithFields(fields).WithError(err).Errorf("heartbeat to manager %v failed", s.conn.Peer())
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you want to move this one line above, so it's printed no matter from the grpc code?

Signed-off-by: Anshul Pundir <anshul.pundir@docker.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants