Skip to content

Conversation

@jmrodri
Copy link
Member

@jmrodri jmrodri commented Mar 14, 2019

With this feature we can use --zap-level 10 and this will allow
log.V(10).Info("hello world") to be seen in the logs. Prior to this
change debug (log.V(1)) was the lowest value attainable.

We also disable sampling when using an integer less than -1
(i.e. beyond debug). The zap sampling code assumes the lowest number for
sampling is DebugLevel (-1) which isn't the case in this scenario.
Disabling sampling makes sense at this point because sampling is used
for better performing logs which isn't required for trace level type
logs.

We also handle negative values being passed in.

@openshift-ci-robot openshift-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Mar 14, 2019
@jmrodri
Copy link
Member Author

jmrodri commented Mar 14, 2019

I updated proxy.go to have 6 levels of log to test this PR out. Here is the output:

[jesusr@transam foo-operator{master}]$ operator-sdk up local --zap-level 10
INFO[0000] Running the operator locally.                
INFO[0000] Using namespace myproject.                   
{"level":"info","ts":1552588792.6807225,"logger":"cmd","msg":"Go Version: go1.11.5"}
{"level":"info","ts":1552588792.6807659,"logger":"cmd","msg":"Go OS/Arch: linux/amd64"}
{"level":"info","ts":1552588792.6807742,"logger":"cmd","msg":"Version of operator-sdk: v0.5.0+git"}
{"level":"info","ts":1552588792.680802,"logger":"cmd","msg":"Watching namespace.","Namespace":"myproject"}
{"level":"info","ts":1552588792.7752655,"logger":"leader","msg":"Trying to become the leader."}
{"level":"info","ts":1552588792.7754438,"logger":"leader","msg":"Skipping leader election; not running in a cluster."}
{"level":"info","ts":1552588792.7771642,"logger":"proxy","msg":"Starting to serve","Address":"127.0.0.1:8888"}
{"level":"debug","ts":1552588792.7773304,"logger":"proxy","msg":"Starting to serve level 1","Address":"127.0.0.1:8888"}
{"level":"Level(-2)","ts":1552588792.7773762,"logger":"proxy","msg":"Starting to serve level 2","Address":"127.0.0.1:8888"}
{"level":"Level(-3)","ts":1552588792.7774038,"logger":"proxy","msg":"Starting to serve level 3","Address":"127.0.0.1:8888"}
{"level":"Level(-4)","ts":1552588792.7774181,"logger":"proxy","msg":"Starting to serve level 4","Address":"127.0.0.1:8888"}
{"level":"Level(-5)","ts":1552588792.7774332,"logger":"proxy","msg":"Starting to serve level 5","Address":"127.0.0.1:8888"}
{"level":"Level(-6)","ts":1552588792.7774472,"logger":"proxy","msg":"Starting to serve level 6","Address":"127.0.0.1:8888"}
{"level":"error","ts":1552588792.7774608,"logger":"proxy","msg":"Starting to serve error level","Address":"127.0.0.1:8888","stacktrace":"github.com/operator-framework/operator-sdk/vendor/github.com/go-logr/zapr.(*zapLogger).Error\n\tsrc/github.com/operator-framework/operator-sdk/vendor/github.com/go-logr/zapr/zapr.go:128\ngithub.com/operator-framework/operator-sdk/pkg/ansible/proxy.Run.func2\n\tsrc/github.com/operator-framework/operator-sdk/pkg/ansible/proxy/proxy.go:403"}
{"level":"info","ts":1552588792.7777417,"logger":"manager","msg":"Using default value for workers 1"}
{"level":"info","ts":1552588792.777771,"logger":"ansible-controller","msg":"Watching resource","Options.Group":"cache.example.com","Options.Version":"v1alpha1","Options.Kind":"FooService"}
{"level":"info","ts":1552588792.7781951,"logger":"kubebuilder.controller","msg":"Starting EventSource","controller":"fooservice-controller","source":"kind source: cache.example.com/v1alpha1, Kind=FooService"}
{"level":"info","ts":1552588792.8786395,"logger":"kubebuilder.controller","msg":"Starting Controller","controller":"fooservice-controller"}
{"level":"info","ts":1552588792.9789739,"logger":"kubebuilder.controller","msg":"Starting workers","controller":"fooservice-controller","worker count":1}

@jmrodri
Copy link
Member Author

jmrodri commented Mar 14, 2019

Using log level 3 notice that 4, 5, & 6 are no longer visible.

[jesusr@transam foo-operator{master}]$ operator-sdk up local --zap-level 3
INFO[0000] Running the operator locally.                
INFO[0000] Using namespace myproject.                   
{"level":"info","ts":1552588862.6171176,"logger":"cmd","msg":"Go Version: go1.11.5"}
{"level":"info","ts":1552588862.6171565,"logger":"cmd","msg":"Go OS/Arch: linux/amd64"}
{"level":"info","ts":1552588862.6171675,"logger":"cmd","msg":"Version of operator-sdk: v0.5.0+git"}
{"level":"info","ts":1552588862.6171973,"logger":"cmd","msg":"Watching namespace.","Namespace":"myproject"}
{"level":"info","ts":1552588862.710273,"logger":"leader","msg":"Trying to become the leader."}
{"level":"info","ts":1552588862.7104235,"logger":"leader","msg":"Skipping leader election; not running in a cluster."}
{"level":"info","ts":1552588862.712774,"logger":"proxy","msg":"Starting to serve","Address":"127.0.0.1:8888"}
{"level":"debug","ts":1552588862.7128866,"logger":"proxy","msg":"Starting to serve level 1","Address":"127.0.0.1:8888"}
{"level":"Level(-2)","ts":1552588862.7129002,"logger":"proxy","msg":"Starting to serve level 2","Address":"127.0.0.1:8888"}
{"level":"Level(-3)","ts":1552588862.7129188,"logger":"proxy","msg":"Starting to serve level 3","Address":"127.0.0.1:8888"}
{"level":"error","ts":1552588862.7129462,"logger":"proxy","msg":"Starting to serve error level","Address":"127.0.0.1:8888","stacktrace":"github.com/operator-framework/operator-sdk/vendor/github.com/go-logr/zapr.(*zapLogger).Error\n\tsrc/github.com/operator-framework/operator-sdk/vendor/github.com/go-logr/zapr/zapr.go:128\ngithub.com/operator-framework/operator-sdk/pkg/ansible/proxy.Run.func2\n\tsrc/github.com/operator-framework/operator-sdk/pkg/ansible/proxy/proxy.go:403"}
{"level":"info","ts":1552588862.7132838,"logger":"manager","msg":"Using default value for workers 1"}
{"level":"info","ts":1552588862.7133071,"logger":"ansible-controller","msg":"Watching resource","Options.Group":"cache.example.com","Options.Version":"v1alpha1","Options.Kind":"FooService"}
{"level":"info","ts":1552588862.7137141,"logger":"kubebuilder.controller","msg":"Starting EventSource","controller":"fooservice-controller","source":"kind source: cache.example.com/v1alpha1, Kind=FooService"}
{"level":"info","ts":1552588862.8141239,"logger":"kubebuilder.controller","msg":"Starting Controller","controller":"fooservice-controller"}
{"level":"info","ts":1552588862.9144537,"logger":"kubebuilder.controller","msg":"Starting workers","controller":"fooservice-controller","worker count":1}

@jmrodri jmrodri requested a review from shawn-hurley March 14, 2019 18:49
@shawn-hurley shawn-hurley added the kind/feature Categorizes issue or PR as related to a new feature. label Mar 14, 2019
@shawn-hurley shawn-hurley added the language/ansible Issue is related to an Ansible operator project label Mar 14, 2019
@openshift-ci-robot openshift-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Mar 14, 2019
Copy link
Member

@shawn-hurley shawn-hurley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

// cause index out of bounds errors in the sampling code.
if levelVal.level < -1 {
sampleVal.set = false
c.sample = false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we think it's okay to silently disable sampling when --zap-sample=true?

Instead, we could log.Fatal() or change the API to allow us to return an error. Thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I do.

We should document that this is what we do, but TBH how many users are purposefully turning on sampling (which is for log performance in production) AND setting the log level to trace level logs. I would assume this is a mistake and we should probably just make the right decision.

Copy link
Member

@joelanford joelanford Mar 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this is more of a nit, but I'm a little nervous about this check happening in the if levelVal.set { ... } block. I know we have unit tests now, but if we ever change our default log level to debug, this check would get skipped.

It would probably be better to use levelVal.level to set the default in line 69 and then do another check in the if sampleVal.set { ... } block to make sure we [EDIT: don't] turn sampling back on when debug logging is enabled.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmrodri I think we should use c.level to determine this on line 89.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved the if block to just before we return the config.

func getConfig() config {                                                             
    var c config                                                                      
                                                                                      
    // Set the defaults depending on the log mode (development vs. production)        
    if development {                                                                  
        c.encoder = consoleEncoder()                                                  
        c.level = zap.NewAtomicLevelAt(zap.DebugLevel)                                
        c.opts = append(c.opts, zap.Development(), zap.AddStacktrace(zap.ErrorLevel)) 
        c.sample = false                                                              
    } else {                                                                          
        c.encoder = jsonEncoder()                                                     
        c.level = zap.NewAtomicLevelAt(zap.InfoLevel)                                 
        c.opts = append(c.opts, zap.AddStacktrace(zap.WarnLevel))                     
        c.sample = true                                                               
    }                                                                                 
                                                                                      
    // Override the defaults if the flags were set explicitly on the command line     
    if encoderVal.set {                                                               
        c.encoder = encoderVal.encoder                                                
    }                                                                                 
    if levelVal.set {                                                                 
        c.level = zap.NewAtomicLevelAt(levelVal.level)                                
                                                                                      
    }                                                                                 
    if sampleVal.set {                                                                
        c.sample = sampleVal.sample                                                   
    }                                                                                 
                                                                                      
    // Disable sampling when we are in debug mode. Otherwise, this will               
    // cause index out of bounds errors in the sampling code.                         
    if c.level.Level() < -1 {                                                         
        c.sample = false                                                              
    }                                                                                 
    return c                                                                          
}                                                                                     

@jmrodri
Copy link
Member Author

jmrodri commented Mar 15, 2019

I now create a logger in the test to ensure it doesn't cause a panic. With the unaltered code:

    ...
     // Disable sampling when we are in debug mode. Otherwise, this will 
     // cause index out of bounds errors in the sampling code.           
     if c.level.Level() < -1 {                                           
         c.sample = false                                                
     }                                                                   
     return c                                                            
 }                                                                       

When I run the test, the log works just fine.

[jesusr@transam zap{log-levels}]$ go test
{"level":"Level(-10)","ts":1552667568.3941715,"msg":"this should not panic"}                                        
{"level":"Level(-10)","ts":1552667568.3942807,"msg":"this should not panic"}                                        
PASS
ok      github.com/operator-framework/operator-sdk/pkg/log/zap  0.009s  

Now the part I really liked which is why @joelanford suggested it, if I comment out the code that disables sampling, we detect the panic.

...
    // Disable sampling when we are in debug mode. Otherwise, this will 
    // cause index out of bounds errors in the sampling code.           
    // if c.level.Level() < -1 {                                        
    //     c.sample = false                                             
    // }                                                                
    return c                                                            
}      

Using the above, we can see the test will panic indicating that clearly we broke something :)

        [jesusr@transam zap{log-levels}]$ go test
panic: runtime error: index out of range [recovered]
        panic: runtime error: index out of range

goroutine 42 [running]:
testing.tRunner.func1(0xc000115900)
        /usr/lib/golang/src/testing/testing.go:792 +0x387
panic(0x8ba720, 0xd49470)
        /usr/lib/golang/src/runtime/panic.go:513 +0x1b9
github.com/operator-framework/operator-sdk/vendor/go.uber.org/zap/zapcore.(*counters).get(0xc000306000, 0x95bcf6, 0x9
3d902, 0x15, 0xc000115400)
        /home/devel/jesusr/dev/src/github.com/operator-framework/operator-sdk/vendor/go.uber.org/zap/zapcore/sampler.
go:48 +0x79
github.com/operator-framework/operator-sdk/vendor/go.uber.org/zap/zapcore.(*sampler).Check(0xc000181e60, 0xf6, 0xbf1b
12d51e5e730a, 0x62477e, 0xd57520, 0x0, 0x0, 0x93d902, 0x15, 0x0, ...)                                               
        /home/devel/jesusr/dev/src/github.com/operator-framework/operator-sdk/vendor/go.uber.org/zap/zapcore/sampler.
go:128 +0x9d
github.com/operator-framework/operator-sdk/vendor/go.uber.org/zap.(*Logger).check(0xc0000d31a0, 0xf6, 0x93d902, 0x15,
 0xc0000cd89c)
        /home/devel/jesusr/dev/src/github.com/operator-framework/operator-sdk/vendor/go.uber.org/zap/logger.go:269 +0
x137
github.com/operator-framework/operator-sdk/vendor/go.uber.org/zap.(*Logger).Check(0xc0000d31a0, 0xf6, 0x93d902, 0x15,
 0xc00019e420)
        /home/devel/jesusr/dev/src/github.com/operator-framework/operator-sdk/vendor/go.uber.org/zap/logger.go:172 +0
x48
github.com/operator-framework/operator-sdk/vendor/github.com/go-logr/zapr.(*infoLogger).Info(0xc00019e420, 0x93d902,
0x15, 0x0, 0x0, 0x0)
        /home/devel/jesusr/dev/src/github.com/operator-framework/operator-sdk/vendor/github.com/go-logr/zapr/zapr.go:
69 +0x4e
github.com/operator-framework/operator-sdk/pkg/log/zap.TestGetConfig.func1(0xc000115900)                            
        /home/devel/jesusr/dev/src/github.com/operator-framework/operator-sdk/pkg/log/zap/logger_test.go:183 +0x4d5 
testing.tRunner(0xc000115900, 0xc00019e360)
        /usr/lib/golang/src/testing/testing.go:827 +0xbf
created by testing.(*T).Run
        /usr/lib/golang/src/testing/testing.go:878 +0x35c
exit status 2
FAIL    github.com/operator-framework/operator-sdk/pkg/log/zap  0.013s   

Thanks for the suggestion @joelanford

@jmrodri jmrodri force-pushed the log-levels branch 2 times, most recently from 9056828 to e55dad6 Compare March 15, 2019 17:59
Copy link
Member

@joelanford joelanford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

jmrodri added 4 commits March 15, 2019 14:51
With this feature we can use --zap-level 10 and this will allow
log.V(10).Info("hello world") to be seen in the logs. Prior to this
change debug (log.V(1)) was the lowest value attainable.

We also disable sampling when using an integer less than -1
(i.e. beyond debug). The zap sampling code assumes the lowest number for
sampling is DebugLevel (-1) which isn't the case in this scenario.
Disabling sampling makes sense at this point because sampling is used
for better performing logs which isn't required for trace level type
logs.

We also handle negative values being passed in.
* add logger test
* do not allow negative log levels at the CLI
* disable sampling later in the call based on review
* add note about disabling sampling to help
* update sampling doc, fix log level help
@jmrodri jmrodri merged commit f5d68db into operator-framework:master Mar 15, 2019
@jmrodri jmrodri deleted the log-levels branch March 20, 2019 14:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/feature Categorizes issue or PR as related to a new feature. language/ansible Issue is related to an Ansible operator project size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants