[py] Log complete python errors + tracebacks when check import fails#259
[py] Log complete python errors + tracebacks when check import fails#259
Conversation
In some cases the python module of a check can't be imported because of a python code issue, in that case logging the traceback is useful. Use a common function that retrieves and formats the python interpreter error, and clears the error flag in the interpreter. Any python c api call that returns an error should be followed by a call of that common function to get a formatted string of the error.
| // getPythonError returns string-formatted info about a Python interpreter error that occurred, | ||
| // and clears the error flag in the Python interpreter | ||
| func getPythonError() (string, error) { | ||
| gstate := NewStickyLock() |
There was a problem hiding this comment.
I can't remember off the top of my head if this was safe. As the reviewer I should make sure, but until I do, this could be a deadlock situation here.
There was a problem hiding this comment.
This is fine on the python side... but there could be collateral:gstate python.PyGILState can be called with python.PyGILState_Ensure() multiple times, as per the python docs (https://docs.python.org/2/c-api/init.html#c.PyGILState_Ensure). However, as we just discussed offline, there's an underlying problem:
When we make nested calls to StickyLock(), we will "lock" the goroutine to a thread, and there should be no problem in doing this twice. But during the unlocking, when the nested function unlocks, it will call runtime.UnlockOSThread(), and at that point the goroutine will be preemptible from the current thread and could potentially change to a different one (with the known ugly sideeffects).
There was a problem hiding this comment.
The StickyLock is not meant to be called more than once so it shouldn't be nested.
truthbk
left a comment
There was a problem hiding this comment.
Let's remove the nested lock for now, until we find an elegant way of dealing with this. We should also create an issue to look into this carefully.
Using a nested lock unlocks to goroutine to the thread when the child lock is unlocked (whereas we'd want that to happen when the parent lock is unlocked). Until we can figure out a good way to deal with this situation, don't use nested locks, and add a warning to `getPythonError`.
|
Thanks @truthbk for pointing this out, I've updated the code to not use nested locks |
Bumps [github.com/aws/aws-sdk-go-v2/service/secretsmanager](https://github.com/aws/aws-sdk-go-v2) from 1.39.13 to 1.40.1. - [Release notes](https://github.com/aws/aws-sdk-go-v2/releases) - [Changelog](https://github.com/aws/aws-sdk-go-v2/blob/main/changelog-template.json) - [Commits](aws/aws-sdk-go-v2@service/sfn/v1.39.13...service/s3/v1.40.1) --- updated-dependencies: - dependency-name: github.com/aws/aws-sdk-go-v2/service/secretsmanager dependency-version: 1.40.1 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
What does this PR do?
In some cases the python module of a check can't be imported because
of a python code issue, in that case logging the traceback is useful. This PR
does just that.
Motivation
Without this change it's impossible to debug an issue with the import of a python
check module, based on the logs alone.
Additional Notes
Use a common function that retrieves and formats the python
interpreter error, and clears the error flag in the interpreter.
Any python c api call that returns an error should be followed by
a call of that common function to get a formatted string of the error.
The signature of that function could be changed at some point to return
something more meaningful than just a string, but I didn't feel the need for that
right now.