Skip to content

[py] Improve handling of py interpreter errors#290

Merged
olivielpeau merged 5 commits intomasterfrom
olivielpeau/fix-py-error
Jun 14, 2017
Merged

[py] Improve handling of py interpreter errors#290
olivielpeau merged 5 commits intomasterfrom
olivielpeau/fix-py-error

Conversation

@olivielpeau
Copy link
Copy Markdown
Member

What does this PR do?

2 things:

  1. fix a regression introduced in [py/loader] Unlock stickyLock before calling check.Configure #266 by using the same stickyLock when calling the interpreter and fetching the error (see commit description of 77a91da)
  2. Various minor improvements of getPythonError() (see commit description of 392287c)

Motivation

Specifically, changing locks between a python call that sets the error indicator
on the interpreter and a fetch of that error can result in the
error indicator not being there anymore when the fetch is attempted. When that happens
we "lose" the error and have no idea what the py interpreter error was.

Generally, this change makes the py loader and runner more reliable, which is pretty
critical when troubleshooting issues on the agent.

Additional Notes

Generally, I'm a bit concerned about how the goroutine thread scheduling made by the go runtime can affect the python interpreter state. Touching anything in the py package is pretty error-prone... Having "system" tests on the bindings will definitely help. We could also add a big warning in the README of the py package :)

@masci
Copy link
Copy Markdown
Contributor

masci commented Jun 12, 2017

Generally, I'm a bit concerned about how the goroutine thread scheduling made by the go runtime can affect the python interpreter state.

In this specific case we could investigate who did reset the error state but not sure it's worth it, the point is by the moment we release the lock we should assume anything can happen.

Touching anything in the py package is pretty error-prone... Having "system" tests on the bindings will definitely help.

It will, but working with the embedded interpreter in an async fashion like we do will be always error prone since bugs are only exposed under particular circumstances.

We could also add a big warning in the README of the py package :)

there's a disclaimer already but feel free to make it more dramatic 😄

If it helps to avoid the confusion with this specific issue, we could make getPythonError a method of the StickyLock struct, stressing the fact that if you release the lock you might lose the error.

@olivielpeau
Copy link
Copy Markdown
Member Author

Thanks @masci for the review, I made getPythonError a method of stickyLock as you suggested, it should help making less mistakes when using it. Let me know what you think of the new changes! :)

Comment thread pkg/collector/py/utils.go Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if it's worth it but we could make this thread safe with an atomic write.
UnlockOSThread is a noop if called again, and PyGILState_Release is safe to call passing an UNLOCKED state, see https://github.com/python/cpython/blob/master/Python/pystate.c#L951, if we add an atomic write we can safely state unlock is thread safe and operates as a noop when called on an unlocked stickylock

@masci
Copy link
Copy Markdown
Contributor

masci commented Jun 14, 2017

One small comment but I confirm my 👍

Changing locks between a python call that sets the error indicator
on the interpreter and a fetch of that error can result in the
error indicator not being there anymore when the fetch is attempted.

I'm not sure exactly why this happens. The fact that it happens
"randomly" makes me think that it's related to how the goroutines are
scheduled on OS threads by the go runtime, and to how this can affect
the state of the python interpreter.
* Use `getPythonError` in py's `check.Run` to fetch errors from the
python interpreter. We should use it wherever possible.
* Improvements on `getPythonError`:
  * check that an error occurred before trying to fetch it (improves
  quality of error that's returned by the function)
  * normalize the python error (recommended by the python C API docs)
  * use the string representation of `pvalue` (improves error message)
`getPythonError` is now a method of the `stickyLock` struct for clarity
@olivielpeau olivielpeau force-pushed the olivielpeau/fix-py-error branch from b4a5f0c to 96778b0 Compare June 14, 2017 17:34
@masci
Copy link
Copy Markdown
Contributor

masci commented Jun 14, 2017

:shipit:

@olivielpeau olivielpeau merged commit 29cf9fc into master Jun 14, 2017
@olivielpeau olivielpeau deleted the olivielpeau/fix-py-error branch June 14, 2017 19:43
s-alad pushed a commit that referenced this pull request Jan 6, 2026
Bumps [github.com/aws/aws-sdk-go-v2/service/secretsmanager](https://github.com/aws/aws-sdk-go-v2) from 1.40.3 to 1.40.5.
- [Release notes](https://github.com/aws/aws-sdk-go-v2/releases)
- [Changelog](https://github.com/aws/aws-sdk-go-v2/blob/main/changelog-template.json)
- [Commits](aws/aws-sdk-go-v2@service/efs/v1.40.3...service/sfn/v1.40.5)

---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go-v2/service/secretsmanager
  dependency-version: 1.40.5
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants