Skip to content

flaky tests when importing module  #3262

@shaoner

Description

@shaoner

Bug Description

In some scenario, when running cargo test, objects from another thread seem to get popped from the release pool.

Here is an example:

use pyo3::{
    prelude::{pyclass, pymethods, *},
    types::{PyDict, PyModule},
};

#[cfg(test)]
mod tests {
    use super::*;

    #[pyclass]
    struct MyClass {
        pub num: u8,
    }

    #[pymethods]
    impl MyClass {
        fn get_num(&self) -> u8 {
            self.num
        }
    }

    fn run(num: u8, timeout: u8) -> PyResult<()> {
        pyo3::Python::with_gil(|py| {
            let api = PyModule::new(py, "api").unwrap();
            let obj = PyCell::new(py, MyClass { num }).unwrap();
            api.add("obj", obj).unwrap();

            let top_level = PyDict::new(py);
            top_level.set_item("api", api).unwrap();
            py.run(
                "import sys; sys.modules['api'] = api",
                Some(top_level),
                None,
            )
            .unwrap();
            let code = format!(
                r#"
import api
import time

time.sleep({timeout})
assert api.obj.get_num() == {num}
"#
            );
            py.run(&code, None, None)
        })
    }

    #[test]
    fn test_race1() -> PyResult<()> {
        run(1, 0)
    }

    #[test]
    fn test_race2() -> PyResult<()> {
        run(2, 1)
    }
}
  • in test_race1, we add a new module api in scope with an object and a method that returns 1
  • in test_race2, we add a new module api in scope with an object and a method that returns 2

Now when we run the following python code we don't always get the expected object and the assertion sometimes breaks.
In test_race2, the object api.obj will refer to the object in test_race1 and return 1 instead.

You'll notice a timeout, this is because it makes it easier to trigger this bug, basically having one test finish before the other.

I tried to find some similar issues and it seems like we previously had some mutex in some tests: dfbe22b and that got fixed

Finally, playing with the locals and globals in py.run seems to solve this but trigger other issues (like missing classes) that I wanted to keep this example simple.

Steps to Reproduce

If you run the example in the description a few times (in my case it takes 2 runs max):

while true; do cargo test || break; done

Backtrace

No response

Your operating system and version

linux

Your Python version (python --version)

python 3.11.3

Your Rust version (rustc --version)

rustc 1.68.2

Your PyO3 version

0.19

How did you install python? Did you use a virtualenv?

pyenv

Additional Info

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions