Skip to content

[prism] PrismRunnerTest::test_windowing - session windowing failing. #32085

@lostluck

Description

@lostluck

PrismRunnerTest::test_windowing is failing with

apache_beam.testing.util.BeamAssertException: Failed assert: [('k', [1, 2]), ('k', [100, 101, 102])] == [('k', [1, 2, 100, 101, 102])], unexpected elements [('k', [1, 2, 100, 101, 102])], missing elements [('k', [1, 2]), ('k', [100, 101, 102])] [while running 'assert_that/Match']

which happens to be because test_windowing validates session windows.

Examining from prism's side, the issue is two fold: 1, that the session merging logic is wrong, we end up leading to duplicated data. And 2, Python isn't encoding the timestamps properly.

For 1. The fix is to actually delete the old data references from the window map, after extracting them, and to ensure the final data is actually put into the map afterwards.

For 2, I can override the existing test just for prism for now while we figure out the type problems. It doesn't seem like the other Runner suites are overriding it though. This might be a Python version thing I'm not familiar with.

I can also then enhance the test so there's a "middle" grouping, which will be a better test of the merging logic anyway.

It's not clear if this would succeed in an unbounded context though, rather than a batch context. Sessions are similar to triggers in that respect.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions