Skip to content

[Bug]: Prism early window pane is not processed immediately downstream #36128

@shunping

Description

@shunping

What happened?

Running the following code, we can see that the logging of the early pane and and on-time pane appear on the terminal at the same time.

The expected behavior should be for each window, the early pane is logged first and then after about 8 seconds (i.e. 10-2) the on-time pane is logged.

import apache_beam as beam
from apache_beam.options.pipeline_options import PipelineOptions
from apache_beam.transforms.window import FixedWindows
from apache_beam.transforms import trigger
from apache_beam.transforms.periodicsequence import PeriodicImpulse, RebaseMode
from apache_beam.utils.timestamp import Timestamp

# prism runner option
options = PipelineOptions([
    "--streaming",
    "--environment_type=LOOPBACK",
    "--runner=PrismRunner",
    "--experiments=prism_enable_rtc",
    "--prism_log_level=info",
     "--allow_unsafe_triggers",
])

with beam.Pipeline(options=options) as p:
  _ = (
      p | PeriodicImpulse(
          start_timestamp=Timestamp.now(),
          stop_timestamp=Timestamp.now() + 40,
          fire_interval=1,
          rebase=RebaseMode.REBASE_ALL)
      | 'window' >> beam.WindowInto(
          FixedWindows(10),
          trigger=trigger.AfterCount(2),
          accumulation_mode=trigger.AccumulationMode.DISCARDING)
      | beam.GroupBy()
      | beam.LogElements(
          with_timestamp=True, with_window=True, with_pane_info=True))

Issue Priority

Priority: 2 (default / most bugs should be filed as P2)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam YAML
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Infrastructure
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions