Skip to content

[Bug]: apache beam python SDK hangs and crashes with segmentation fault errors with orjson 3.9.4 #28318

@dankuchler

Description

@dankuchler

What happened?

A bug introduced in orjson dependency (ijl/orjson#415) might cause Beam Python pipelines to crash with a segmentation fault or get stuck. Beam uses orjson in BigQuery IO, users of this IO might be affected.

Mitigation

Until Beam 2.51.0 is released, consider any of the following workarounds:

Original report

In our latest deployment of our apache beam pipeline our dependency for orjson (dependency of the python apache beam SDK) was upgraded from 3.9.2 to 3.9.4.

The apache beam SDK has a dependency on orjson < 4.0 here:

https://github.com/apache/beam/blob/master/sdks/python/setup.py#L233

With this upgrade of orjson from 3.9.2 to 3.9.4 we are periodically seeing our apache beam SDK hang or the workers crash with segmentation fault errors that we believe is related to this issue in the orjson project:

ijl/orjson#415

When reverting from orjson 3.9.4 to 3.9.2 it seems that the issues are resolved.

The python apache beam SDK may want to limit orjson to 3.9.2 or below until orjson issue 415 is resolved.

Issue Priority

Priority: 2

Issue Components

  • Component: Python SDK

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2bugdone & doneIssue has been reviewed after it was closed for verification, followups, etc.python

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions