Skip to content

Key-based batching and hashingScheme parameter for Python client #83

@matejsaravanja

Description

@matejsaravanja

Short intro
We're running a platform that's effectively just a bunch of microservices written in Python, Java and Go which are communicating through Kafka. Currently we're considering moving from Kafka to Pulsar.
However, one of the main features we need is key-based message routing for solving concurrency problems. Pulsar has that option with KeyShared subscription mode.

The problem
In Python, KeyShared subscription works only if producer disables batching. I ran some tests in my local environment and disabling batching results in high decrease of throughput (~25k/s with batching vs ~9k/s without).

Features that would've solved the problem

  • Key-based batching for Python client
  • HashingScheme parameter when creating producer so all of our services (Java, Python and Go) could have the same hashing scheme

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions