Skip to content

KAFKA-4432: Added support to supply custom message payloads to perf-producer script.#2158

Closed
SandeshKarkera wants to merge 4 commits intoapache:trunkfrom
SandeshKarkera:PerfProducerChanges
Closed

KAFKA-4432: Added support to supply custom message payloads to perf-producer script.#2158
SandeshKarkera wants to merge 4 commits intoapache:trunkfrom
SandeshKarkera:PerfProducerChanges

Conversation

@SandeshKarkera
Copy link
Copy Markdown
Contributor

Current implementation of ProducerPerformance creates static payload. This is not very useful in testing compression or when you want to test with production/custom payloads. So, we decided to add support for providing payload file as an input to producer perf test script.

We made the following changes:

  1. Added support to provide a payload file which can have the list of payloads that you actually want to send.
  2. Moved payload generation inside the send loop for cases when payload file is provided.

Following are the changes to how the producer-performance is evoked:

  1. You must provide "--record-size" or "--payload-file" but not both. This is because, record size cannot be guaranteed when you are using custom events.
    e.g. ./kafka-producer-perf-test.sh --topic test_topic --num-records 100000 --producer-props bootstrap.servers=127.0.0.1:9092 acks=0 buffer.memory=33554432 compression.type=gzip batch.size=10240 linger.ms=10 --throughput -1 --payload-file ./test_payloads --payload-delimiter ,
  2. Earlier "--record-size" was a required config, now you must provide exactly one of "--record-size" or "--payload-file". Providing both will result in an error.
  3. Support for an additional parameter "--payload-delimiter" has been added which defaults to "\n"

@SandeshKarkera
Copy link
Copy Markdown
Contributor Author

@ijuma can you please review?

@SandeshKarkera
Copy link
Copy Markdown
Contributor Author

@guozhangwang @dguy @mjsax @norwood @enothereska can you please take a look?

throw new ArgumentParserException("Either --producer-props or --producer.config must be specified.", parser);
}

if ((recordSize == null && payloadFilePath == null) || (recordSize != null && payloadFilePath != null)) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can validate this in the ArgumentParser by using parser.addMutuallyExclusiveGroup() and adding --record-size and --payload-file to the returned group and making the group .required()

@SandeshKarkera
Copy link
Copy Markdown
Contributor Author

@norwood Thanks for the review. I've removed the validation for record-size and payload-file and added them to a MutuallyExclusiveGroup as suggested.

@SandeshKarkera
Copy link
Copy Markdown
Contributor Author

SandeshKarkera commented Nov 29, 2016

Jenkins build job is failing due to time out issue.

@SandeshKarkera
Copy link
Copy Markdown
Contributor Author

@norwood, can you please review and let me know if this can be merged?

throw new IllegalArgumentException("File does not exist or empty file provided.");
}

String[] payloadList = new String(Files.readAllBytes(path), "UTF-8").split(payloadDelimiter);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems like it could be dangerous depending on how large your input file is. it would probably be best to use BufferedReader, but if you have to support custom delimiters this could get complicated.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@norwood We have to store all the messages in memory, otherwise, picking a new random message to send will be slow and complicated.
Given that all messages are stored in memory, I think its OK to use Files.readAllBytes()

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@norwood Do let me know your thoughts on the point mentioned above.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry. yeah, this makes sense. thx.

you will still need to get a commiter to 👍 this, but lgtm

@SandeshKarkera
Copy link
Copy Markdown
Contributor Author

@junrao @ijuma @guozhangwang Can you please review and merge?

@asfbot
Copy link
Copy Markdown

asfbot commented Dec 13, 2016

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/kafka-pr-jdk8-scala2.12/92/
Test PASSed (JDK 8 and Scala 2.12).

@asfbot
Copy link
Copy Markdown

asfbot commented Dec 13, 2016

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/kafka-pr-jdk8-scala2.11/93/
Test PASSed (JDK 8 and Scala 2.11).

@asfbot
Copy link
Copy Markdown

asfbot commented Dec 13, 2016

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/kafka-pr-jdk7-scala2.10/91/
Test PASSed (JDK 7 and Scala 2.10).

@SandeshKarkera
Copy link
Copy Markdown
Contributor Author

@junrao @jkreps @guozhangwang @ijuma @mumrah @sriramsub @d3fmacro : Guys, this PR has been open for a long time. It has been reviewed by @norwood . I'd really appreciate if one of you can take a look and merge this.

@junrao
Copy link
Copy Markdown
Contributor

junrao commented Jan 20, 2017

@SandeshKarkera : Thanks for the patch and sorry for the delay. LGTM

asfgit pushed a commit that referenced this pull request Jan 20, 2017
…roducer script.

Current implementation of ProducerPerformance creates static payload. This is not very useful in testing compression or when you want to test with production/custom payloads. So, we decided to add support for providing payload file as an input to producer perf test script.

We made the following changes:
1. Added support to provide a payload file which can have the list of payloads that you actually want to send.
2. Moved payload generation inside the send loop for cases when payload file is provided.

Following are the changes to how the producer-performance is evoked:
1. You must provide "--record-size" or "--payload-file" but not both. This is because, record size cannot be guaranteed when you are using custom events.
  e.g. ./kafka-producer-perf-test.sh --topic test_topic --num-records 100000 --producer-props bootstrap.servers=127.0.0.1:9092 acks=0 buffer.memory=33554432 compression.type=gzip batch.size=10240 linger.ms=10 --throughput -1 --payload-file ./test_payloads --payload-delimiter ,
2. Earlier "--record-size" was a required config, now you must provide exactly one of "--record-size" or "--payload-file". Providing both will result in an error.
3. Support for an additional parameter "--payload-delimiter" has been added which defaults to "\n"

Author: Sandesh K <sandesh.karkera@flipkart.com>

Reviewers: dan norwood <norwood@confluent.io>, Jun Rao <junrao@gmail.com>

Closes #2158 from SandeshKarkera/PerfProducerChanges

(cherry picked from commit a37bf5f)
Signed-off-by: Jun Rao <junrao@gmail.com>
@asfgit asfgit closed this in a37bf5f Jan 20, 2017
@asfbot
Copy link
Copy Markdown

asfbot commented Jan 20, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/kafka-pr-jdk8-scala2.12/1055/
Test PASSed (JDK 8 and Scala 2.12).

@asfbot
Copy link
Copy Markdown

asfbot commented Jan 20, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/kafka-pr-jdk8-scala2.11/1057/
Test FAILed (JDK 8 and Scala 2.11).

@asfbot
Copy link
Copy Markdown

asfbot commented Jan 20, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/kafka-pr-jdk7-scala2.10/1055/
Test PASSed (JDK 7 and Scala 2.10).

soenkeliebau pushed a commit to soenkeliebau/kafka that referenced this pull request Feb 7, 2017
…roducer script.

Current implementation of ProducerPerformance creates static payload. This is not very useful in testing compression or when you want to test with production/custom payloads. So, we decided to add support for providing payload file as an input to producer perf test script.

We made the following changes:
1. Added support to provide a payload file which can have the list of payloads that you actually want to send.
2. Moved payload generation inside the send loop for cases when payload file is provided.

Following are the changes to how the producer-performance is evoked:
1. You must provide "--record-size" or "--payload-file" but not both. This is because, record size cannot be guaranteed when you are using custom events.
  e.g. ./kafka-producer-perf-test.sh --topic test_topic --num-records 100000 --producer-props bootstrap.servers=127.0.0.1:9092 acks=0 buffer.memory=33554432 compression.type=gzip batch.size=10240 linger.ms=10 --throughput -1 --payload-file ./test_payloads --payload-delimiter ,
2. Earlier "--record-size" was a required config, now you must provide exactly one of "--record-size" or "--payload-file". Providing both will result in an error.
3. Support for an additional parameter "--payload-delimiter" has been added which defaults to "\n"

Author: Sandesh K <sandesh.karkera@flipkart.com>

Reviewers: dan norwood <norwood@confluent.io>, Jun Rao <junrao@gmail.com>

Closes apache#2158 from SandeshKarkera/PerfProducerChanges
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants