Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 9 additions & 5 deletions docs/ops.html
Original file line number Diff line number Diff line change
Expand Up @@ -680,24 +680,28 @@ <h4><a id="prodconfig" href="#prodconfig">A Production Server Config</a></h4>

<h3><a id="java" href="#java">6.4 Java Version</a></h3>

From a security perspective, we recommend you use the latest released version of JDK 1.8 as older freely available versions have disclosed security vulnerabilities.
Java 8 and Java 11 are supported. Java 11 performs significantly better if TLS is enabled, so it is highly recommended (it also includes a number of other
performance improvements: G1GC, CRC32C, Compact Strings, Thread-Local Handshakes and more).

From a security perspective, we recommend the latest released patch version as older freely available versions have disclosed security vulnerabilities.

At the time this is written, LinkedIn is running JDK 1.8 u5 (looking to upgrade to a newer version) with the G1 collector. LinkedIn's tuning looks like this:
Typical arguments for running Kafka with OpenJDK-based Java implementations (including Oracle JDK) are:

<pre class="brush: text;">
-Xmx6g -Xms6g -XX:MetaspaceSize=96m -XX:+UseG1GC
-XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:G1HeapRegionSize=16M
-XX:MinMetaspaceFreeRatio=50 -XX:MaxMetaspaceFreeRatio=80
-XX:MinMetaspaceFreeRatio=50 -XX:MaxMetaspaceFreeRatio=80 -XX:+ExplicitGCInvokesConcurrent
</pre>

For reference, here are the stats on one of LinkedIn's busiest clusters (at peak):
For reference, here are the stats for one of LinkedIn's busiest clusters (at peak) that uses said Java arguments:
<ul>
<li>60 brokers</li>
<li>50k partitions (replication factor 2)</li>
<li>800k messages/sec in</li>
<li>300 MB/sec inbound, 1 GB/sec+ outbound</li>
</ul>

The tuning looks fairly aggressive, but all of the brokers in that cluster have a 90% GC pause time of about 21ms, and they're doing less than 1 young GC per second.
All of the brokers in that cluster have a 90% GC pause time of about 21ms with less than 1 young GC per second.

<h3><a id="hwandos" href="#hwandos">6.5 Hardware and OS</a></h3>
We are using dual quad-core Intel Xeon machines with 24GB of memory.
Expand Down