KAFKA-13874 Avoid synchronization in SocketServer metrics by chia7712 · Pull Request #13285 · apache/kafka

chia7712 · 2023-02-21T19:41:33Z

The collections used by generating metrics are thread-safe already, so the synchronization is unnecessary

Committer Checklist (excluded from commit message)

Verify design and implementation
Verify test coverage and CI build status
Verify documentation (including upgrade notes)

divijvaidya · 2023-02-24T12:51:27Z

Hello @chia7712
I have a question about this change. Please correct me if my understanding is wrong.

In SocketServer.scala, dataPlaneAcceptors is a ConcurrentHashMap and hence, the keys could be be read/written-to by multiple threads. For size() computation, ConcurrentHashMap gives an approximate value which should suffice for metrics use case. However, the processors are stored in an ArrayBuffer which is mutable.

The purpose of this lock on the entire SocketServer object is to ensure that processors are accessed in a thread safe manner. That is why we have a similar lock on entire SocketServer object in acceptNewConnections(), removeProcessors() etc.

If we remove this lock, the processors will be accessed in a thread unsafe manner. Isn't that right?

Perhaps a better way to solve this to 1\ reduce the granularity of the lock since we don't want a lock on entire SocketServer object but just on the processors and 2\ use shared locks and exclusive lock which will allow concurrent reads to happen and the same time entire that writes happen in isolation.

chia7712 · 2023-02-24T14:46:46Z

@divijvaidya thanks for feedback!

However, the processors are stored in an ArrayBuffer which is mutable.

pardon me. the mutable ArrayBuffer won't be modified after it is created. It seems to me the "mutable" won't hurt us here.

If we remove this lock, the processors will be accessed in a thread unsafe manner. Isn't that right?

All we need to update metrics is the metricTags of processor. we won't update metricTags after processor is created. It seems to me it is thread-safe after we get collection copy from ConcurrentHashMap as the collection copy and metricTags won't get changed anymore. Also, we have Option to handle null metrics if the processor is removed by anther thread.

divijvaidya · 2023-02-28T13:48:40Z

Hey @chia7712

the mutable ArrayBuffer won't be modified after it is created

I have a different view of this. Please correct me if I am wrong.
The number of network threads is dynamically reconfigurable in a broker. processors in SocketServer.scala has one processor per thread. Hence, it is possible to add/remove entries from ArrayBuffer[Processor] while the server is already running. An example of this situation is in SocketServer.scala#reconfigure() (line 503) function where we call addProcessors() (or removeProcessors()) and eventually processors ++= listenerProcessors.

Consider the following scenario which will go wrong with the changes in this PR:

Thread 1 (metric thread) tries to read from processors at line 119 of SocketServer.scala
At the same time, Thread 2 (network thread) invokes override def reconfigure(configs: util.Map[String, _]): Unit which calls addProcessors() which tried to acquire a lock on this and successfully obtains it (because we have removed the acquisition of lock from the metrics and hence Thread 1 is running without acquiring a lock). After obtaining the lock it adds a value to ArrayBuffer[Processor].
We have both Thread 1 and Thread 2 attempting to read/write from mutable ArrayBuffer. This is not a thread safe collection and hence the result could lead to inconsistent state for ArrayBuffer.

chia7712 · 2023-02-28T14:33:46Z

Consider the following scenario which will go wrong with the changes in this PR:

@divijvaidya thanks for nice explanation. You are right. I will adopt the solution#1 - reduce the granularity of the lock. PTAL and thanks!

divijvaidya · 2023-03-01T12:33:04Z

I am afraid the latest approach will also not work. This is because there is a possibility that ArrayBuffer is internally performing size expansion while we read at dataPlaneProcessors.size or when iterating on it using the map. This concurrent access could lead to undefined results (notably, this would not have been a problem if we were using a fixed size array). Also, note that the current implementation works for controlPlaneAcceptorOpt since they do not access the processors arrayBuffer outside the lock.

When I mentioned, option 1 of using fine grained locking, I actually implied locking on processors object instead of locking on entire SocketServer object. If we go down this path, we will have to change other places in the file to acquire this processor lock when mutation and we have also have to ensure that deadlock doesn't occur when trying to acquire SocketServer lock and Processors lock.

Hence, my suggestion would be to opt for a lock-free concurrent access data structure for storing processors.

Here's our requirement for such a data structure:

we don't mutate the data structure frequently, so even if writes are slow, we are ok with that.
we require lock-free concurrent reads since we perform a read with every connection setup and every time we emit a metric
the size of the data structure is going to be small, in tens to low hundreds entries
the data structure should be able to expand it's size since we allow dynamic shrinking and expanding

Based on the above, we can choose to use a ConcurrentHashMap or a CopyOnWriteArrayList for storing processors.

chia7712 · 2023-03-01T13:20:21Z

@divijvaidya thanks for great feedback!

we have also have to ensure that deadlock doesn't occur when trying to acquire SocketServer lock and Processors lock.

that is interesting. the mutation of processors is locked by Acceptor object. It means getting only lock of SocketServer is not safe when we plan to access the processors. We should require Acceptor lock also. In other words, this issue is related not only performance but also potential bug (concurrent issue).

we can choose to use a ConcurrentHashMap or a CopyOnWriteArrayList for storing processors.

I will use CopyOnWriteArrayList

divijvaidya · 2023-03-02T11:31:05Z

  }

-  private[network] val processors = new ArrayBuffer[Processor]()
+  private[network] val processors = new CopyOnWriteArrayList[Processor]()


It would be nice if you could add a comment here on why we chose this data structure. Folks who look at this code in future will have a clear explanation of choices and tradeoffs we made for this.

divijvaidya · 2023-03-02T12:25:07Z

-  newGauge(s"${DataPlaneAcceptor.MetricPrefix}ExpiredConnectionsKilledCount", () => SocketServer.this.synchronized {
-    val dataPlaneProcessors = dataPlaneAcceptors.asScala.values.flatMap(a => a.processors)
+  newGauge(s"${DataPlaneAcceptor.MetricPrefix}ExpiredConnectionsKilledCount", () => {
+    val dataPlaneProcessors = dataPlaneAcceptors.values.asScala.flatMap(a => a.processors.asScala)


We cannot convert processors to Scala since it transforms it into a mutable ArrayBuffer.

We probably don't need Scala transformations here. Could you please try something like this:
dataPlaneAcceptors.values.stream.flatMap(a => a.processors.stream)

(same comment for other places such as line 119)

We cannot convert processors to Scala since it transforms it into a mutable ArrayBuffer.

pardon me. why we can't use "mutable ArrayBuffer" here?

divijvaidya · 2023-03-02T12:26:47Z

      metrics.metricName("io-wait-ratio", MetricsGroup, p.metricTags)
-    }
-    if (dataPlaneProcessors.isEmpty) {
+    }.toArray


This leads to object creation on every call to this metric (which is going to happen frequently). Do we really want this?

If I understand correctly, your motivation is to guard against scenarios where the number of processors between the time when we calculate ioWaitRatioMetricNames and when we calculate dataPlaneProcessors.size. To mitigate it, we can store the size in an int before this and then, we don't have to convert ioWaitRatioMetricNames to an array.

To mitigate it, we can store the size in an int before this and then, we don't have to convert ioWaitRatioMetricNames to an array.

the ioWaitRatioMetricNames might reflect modifications of processors, so the actual size of processors may get changed when we calculate the average. For example, we cache the size = 5 but the processors could be increased to 6 when we summarize all processors.

divijvaidya

Thank you for making the changes @chia7712 . The code looks good to me!

github-actions · 2023-06-10T20:47:56Z

This PR is being marked as stale since it has not had any activity in 90 days. If you would like to keep this PR alive, please ask a committer for review. If the PR has merge conflicts, please update it with the latest from trunk (or appropriate release branch)
If this PR is no longer valid or desired, please feel free to close it. If no activity occurrs in the next 30 days, it will be automatically closed.

divijvaidya · 2023-06-14T13:50:22Z

@chia7712 since you last worked on this, there has been a new development. We have introduced newer libraries [1] in Kafka which are much more efficient than copyOnWrite data structures we are using here. Could you please consider reusing those data structures instead?

[1] #13437

github-actions · 2023-09-13T03:33:16Z

This PR is being marked as stale since it has not had any activity in 90 days. If you would like to keep this PR alive, please ask a committer for review. If the PR has merge conflicts, please update it with the latest from trunk (or appropriate release branch)

If this PR is no longer valid or desired, please feel free to close it. If no activity occurs in the next 30 days, it will be automatically closed.

KAFKA-13874 Avoid synchronization in SocketServer metrics

ef338dc

chia7712 commented Feb 21, 2023

View reviewed changes

Comment thread core/src/main/scala/kafka/network/SocketServer.scala Outdated

chia7712 requested a review from cmccabe February 21, 2023 19:42

chia7712 added 2 commits February 28, 2023 22:26

Merge branch 'trunk' into KAFKA-13874

da7c9cc

address comment

a52a5fe

Merge branch 'trunk' into KAFKA-13874

f19809f

chia7712 added 2 commits March 1, 2023 21:24

address comment

c54f49a

fix test

f694582

divijvaidya reviewed Mar 2, 2023

View reviewed changes

chia7712 added 2 commits March 2, 2023 21:26

Merge branch 'trunk' into KAFKA-13874

419bac3

address comment

a7ebd45

divijvaidya approved these changes Mar 2, 2023

View reviewed changes

Comment thread core/src/main/scala/kafka/network/SocketServer.scala Outdated

chia7712 added 2 commits March 2, 2023 23:28

Merge branch 'trunk' into KAFKA-13874

55b2380

address comment

3f44d76

github-actions Bot added the stale Stale PRs label Jun 10, 2023

divijvaidya requested a review from tombentley June 12, 2023 07:13

github-actions Bot removed the stale Stale PRs label Jun 13, 2023

github-actions Bot added the stale Stale PRs label Sep 13, 2023

chia7712 closed this Mar 25, 2024

chia7712 deleted the KAFKA-13874 branch March 25, 2024 15:20

Conversation

chia7712 commented Feb 21, 2023

Committer Checklist (excluded from commit message)

Uh oh!

Uh oh!

divijvaidya commented Feb 24, 2023

Uh oh!

chia7712 commented Feb 24, 2023

Uh oh!

divijvaidya commented Feb 28, 2023

Uh oh!

chia7712 commented Feb 28, 2023

Uh oh!

divijvaidya commented Mar 1, 2023

Uh oh!

chia7712 commented Mar 1, 2023

Uh oh!

divijvaidya Mar 2, 2023

Choose a reason for hiding this comment

Uh oh!

divijvaidya Mar 2, 2023

Choose a reason for hiding this comment

Uh oh!

chia7712 Mar 2, 2023

Choose a reason for hiding this comment

Uh oh!

divijvaidya Mar 2, 2023

Choose a reason for hiding this comment

Uh oh!

chia7712 Mar 2, 2023

Choose a reason for hiding this comment

Uh oh!

divijvaidya left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions Bot commented Jun 10, 2023

Uh oh!

divijvaidya commented Jun 14, 2023

Uh oh!

github-actions Bot commented Sep 13, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants