Skip to content

KAFKA-4453: Separating controller connections and requests from the data plane (KIP-291)#5783

Closed
gitlw wants to merge 2 commits intoapache:trunkfrom
gitlw:kip_291
Closed

KAFKA-4453: Separating controller connections and requests from the data plane (KIP-291)#5783
gitlw wants to merge 2 commits intoapache:trunkfrom
gitlw:kip_291

Conversation

@gitlw
Copy link
Copy Markdown
Contributor

@gitlw gitlw commented Oct 11, 2018

Separating controller connections and requests from the data plane (KIP-291)

  • Tested that there are no dedicated control plane endpoints or threads when control.plane.listener.name is not set.
  • Tested that the controller requests are handled by the control plane threads when
    control.plane.listener.name is set

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@gitlw
Copy link
Copy Markdown
Contributor Author

gitlw commented Oct 11, 2018

retest this please

@gitlw
Copy link
Copy Markdown
Contributor Author

gitlw commented Oct 16, 2018

@junrao @jjkoshy @MayureshGharat Can you please take a look at this PR? Thanks

@MayureshGharat
Copy link
Copy Markdown
Contributor

@gitlw Thanks for the patch. On it now.

val requestChannel = new RequestChannel(maxQueuedRequests)
private val processors = new ConcurrentHashMap[Int, Processor]()
val dataRequestChannel = new RequestChannel(maxQueuedRequests, RequestChannel.RequestQueueSizeMetric)
var controlPlaneRequestChannel: RequestChannel = null
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
var controlPlaneRequestChannel: RequestChannel = null
val controlPlaneRequestChannel: RequestChannel = if (config.controlPlaneListenerName.isDefined) {
controlPlaneRequestChannel = new RequestChannel(20, RequestChannel.ControlPlaneRequestQueueSizeMetric)
} else {
null
}

Copy link
Copy Markdown
Contributor

@MayureshGharat MayureshGharat Nov 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually how about :

Suggested change
var controlPlaneRequestChannel: RequestChannel = null
private var controlPlaneRequestChannelOpt: Option[RequestChannel] = None

We can then create the controlPlaneRequestChannel lazily when we call startup

}
private val dataProcessors = new ConcurrentHashMap[Int, Processor]()
// there should be only one controller processor, however we use a map to store it so that we can reuse the logic for data processors
private[network] val controlPlaneProcessors = new ConcurrentHashMap[Int, Processor]()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of having a map, which might indicate that we might have multiple processors for the controlPlane, do you think we can do :

Suggested change
private[network] val controlPlaneProcessors = new ConcurrentHashMap[Int, Processor]()
private[network] var controlPlaneProcessorOpt : Option[Processor] = None

We can lazily create controlPlaneProcessor when we call startup()

this.synchronized {
connectionQuotas = new ConnectionQuotas(config.maxConnectionsPerIp, config.maxConnectionsPerIpOverrides)
createAcceptorAndProcessors(config.numNetworkThreads, config.listeners)
createAcceptorAndProcessors(config.numNetworkThreads, config.dataListeners, dataRequestChannel, dataProcessors, false)
Copy link
Copy Markdown
Contributor

@MayureshGharat MayureshGharat Nov 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was thinking if we should have 2 explicit functions :

  • createDataPlaneAcceptorsAndProcessors(processorsPerListener : Int, endpoints: Seq[EndPoint])

  • createControlPlaneAcceptorAndProcessor (controlEndPointOpt : Option[EndPoint])

We can probably do something like :

Suggested change
createAcceptorAndProcessors(config.numNetworkThreads, config.dataListeners, dataRequestChannel, dataProcessors, false)
createDataPlaneAcceptorsAndProcessors(config.numNetworkThreads, config.dataListeners)
createControlPlaneAcceptorAndProcessor(config.controlPlaneListener)

The two functions can look something like this :

private def createDataPlaneAcceptorsAndProcessors(processorsPerListener: Int,
endpoints: Seq[EndPoint]) : Unit = synchronized {
endpoints.foreach { endpoint =>
val acceptor = createAcceptorWithProcessors()
addDataProcessors(acceptor, endpoint, processorsPerListener, dataRequestChannel, dataProcessors)
KafkaThread.nonDaemon(s"data-plane-kafka-socket-acceptor-${endpoint.listenerName}-${endpoint.securityProtocol}-${endpoint.port}", acceptor).start()
acceptor.awaitStartup()
acceptors.put(endpoint, acceptor)
}
}

private def createControlPlaneAcceptorAndProcessor (controlEndPointOpt: Option[EndPoint]) : Unit = synchronized {
if (controlEndPointOpt.isDefined) {
val controlPlaneEndPoint = controlEndPointOpt.get
controlPlaneRequestChannelOpt = Some(new RequestChannel(20, RequestChannel.ControlPlaneRequestQueueSizeMetric))
val controlPlaneAcceptor = createAcceptor(controlPlaneEndPoint)
val controlPlaneProcessor = newProcessor(nextProcessorId, controlPlaneRequestChannelOpt.get, connectionQuotas, controlPlaneEndPoint.listenerName, controlPlaneEndPoint.securityProtocol, memoryPool, isControlPlane = true)
controlPlaneRequestChannelOpt.get.addProcessor(controlPlaneProcessor)
val listenerProcessors = new ArrayBufferProcessor
listenerProcessors += controlPlaneProceesor
controlPlaneAcceptor.addProcessors(listenerProcessors)
KafkaThread.nonDaemon(s"control-plane-kafka-socket-acceptor-$controlPlaneListenerName-$controlPlaneSecurityProtocol-${controlPlaneEndPoint.port}", controlPlaneAcceptor).start()
controlPlaneAcceptor.awaitStartup()
acceptors.put(controlPlaneEndPoint, controlPlaneAcceptor)
}
}

private def createAcceptor(endpoint: EndPoint): Acceptor = synchronized {
val sendBufferSize = config.socketSendBufferBytes
val recvBufferSize = config.socketReceiveBufferBytes
val brokerId = config.brokerId
new Acceptor(endpoint, sendBufferSize, recvBufferSize, brokerId, connectionQuotas)
}

private def addDataProcessors(acceptor: Acceptor, endpoint: EndPoint,
newProcessorsPerListener: Int): Unit = synchronized {
val listenerName = endpoint.listenerName
val securityProtocol = endpoint.securityProtocol
val listenerProcessors = new ArrayBufferProcessor

for (_ <- 0 until newProcessorsPerListener) {
  val processor = newProcessor(nextProcessorId, dataRequestChannel, connectionQuotas, listenerName, securityProtocol, memoryPool, isControlPlane = false)
  listenerProcessors += processor
  requestChannel.addProcessor(processor)
  nextProcessorId += 1
}
listenerProcessors.foreach(p => dataProcessors.put(p.id, p))
acceptor.addProcessors(listenerProcessors)

}

processors.asScala.values.foreach(_.shutdown())
requestChannel.clear()
dataProcessors.asScala.values.foreach(_.shutdown())
controlPlaneProcessors.asScala.values.foreach(_.shutdown())
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with the above suggestion, this would be :

Suggested change
controlPlaneProcessors.asScala.values.foreach(_.shutdown())
controlPlaneProcessorOpt.map(_.shutdown())

acceptors.asScala.foreach { case (endpoint, acceptor) =>
addProcessors(acceptor, endpoint, newNumNetworkThreads - oldNumNetworkThreads)
dataAcceptors.foreach { case (endpoint, acceptor) =>
addProcessors(acceptor, endpoint, newNumNetworkThreads - oldNumNetworkThreads, dataRequestChannel, dataProcessors, false)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the above suggestion, this would be :

Suggested change
addProcessors(acceptor, endpoint, newNumNetworkThreads - oldNumNetworkThreads, dataRequestChannel, dataProcessors, false)
addDataProcessors(acceptor, endpoint, newNumNetworkThreads - oldNumNetworkThreads)

controlPlaneProcessors.asScala.values.foreach(_.shutdown())
dataRequestChannel.clear()
if (controlPlaneRequestChannel != null)
controlPlaneRequestChannel.clear()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would become :

Suggested change
controlPlaneRequestChannel.clear()
controlPlaneRequestChannelOpt.map(_.clear())

requestChannel.shutdown()
dataRequestChannel.shutdown()
if (controlPlaneRequestChannel != null )
controlPlaneRequestChannel.shutdown()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with the above suggestion :

Suggested change
controlPlaneRequestChannel.shutdown()
controlPlaneRequestChannelOpt.map(_.shutdown())

if (dataPlaneListenersAdded.nonEmpty)
createAcceptorAndProcessors(config.numNetworkThreads, dataPlaneListenersAdded, dataRequestChannel, dataProcessors, false)
if (controlPlaneListenersAdded.nonEmpty)
createAcceptorAndProcessors(1, controlPlaneListenersAdded, controlPlaneRequestChannel, controlPlaneProcessors, true)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we have a check that a control plane listener is already defined, since there should be only one controlPlaneListener, no?

var authorizer: Option[Authorizer] = None
var socketServer: SocketServer = null
var requestHandlerPool: KafkaRequestHandlerPool = null
var dataRequestHandlerPool: KafkaRequestHandlerPool = null
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we rename this to "dataPlaneRequestHandlerPool" to be consistent with "controlPlaneRequestHandlerPool" ?

@lindong28 lindong28 self-assigned this Nov 12, 2018
Copy link
Copy Markdown
Member

@lindong28 lindong28 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch @gitlw. Left some initial comments.

private val memoryPool = if (config.queuedMaxBytes > 0) new SimpleMemoryPool(config.queuedMaxBytes, config.socketRequestMaxBytes, false, memoryPoolSensor) else MemoryPool.NONE
val requestChannel = new RequestChannel(maxQueuedRequests)
private val processors = new ConcurrentHashMap[Int, Processor]()
val dataRequestChannel = new RequestChannel(maxQueuedRequests, RequestChannel.RequestQueueSizeMetric)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nits: to be consistent with variable name controlPlaneRequestChannel, would we name this parameter dataPlaneRequestChannel?

private val requestLogger = Logger("kafka.request.logger")

val RequestQueueSizeMetric = "RequestQueueSize"
val ControlPlaneRequestQueueSizeMetric = "ControlPlaneRequestQueueSize"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nits: could we also rename variable RequestQueueSizeMetric to DataPlaneRequestQueueSizeMetric to be consistent with the variable name ControlPlaneRequestQueueSizeMetric? This could also reduce confusion for the variable requestQueueSizeMetric used in class RequestChannel(val queueSize: Int, val requestQueueSizeMetric: String).

val requestChannel = new RequestChannel(maxQueuedRequests)
private val processors = new ConcurrentHashMap[Int, Processor]()
val dataRequestChannel = new RequestChannel(maxQueuedRequests, RequestChannel.RequestQueueSizeMetric)
var controlPlaneRequestChannel: RequestChannel = null
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general in scala we prefer to use val and use None to indicate null. Can we do the following:

var controlPlaneRequestChannel: Option[RequestChannel] = config.controlPlaneListenerName.map(_ => new RequestChannel(20, RequestChannel.ControlPlaneRequestQueueSizeMetric))

if (config.controlPlaneListenerName.isDefined) {
controlPlaneRequestChannel = new RequestChannel(20, RequestChannel.ControlPlaneRequestQueueSizeMetric)
}
private val dataProcessors = new ConcurrentHashMap[Int, Processor]()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nits: can we name it dataPlaneProcessors

createAcceptorAndProcessors(config.numNetworkThreads, config.listeners)
createAcceptorAndProcessors(config.numNetworkThreads, config.dataListeners, dataRequestChannel, dataProcessors, false)
if (config.controlPlaneListener.isDefined)
createAcceptorAndProcessors(1, Seq(config.controlPlaneListener.get), controlPlaneRequestChannel, controlPlaneProcessors, true)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we simplify the code here to createAcceptorAndProcessors(1, config.controlPlaneListener.toSeq, controlPlaneRequestChannel, controlPlaneProcessors, true) without checking config.controlPlaneListener.isDefined? createAcceptorAndProcessors will do nothing if endpoints is an empty sequence.

info(s"Resizing network thread pool size for each listener from $oldNumNetworkThreads to $newNumNetworkThreads")
val dataAcceptors = config.controlPlaneListenerName match {
case Some(controlPlaneListenerName) =>
acceptors.asScala.filter{ case (endpoint, _) => !endpoint.listenerName.value().equals(controlPlaneListenerName.value())}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general it is preferred to identify type based on e.g. enum, or use separate variables (e.g. dataPlaneAccetor vs. controlPlaneAcceptor), rather than by matching the string value. Could you see if there is a way to improve it.

time: Time) extends Runnable with Logging {
this.logIdent = "[Kafka Request Handler " + id + " on Broker " + brokerId + "], "
// the requestsHandled counter is defined only for testing
var requestsHandled: Long = 0L
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We generally don't want to add state/variable specifically for unit test, particularly if there is no existing code which does this.

Usually we can do unit test using mockit to verify that something is triggered. Can you see if this is doable? Also, maybe we can just read the newly defined metrics value to verify that the newly defined processor/acceptor has processed some requests.

}

// the receivesProcessed counter is defined only for testing
private[network] var receivesProcessed: Long = 0L
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We generally don't want to define variable and state just for unit tests. Can you find a way to test the code without adding such variables?

time: Time,
numThreads: Int) extends Logging with KafkaMetricsGroup {
numThreads: Int,
requestHandlerAvgIdleMetric: String,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nits: requestHandlerAvgIdleMetric => requestHandlerAvgIdleMetricName seems better.


var apis: KafkaApis = null
var dataPlaneApis: KafkaApis = null
var controlPlaneApis: KafkaApis = null
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

KafkaApis contains the logic to handle request and itself should contain no state (other than those state in its constructor parameters). So it seems weird to have two KafkaApis. Currently KafkaApis only uses requestChannel with together with RequestChannel.Request. One reasonable approach is to put the requestChannel in RequestChannel.Request similar to the existing RequestChannel.Request.processor. So that we no longer need two KafkaApis instance. I am not sure this is the best alternative approach though.

@gitlw
Copy link
Copy Markdown
Contributor Author

gitlw commented Nov 16, 2018

Closing this PR as @MayureshGharat is taking over the ownership in #5921

@gitlw gitlw closed this Nov 16, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants