Skip to content

Activation Persister Service#4632

Closed
chetanmeh wants to merge 59 commits intoapache:masterfrom
chetanmeh:persister
Closed

Activation Persister Service#4632
chetanmeh wants to merge 59 commits intoapache:masterfrom
chetanmeh:persister

Conversation

@chetanmeh
Copy link
Member

@chetanmeh chetanmeh commented Sep 18, 2019

This PR introduces a new micro service - Activation Persister Service (APS). This is based on discussion done here and here.

Description

This service fullfills following objective

  1. Move out the load of persisting Activation results from Invoker to a new service
  2. Enables a more controlled write to backend db in case rate of production of activation is more than the rate at which db can persist them within given capacity

With this service the activation results would now instead written to a Kafka topic and then picked by this service and then saved in db. The approach taken here varies with following parameters

  1. Is the activation for a blocking or non blocking call
  2. Is the activation result being persisted from Controller (for sequence/composition) or from Invoker
  3. Does the setup stores logs as part of activation or store them seprately

Components

The implementation consist of following high level components

  1. Activation Persister Service (APS) - A new service impl in core/persister module. It is based on Alpakka Kafka and reads the activation record from Kafka and saves them to db
  2. KafkaActivationStore - An extension of ArtifactActivationStore which also routes to Kafka via MessageProducer connector
  3. InvokerReactive - The send logic would be modfied (details below)
  4. New Kafka topic
    1. completed-others - This topic is meant to be used for setups using LogDriverLogStore i.e. where logs are not stored in activation result. Further note that the name of topic confirms to pattern completed.* such that Kafka consumer used within ActivationPersisterService can pickup this topic along with other completed topics created per controller
      1. Non blocking activations - Currently non blocking activations are not sent over Kafka. So this topic would be used for them. We did used the existing completed<controllerId> as they are strictly used for blocking case
      2. Activations originating from controller - All activations from controller like for trigger, conductor and sequence would be sent via this topic. Again these are currently directly stored in db and not sent on any other kafka topic
    2. activations - This topic name would be used when setup stores the log within activation result. Note that in this case if APS used one need to ensure that Kafka can carry messages of size upto 10MB (or configured log limits)

Some key aspects which impact the implementation are

  1. Blocking/Non Blocking - Depending on invocation type the nature of activation result sent via Kafka completed topic varies
    a. Blocking - The ResultMessage contains the activation result without logs
    b. Non Blocking - Only CompletionMessage is sent

Activation Persister Service

This service implements a streaming flow based on Alpakka Kafka. It runs in 2 modes

ActivationResult without Logs

If the setup does not store the activation with logs (uses some LogDriverLogStore) then persister service would listen for ResultMessage (and CombinedCompletionAndResultMessage post #4624) on all completed.* topics. This would also pickup records from completed-others topic.

Further how ActivationStore gets used would change

  1. Within Controller - Store all activations to completed-others via KafkaActivationStore
  2. Within Invoker - The store function would be initialized to a noop function. Here we did not went for a NoopActivationStore as that would cause issue when configuring this setup with StandaloneOpenWhisk for test setup as such a setup can only have a single ActivationStore impl

InvokerReactive send flow would send all WhiskActivation result for blocking calls to existing completed<instanceId> topic. In addition non blocking results would also get routed to Kafka to completed-others

ActivationResult with Logs

In this mode APS would listen for messages on activations topic. KafkaActivationStore would route all activations to this topic

Related issue and scope

  • I opened an issue to propose and discuss this change (#????)

My changes affect the following components

  • API
  • Controller
  • Message Bus (e.g., Kafka)
  • Loadbalancer
  • Invoker
  • Intrinsic actions (e.g., sequences, conductors)
  • Data stores (e.g., CouchDB)
  • Tests
  • Deployment
  • CLI
  • General tooling
  • Documentation

Types of changes

  • Bug fix (generally a non-breaking change which closes an issue).
  • Enhancement or new feature (adds new functionality).
  • Breaking change (a bug fix or enhancement which changes existing behavior).

Checklist:

  • I signed an Apache CLA.
  • I reviewed the style guides and followed the recommendations (Travis CI will check :).
  • I added tests to cover my changes.
  • My changes require further changes to the documentation.
  • I updated the documentation where necessary.

@codecov-io
Copy link

codecov-io commented Sep 20, 2019

Codecov Report

Merging #4632 into master will increase coverage by 0.68%.
The diff coverage is 15%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master   #4632      +/-   ##
=========================================
+ Coverage   77.92%   78.6%   +0.68%     
=========================================
  Files         198     199       +1     
  Lines        8841    8879      +38     
  Branches      614     624      +10     
=========================================
+ Hits         6889    6979      +90     
+ Misses       1952    1900      -52
Impacted Files Coverage Δ
...openwhisk/core/database/KafkaActivationStore.scala 0% <0%> (ø)
...ain/scala/org/apache/openwhisk/spi/SpiLoader.scala 80% <0%> (-20%) ⬇️
...ache/openwhisk/core/database/ActivationStore.scala 33.33% <0%> (-33.34%) ⬇️
...la/org/apache/openwhisk/http/BasicRasService.scala 100% <100%> (+100%) ⬆️
.../scala/org/apache/openwhisk/core/WhiskConfig.scala 94.85% <100%> (+2.31%) ⬆️
...pache/openwhisk/core/invoker/InvokerReactive.scala 78.81% <60%> (+77.93%) ⬆️
...core/database/cosmosdb/RxObservableImplicits.scala 0% <0%> (-100%) ⬇️
...e/database/cosmosdb/cache/ChangeFeedConsumer.scala 0% <0%> (-100%) ⬇️
...ore/database/cosmosdb/cache/CacheInvalidator.scala 0% <0%> (-100%) ⬇️
...core/database/cosmosdb/CosmosDBArtifactStore.scala 0% <0%> (-96.23%) ⬇️
... and 66 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c674757...2460cdc. Read the comment docs.

# Maximum request size. By default it uses the MAX_ACTIVATION_LIMIT as computed from
# `whisk.activation.payload`. Bump it to higher value if the activation result also
# includes the logs
# max-request-size = 2 MB
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to have conflicting size limits (what if this is lower than the activation payload)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes thats possible. Probably I can add a programatic check in KafkaActivationStore where we pick the max and log a warning. Purpose here is to ensure that no activation is rejected due to max payload size error as the fallback route to add to db is disabled

logging.info(
this,
s"posted activation to Kafka topic ${config.activationsTopic} - ${activation.activationId}")
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log failures?

activations {
# Enable this if ActivationStore is to be disabled and all activation results are
# to be sent to Kafka topic for further processing by ActivationPersisterService
activation-store-enabled = true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is strangely named: you enable this property to send to kafka instead of the "activation store" (which i'd think is the existing store)...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description here was wrong. Instead it would be

# Disable this if ActivationStore should not be used and all activation results are
# to be sent to Kafka topic for further processing by ActivationPersisterService

activationStore.storeAfterCheck(activation, context)(tid, notifier = None)
private val store = if (activationStorageConfig.activationStoreEnabled) {
(tid: TransactionId, activation: WhiskActivation, context: UserContext) =>
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need the {} here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically not as its a single statement but it help in clarifying the boundaries

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm getting a little confused, if we set activationStoreEnabled to false, the store will do nothing here(code will not jump to KafkaActivationStore.store neither), then how can activations be sent to kafka? I see that only blocking activations result will be sent to completedN topic, but for non-blocking activations, they will be missing

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Support for non-blocking activations is currently pending.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any update on whether this will support non-blocking activations?


case class KafkaActivationStoreConfig(activationsTopic: String, db: Boolean, maxRequestSize: Option[ByteSize])

class KafkaActivationStore(producer: MessageProducer,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this loaded/activated via SPI?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes via ActivationStore SPI

implicit transid: TransactionId,
notifier: Option[CacheChangeNotification]): Future[DocInfo] = {
if (config.db) {
sendToKafka(activation).flatMap(_ => super.store(activation, context))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should there be a series of stores that are configured at top level, all of which receive the activation - to keep the db setting from trickling down to subtypes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With recent addition of ActivationStoreWrapper this is now handled and KafkaActivationStore would now delegate to the configured primary store here if configured

ConsumerSettings(system, new StringDeserializer, new ByteArrayDeserializer)
.withGroupId(config.groupId)
.withBootstrapServers(config.kafkaHosts)
.withProperty(ConsumerConfig.CLIENT_ID_CONFIG, config.clientId)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be worth to make this blank or to use the unique ID for each instance?
We may want to scale out the microservice at some point.
Then each instance will form one consumer group.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be done. Currently client.id is used to access the lag stats from JMX MBean. Hence made it deterministic via config.

From docs for client.id I was not sure of its significance in multi consumer group settings. Currently if you do not specify the id then it gets something like consumer-0 based on a static counter. So 2 consumer in same group would still end up getting same id.

An id string to pass to the server when making requests. The purpose of this is to be able to track the source of requests beyond just ip/port by allowing a logical application name to be included in server-side request logging.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes I got confused.

I thought the client id should be unique.
But actually a consumer id should be unique.
And it is generated with random UUID.
https://github.com/apache/kafka/blob/e24d0e22abb0fb3e4cb3974284a3dad126544584/core/src/main/scala/kafka/coordinator/group/GroupMetadata.scala#L373

TOPIC                          PARTITION  CURRENT-OFFSET  LOG-END-OFFSET  LAG        CONSUMER-ID                                       HOST                           CLIENT-ID
completed0                     0          347324          347324          0          consumer-1-0da778fb-61f2-414c-8c36-82b60e60fdc4   /10.64.74.90                   consumer-1
-                              -          -               -               -          dominic-7fc8c30b-8b99-46b0-84e9-33024f466d15      /10.64.74.90                   dominic
-                              -          -               -               -          consumer-1-8e518874-1789-42cd-b0f4-914b8e610ba5   /10.64.71.195                  consumer-1
-                              -          -               -               -          dominic-397e49df-5cd7-4ef8-a0cd-7cc75ad198c8      /10.64.74.90                   dominic
-                              -          -               -               -          consumer-1-61d8894f-0d53-43c8-b2d4-cb33bb47be96   /10.64.74.90                   consumer-1

# - earliest : automatically reset the offset to the earliest offset
# - latest : automatically reset the offset to the latest offset
# - none : throw exception to the consumer if no previous offset is found for the consumer's group
auto.offset.reset = "earliest"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implies the multiple processing of the same activation.
Some activation stores would just override existing activations but others may not.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the semantics being implemented here are at-least-once processing. So same activation may be processed multiple times

//Recover for conflict case as its possible in case of unclean shutdown persister need to process
// same activation again
//Checking the chain as exception may get wrapped
case t if Throwables.getRootCause(t).isInstanceOf[DocumentConflictException] => Future.successful(Done)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh here, it would take care of the situation where same activations are processed multiple time.

actorSystem: ActorSystem,
actorMaterializer: ActorMaterializer,
logging: Logging)
extends ArtifactActivationStore(actorSystem, actorMaterializer, logging) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any possibility that KafkaActivationStore can be used with other ActivationStore rather than just ArtifactActivationStore?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's hard for KafkaActivationStore to be compatible with other ActivationStore as different ActivationStore may have different definition for methods store, get, delete and countActivationsInNamespace, and so on, and we cannot define these methods except the store in the KafkaActivationStore

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current ActivationPersisterService is also using the ActivationStore for storage. So in theory we can support others.

I can introduce a config to capture base activation store and then have KafkaActivationStore act as decorator for that. Would give it a try

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is implemented now. There is now a config to configure the primary store via primary-store-provider which defaults to ArtifactActivationStoreProvider

whisk {
  kafka-activation-store {
    # Name of the Kafka topic for sending WhiskActivations
    # Set to
    # - `completed-others` - When LogDriverLogStore is used
    # - `activations` - When other LogStore is used where logs are stored as part of activation result
    activations-topic = "completed-others"

    # Primary activation store used for eventual storage and querying of activation records
    primary-store-provider = org.apache.openwhisk.core.database.ArtifactActivationStoreProvider

    # Also store directly to primary ActivationStore. This mode can be used for initial
    # trial runs where data is sent to both primary store and also via Kafka to ActivationPersisterService
    store-in-primary = false

    # Maximum request size. By default it uses the MAX_ACTIVATION_LIMIT as computed from
    # `whisk.activation.payload`. Bump it to higher value if the activation result also
    # includes the logs
    # max-request-size = 2 MB
  }
}

So one can configure any ActivationStore impl and KafkaActivationStore would wrap it and handles the store flow. Rest of the flow is directed back to primary store

# Set to
# - `completed-others` - When LogDriverLogStore is used
# - `activations` - When other LogStore is used where logs are stored as part of activation result
activations-topic = "completed-others"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel there may be a more intuitive name for this topic(completed-others).
It makes me vision activations sent to some misc controllers or else at first sight.

I am not quite sure it's worth to do and I know this is to use the topic pattern, but how about using a more intuitive name? such as activations-with-logs and just activations?

Opinion from others would be appreciated.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also do not like this name :).

For LogDriverLogStore case I need a name which confirms to pattern completed.* as consumer listens to all topic with that pattern. So any name confirming to that can be used

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was also thinking this about the name. Would completed-async work? It's at the very least much more descriptive than others

@style95 style95 added the stale old issue which needs to validate label May 20, 2023
@style95
Copy link
Member

style95 commented May 20, 2023

@chetanmeh
I marked this PR as stale to close soon due to long inactivity.
Please let me know if you would work on it again.

@style95 style95 closed this May 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

stale old issue which needs to validate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants