Move batching to CouchDbRestStore#2835
Conversation
| response.convertTo[Seq[BulkEntityResult]].map { singleResult => | ||
| singleResult.error | ||
| .map { | ||
| case "conflict" => Left(DocumentConflictException("conflict on 'bulk_put'")) |
There was a problem hiding this comment.
If there is one conflicting entity in the bulk request do all the entities fail? If so, is it possible to remove that conflict and retry?
There was a problem hiding this comment.
Nope, it's only a specific entity that fails. Behavior then is just like with any single request today.
There was a problem hiding this comment.
No, all other entries will have no errors. So there is no need to retry, if there is only one document with a conflict.
|
|
||
| // This the the amount of allowed parallel requests for each entity, before batching starts. If there are already maxOpenDbRequests | ||
| // and more documents need to be stored, then all arriving documents will be put into batches (if enabled) to avoid a long queue. | ||
| private val maxOpenDbRequests = system.settings.config.getInt("akka.http.host-connection-pool.max-open-requests") / 2 |
There was a problem hiding this comment.
Why is maxOpenDbRequests half of max-open-requests?
There was a problem hiding this comment.
To prevent fights of other put/reads vs. activation puts.
There was a problem hiding this comment.
Due to the current design, we have an datastore (and a batcher) for each entity. That means in the controller we have 5 batchers. All of them are sharing the 128 connections, that are configured in akka (max-open-requests).
If the controller has to write a lot of activations now, only 64 of these connections will be used on setting maxOpenDbRequests to half of max-open-requests.
So if someone else comes and wants to invoke an action that is not in the cache, it has not to wait until all the activations are written into the database.
In the invoker that also means, that writing activations will not affect getting actions from the database.
In addition, @markusthoemmes and me did some performance meassurements and there was no big difference between using 64 or 128 requests in parallel to write away activations.
| private val maxOpenDbRequests = system.settings.config.getInt("akka.http.host-connection-pool.max-open-requests") / 2 | ||
|
|
||
| private val batcher: Batcher[JsObject, Either[ArtifactStoreException, DocInfo]] = | ||
| new Batcher(500, maxOpenDbRequests)(put(_)(TransactionId.unknown)) |
There was a problem hiding this comment.
Seems like the batch size should be configurable instead of hardcoded.
There was a problem hiding this comment.
Why do you think it should be configurable? Do you have a scenario in mind, where you want to configure this size?
Btw: It is only the maximum allowed batch size.
When I was chatting with Cloudant developpers a while ago, we were talking about such bulk requests. The scenario was a bit different, but very similar (write documents as fast as possible into a database with _bulk). This person suggested a batch size of 500.
There was a problem hiding this comment.
Was thinking the value might change per deployment. Should at least add a comment stating why it's set to 500 here.
|
|
||
| def datastore(config: WhiskConfig)(implicit system: ActorSystem, logging: Logging) = | ||
| SpiLoader.get[ArtifactStoreProvider].makeStore[WhiskActivation](config, _.dbActivations) | ||
| def datastore(config: WhiskConfig)(implicit system: ActorSystem, logging: Logging, materializer: ActorMaterializer) = |
There was a problem hiding this comment.
Why not batch WhiskEntityStore as well?
| private val maxOpenDbRequests = system.settings.config.getInt("akka.http.host-connection-pool.max-open-requests") / 2 | ||
|
|
||
| private val batcher: Batcher[JsObject, Either[ArtifactStoreException, DocInfo]] = | ||
| new Batcher(500, maxOpenDbRequests)(put(_)(TransactionId.unknown)) |
There was a problem hiding this comment.
Was thinking the value might change per deployment. Should at least add a comment stating why it's set to 500 here.
|
PG2 2248 is all good. |
Move the mechanism of batching activations to the CouchDbRestStore, to have it also in the Controller in place.
Currently batching is only enabled for activations.
Dependant on #2812