- Request flow
- Request Methods
- Request Payloads
- Component Contracts
- Request Management
- State Management
ML Graph adds two specific operations: Route and Join. Users can provide custom implementations for these.
An example graph:
One prediction request flow through this could:
Another prediction request could take the following route:
For each node in the graph the request flow (and possible knative integration) is shown below:
Each node will define at least 1 of:
- Join : ensemble requests
- Predict/Transform : call prediction server or transformation service
- Route : route to some subset of connected nodes
- Predict : Get a prediction from the graph
- Example
http://endpoint/predict
- Example
- Reward : Send a reward for a previous prediction
- Example
http://endpoint/reward
- Example
- Explain : Explain a particular request
- Example
http://endpoint/explain
- Example
- Outlier : Check for Outlier
- Example
http://endpoint/outlier
- Example
- Skew : Check for Skew/Concept Drift
- Example
http://endpoint/skew
- Example
- Boas : Check for bias
- Example
http://endpoint/bias
- Example
- Request : Tensorflow or Seldon request payloads until we unify on a single MLGraph payload.
- Response: Tensorflow or Seldon response payloads until we unify on a single MLGraph payload.
- Request: A payload providing a reward for a previous prediction.
- Response: Success notification
A new request payload will need to be created as there is no standard for this.
An example:
Send a reward of 1 for a previous prediction with prediction UID 1234:
{
"puid": 1234
"reward": 1
}- The reward needs to hit the graph nodes that the given request
1234travelled.
- Request: Tensorflow or Seldon request payloads until we unify on a single MLGraph payload.
- Response: Explanation method specific response payload: TBD
- Request: Tensorflow or Seldon request payloads until we unify on a single MLGraph payload.
- Response: Outlier response: TBD
- Request: empty request
- Response: Current skew estinate: TBD
- Request: empty request
- Response: Current bias estimate: TBD
A user should provide a server that:
For predict calls:
- Finds the child nodes from a provided environment variable `MLGRAPH_CHILDREN``
- Returns a response with possibly modified payload with Header
mlgraph/routewith value the list of child node names the request should be routed to.
For reward calls:
- Handles the reward for the given
puidand returns an empty reply with success/failure
A user should provide a server that:
For predict calls:
- Receives requests and eventually returns an aggregated request.
- Upon receiving the request if the merge should not be carried out then an empty reply should be returned.
- Each request will contain a header
mlgraph/pending-requestswhich will provide an upper bound on the number of requests that may still arrive for this transaction. It is an upper bound as other requests in this transaction may not have passed through all routing elements in the graph so it is uncertain if all routes to this node will actually be taken.
For reward calls:
- Handles the reward for the given
puidand returns an empty reply with success/failure
Current protocols such as Seldon's have the same payload for request and responses so components can always we connected into a graph. For other protocols such as Tensorflow's the prediction request payload differs from the prediction response and therefore the ability of nodes in a graph being able to be connected would need verification.
For predict calls
- Receive a predict request and generate explanation response to reporting endpoint
For explain calls
- Receive a predict payload and generate a synchronous response
For Predict calls
- Receive a predict request and generate outlier response to reporting endpoint
For Outlier calls
- Receive a predict payload and generate an outlier synchronous response
TBD
The MLGraph data plane will need to add components to add required headers for routing and joining, in particular:
mlgraph/routable-nodes: added to allow custom routing to know which child nodes can be routed tomlgraph/pending-requests: added to allow custom joiner to know it can carry out join on all payloads- After a routing operation routes NOT taken would decrease pending request paths to future join operations
Options:
- Add sidecars to routing and join nodes to pre/postprocess the request/responses.
- Add separate KNative services to pre/postprocess request/responses
Either method will need to add headers to the request to allow future components to calculate the mlgraph/pending-requests value for each node.
State management is needed for several reasons:
- Stateful routers/models e.g., Running online models such as multi-armed bandit solvers that need to keep current state to decide which route a new request will travel
- Reward API calls that need to be sent to nodes a previous request travelled through
- Scaling up and down of above stateful models means new replicas need:
- warm start to ensure state does not need to be recreated
- state sharing in multi replicas scenarios to ensure (if needed) that all models are in sync
The calls that might be needed by components are:
- Get (previous) request payload with this prediction ID at this node.
- Get state for my node
- Push my state to reconcile with global state for my node (and return new reconciled global state)
One option is to provide a surrounding management layer to handle state for the nodes:
A simpler option would be to provide a default state service, such as Redis and provide the details for users to connect to it via environment variables. They then have the responsibility to manage their own state.
Stateful serverless has been discussed recently:
- Akka proposal
- See here of critique of using simpler CRUD.




