From 3c6c25c7e236f42bb016b100c2176d761e3a4028 Mon Sep 17 00:00:00 2001 From: Akihiro Suda Date: Mon, 13 Nov 2017 09:49:15 +0000 Subject: [PATCH] docs: add design, roadmap, bof notes Signed-off-by: Akihiro Suda --- README.md | 8 +++-- docs/misc/design-distributed-mode.md | 28 ++++++++++++++++ docs/roadmap.md | 49 ++++++++++++++++++++++++++++ 3 files changed, 83 insertions(+), 2 deletions(-) create mode 100644 docs/misc/design-distributed-mode.md create mode 100644 docs/roadmap.md diff --git a/README.md b/README.md index bfb13355d624..43d9ed1a2934 100644 --- a/README.md +++ b/README.md @@ -25,8 +25,6 @@ Key features: - Pluggable architecture -Read the proposal from https://github.com/moby/moby/issues/32925 - #### Quick start BuildKit daemon can be built in two different versions: one that uses [containerd](https://github.com/containerd/containerd) for execution and distribution, and a standalone version that doesn't have other dependencies apart from [runc](https://github.com/opencontainers/runc). We are open for adding more backends. `buildd` is a CLI utility for serving the gRPC API. @@ -162,3 +160,9 @@ Validating your updates before submission: ```bash make validate-all ``` + +#### Documents +- https://github.com/moby/moby/issues/32925: Original proposal +- ['docs/roadmap.md'](./docs/roadmap.md): roadmap (tentative) +- ['docs/misc'](./docs/misc): miscellaneous unformatted documents +- https://drive.google.com/drive/folders/1KpNIqmiAGU1UAddscscZFMqlpvLqcQYM?usp=sharing : BoF notes diff --git a/docs/misc/design-distributed-mode.md b/docs/misc/design-distributed-mode.md new file mode 100644 index 000000000000..3fe29bf86542 --- /dev/null +++ b/docs/misc/design-distributed-mode.md @@ -0,0 +1,28 @@ +# Design: distributed mode (work in progress) + +## master +- Stateless. +-- No etcd. +-- The orchestrator (Kubernetes) is expected to restart master container on failure. + +## worker +- Workers could be started by passing the master host information. +-- A worker connect to the master and tell its workerID. +-- TODO: how to connect to the master with multiple container replicas? + +## scheduling +- The master asks the workers: "are you able to reproduce the cachekey for this operation?", and the master become aware of `map[op][]workerID`. +-- This map does not need to be 100% accurate. So we have many opportunity for optimization. +-- The master could ask cpu/mem/io stat as well and utilize such information for better scheduling. +- The master schedules a vertex job to a worker which is likely to have the most numbers of the dependee-vertice caches. +-- If none have it, choose randomly + +## scalability +- We may be able to use gossip (or something like that) for improving scalability of the cache map + +## credential +- Because the worker can need credentials at any time, and it does not have any open session with the client, it needs to ask the master who does have an open session with the client + +## local file +- we need to find the same worker that already received the content from client the first time. +- OR always sync to manager: manager can do the work of a worker (with caveat of constraints) diff --git a/docs/roadmap.md b/docs/roadmap.md new file mode 100644 index 000000000000..167301efe699 --- /dev/null +++ b/docs/roadmap.md @@ -0,0 +1,49 @@ +# BuildKit Roadmap (tentative) + +This document roughly describes the roadmap of the BuildKit project. +We will be using GitHub Projects and GitHub Milestones for more detailed roadmap. + +## Task 1 (2018Q1) +- Implement all features needed for replacing the legacy builder backend of moby-engine + +- Test, test, and test (help needed!) + +- Integrate BuildKit to moby-engine as an experimental builder backend. (`moby-engine --experimental --build-driver buildkit`) +-- Probably for Linux only + +## Task 2 (2018Q2-Q3) + +- Promote BuildKit to the default moby-engine build backend +-- At this time, BuildKit API and LLB spec do not need to be stabilized + +## Task 3 (2018Q1-Q2) + +- Implement basic distributed mode + +## Task 4 (2018-2019) + +-- Stabilize BuildKit API and LLB spec + +## Task 5 (2018-2019) +- Optimize distributed mode, especially on scalability of the distributed cache map (gossip protocol or something like that?) + +## Task 6+ +- Stabilize distributed mode +- Add more features + +- - - + +TODO: DAG-ify these roadmap tasks in more pretty format + + + + +``` +1 ----->2 + | + v +3 ----->4---+ + \ | + \ v + +--->5-->6.. +```