Skip to content

hpcclab/OaaS-Tutorial

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OaaS Tutorial

Access to code

We created the tutorial so that the audience can follow it step-by-step. If something go wrong, you can check the complete version using git checkout.

git checkout steps-4
# OR
git checkout steps-5

Prerequisites

Required tools

  • Git

  • Docker

  • ocli

  • Python 3

  • Oparaca Cluster (optional for the last section)

Clone the tutorial project

git clone https://github.com/hpcclab/OaaS-Tutorial.git
cd oaas-tutorial

1. Basic ocli operations

In this section, we will guide you on how to use ocli for dev mode.

ocli -h
ocli dev -h
Note
Use -h to check the command reference.

1.1 Create pkg.yml

First, we create pkg.yml file that define the class demo.

pkg.yaml
name: tutorial # (1)
classes:
  - name: demo # (2)
    parents: ['builtin.base'] # (3)
  1. Set the name of this package to tutorial

  2. Set the class name to demo

  3. Each class in Oparaca can inherit the behavior from the parent class. In this case, we make demo inherit builtin.base, which is the base class for any class that should be inherited.

1.2 Apply package to ocli in dev mode

To use pkg.yml, we need to use the below command to apply the package. ocli will persist this package definition to the local state ($HOME/.oprc/local).

ocli dev pa pkg.yml

We can also check what packages already persist in the local state.

ocli dev cl

1.3 Create object

With the current package definition, even though it has no custom code, we can still use some basic operations such as new and get

ocli dev oc tutorial.demo -s
# with '-s' option, ocli will save the object ID for the later command.
ocli dev i

ocli dev oc tutorial.demo -s -d '{"hello": "world"}'
# we can create object with some JSON data as well

When creating the object, ocli will save the object data into a local file with the NDJSON format (Newline Delimited JSON).

ls $HOME/.oprc/local/
cat $HOME/.oprc/local/tutorial.demo.ndjson

1.4 Add function to class

Let’s try to add more built-in functions. We can add builtin.update to our demo class.

pkg.yaml
name: tutorial
classes:
  - name: demo
    parents: ['builtin.base']
    #### START
    functions:
      - name: update
        function: builtin.update
    #### END

Reapply the package definition to make it effective.

ocli dev pa pkg.yml

Try invoke update function

ocli dev i update -d '{"foo": "bar"}'

You may realize that hello=world is gone. This is expected because the update function will completely replace the data. To merge with the original data, we can add option merge=true to update function.

pkg.yaml
name: tutorial
classes:
  - name: demo
    parents: ['builtin.base']
    functions:
      - name: update
        function: builtin.update
        #### START
        override:
          merge: true
        #### END
Note
Don’t forget to reapply!!!

Now the data should be merged properly.

ocli dev oc tutorial.demo -s -d '{"hello": "world"}'
ocli dev i update -d '{"foo": "bar"}'

1.5 Add key to unstructured state

Oparaca can work with unstructured state (BLOB). However, it is required to bep redefined in class definition.

pkg.yaml
name: tutorial
classes:
  - name: demo
    parents: ['builtin.base']
    functions:
      - name: update
        function: builtin.update
        override:
          merge: true
    #### START
    stateSpec:
      keySpecs:
        - name: image
    #### END
Note
Don’t forget to reapply!!!

Because ocli don’t emulate the object storage, we have to create by ourselves. We have to run the below docker command to create minio for object storage.

docker run -d -p 9000:9000 -p 9001:9001 -e MINIO_ROOT_USER=admin -e MINIO_ROOT_PASSWORD=changethis -e MINIO_DEFAULT_BUCKETS=oaas-bkt -e MINIO_API_CORS_ALLOW_ORIGIN=* --name="minio" bitnami/minio

#### to clean up
# docker stop minio
# docker rm minio

Now, we should be able to create the object with the file. We can use -f <key>=<path-to-file> to upload the file.

ocli dev oc tutorial.demo -s -f image=images/sol.png

Try load image back

ocli dev of image out.png

2. Add the custom function

The example image is big. Let’s try to resize it with a custom function. We already have the image-resizing on our main OaaS repository. So, we can use it here.

To simplify the process, we can use the docker to run the function container.

docker run -d --network host --name="img-resize-fn-py" ghcr.io/hpcclab/oaas/img-resize-fn-py:latest

#### to clean up
# docker stop img-resize-fn-py && docker rm img-resize-fn-py
Note
--network host is important. It allows the container to access minio container with localhost.

We also need to update our package definition.

pkg.yaml
name: tutorial
classes:
  - name: image # CHANGE THIS (1)
    parents: ['builtin.base']
    functions:
      - name: update
        function: builtin.update
        override:
          merge: true
      - name: resize # (2)
        function: .resize # (3)
    stateSpec:
      keySpecs:
        - name: image
functions:
  - name: resize # (4)
  1. We change the class name to image to be more meaningful.

  2. Adding resize function to our class.

  3. Link the class function to the actual resize function. Prefix . will be substituted with package name (tutorial.resize).

  4. It is the new function. We need to add to the function section too. Since we are in dev mode, other configuration parameters are not needed.

Note
Don’t forget to reapply!!!

Now, we can try to use this function.

ocli dev oc tutorial.image -s -f image=images/sol.png
ocli dev i resize --args ratio=0.5
ocli dev of image out.png

Now, you can see that the size of out.png is reduced by half.

3. Implement the custom function

3.1 Clone template project

Clone the template project

git clone --depth 1 https://github.com/pawissanutt/oprc-func-py-template.git bg-remover
cd bg-remover
rm -rf .git

3.2 Setup virtual environment

Note
You may skip this step if your IDE does it for you.

Create a virtual environment.

python -m venv venv

Activate a virtual environment.

# For powershell
./venv/Scripts/activate
# For bash
source venv/Scripts/activate

+ NOTE: If you did it correctly, you should see (venv) at the beginning of your terminal.

Open bg-remover/requirements.txt to add rembg[cpu] (The Python library for removing background from image). We then have to install the dependencies.

pip install -r requirements.txt

3.3 Implement the logic

main.py
import logging
import os
from io import BytesIO

import aiohttp
import oaas_sdk_py as oaas
import uvicorn
from PIL import Image
from fastapi import Request, FastAPI, HTTPException
from oaas_sdk_py import OaasInvocationCtx
from rembg import remove

LOG_LEVEL = os.getenv("LOG_LEVEL", "INFO")
level = logging.getLevelName(LOG_LEVEL)
logging.basicConfig(level=level)

IMAGE_KEY = "image"

class RemoveBackgroundHandler(oaas.Handler):  #(1)
    async def handle(self, ctx: OaasInvocationCtx):
        async with aiohttp.ClientSession() as session:
            async with await ctx.load_main_file(session,  IMAGE_KEY) as resp: #(2)
                image_bytes = await resp.read()  #(3)
                with Image.open(BytesIO(image_bytes)) as img:
                    output_image = remove(img) #(4)
                    byte_io = BytesIO()
                    output_image.save(byte_io, format=img.format)
                    resized_image_bytes = byte_io.getvalue()
                    await ctx.upload_byte_data(session, IMAGE_KEY, resized_image_bytes) #(5)


app = FastAPI()
router = oaas.Router()
router.register(RemoveBackgroundHandler())
  1. Create a new handler class

  2. Load image content from object storage

  3. Read image content into byte array

  4. Use remove function from rembg library to remove image background.

  5. Upload the image content back to object storage

Because we already use port 8080, we have to change port for this function.

uvicorn.run(app, host="0.0.0.0", port=8081)

We also need to update package definition as well.

pkg.yaml
name: tutorial
classes:
  - name: image
    parents: ['builtin.base']
    functions:
      - name: update
        function: builtin.update
        override:
          merge: true
      - name: resize
        function: .resize
      - name: bg-remove # (1)
        function: .bg-remove
    stateSpec:
      keySpecs:
        - name: image
functions:
  - name: resize
    config:
      staticUrl: http://localhost:8080 # (2)
  - name: bg-remove
    config:
      staticUrl: http://localhost:8081
  1. Add bg-remove function

  2. Set the URL of function server

Note
Don’t forget to reapply!!!

Then, open another terminal to run this function.

python main.py
Note
Don’t forget to activate venv if needed.

Now, we can try to use this function via ocli

ocli dev oc tutorial.image -s -f image=images/sol.png
ocli dev i bg-remove
ocli dev of image out.png

3.4 Immutable invocation

Sometime, we may want to keep the old image. So, invoking the function should not modify the old image. Oparaca provide support to this requirement, but the function code need to be awareness of this as well. In this case, you modify some code.

main.py
class RemoveBackgroundHandler(oaas.Handler):
    async def handle(self, ctx: OaasInvocationCtx):
        inplace = ctx.task.output_obj is None or ctx.task.output_obj.id is None # (1)
        async with aiohttp.ClientSession() as session:
            async with await ctx.load_main_file(session, IMAGE_KEY) as resp:
                image_bytes = await resp.read()
                with Image.open(BytesIO(image_bytes)) as img:
                    output_image = remove(img)
                    byte_io = BytesIO()
                    output_image.save(byte_io, format=img.format)
                    resized_image_bytes = byte_io.getvalue()
                    if inplace: # (2)
                        await ctx.upload_main_byte_data(session, IMAGE_KEY, resized_image_bytes)
                    else:
                        await ctx.upload_byte_data(session, IMAGE_KEY, resized_image_bytes)
  1. Check if Oparaca generate the output ID or not. If it does, mean it imply immutable invocation.

  2. Update image content to the output object.

pkg.yaml
classes:
  - name: image
    functions:
      #### START
      - name: resize
        function: .resize
        outputCls: .image
        immutable: true
      - name: resize-inplace
        function: .resize
      - name: bg-remove
        function: .bg-remove
        outputCls: .image
        immutable: true
      - name: bg-remove-inplace
        function: .bg-remove
      #### END
Note
Don’t forget to reapply!!!

We add 2 functions with prefix -inplace to make the function update the main object directly. For 2 old functions, we modify them to make them become immutable functions.

Now, we can try them.

ocli dev oc tutorial.image -s -f image=images/sol.png
ocli dev i bg-remove
ocli dev of image out.png

Now, out.png is the same as the original image. To get the output image, we need to add -s to invoke the command to save the output ID.

ocli dev oc tutorial.image -s -f image=images/sol.png
ocli dev i -s bg-remove
ocli dev of image out.png

4. Dataflow

Oparaca has support building the workflow in the form of dataflow. The feature enables us to run multiple functions as one function. For example, in this tutorial, we want to run both resize and bg-remove functions as one function.

4.1 Basic MACRO function

pkg.yaml
name: tutorial
classes:
  - name: image
    parents: ['builtin.base']
    functions:
      - {...}
      - nam: transform #(1)
        function: .transform
    stateSpec:
      keySpecs:
        - name: image

functions:
  - {...}
  - name: transform # (2)
    type: MACRO
    macro:
      steps: (3)
        # var out1 = self.resize(ratio=$args.ratio)
        - target: '@'
          as: out1
          function: resize
          args:
            ratio: ${@|args|ratio}
        # var out2 = out1.bg-remove()
        - target: out1
          as: out2
          function: bg-remove
      # return out2
      output: out2 (4)
  1. Create a transform function binding to image class and link to function definition below.

  2. Create a transform function with MACRO type.

  3. Create 2 steps (resize and bg-remove) for this function.

  4. Specify the return object for this function

Note
Don’t forget to reapply!!!
ocli dev oc tutorial.image -s -f image=images/sol.png
ocli dev i -s transform
ocli dev of image out.png

We can see that the output image is not only resized but also has removed the background as well.

Nested MACRO function

The MACRO function in Oparaca is still a function. We can creae another haso function to invoke this function. In this tutorial, we want to have one function that creates multiple images with different sizes and backgrounds removed.

pkg.yaml
name: tutorial
classes:
  - name: image
    parents: ['builtin.base']
    functions:
      - {...}
      - name: split
        function: .split-transform
    stateSpec:
      keySpecs:
        - name: image

functions:
  - {...}
  - name: split-transform
    type: MACRO
    macro:
      steps:
        # var small = self.transform(ratio=0.1)
        - target: '@'
          as: small
          function: transform
          args:
            ratio: 0.1
        # var medium = self.transform(ratio=0.3)
        - target: '@'
          as: medium
          function: transform
          args:
            ratio: 0.3
        # var big = self.transform(ratio=0.5)
        - target: '@'
          as: big
          function: transform
          args:
            ratio: 0.5
Note
Don’t forget to reapply!!!
ocli dev oc tutorial.image -s -f image=images/sol.png
ocli dev i split
ocli dev of -m <id> out.png

When invoking split function, ocli calls transform function 3 times. Each of them also calls function resize and bg-remove. The total execution time is lower than calling these functions one by one because Oparaca try to execute them concurrently.

5. Cluster deployment

5.1 Prerequisite

To begin with step 5, we need an Oparaca cluster. We can create one in a local PC with this guide

5.2 Deploy image class

To deploy this sample application to the cluster environment, we need to build the image and push it to the container registry. To simplify this process, we will use the pre-built image.

First, we need to update pkg.yml by adding the container image to each function.

pkg.yml
functions:
  - name: resize
    config:
      staticUrl: http://localhost:8080
    provision:
      knative:
        image: ghcr.io/pawissanutt/oaas/img-resize-fn-py:latest
  - name: bg-remove
    config:
      staticUrl: http://localhost:8081
    provision:
      knative:
        image: ghcr.io/pawissanutt/oaas/img-rembg-fn-py:latest

Then, try applying pkg.yml to the cluster. This time, we will not need to use dev mode anymore.

ocli p a pkg.yml

We can now check on Kubernetes.

kubectl get pod -n oaas -l cr-id

We can see three pods are created. Two of them are functions resize and bg-remove that are powered by Knative. When there is no request for a certain period of time, these pods will be removed.

Now, we can try on function invocation by running the below commands. They should work in the same as dev mode.

ocli o c -s tutorial.image -f image=images/sol.png
ocli i -s transform --args ratio=0.1
ocli o f image out.png

ocli o c -s tutorial.image -f image=images/sol.png
ocli i split

Now, we realize that bg-remove function require a lot of CPU resource. It is not good if multiple requests come to the same pod. To prevent this, we can add concurrency=1.

pkg.yml
functions:
  - name: bg-remove
    config:
      staticUrl: http://localhost:8081
    provision:
      knative:
        concurrency: 1
        image: ghcr.io/pawissanutt/oaas/img-rembg-fn-py:latest

To see how it works, we have to monitor the pod by using kubectl in another terminal session.

kubectl get pod -n oaas -l cr-id

Then, try to invoke the split workflow again

ocli p a pkg.yml
# wait for some seconds
ocli o c -s tutorial.image -f image=images/sol.png
ocli i split

Now you can see that Knative creates multiple new pods to handle the request concurrently.

6. Video recording of tutorial

Video recording of this tutorial is available on YouTube: https://youtu.be/vXqO50jsCjM

7. Publication and citation

If you use the materials of this tutorial, please cite its related publication as follows:

Pawissanutt Lertpongrujikorn and Mohsen Amini Salehi, "Tutorial: Object as a Service (OaaS) Serverless Cloud Computing Paradigm", in Proceedings of the 44th IEEE International Conference on Distributed Computing Systems Workshops (ICDCSW '24), Jersey City, USA, Jul. 2024

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors