Skip to content

Latest commit

 

History

History
78 lines (61 loc) · 4.68 KB

File metadata and controls

78 lines (61 loc) · 4.68 KB

Async techniques in Python

Notes on the course from TalkPython Training.

General

  • A process is an independent program execution instance with its own memory and resources. Processes have at least one thread and are isolated from each other. Creating a process is heavy weight 
  • A thread is light weight execution unit within a process. Threads share the process memory and resources
  • Use processes for CPU-intensive work (do things faster) and threads or asyncio for IO-intensive work (do more at once)
  • Use asyncio as a lightweight approach to async development when you work with libraries supporting it -> Use async when you can, threads when you must!
  • From Python 3.14 onwards, the GIL can be disabled to enable free treading, so threads can be used to do things faster and more at once
  • If optimizing for performance, always think about the upper bound of improvement, before implementing logic asynchronously
  • Consider alternative event loop implementations, if dealing with lots of events
  • As more recent Python versions added features (task groups) to asyncio, we rarely need external libraries such as trio

AsyncIO

  • The event loop orchestrates tasks
  • Prefixing a function with async makes it an async or coroutine function
  • Calling an async function returns a coroutine object (~ coroutine)
  • Coroutines are build on generators, which can regarded as restartable functions
  • Tasks are coroutines tied to an event loop
  • We create tasks via asyncio.create_task(<FUNC()>)
  • Consider using task groups if you don't want to manually create/track tasks (alternatively look into gathering tasks)
  • We run coroutines via asyncio.run(<FUNC())
  • Note: Awaiting a coroutine, does not hand control back to the event loop. Wrapping a coroutine in task first, then awaiting it does so
  • We can run blocking code in a dedicated thread using asyncio.to_thread(<FUNC_NAME>)

Web Requests

  • Use async http clients like aiohttp or httpx to make requests
  • Consider using aiofiles to handle local files in async applications

Threads

  • Threads work with the fork/join pattern
  • We can create threads via t = threading.Thread(target=<NAME_OF_FUNCTION>, args=(), kwargs={}) and start them via t.start()
  • If daemon=True the thread runs in the background and is immediately shut down if the main process shuts down (without any clean up)
  • If we want to wait for all threads to finish, use need to join them via t.join()
  • We may use a dedicated cancellation thread to check for signals to cancel all active threads, by first checking if any thread is alive, joining on them with a small wait time, and then checking if the cancellation thread is alive
  • An alternative to using the threading module directly is to use ThreadPoolExecutor instead

Thread Safety

  • Thread Safety is about avoiding temporary, invalid states in programs using threading
  • Use Lock10 per default
  • Consider using reentrant lock (RLock) only if the same threads needs to re-enter the same lock before releasing it
  • Using fine-grained locks often adds complexity and only yields minimal performance improvements

Multiprocessing

  • In order to use multiprocessing, first create a pool via multiprocessing.Pool()
  • Then create tasks via task = pool.apply_async(func=<FUNC_NAME>, args=(<ARGS>))
  • Finally close the pool pool.close() and join pool.join()
  • If we need results from tasks, retrieve them via task.get()
  • An alternative to using the multiprocessing module directly is to use ProcessPoolExecutor instead

Cython

  • Consider using Cython to speed up Python without actually writing C