Notes on the course from TalkPython Training.
- A process is an independent program execution instance with its own memory and resources. Processes have at least one thread and are isolated from each other. Creating a process is heavy weight
- A thread is light weight execution unit within a process. Threads share the process memory and resources
- Use processes for CPU-intensive work (do things faster) and threads or asyncio for IO-intensive work (do more at once)
- Use asyncio as a lightweight approach to async development when you work with libraries supporting it -> Use async when you can, threads when you must!
- From Python 3.14 onwards, the GIL can be disabled to enable free treading, so threads can be used to do things faster and more at once
- If optimizing for performance, always think about the upper bound of improvement, before implementing logic asynchronously
- Consider alternative event loop implementations, if dealing with lots of events
- As more recent Python versions added features (task groups) to asyncio, we rarely need external libraries such as trio
- The event loop orchestrates tasks
- Prefixing a function with
asyncmakes it an async or coroutine function - Calling an async function returns a coroutine object (~ coroutine)
- Coroutines are build on generators, which can regarded as restartable functions
- Tasks are coroutines tied to an event loop
- We create tasks via
asyncio.create_task(<FUNC()>) - Consider using task groups if you don't want to manually create/track tasks (alternatively look into gathering tasks)
- We run coroutines via
asyncio.run(<FUNC()) - Note: Awaiting a coroutine, does not hand control back to the event loop. Wrapping a coroutine in task first, then awaiting it does so
- We can run blocking code in a dedicated thread using
asyncio.to_thread(<FUNC_NAME>)
- Use async http clients like aiohttp or httpx to make requests
- Consider using aiofiles to handle local files in async applications
- Threads work with the fork/join pattern
- We can create threads via
t = threading.Thread(target=<NAME_OF_FUNCTION>, args=(), kwargs={})and start them viat.start() - If
daemon=Truethe thread runs in the background and is immediately shut down if the main process shuts down (without any clean up) - If we want to wait for all threads to finish, use need to join them via
t.join() - We may use a dedicated cancellation thread to check for signals to cancel all active threads, by first checking if any thread is alive, joining on them with a small wait time, and then checking if the cancellation thread is alive
- An alternative to using the threading module directly is to use ThreadPoolExecutor instead
- Thread Safety is about avoiding temporary, invalid states in programs using threading
- Use Lock10 per default
- Consider using reentrant lock (RLock) only if the same threads needs to re-enter the same lock before releasing it
- Using fine-grained locks often adds complexity and only yields minimal performance improvements
- In order to use multiprocessing, first create a pool via
multiprocessing.Pool() - Then create tasks via
task = pool.apply_async(func=<FUNC_NAME>, args=(<ARGS>)) - Finally close the pool
pool.close()and joinpool.join() - If we need results from tasks, retrieve them via
task.get() - An alternative to using the multiprocessing module directly is to use ProcessPoolExecutor instead
- Consider using Cython to speed up Python without actually writing C