Skip to content

[Proposal] Router: Replace httpx with aiohttp in vllm_router for Enhanced High-Concurrency Performance #569

@ikaadil

Description

@ikaadil

Summary

This proposal suggests switching the vllm_router from using httpx to aiohttp for handling asynchronous HTTP requests. The goal is to improve responsiveness and reliability, especially when the system is under heavy load.

Motivation

Right now, vllm_router uses httpx to send asynchronous HTTP requests.

However, benchmarks show that aiohttp performs better than httpx in high-concurrency situations. For example, in a benchmark with 1000 parallel requests:

  • httpx: around 10.22 seconds
  • aiohttp: around 3.79 seconds
    Source

Goals

  • Speed up request handling in vllm_router when under load.
  • Make routing more reliable and reduce timeouts or failures.
  • Keep all existing features working the same as before.

Non-Goals

  • No changes will be made to the routing logic or algorithms.
  • The public interfaces and APIs exposed by vllm_router will remain unchanged.

Proposed Changes

Implementation Details

  • Swap out all usage of httpx.AsyncClient with aiohttp.ClientSession.
  • Ensure request behavior (timeouts, retries, headers, etc.) stays the same.
  • Re-test core features like health checks and request forwarding to ensure nothing breaks.

Architecture / Components

  • Affected Component: The vllm_router module in the src directory.

Interface Changes

  • No changes to any public-facing API or configuration.

Performance Considerations

  • Expect lower response latency and better performance under high concurrency.

Resource Constraints

  • This should reduce CPU and memory usage slightly due to aiohttp’s efficient I/O model.

Test Plans

Unit Tests

  • Update unit tests to support aiohttp.
  • Add any new tests needed to cover changes in request handling.

Integration/E2E Tests

  • Run full integration tests to confirm vllm_router behaves correctly with the new client.

Negative Tests

  • Simulate network issues and timeouts to ensure proper error handling with aiohttp.

Drawbacks

  • API Differences: aiohttp has a different style and error model compared to httpx, so code will need to be adjusted carefully.
  • Maintenance Overhead: If the upstream project does not accept this change, we may need to manage and test it separately.

Alternatives

  • Stick With httpx: Keep using httpx and try tuning its configuration for better performance.
  • Use Aiohttp As Transport for Httpx: A more balanced option is to keep httpx but replace its transport layer with aiohttp using a custom AiohttpTransport. This gives some of the performance benefits while preserving the httpx API.
    More details: link

Implementation Timeline / Phases

  1. Week 1: Start implementation and testing of the aiohttp integration in vllm_router.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions