Jina Embedding & Reranker Server

OpenAI-compatible API server for Jina embeddings and reranking models.

Features

Embedding: jina-embeddings-v5-text-small (with LoRA task adapters)
Reranking: jina-reranker-v3
OpenAI-compatible API: Works with existing OpenAI client libraries
Task-adaptive embeddings: Switch LoRA adapter per request (retrieval, text-matching, clustering, classification)

Installation

uv sync

Fetch_Models.bat  #This is for windows, it contains two 'hf download' commands, adapt for linux accordingly

Usage

Start the server:

uv run jina_server.py

Or activate the virtual environment:

uv venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate
python jina_server.py

API Endpoints

`POST /v1/embeddings`

Create embeddings for input text. Supports task-adaptive LoRA adapters.

Parameters

Parameter	Type	Default	Description
`input`	`string \| string[]`	(required)	Text(s) to embed
`task`	`string`	`"retrieval"`	Task adapter: `retrieval`, `text-matching`, `clustering`, `classification`
`prompt_name`	`string`	`null`	Required when `task="retrieval"`: `"query"` or `"document"`
`model`	`string`	`"jina-embeddings-v5-text-small"`	Model name
`batch_size`	`int`	`32`	Processing batch size (1–128)

Examples

Retrieval (query side):

curl http://localhost:8000/v1/embeddings \
  -H "Content-Type: application/json" \
  -d '{"input": "What is machine learning?", "task": "retrieval", "prompt_name": "query"}'

Retrieval (document side):

curl http://localhost:8000/v1/embeddings \
  -H "Content-Type: application/json" \
  -d '{"input": "Machine learning is a field of AI...", "task": "retrieval", "prompt_name": "document"}'

Text matching:

curl http://localhost:8000/v1/embeddings \
  -H "Content-Type: application/json" \
  -d '{"input": ["Hello", "Hi"], "task": "text-matching"}'

Classification:

curl http://localhost:8000/v1/embeddings \
  -H "Content-Type: application/json" \
  -d '{"input": "This product is great!", "task": "classification"}'

Clustering:

curl http://localhost:8000/v1/embeddings \
  -H "Content-Type: application/json" \
  -d '{"input": ["neural networks", "monetary policy"], "task": "clustering"}'

`POST /v1/rerank`

Rerank documents based on query relevance.

curl http://localhost:8000/v1/rerank \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What is machine learning?",
    "documents": ["Doc 1", "Doc 2"],
    "top_n": 3
  }'

`GET /v1/models`

List available models.

`GET /`

Health check endpoint.

Testing

Run the test suite:

uv run test_server.py

WIP:

I'm telling my opencode to implement avx512 for my 9900x rig, brb

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
.gitignore		.gitignore
Fetch_Models.bat		Fetch_Models.bat
README.md		README.md
jina_server.py		jina_server.py
pyproject.toml		pyproject.toml
test_server.py		test_server.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Jina Embedding & Reranker Server

Features

Installation

Usage

API Endpoints

`POST /v1/embeddings`

Parameters

Examples

`POST /v1/rerank`

`GET /v1/models`

`GET /`

Testing

WIP:

License

About

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Jina Embedding & Reranker Server

Features

Installation

Usage

API Endpoints

POST /v1/embeddings

Parameters

Examples

POST /v1/rerank

GET /v1/models

GET /

Testing

WIP:

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors

Uh oh!

Languages

`POST /v1/embeddings`

`POST /v1/rerank`

`GET /v1/models`

`GET /`