qdrant-client: `set_model` attempts network connection despite `HF_HUB_OFFLINE=1` and local cache


---

### **Title: qdrant-client: `set_model` attempts network connection despite `HF_HUB_OFFLINE=1` and local cache**

## Current Behavior
When the `HF_HUB_OFFLINE=1` environment variable is set, `QdrantClient.set_model()` still attempts to download the embedding model from Hugging Face. This fails in an offline environment, **even when the model is already present in the local cache**, preventing the client from initializing.

The logs paradoxically show `fastembed` reporting "offline mode is enabled" as the reason for a network connection failure, indicating that while the flag is recognized, the connection attempt is not being properly suppressed.

## Steps to Reproduce
1.  Set up an environment with no internet access (e.g., a firewalled server or a Docker container).

2.  Set the environment variable: `export HF_HUB_OFFLINE=1`.

3.  Pre-download the embedding model into the specified cache directory (`/app/.cache/fastembed`).

4.  Confirm the model files are present in the cache. The directory structure and size should be verified:
    ```bash
    $ du -h -d 3 /app/.cache/fastembed/models--qdrant--paraphrase-multilingual-MiniLM-L12-v2-onnx-Q/
    
    241M    /app/.cache/fastembed/models--qdrant--paraphrase-multilingual-MiniLM-L12-v2-onnx-Q/blobs
    4.0K    /app/.cache/fastembed/models--qdrant--paraphrase-multilingual-MiniLM-L12-v2-onnx-Q/refs
    20K     /app/.cache/fastembed/models--qdrant--paraphrase-multilingual-MiniLM-L12-v2-onnx-Q/snapshots/faf4aa4225822f3bc6376869cb1164e8e3feedd0
    20K     /app/.cache/fastembed/models--qdrant--paraphrase-multilingual-MiniLM-L12-v2-onnx-Q/snapshots
    241M    /app/.cache/fastembed/models--qdrant--paraphrase-multilingual-MiniLM-L12-v2-onnx-Q/
    ```

5.  Run the following code:
    ```python
    import os
    from qdrant_client import QdrantClient
    
    QDRANT_HOST = "localhost"
    QDRANT_PORT = 6333
    EMBEDDING_MODEL = "sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2"
    CACHE_DIR = "/app/.cache/fastembed" # Example cache path

    os.environ['FASTEMBED_CACHE_PATH'] = CACHE_DIR
    
    # This step fails due to network connection attempts despite the model being cached
    client = QdrantClient(host=QDRANT_HOST, port=QDRANT_PORT)
    client.set_model(EMBEDDING_MODEL, cache_dir=CACHE_DIR)
    
    print("Client initialized successfully.") # This line is never reached
    ```

6.  Observe the error logs showing repeated attempts to connect to `huggingface.co`.

**Relevant Log Output:**
```log
2025-10-28 13:31:18.048 | ERROR    | fastembed.common.model_management:download_model:430 - Could not download model from HuggingFace: Cannot reach https://huggingface.co/...: offline mode is enabled. To disable it, please unset the `HF_HUB_OFFLINE` environment variable. Falling back to other sources.
2025-10-28 13:31:18.048 | ERROR    | fastembed.common.model_management:download_model:452 - Could not download model from either source, sleeping for 3.0 seconds, 2 retries left.
```

## Expected Behavior
When `HF_HUB_OFFLINE=1` is set, `qdrant-client` should first check the specified `cache_dir` for the model. If the model files exist locally—as confirmed above—it should load them directly without initiating any network connections. The initialization should succeed seamlessly in an air-gapped environment.

## Possible Solution
The issue appears to originate in the `fastembed` dependency. The model management logic must be updated to **prioritize checking for a local model in the cache** before attempting any download logic. When `HF_HUB_OFFLINE=1` is set, the network download path should be completely bypassed, and the client should rely solely on the cached files.

## Context (Environment)
We are deploying an application using `qdrant-client` in a secured, air-gapped production environment. All dependencies and models are pre-packaged into a container image. This bug is a blocker for our deployment, as the application fails to start due to its inability to operate in a true offline mode.

-   **Python Version**: 3.12.12
-   **Operating System**: Linux (Docker)
-   **Key Environment Variables**:
    -   `HF_HUB_OFFLINE=1`
    -   `FASTEMBED_CACHE_PATH=/app/.cache/fastembed`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

qdrant-client: `set_model` attempts network connection despite `HF_HUB_OFFLINE=1` and local cache #565

Title: qdrant-client: `set_model` attempts network connection despite `HF_HUB_OFFLINE=1` and local cache

Current Behavior

Steps to Reproduce

Expected Behavior

Possible Solution

Context (Environment)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

qdrant-client: set_model attempts network connection despite HF_HUB_OFFLINE=1 and local cache #565

Description

Title: qdrant-client: set_model attempts network connection despite HF_HUB_OFFLINE=1 and local cache

Current Behavior

Steps to Reproduce

Expected Behavior

Possible Solution

Context (Environment)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

qdrant-client: `set_model` attempts network connection despite `HF_HUB_OFFLINE=1` and local cache #565

Title: qdrant-client: `set_model` attempts network connection despite `HF_HUB_OFFLINE=1` and local cache