Skip to content

output normalize embedding in '/v1/embeddings'#5956

Merged
ggerganov merged 3 commits intoggml-org:masterfrom
redlion0929:normalize-embedding
Mar 9, 2024
Merged

output normalize embedding in '/v1/embeddings'#5956
ggerganov merged 3 commits intoggml-org:masterfrom
redlion0929:normalize-embedding

Conversation

@redlion0929
Copy link
Copy Markdown
Contributor

@redlion0929 redlion0929 commented Mar 9, 2024

Description

Configure server.cpp to output normalized embeddings with the endpoint set to "v1/embeddings".
This PR is related with issue #5954

Test Method

I use below code to check normalized results.

def check_normalize(embedding):
    s = 0
    for i in embedding:
        s += i * i
    print(f"sum: {s}")


response = client.embeddings.create(input = input_texts, model="test")

print(response)

check_normalize(response.data[0].embedding)

Before

CreateEmbeddingResponse(data=[Embedding(embedding=[0.779549777507782, -1.7770930528640747, -0.5943143963813782, ... ,], index=0, object='embedding')], model='test', object='list', usage=Usage(prompt_tokens=0, total_tokens=0))

sum: 85301.80714120051

After

CreateEmbeddingResponse(data=[Embedding(embedding=[0.0026690999511629343, -0.006084587424993515, -0.0020348725374788046, ..., ], index=0, object='embedding')], model='test', object='list', usage=Usage(prompt_tokens=0, total_tokens=0))

sum: 1.0000004424428977

@ngxson
Copy link
Copy Markdown
Contributor

ngxson commented Mar 9, 2024

Quick note: this is maybe a breaking change. Can we add an option to skip normalize, something like --embeddings-skip-norm ?
By default, openai normalize the vector so I prefer to have normalization enable by default in llama.cpp

@ggerganov
Copy link
Copy Markdown
Member

The flag should better be a request option instead of CLI option

Copy link
Copy Markdown
Member

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I avoided an extra copy of the data and moved the normalization into common so it can be reused

We can add the request option for disabling the normalization in a separate PR

@ggerganov ggerganov merged commit fb215c3 into ggml-org:master Mar 9, 2024
hazelnutcloud pushed a commit to hazelnutcloud/llama.cpp that referenced this pull request Mar 10, 2024
* output normalize embedding in '/v1/embeddings'

* common : reuse llama_embd_normalize

* common : better normalize impl

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
NeoZhangJianyu pushed a commit to NeoZhangJianyu/llama.cpp that referenced this pull request Mar 12, 2024
* output normalize embedding in '/v1/embeddings'

* common : reuse llama_embd_normalize

* common : better normalize impl

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
jordankanter pushed a commit to jordankanter/llama.cpp that referenced this pull request Mar 13, 2024
* output normalize embedding in '/v1/embeddings'

* common : reuse llama_embd_normalize

* common : better normalize impl

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026
* output normalize embedding in '/v1/embeddings'

* common : reuse llama_embd_normalize

* common : better normalize impl

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
phuongncn pushed a commit to phuongncn/llama.cpp-gx10-dgx-sparks-deepseekv4 that referenced this pull request Apr 28, 2026
* output normalize embedding in '/v1/embeddings'

* common : reuse llama_embd_normalize

* common : better normalize impl

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants