output normalize embedding in '/v1/embeddings' by redlion0929 · Pull Request #5956 · ggml-org/llama.cpp

redlion0929 · 2024-03-09T11:25:02Z

Description

Configure server.cpp to output normalized embeddings with the endpoint set to "v1/embeddings".
This PR is related with issue #5954

Test Method

I use below code to check normalized results.

def check_normalize(embedding):
    s = 0
    for i in embedding:
        s += i * i
    print(f"sum: {s}")


response = client.embeddings.create(input = input_texts, model="test")

print(response)

check_normalize(response.data[0].embedding)

Before

CreateEmbeddingResponse(data=[Embedding(embedding=[0.779549777507782, -1.7770930528640747, -0.5943143963813782, ... ,], index=0, object='embedding')], model='test', object='list', usage=Usage(prompt_tokens=0, total_tokens=0))

sum: 85301.80714120051

After

CreateEmbeddingResponse(data=[Embedding(embedding=[0.0026690999511629343, -0.006084587424993515, -0.0020348725374788046, ..., ], index=0, object='embedding')], model='test', object='list', usage=Usage(prompt_tokens=0, total_tokens=0))

sum: 1.0000004424428977

ngxson · 2024-03-09T12:08:17Z

Quick note: this is maybe a breaking change. Can we add an option to skip normalize, something like --embeddings-skip-norm ?
By default, openai normalize the vector so I prefer to have normalization enable by default in llama.cpp

ggerganov · 2024-03-09T12:15:39Z

The flag should better be a request option instead of CLI option

ggerganov

I avoided an extra copy of the data and moved the normalization into common so it can be reused

We can add the request option for disabling the normalization in a separate PR

* output normalize embedding in '/v1/embeddings' * common : reuse llama_embd_normalize * common : better normalize impl --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

output normalize embedding in '/v1/embeddings'

08d2ea1

ggerganov added 2 commits March 9, 2024 14:23

common : reuse llama_embd_normalize

98cccf1

common : better normalize impl

02addab

ggerganov approved these changes Mar 9, 2024

View reviewed changes

ggerganov merged commit fb215c3 into ggml-org:master Mar 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

output normalize embedding in '/v1/embeddings'#5956

output normalize embedding in '/v1/embeddings'#5956
ggerganov merged 3 commits intoggml-org:masterfrom
redlion0929:normalize-embedding

redlion0929 commented Mar 9, 2024 •

edited

Loading

Uh oh!

ngxson commented Mar 9, 2024

Uh oh!

ggerganov commented Mar 9, 2024

Uh oh!

ggerganov left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

redlion0929 commented Mar 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Test Method

Before

After

Uh oh!

ngxson commented Mar 9, 2024

Uh oh!

ggerganov commented Mar 9, 2024

Uh oh!

ggerganov left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

redlion0929 commented Mar 9, 2024 •

edited

Loading