-
Notifications
You must be signed in to change notification settings - Fork 690
Feature:Add support for Pooling Model Embedding and provide an OpenAI-compatible API. #4344
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Jiang-Jia-Jun
merged 10 commits into
PaddlePaddle:develop
from
sunlei1024:feat/pooling_embedding
Oct 15, 2025
Merged
Feature:Add support for Pooling Model Embedding and provide an OpenAI-compatible API. #4344
Jiang-Jia-Jun
merged 10 commits into
PaddlePaddle:develop
from
sunlei1024:feat/pooling_embedding
Oct 15, 2025
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
|
Thanks for your contribution! |
Collaborator
|
上面的使用demo再加一个 |
71a40a7 to
abfc268
Compare
76c5782 to
3f9d216
Compare
yuanlehome
approved these changes
Oct 15, 2025
lizexu123
approved these changes
Oct 15, 2025
EmmonsCurse
approved these changes
Oct 15, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
本次 PR 新增了对 Pooling Embedding 模型 的支持,并提供 与 OpenAI
/v1/embeddings完全兼容的接口实现。该功能旨在满足用户对高性能语义嵌入(Sentence Embedding)的需求,为搜索、聚类、推荐等下游任务提供更优质的嵌入表示。
主要更新内容
1. 新增:Pooling Model Embedding 支持
pooling模型 的底层接口支持。该功能允许服务将输入序列(如文本的词向量)通过聚合操作(如平均池化)转换为固定维度的语义嵌入向量。
2. 新增:OpenAI 兼容接口实现
New Feature: 实现了 与 OpenAI
/v1/embeddings标准完全兼容的 API 接口。兼容请求格式: 接口支持两种主流的请求类型,确保与现有 OpenAI 客户端无缝对接:
EmbeddingCompletionRequest— 接收input字符串或字符串列表。EmbeddingChatRequest— 接收messages列表,用于聊天类上下文嵌入。测试方式 (cURL 示例)
A. EmbeddingCompletionRequest 示例(标准文本输入)
B. EmbeddingChatRequest 示例(消息序列输入)
响应参数说明
以下为标准的接口响应格式,兼容 OpenAI 的
/v1/embeddings输出规范,同时支持多样化的 embedding 数据结构:{ "id": "embed-550e8400-e29b-41d4-a716-446655440000", "object": "list", "created": 1693645123, "model": "text-embedding-chat-model", "data": [ { // 示例 1:单层 embedding 向量 "index": 0, "object": "embedding", "embedding": [0.0123, -0.0456, 0.0789, 0.1011, -0.2022] }, { // 示例 2:多层嵌套 embedding(如 token 级输出) "index": 1, "object": "embedding", "embedding": [ [0.001, 0.002, 0.003], [0.004, 0.005, 0.006] ] } ], "usage": { "prompt_tokens": 42, "total_tokens": 42 } }字段说明:
id:请求唯一标识(带前缀pool-)object:响应对象类型,固定为"list"created:请求创建时间(Unix 时间戳)model:使用的嵌入模型名称data:嵌入结果数组,包含一个或多个 embedding 对象index:输入序列对应的索引embedding:嵌入向量(支持一维或二维结构)usage:请求的 Token 使用统计