Summary
Currently all shipped text embedding models (ALL_MINILM_L6_V2, ALL_MPNET_BASE_V2, MULTI_QA_MINILM_L6_COS_V1, MULTI_QA_MPNET_BASE_DOT_V1) are English-only.
For apps that support multiple languages, this limits the usefulness of semantic search and similarity features. In our case, we use useTextEmbeddings to match user-typed label names to icons — but it only works well when the user types in English.
Request
Ship a pre-exported multilingual text embeddings model, such as:
This would allow useTextEmbeddings to work with non-English input out of the box, similar to how the English models work today:
import { useTextEmbeddings, MULTILINGUAL_MINILM_L12_V2 } from 'react-native-executorch';
const embeddings = useTextEmbeddings({ model: MULTILINGUAL_MINILM_L12_V2 });
Context
- The current
MULTI_QA_MINILM_L6_COS_V1 works great for English
- Non-English queries produce poor embeddings since the model was only trained on English data
- Many React Native apps are multilingual by nature (we support English, Italian, and Albanian)
- The
useTextEmbeddings API wouldn't need to change — just a new model constant
Thanks for the great library!
Summary
Currently all shipped text embedding models (
ALL_MINILM_L6_V2,ALL_MPNET_BASE_V2,MULTI_QA_MINILM_L6_COS_V1,MULTI_QA_MPNET_BASE_DOT_V1) are English-only.For apps that support multiple languages, this limits the usefulness of semantic search and similarity features. In our case, we use
useTextEmbeddingsto match user-typed label names to icons — but it only works well when the user types in English.Request
Ship a pre-exported multilingual text embeddings model, such as:
paraphrase-multilingual-MiniLM-L12-v2— 50+ languages, 384 dimensions, ~470MB (could be quantized)distiluse-base-multilingual-cased-v2— 50+ languages, 512 dimensionsThis would allow
useTextEmbeddingsto work with non-English input out of the box, similar to how the English models work today:Context
MULTI_QA_MINILM_L6_COS_V1works great for EnglishuseTextEmbeddingsAPI wouldn't need to change — just a new model constantThanks for the great library!