Skip to content

Commit 03e580e

Browse files
committed
chore: switch to AllMiniLML6V2 for faster embeddings
1 parent c9b2f95 commit 03e580e

File tree

2 files changed

+15
-11
lines changed

2 files changed

+15
-11
lines changed

README.md

+10-9
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@
2121
</div>
2222

2323
## 🔎 The Project
24-
RepoQuery is an early-beta project, that uses recursive [OpenAI function calling](https://platform.openai.com/docs/api-reference/chat/create#chat/create-functions) paired with semantic search using [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5) to index and answer user queries about public GitHub repositories.
24+
RepoQuery is an early-beta project, that uses recursive [OpenAI function calling](https://platform.openai.com/docs/api-reference/chat/create#chat/create-functions) paired with semantic search using [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) to index and answer user queries about public GitHub repositories.
2525

2626
## 📬 Service Endpoints
2727

@@ -173,15 +173,16 @@ make up
173173

174174
## Attributions
175175

176-
[baai.ac.cn](https://www.baai.ac.cn/english.html) for [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5).
176+
[sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2).
177177
```
178-
@misc{bge_embedding,
179-
title={C-Pack: Packaged Resources To Advance General Chinese Embedding},
180-
author={Shitao Xiao and Zheng Liu and Peitian Zhang and Niklas Muennighoff},
181-
year={2023},
182-
eprint={2309.07597},
183-
archivePrefix={arXiv},
184-
primaryClass={cs.CL}
178+
@inproceedings{reimers-2019-sentence-bert,
179+
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
180+
author = "Reimers, Nils and Gurevych, Iryna",
181+
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
182+
month = "11",
183+
year = "2019",
184+
publisher = "Association for Computational Linguistics",
185+
url = "https://arxiv.org/abs/1908.10084",
185186
}
186187
```
187188

src/embeddings/fastembed.rs

+5-2
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
use crate::prelude::*;
2-
use fastembed::{EmbeddingBase, FlagEmbedding};
2+
use fastembed::{EmbeddingBase, FlagEmbedding, InitOptions};
33

44
use super::{Embeddings, EmbeddingsModel};
55

@@ -9,7 +9,10 @@ pub struct Fastembed {
99

1010
impl Fastembed {
1111
pub fn try_new() -> Result<Self> {
12-
let model = FlagEmbedding::try_new(Default::default())?;
12+
let model = FlagEmbedding::try_new(InitOptions {
13+
model_name: fastembed::EmbeddingModel::AllMiniLML6V2,
14+
..Default::default()
15+
})?;
1316
Ok(Self { model })
1417
}
1518
}

0 commit comments

Comments
 (0)