You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

127 lines
4.1 KiB

2 weeks ago
# Embedding Model Configuration
DeepSearcher supports various embedding models to convert text into vector representations for semantic search.
## 📝 Basic Configuration
```python
config.set_provider_config("embedding", "(EmbeddingModelName)", "(Arguments dict)")
```
## 📋 Available Embedding Providers
| Provider | Description | Key Features |
|----------|-------------|--------------|
| **OpenAIEmbedding** | OpenAI's text embedding models | High quality, production-ready |
| **MilvusEmbedding** | Built-in embedding models via Pymilvus | Multiple model options |
| **VoyageEmbedding** | VoyageAI embedding models | Specialized for search |
| **BedrockEmbedding** | Amazon Bedrock embedding | AWS integration |
| **GeminiEmbedding** | Google's Gemini embedding | High performance |
| **GLMEmbedding** | ChatGLM embeddings | Chinese language support |
| **OllamaEmbedding** | Local embedding with Ollama | Self-hosted option |
| **PPIOEmbedding** | PPIO cloud embedding | Scalable solution |
| **SiliconflowEmbedding** | Siliconflow's models | Enterprise support |
| **VolcengineEmbedding** | Volcengine embedding | High throughput |
| **NovitaEmbedding** | Novita AI embedding | Cost-effective |
| **SentenceTransformerEmbedding** | Sentence Transfomer Embedding | Self-hosted option |
| **IBM watsonx.ai** | Various options | IBM's Enterprise AI platform |
## 🔍 Provider Examples
### OpenAI Embedding
```python
config.set_provider_config("embedding", "OpenAIEmbedding", {"model": "text-embedding-3-small"})
```
*Requires `OPENAI_API_KEY` environment variable*
### Milvus Built-in Embedding
```python
config.set_provider_config("embedding", "MilvusEmbedding", {"model": "BAAI/bge-base-en-v1.5"})
```
```python
config.set_provider_config("embedding", "MilvusEmbedding", {"model": "jina-embeddings-v3"})
```
*For Jina's embedding model, requires `JINAAI_API_KEY` environment variable*
### VoyageAI Embedding
```python
config.set_provider_config("embedding", "VoyageEmbedding", {"model": "voyage-3"})
```
*Requires `VOYAGE_API_KEY` environment variable and `pip install voyageai`*
## 📚 Additional Providers
??? example "Amazon Bedrock"
```python
config.set_provider_config("embedding", "BedrockEmbedding", {"model": "amazon.titan-embed-text-v2:0"})
```
*Requires `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` environment variables and `pip install boto3`*
??? example "Novita AI"
```python
config.set_provider_config("embedding", "NovitaEmbedding", {"model": "baai/bge-m3"})
```
*Requires `NOVITA_API_KEY` environment variable*
??? example "Siliconflow"
```python
config.set_provider_config("embedding", "SiliconflowEmbedding", {"model": "BAAI/bge-m3"})
```
*Requires `SILICONFLOW_API_KEY` environment variable*
??? example "Volcengine"
```python
config.set_provider_config("embedding", "VolcengineEmbedding", {"model": "doubao-embedding-text-240515"})
```
*Requires `VOLCENGINE_API_KEY` environment variable*
??? example "GLM"
```python
config.set_provider_config("embedding", "GLMEmbedding", {"model": "embedding-3"})
```
*Requires `GLM_API_KEY` environment variable and `pip install zhipuai`*
??? example "Google Gemini"
```python
config.set_provider_config("embedding", "GeminiEmbedding", {"model": "text-embedding-004"})
```
*Requires `GEMINI_API_KEY` environment variable and `pip install google-genai`*
??? example "Ollama"
```python
config.set_provider_config("embedding", "OllamaEmbedding", {"model": "bge-m3"})
```
*Requires local Ollama installation and `pip install ollama`*
??? example "PPIO"
```python
config.set_provider_config("embedding", "PPIOEmbedding", {"model": "baai/bge-m3"})
```
*Requires `PPIO_API_KEY` environment variable*
??? example "SentenceTransformer"
```python
config.set_provider_config("embedding", "SentenceTransformerEmbedding", {"model": "BAAI/bge-large-zh-v1.5"})
```
*Requires `pip install sentence-transformers`*
??? example "IBM WatsonX"
```python
config.set_provider_config("embedding", "WatsonXEmbedding", {"model": "ibm/slate-125m-english-rtrvr-v2"})
```
*Requires `pip install ibm-watsonx-ai`*