# Web Crawler Configuration

DeepSearcher supports various web crawlers to collect data from websites for processing and indexing.

## 📝 Basic Configuration

```python
config.set_provider_config("web_crawler", "(WebCrawlerName)", "(Arguments dict)")
```

## 📋 Available Web Crawlers

| Crawler | Description | Key Feature |
|---------|-------------|-------------|
| **FireCrawlCrawler** | Cloud-based web crawling service | Simple API, managed service |
| **Crawl4AICrawler** | Browser automation crawler | Full JavaScript support |
| **JinaCrawler** | Content extraction service | High accuracy parsing |
| **DoclingCrawler** | Doc processing with crawling | Multiple format support |

## 🔍 Web Crawler Options

### FireCrawl

[FireCrawl](https://docs.firecrawl.dev/introduction) is a cloud-based web crawling service designed for AI applications.

**Key features:**
- Simple API
- Managed Service
- Advanced Parsing

```python
config.set_provider_config("web_crawler", "FireCrawlCrawler", {})
```

??? tip "Setup Instructions"

    1. Sign up for FireCrawl and get an API key
    2. Set the API key as an environment variable:
       ```bash
       export FIRECRAWL_API_KEY="your_api_key"
       ```
    3. For more information, see the [FireCrawl documentation](https://docs.firecrawl.dev/introduction)

### Crawl4AI

[Crawl4AI](https://docs.crawl4ai.com/) is a Python package for web crawling with browser automation capabilities.

```python
config.set_provider_config("web_crawler", "Crawl4AICrawler", {"browser_config": {"headless": True, "verbose": True}})
```

??? tip "Setup Instructions"

    1. Install Crawl4AI:
       ```bash
       pip install crawl4ai
       ```
    2. Run the setup command:
       ```bash
       crawl4ai-setup
       ```
    3. For more information, see the [Crawl4AI documentation](https://docs.crawl4ai.com/)

### Jina Reader

[Jina Reader](https://jina.ai/reader/) is a service for extracting content from web pages with high accuracy.

```python
config.set_provider_config("web_crawler", "JinaCrawler", {})
```

??? tip "Setup Instructions"

    1. Get a Jina API key
    2. Set the API key as an environment variable:
       ```bash
       export JINA_API_TOKEN="your_api_key"
       # or
       export JINAAI_API_KEY="your_api_key"
       ```
    3. For more information, see the [Jina Reader documentation](https://jina.ai/reader/)

### Docling Crawler

[Docling](https://docling-project.github.io/docling/) provides web crawling capabilities alongside its document processing features.

```python
config.set_provider_config("web_crawler", "DoclingCrawler", {})
```

??? tip "Setup Instructions"

    1. Install Docling:
       ```bash
       pip install docling
       ```
    2. For information on supported formats, see the [Docling documentation](https://docling-project.github.io/docling/usage/supported_formats/#supported-output-formats)