# Web Crawler Configuration DeepSearcher supports various web crawlers to collect data from websites for processing and indexing. ## 📝 Basic Configuration ```python config.set_provider_config("web_crawler", "(WebCrawlerName)", "(Arguments dict)") ``` ## 📋 Available Web Crawlers | Crawler | Description | Key Feature | |---------|-------------|-------------| | **FireCrawlCrawler** | Cloud-based web crawling service | Simple API, managed service | | **Crawl4AICrawler** | Browser automation crawler | Full JavaScript support | | **JinaCrawler** | Content extraction service | High accuracy parsing | | **DoclingCrawler** | Doc processing with crawling | Multiple format support | ## 🔍 Web Crawler Options ### FireCrawl [FireCrawl](https://docs.firecrawl.dev/introduction) is a cloud-based web crawling service designed for AI applications. **Key features:** - Simple API - Managed Service - Advanced Parsing ```python config.set_provider_config("web_crawler", "FireCrawlCrawler", {}) ``` ??? tip "Setup Instructions" 1. Sign up for FireCrawl and get an API key 2. Set the API key as an environment variable: ```bash export FIRECRAWL_API_KEY="your_api_key" ``` 3. For more information, see the [FireCrawl documentation](https://docs.firecrawl.dev/introduction) ### Crawl4AI [Crawl4AI](https://docs.crawl4ai.com/) is a Python package for web crawling with browser automation capabilities. ```python config.set_provider_config("web_crawler", "Crawl4AICrawler", {"browser_config": {"headless": True, "verbose": True}}) ``` ??? tip "Setup Instructions" 1. Install Crawl4AI: ```bash pip install crawl4ai ``` 2. Run the setup command: ```bash crawl4ai-setup ``` 3. For more information, see the [Crawl4AI documentation](https://docs.crawl4ai.com/) ### Jina Reader [Jina Reader](https://jina.ai/reader/) is a service for extracting content from web pages with high accuracy. ```python config.set_provider_config("web_crawler", "JinaCrawler", {}) ``` ??? tip "Setup Instructions" 1. Get a Jina API key 2. Set the API key as an environment variable: ```bash export JINA_API_TOKEN="your_api_key" # or export JINAAI_API_KEY="your_api_key" ``` 3. For more information, see the [Jina Reader documentation](https://jina.ai/reader/) ### Docling Crawler [Docling](https://docling-project.github.io/docling/) provides web crawling capabilities alongside its document processing features. ```python config.set_provider_config("web_crawler", "DoclingCrawler", {}) ``` ??? tip "Setup Instructions" 1. Install Docling: ```bash pip install docling ``` 2. For information on supported formats, see the [Docling documentation](https://docling-project.github.io/docling/usage/supported_formats/#supported-output-formats)