Deep Researcher with Plugin-based System for LLMs and Search APIs

Find a file

Sakthi Santhosh 91301ecc09 Update: Add Detailed Documentation		2026-02-10 09:50:41 +05:30
.vscode	Initial Commit	2026-02-10 09:50:15 +05:30
lib	Update: Refactor Prompt System, Clean Imports, and Miscellaneous Fixes	2026-02-10 09:50:41 +05:30
.env.example	Initial Commit	2026-02-10 09:50:15 +05:30
.gitignore	Update: Refactor Prompt System, Clean Imports, and Miscellaneous Fixes	2026-02-10 09:50:41 +05:30
.python-version	Initial Commit	2026-02-10 09:50:15 +05:30
main.py	Update: Refactor Prompt System, Clean Imports, and Miscellaneous Fixes	2026-02-10 09:50:41 +05:30
pyproject.toml	Initial Commit	2026-02-10 09:50:15 +05:30
README.md	Update: Add Detailed Documentation	2026-02-10 09:50:41 +05:30
uv.lock	Initial Commit	2026-02-10 09:50:15 +05:30

README.md

Deep Researcher

An deep researcher agent with plugin-based system for crawlers and LLM models. This deep researcher is capable of working with multiple model providers, search engines and provides out-of-the-box analytics. A detailed report is generated at the end.

Author: Sakthi Santhosh Anumand
Created on: 15/03/2025

Objective

Deep Researcher implements an autonomous research agent designed to conduct comprehensive, iterative web-based research on complex topics. The system employs a depth-first exploration strategy that generates targeted search queries, extracts actionable insights from search results, and recursively explores follow-up questions to build a complete understanding of the research domain. The agent operates through configurable hyperparameters controlling query refinement, learning generation, and search depth, enabling researchers to balance thoroughness against API costs and execution time. At completion, the system produces both granular learnings from each search iteration and a synthesized research report that distills findings into actionable intelligence. The modular architecture ensures extensibility across LLM providers and search backends while maintaining provider-agnostic interfaces throughout the research workflow.

Setup and Usage Guide

The system requires Python 3.12 or higher. Install dependencies using uv sync or your preferred package manager, then configure API credentials:

OPENAI_API_KEY=your_openai_key_here
GEMINI_API_KEY=your_gemini_key_here
FIRECRAWL_API_KEY=your_firecrawl_key_here

This enables integration with OpenAI models and search, Google Gemini models with native search, and optionally Firecrawl for traditional web crawling. Initialize the researcher by selecting a crawler backend and LLM model, then configuring research parameters:

from os import getenv

from dotenv import load_dotenv
from google.genai import Client
from openai import OpenAI

from lib.analytics import LLMAnalytics
from lib.constants import LLMIdentifier
from lib.crawlers import GeminiSearchCrawler, OpenAISearchCrawler
from lib.llm import OpenAICompatibleLLMModel
from lib.models.llm import DeepResearchHyperParameters
from lib.researcher import DeepResearcher


def main():
    load_dotenv(".env.local")

    openai_client = OpenAI()
    researcher = DeepResearcher(
        crawler=GeminiSearchCrawler(
            llm_identifier=LLMIdentifier.GEMINI_2_0_FLASH,
            llm_instance=Client(api_key=getenv("GEMINI_API_KEY")),
        ),
        llm_model=OpenAICompatibleLLMModel(
            llm_identifier=LLMIdentifier.GPT_4O, llm_instance=openai_client
        ),
        analytics_instance=LLMAnalytics(),
        research_parameters=DeepResearchHyperParameters(
            num_learnings=5,
            num_refinement_questions=3,
            learning_depth=5,
            learning_width=3,
        ),
    )

This configuration combines a Gemini-based search crawler with a GPT-4o reasoning model. The DeepResearchHyperParameters control research breadth and depth, while LLMAnalytics tracks token usage and costs across all API calls.

Execute the research workflow by loading your query and invoking the researcher instance:

    with open("./assets/query.md", "r", encoding="utf-8") as file_handle:
        user_query = file_handle.read().strip()

    learnings, report = researcher(
        user_query=user_query,
        auto_query_refinement=True,
    )

    with open("./assets/learnings.md", "w", encoding="utf-8") as f:
        f.write("\n\n".join(learnings))

    with open("./assets/report.md", "w", encoding="utf-8") as f:
        f.write(report)

The researcher returns two outputs: a list of granular learnings from each search iteration, and a comprehensive synthesized report. Set auto_query_refinement=True for automatic query expansion or False to interactively answer clarification questions.

Swap crawler backends and LLM models to customize behavior:

    researcher = DeepResearcher(
        crawler=OpenAISearchCrawler(
            llm_identifier=LLMIdentifier.GPT_4O_MINI,
            llm_instance=openai_client,
            search_context_size="medium",
        ),
        llm_model=OpenAICompatibleLLMModel(
            llm_identifier=LLMIdentifier.GPT_4O, llm_instance=openai_client
        ),
    )

This alternative uses OpenAI's integrated web search with medium context size. Available crawlers include FirecrawlCrawler for traditional scraping, OpenAISearchCrawler with configurable context (low, medium, high), and GeminiSearchCrawler for Google's native search. LLM models span OpenAI's O3-Mini, O1, GPT-4o, GPT-4o-Mini and Google's Gemini 2.0 Flash variants, each with distinct performance and pricing characteristics.

Technical Details / Architecture

graph TB
    subgraph "Research Orchestration Layer"
        DR[DeepResearcher]
        HP[DeepResearchHyperParameters]
        AN[LLMAnalytics]
    end

    subgraph "Provider Abstraction Layer"
        LLM[LLMModel Abstract Base]
        CR[Crawler Abstract Base]
        LLMCR[LLMCrawler Abstract Base]
        
        LLM --> OAILLM[OpenAICompatibleLLMModel]
        LLM --> GEMLLM[GeminiLLMModel]
        
        CR --> FCCR[FirecrawlCrawler]
        LLMCR --> OAICR[OpenAISearchCrawler]
        LLMCR --> GEMCR[GeminiSearchCrawler]
    end

    subgraph "External Integration Layer"
        OAIAPI[OpenAI API]
        GEMAPI[Google Gemini API]
        FCAPI[Firecrawl API]
    end

    subgraph "Data Models"
        SERP[SERPQuery Model]
        LEARN[Learning Model]
        REFINE[UserQueryRefinementQuestions Model]
        RESULTS[SERPQuerySearchResults Model]
    end

    DR -->|Uses| LLM
    DR -->|Uses| CR
    DR -->|Uses| LLMCR
    DR -->|Configures| HP
    DR -->|Tracks Via| AN

    OAILLM -->|Calls| OAIAPI
    GEMLLM -->|Calls| GEMAPI
    FCCR -->|Calls| FCAPI
    OAICR -->|Calls| OAIAPI
    GEMCR -->|Calls| GEMAPI

    DR -->|Generates| SERP
    DR -->|Extracts| LEARN
    DR -->|Produces| REFINE
    CR -->|Returns| RESULTS
    LLMCR -->|Returns| RESULTS

    AN -->|Monitors| OAIAPI
    AN -->|Monitors| GEMAPI

    style DR fill:#4a90e2
    style LLM fill:#50c878
    style CR fill:#50c878
    style LLMCR fill:#50c878
    style OAIAPI fill:#ff6b6b
    style GEMAPI fill:#ff6b6b
    style FCAPI fill:#ff6b6b

The architecture implements a three-layer abstraction separating research orchestration, provider interfaces, and external integrations. At the core, the DeepResearcher class manages the iterative research workflow through recursive depth-first exploration, maintaining state across search iterations and accumulating learnings that inform subsequent queries. The system generates SERP queries with explicit research goals, executes searches through the configured crawler, extracts structured learnings using LLM-powered analysis, and recursively explores follow-up questions until reaching the configured depth limit or width constraints.

flowchart TD
    UQ[User Query Input]
    QR{Query Refinement Mode}
    AQR[Automatic Query Expansion]
    MQR[Interactive Clarification Questions]
    
    UQ --> QR
    QR -->|Auto| AQR
    QR -->|Manual| MQR
    
    SG0[Generate SERP Queries<br/>Depth 0 Width 3]
    AQR --> SG0
    MQR --> SG0
    
    SQ1[SERP Query 1]
    SQ2[SERP Query 2]
    SQ3[SERP Query 3]
    
    SG0 --> SQ1
    SG0 --> SQ2
    SG0 --> SQ3
    
    CR1[Crawler Search 1]
    LLM1[LLM Analysis 1]
    L1[Extract Learnings 1]
    
    SQ1 --> CR1
    CR1 --> LLM1
    LLM1 --> L1
    
    SG1[Generate SERP Queries<br/>Depth 1 Width 2]
    L1 --> SG1
    
    SQ1A[SERP Query 1A]
    SQ1B[SERP Query 1B]
    
    SG1 --> SQ1A
    SG1 --> SQ1B
    
    CR1A[Crawler Search 1A]
    LLM1A[LLM Analysis 1A]
    L1A[Extract Learnings 1A<br/>Max Depth Reached]
    
    SQ1A --> CR1A
    CR1A --> LLM1A
    LLM1A --> L1A
    
    CR1B[Crawler Search 1B]
    LLM1B[LLM Analysis 1B]
    L1B[Extract Learnings 1B<br/>Max Depth Reached]
    
    SQ1B --> CR1B
    CR1B --> LLM1B
    LLM1B --> L1B
    
    DOTQ2[Similar Process<br/>For Query 2]
    DOTQ3[Similar Process<br/>For Query 3]
    
    SQ2 -.-> DOTQ2
    SQ3 -.-> DOTQ3
    
    AGG[Accumulate All Learnings<br/>From All Depths]
    
    L1 --> AGG
    L1A --> AGG
    L1B --> AGG
    DOTQ2 -.-> AGG
    DOTQ3 -.-> AGG
    
    REP[Generate Final Report<br/>Using LLM]
    OUT[Output Learnings And Report]
    
    AGG --> REP
    REP --> OUT
    
    style UQ fill:#4a90e2,stroke:#333,stroke-width:2px
    style QR fill:#f39c12,stroke:#333,stroke-width:2px
    style SG0 fill:#9b59b6,stroke:#333,stroke-width:2px,color:#fff
    style SG1 fill:#9b59b6,stroke:#333,stroke-width:2px,color:#fff
    style AGG fill:#27ae60,stroke:#333,stroke-width:2px,color:#fff
    style REP fill:#27ae60,stroke:#333,stroke-width:2px,color:#fff
    style OUT fill:#27ae60,stroke:#333,stroke-width:2px,color:#fff
    style L1A fill:#e74c3c,stroke:#333,stroke-width:2px,color:#fff
    style L1B fill:#e74c3c,stroke:#333,stroke-width:2px,color:#fff
    style DOTQ2 fill:#95a5a6,stroke:#333,stroke-width:1px
    style DOTQ3 fill:#95a5a6,stroke:#333,stroke-width:1px

Provider abstraction occurs through abstract base classes that define uniform interfaces while accommodating provider-specific behaviors. The LLMModel hierarchy includes OpenAICompatibleLLMModel supporting OpenAI's chat completions API with structured output parsing, and GeminiLLMModel implementing Google's generation API with native JSON schema support. Both handle timeout scenarios, usage metadata extraction, and optional prompt caching for cost optimization. The Crawler and LLMCrawler hierarchies similarly abstract search operations, with FirecrawlCrawler making REST API calls to Firecrawl's search endpoint for markdown-formatted content, OpenAISearchCrawler leveraging OpenAI's responses API with web search tools, and GeminiSearchCrawler utilizing Google's native search grounding capabilities.

Structured data flows through Pydantic models that enforce type safety and validation across the research pipeline. The SERPQuery model captures both the search string and associated research goals with guidance for future exploration. The Learning model combines extracted insights with follow-up questions that drive recursive research. Query refinement produces UserQueryRefinementQuestions for interactive clarification or automatic expansion. All LLM interactions support both free-form text generation and schema-constrained structured outputs, with the system automatically selecting the appropriate API methods based on the presence of response format specifications.

The analytics subsystem tracks comprehensive usage metrics across all API interactions, differentiating between search operations, structured completions, and standard text generation. Token accounting separates cached versus non-cached input tokens, completion tokens, and search-specific costs, enabling real-time cost projections based on model-specific pricing captured in the ModelParameters dataclass. The system calculates per-million-token costs for inputs (cached and non-cached) and outputs, plus per-thousand-query costs for search operations where applicable. This granular tracking provides visibility into cost drivers and supports budget-aware research parameter tuning.

Configuration management through DeepResearchHyperParameters implements intelligent defaults and validation logic that prevents invalid parameter combinations. The width calculation automatically halves at each depth level to maintain computational feasibility, with maximum depth constraints derived from initial width settings. The system logs parameter adjustments and caps depth values that would generate impractical query loads. This design balances research thoroughness against practical resource constraints while maintaining transparency about system behavior through comprehensive logging at multiple verbosity levels.

Contribution Guidelines

Contributions welcome through pull requests addressing bugs, adding LLM providers via abstract class inheritance, implementing new crawler backends, or enhancing prompt engineering. Maintain type hints, follow existing abstractions, ensure backward compatibility, and include usage examples for new providers. Test thoroughly across provider combinations before submission.