AI AGENT SKILLS

Web Search by Exa

一个面向 Dev Tools 场景的 Agent 技能。原始说明：Neural web search, content extraction, company and people research, code search, and deep research via the Exa MCP server. Use when you need to: (1) search t...

下载技能包打开来源页 Dev Tools

SKILL.md

name: web-search-exa
description: "Neural web search, content extraction, company and people research, code search, and deep research via the Exa MCP server. Use when you need to: (1) search the web with semantic understanding — not just keywords, (2) find research papers, news, tweets, companies, or people, (3) extract clean content from URLs, (4) find semantically similar pages to a known URL, (5) get code examples and documentation, (6) run deep multi-step research with a report, (7) get a quick synthesized answer with citations. NOT for: local file operations, non-web tasks, or anything that doesn't involve web search or content retrieval."

Exa — Neural Web Search & Research

Exa is a neural search engine. Unlike keyword-based search, it understands meaning — you describe the page you're looking for and it finds it. Returns clean, LLM-ready content with no scraping needed.

MCP server: https://mcp.exa.ai/mcp
Free tier: generous rate limits, no key needed for basic tools
API key: dashboard.exa.ai/api-keys — unlocks higher limits + all tools
Docs: exa.ai/docs
GitHub: github.com/exa-labs/exa-mcp-server

Setup

Add the MCP server to your agent config:

# OpenClaw
openclaw mcp add exa --url "https://mcp.exa.ai/mcp"

Or in any MCP config JSON:

{
  "mcpServers": {
    "exa": {
      "url": "https://mcp.exa.ai/mcp"
    }
  }
}

To unlock all tools and remove rate limits, append your API key:

https://mcp.exa.ai/mcp?exaApiKey=YOUR_EXA_KEY

To enable specific optional tools:

https://mcp.exa.ai/mcp?exaApiKey=YOUR_KEY&tools=web_search_exa,web_search_advanced_exa,people_search_exa,crawling_exa,company_research_exa,get_code_context_exa,deep_researcher_start,deep_researcher_check,deep_search_exa

Tool Reference

Default tools (available without API key)

| Tool | What it does |
|------|-------------|
| web_search_exa | General-purpose web search — clean content, fast |
| get_code_context_exa | Code examples + docs from GitHub, Stack Overflow, official docs |
| company_research_exa | Company overview, news, funding, competitors |

Optional tools (enable via `tools` param, need API key for some)

| Tool | What it does |
|------|-------------|
| web_search_advanced_exa | Full-control search: domain filters, date ranges, categories, content modes |
| crawling_exa | Extract full page content from a known URL — handles JS, PDFs, complex layouts |
| people_search_exa | Find LinkedIn profiles, professional backgrounds, experts |
| deep_researcher_start | Kick off an async multi-step research agent → detailed report |
| deep_researcher_check | Poll status / retrieve results from deep research |
| deep_search_exa | Single-call deep search with synthesized answer + citations (needs API key) |

websearchexa

Fast general search. Describe what you're looking for in natural language.

Parameters:

query (string, required) — describe the page you want to find
numResults (int) — number of results, default 10
type — auto (best quality), fast (lower latency), deep (multi-step reasoning)
livecrawl — fallback (default) or preferred (always fetch fresh)
contextMaxCharacters (int) — cap the returned content size

web_search_exa {
  "query": "blog posts about using vector databases for recommendation systems",
  "numResults": 8
}

web_search_exa {
  "query": "latest OpenAI announcements March 2026",
  "numResults": 5,
  "type": "fast"
}

websearchadvanced_exa

The power-user tool. Everything web_search_exa does, plus domain filters, date filters, category targeting, and content extraction modes.

Extra parameters beyond basic search:

| Parameter | Type | What it does |
|-----------|------|-------------|
| includeDomains | string[] | Only return results from these domains (max 1200) |
| excludeDomains | string[] | Block results from these domains |
| category | string | Target content type — see table below |
| startPublishedDate | string | ISO date, results published after this |
| endPublishedDate | string | ISO date, results published before this |
| maxAgeHours | int | Content freshness — 0 = always livecrawl, -1 = cache only, 24 = cache if <24h |
| contents.highlights | object | Extractive snippets relevant to query. Set maxCharacters to control size |
| contents.text | object | Full page as clean markdown. Set maxCharacters to cap |
| contents.summary | object | LLM-generated summary. Supports query and JSON schema for structured extraction |

Categories:

| Category | Best for |
|----------|---------|
| company | Company pages, LinkedIn company profiles |
| people | LinkedIn profiles, professional bios, personal sites |
| research paper | arXiv, academic papers, peer-reviewed research |
| news | Current events, journalism |
| tweet | Posts from X/Twitter |
| personal site | Blogs, personal pages |
| financial report | SEC filings, earnings reports |

Examples

Research papers:

web_search_advanced_exa {
  "query": "transformer architecture improvements for long-context windows",
  "category": "research paper",
  "numResults": 15,
  "contents": { "highlights": { "maxCharacters": 3000 } }
}

Company list building with structured extraction:

web_search_advanced_exa {
  "query": "Series A B2B SaaS companies in climate tech founded after 2022",
  "category": "company",
  "numResults": 25,
  "contents": {
    "summary": {
      "query": "company name, what they do, funding stage, location",
      "schema": {
        "type": "object",
        "properties": {
          "name": { "type": "string" },
          "description": { "type": "string" },
          "funding": { "type": "string" },
          "location": { "type": "string" }
        }
      }
    }
  }
}

People search — find candidates with specific profiles:

web_search_advanced_exa {
  "query": "machine learning engineers at fintech startups in NYC with experience in fraud detection",
  "category": "people",
  "numResults": 20,
  "contents": { "highlights": { "maxCharacters": 2000 } }
}

Finding pages similar to a known URL:
Use the URL itself as the query — Exa will find semantically similar pages:

web_search_advanced_exa {
  "query": "https://linkedin.com/in/some-candidate-profile",
  "numResults": 15,
  "contents": { "highlights": { "maxCharacters": 2000 } }
}

Recent news with freshness control:

web_search_advanced_exa {
  "query": "AI regulation policy updates",
  "category": "news",
  "maxAgeHours": 72,
  "numResults": 10,
  "contents": { "highlights": { "maxCharacters": 4000 } }
}

Scoped domain search:

web_search_advanced_exa {
  "query": "authentication best practices",
  "includeDomains": ["owasp.org", "auth0.com", "docs.github.com"],
  "numResults": 10,
  "contents": { "text": { "maxCharacters": 5000 } }
}

companyresearchexa

One-call company research. Returns business overview, recent news, funding, and competitive landscape.

company_research_exa { "query": "Stripe payments company overview and recent news" }

company_research_exa { "query": "what does Anduril Industries do and who are their competitors" }

peoplesearchexa

Find professionals by role, company, location, expertise. Returns LinkedIn profiles and bios.

people_search_exa { "query": "VP of Engineering at healthcare startups in San Francisco" }

people_search_exa { "query": "AI researchers specializing in multimodal models" }

getcodecontext_exa

Search GitHub repos, Stack Overflow, and documentation for code examples and API usage patterns.

get_code_context_exa { "query": "how to implement rate limiting in Express.js with Redis" }

get_code_context_exa { "query": "Python asyncio connection pooling example with aiohttp" }

crawling_exa

Extract clean content from a specific URL. Handles JavaScript-rendered pages, PDFs, and complex layouts. Returns markdown.

crawling_exa { "url": "https://arxiv.org/abs/2301.07041" }

Good for when you already have the URL and want to read the page.

deepresearcherstart + deepresearchercheck

Long-running async research. Exa's research agent searches, reads, and compiles a detailed report.

Start a research task:

deep_researcher_start {
  "query": "competitive landscape of AI code generation tools in 2026 — key players, pricing, technical approaches, market share"
}

Check status (use the researchId from the start response):

deep_researcher_check { "researchId": "abc123..." }

Poll deep_researcher_check until status is completed. The final response includes the full report.

deepsearchexa

Single-call deep search: expands your query across multiple angles, searches, reads results, and returns a synthesized answer with grounded citations. Requires API key.

deep_search_exa { "query": "what are the leading approaches to multimodal RAG in production systems" }

Supports structured output via outputSchema:

deep_search_exa {
  "query": "top 10 aerospace companies by revenue",
  "type": "deep",
  "outputSchema": {
    "type": "object",
    "required": ["companies"],
    "properties": {
      "companies": {
        "type": "array",
        "items": {
          "type": "object",
          "properties": {
            "name": { "type": "string" },
            "revenue": { "type": "string" },
            "hq": { "type": "string" }
          }
        }
      }
    }
  }
}

Query Craft

Exa is neural — it matches on meaning, not keywords. Write queries like you'd describe the ideal page to a colleague.

Do: "blog post about using embeddings for product recommendations at scale"
Don't: "embeddings product recommendations"

Do: "Stripe payments company San Francisco fintech"
Don't: "Stripe" (too ambiguous)

Use category when you know the content type — it makes a big difference.
For broader coverage, run 2-3 query variations in parallel and deduplicate results.
For agentic workflows, use highlights instead of full text — it's 10x more token-efficient while keeping the relevant parts.

Token Efficiency

| Content mode | When to use |
|-------------|------------|
| highlights | Agent workflows, factual lookups, multi-step pipelines — most token-efficient |
| text | Deep analysis, when you need full page context |
| summary | Quick overviews, structured extraction with JSON schema |

Set maxCharacters on any content mode to control output size.

When to Reach for Which Tool

| I need to... | Use |
|-------------|-----|
| Quick web lookup | web_search_exa |
| Research papers, academic search | web_search_advanced_exa + category: "research paper" |
| Company intel, competitive analysis | company_research_exa or advanced + category: "company" |
| Find people, candidates, experts | people_search_exa or advanced + category: "people" |
| Code examples, API docs | get_code_context_exa |
| Read a specific URL | crawling_exa |
| Find pages similar to a URL | web_search_advanced_exa with URL as query |
| Recent news / tweets | Advanced + category: "news" or "tweet" + maxAgeHours |
| Detailed research report | deep_researcher_start → deep_researcher_check |
| Quick answer with citations | deep_search_exa |

Docs: exa.ai/docs — Dashboard: dashboard.exa.ai — Support: support@exa.ai

适用场景

分类

Dev Tools Dev Tools

风险等级

风险标签

may need API key network access file access

文件

2

Dev Tools