AI AGENT SKILLS

AnnData MCP数据检索工具

一个面向 Data & APIs 场景的 Agent 技能。原始说明:AnnData MCP是一个通过MCP协议检索AnnData对象信息的工具,适用于生物医学数据分析场景。

SKILL.md

SKILL.md


name: AnnData MCP数据检索工具
description: AnnData MCP是一个通过MCP协议检索AnnData对象信息的工具,适用于生物医学数据分析场景。
version: 1.0.0


AnnData MCP数据检索工具

AnnData MCP是一个通过MCP协议检索AnnData对象信息的工具,适用于生物医学数据分析场景。


⚠️ 强制要求:API 密钥

此 Skill 必须配置 API 密钥才能使用。

  • 首次使用时,如果 .env 中没有 XBY_APIKEY必须使用 AskUserQuestion 工具向用户询问 API 密钥
  • 拿到用户提供的密钥后,调用 scripts.config.set_api_key(api_key) 保存,然后继续处理
  • 获取 API 密钥:https://xiaobenyang.com
  • 禁止在缺少 API 密钥时自行搜索或编造数据

工作流程(必须遵守)

你(大模型)是路由层,负责理解用户意图、选择工具、提取参数。代码只负责调用API。

用户输入 → 你选择工具 → 提取该工具需要的参数 → 调用 scripts.tools 中的函数 → 返回结果给用户

步骤

  1. 检查 API 密钥:如果 scripts.config.settings.api_key 为空,使用 AskUserQuestion 询问用户,拿到后调用 scripts.config.set_api_key(key) 保存
  2. 选择工具:根据用户意图从下方工具列表中选择对应的工具函数
  3. 提取参数:根据选中的工具,提取该工具需要的参数
  4. 调用工具:使用关键字参数调用 scripts.tools 中的函数,例如 scripts.tools.search_schools(score='520', province='北京', category='综合')
  5. 返回结果:将工具返回的 raw 数据整理后展示给用户

工具选择规则

根据用户意图选择对应的工具函数:

| 用户意图 | 工具函数 |
|---------|---------|
| View the raw data of an AnnData object. | scripts.tools.view_raw_data |
| Get a summary of an AnnData object from a file or URL. | scripts.tools.get_summary |
| Provide basic descriptive statistics (e.g., count, mean, std, min, max, etc. or value counts) for an attribute or attribute value of an optionally filtered AnnData object. | scripts.tools.get_descriptive_stats |

如果参数不完整,使用 AskUserQuestion 向用户询问缺失的参数。


工具函数说明


scripts.tools.viewrawdata

工具描述:View the raw data of an AnnData object.

参数定义

|参数名称|参数类型|是否必填|默认值|描述|
|------|-------|------|-----|----|
|path|string|true| |Absolute path or URL to the AnnData file|
|attribute|string|true| |The attribute to view|
|key|null|false| |The key of the attribute value to view. Can be a single string or a list of strings for nested key retrieval (e.g., ['key1', 'key2'] to access attr_obj['key1']['key2']).|
|columnsorgenes|null|false| |Column names or gene names to select. For pandas.DataFrame attributes (e.g., obs, var), these are column names. For 'X' or 'layers' attributes, these are gene names (from varnames) and are used instead of colstartindex/colstopindex. If None, the entire attribute is considered or colstartindex/colstop_index is used. Also accepts glob-like patterns as input, e.g. ['RE', 'CD4'].|
|rowstartindex|integer|false|0.0|The start index for the row slice. Only applied to attributes or attribute values with a suitable type.|
|rowstopindex|integer|false|5.0|The stop index for the row slice. Only applied to attributes or attribute values with a suitable type.|
|colstartindex|integer|false|0.0|The start index for the column slice. Only applied to attributes or attribute values with a suitable type.|
|colstopindex|integer|false|5.0|The stop index for the column slice. Only applied to attributes or attribute values with a suitable type.|
|filtercolumn|null|false| |The column name of the dataframe to filter by. Only applicable when the selected attribute (or attribute value) is a dataframe. Must be provided TOGETHER with filteroperator and filter_value.|
|filter_operator|null|false| |The operator to use for the dataframe filter.|
|filter_value|null|false| |The value(s) to filter the dataframe by.|


scripts.tools.get_summary

工具描述:Get a summary of an AnnData object from a file or URL.

参数定义

|参数名称|参数类型|是否必填|默认值|描述|
|------|-------|------|-----|----|
|path|string|true| |Absolute path or URL to the AnnData file (.h5ad or .zarr)|


scripts.tools.getdescriptivestats

工具描述:Provide basic descriptive statistics (e.g., count, mean, std, min, max, etc. or value counts) for an attribute or attribute value of an optionally filtered AnnData object.

参数定义

|参数名称|参数类型|是否必填|默认值|描述|
|------|-------|------|-----|----|
|path|string|true| |Absolute path or URL to the AnnData file (.h5ad or .zarr)|
|attribute|string|true| |The attribute to describe|
|key|null|false| |The key of the attribute value to explore. Can be a single string or a list of strings for nested key retrieval (e.g., ['key1', 'key2'] to access attr_obj['key1']['key2']). Should be None for attributes X, obs, and var.|
|columnsorgenes|null|false| |The columns or genes to describe. For pandas.DataFrame attributes (e.g., obs, var), these are column names. For 'X' or 'layers' attributes, these are gene names (from var_names). If None, the entire dataset is considered. Also accepts glob-like patterns as input, e.g. ['RE', 'CD4'].|
|returnvaluecountsforcategorical|boolean|false|false|Whether to return the value counts for categorical columns.|
|filterattribute|string|false| |The attribute to filter by. One of 'obs' or 'var' or None for no filtering. Has to be provided TOGETHER with filtercolumn, filteroperator, and filtervalue.|
|filter_column|null|false| |The column name of the obs or var dataframe to filter by.|
|filter_operator|null|false| |The operator to use for the filter.|
|filter_value|null|false| |The value(s) to filter by.|



返回值处理

工具函数返回 dict 对象:

  • result["raw"] - API 原始返回数据(JSON),直接将此数据整理后展示给用户
  • result["success"] - 是否成功(True/False)
  • result["message"] - 状态消息

项目结构

xiaobenyang_gaokao_skill/
├── scripts/
│   ├── __init__.py
│   ├── config.py       # 配置管理 + set_api_key()
│   ├── call_api.py      # API 客户端 + call_api()
│   └── tools.py         # 工具函数(直接调用)
├── requirements.txt
└── SKILL.md

注意事项

  1. API 密钥是必需的,无密钥时必须通过 AskUserQuestion 询问用户
  2. 禁止在缺少 API 密钥时自行搜索或编造数据