文件预览
SKILL.md

查看 AnnData MCP数据检索工具技能包中的文件内容。
返回技能详情下载技能包打开来源页
文件内容
SKILL.md

---
name: AnnData MCP数据检索工具
description: AnnData MCP是一个通过MCP协议检索AnnData对象信息的工具，适用于生物医学数据分析场景。
version: 1.0.0
---

# AnnData MCP数据检索工具

AnnData MCP是一个通过MCP协议检索AnnData对象信息的工具，适用于生物医学数据分析场景。

---

## ⚠️ 强制要求：API 密钥

**此 Skill 必须配置 API 密钥才能使用。**

- 首次使用时，如果 `.env` 中没有 `XBY_APIKEY`，**必须使用 AskUserQuestion 工具向用户询问 API 密钥**
- 拿到用户提供的密钥后，调用 `scripts.config.set_api_key(api_key)` 保存，然后继续处理
- 获取 API 密钥：https://xiaobenyang.com
- **禁止**在缺少 API 密钥时自行搜索或编造数据

---

## 工作流程（必须遵守）

你（大模型）是路由层，负责理解用户意图、选择工具、提取参数。代码只负责调用API。

```
用户输入 → 你选择工具 → 提取该工具需要的参数 → 调用 scripts.tools 中的函数 → 返回结果给用户
```

### 步骤

1. **检查 API 密钥**：如果 `scripts.config.settings.api_key` 为空，使用 AskUserQuestion 询问用户，拿到后调用 `scripts.config.set_api_key(key)` 保存
2. **选择工具**：根据用户意图从下方工具列表中选择对应的工具函数
3. **提取参数**：根据选中的工具，提取该工具需要的参数
4. **调用工具**：使用**关键字参数**调用 `scripts.tools` 中的函数，例如 `scripts.tools.search_schools(score='520', province='北京', category='综合')`
5. **返回结果**：将工具返回的 `raw` 数据整理后展示给用户

---
## 工具选择规则

根据用户意图选择对应的工具函数：

| 用户意图 | 工具函数 | 
|---------|---------|
| View the raw data of an AnnData object. | `scripts.tools.view_raw_data` |
| Get a summary of an AnnData object from a file or URL. | `scripts.tools.get_summary` |
| Provide basic descriptive statistics (e.g., count, mean, std, min, max, etc. or value counts) for an attribute or attribute value of an optionally filtered AnnData object. | `scripts.tools.get_descriptive_stats` |

**如果参数不完整，使用 AskUserQuestion 向用户询问缺失的参数。**

---

## 工具函数说明

---

## scripts.tools.view_raw_data
工具描述：View the raw data of an AnnData object.
### 参数定义
|参数名称|参数类型|是否必填|默认值|描述|
|------|-------|------|-----|----|
|path|string|true| |Absolute path or URL to the AnnData file|
|attribute|string|true| |The attribute to view|
|key|null|false| |The key of the attribute value to view. Can be a single string or a list of strings for nested key retrieval (e.g., ['key1', 'key2'] to access attr_obj['key1']['key2']).|
|columns_or_genes|null|false| |Column names or gene names to select. For pandas.DataFrame attributes (e.g., obs, var), these are column names. For 'X' or 'layers' attributes, these are gene names (from var_names) and are used instead of col_start_index/col_stop_index. If None, the entire attribute is considered or col_start_index/col_stop_index is used. Also accepts glob-like patterns as input, e.g. ['RE*', 'CD4*'].|
|row_start_index|integer|false|0.0|The start index for the row slice. Only applied to attributes or attribute values with a suitable type.|
|row_stop_index|integer|false|5.0|The stop index for the row slice. Only applied to attributes or attribute values with a suitable type.|
|col_start_index|integer|false|0.0|The start index for the column slice. Only applied to attributes or attribute values with a suitable type.|
|col_stop_index|integer|false|5.0|The stop index for the column slice. Only applied to attributes or attribute values with a suitable type.|
|filter_column|null|false| |The column name of the dataframe to filter by. Only applicable when the selected attribute (or attribute value) is a dataframe. Must be provided TOGETHER with filter_operator and filter_value.|
|filter_operator|null|false| |The operator to use for the dataframe filter.|
|filter_value|null|false| |The value(s) to filter the dataframe by.|

---

## scripts.tools.get_summary
工具描述：Get a summary of an AnnData object from a file or URL.
### 参数定义
|参数名称|参数类型|是否必填|默认值|描述|
|------|-------|------|-----|----|
|path|string|true| |Absolute path or URL to the AnnData file (.h5ad or .zarr)|

---

## scripts.tools.get_descriptive_stats
工具描述：Provide basic descriptive statistics (e.g., count, mean, std, min, max, etc. or value counts) for an attribute or attribute value of an optionally filtered AnnData object.
### 参数定义
|参数名称|参数类型|是否必填|默认值|描述|
|------|-------|------|-----|----|
|path|string|true| |Absolute path or URL to the AnnData file (.h5ad or .zarr)|
|attribute|string|true| |The attribute to describe|
|key|null|false| |The key of the attribute value to explore. Can be a single string or a list of strings for nested key retrieval (e.g., ['key1', 'key2'] to access attr_obj['key1']['key2']). Should be None for attributes X, obs, and var.|
|columns_or_genes|null|false| |The columns or genes to describe. For pandas.DataFrame attributes (e.g., obs, var), these are column names. For 'X' or 'layers' attributes, these are gene names (from var_names). If None, the entire dataset is considered. Also accepts glob-like patterns as input, e.g. ['RE*', 'CD4*'].|
|return_value_counts_for_categorical|boolean|false|false|Whether to return the value counts for categorical columns.|
|filter_attribute|string|false| |The attribute to filter by. One of 'obs' or 'var' or None for no filtering. Has to be provided TOGETHER with filter_column, filter_operator, and filter_value.|
|filter_column|null|false| |The column name of the obs or var dataframe to filter by.|
|filter_operator|null|false| |The operator to use for the filter.|
|filter_value|null|false| |The value(s) to filter by.|

---


---

## 返回值处理

工具函数返回 `dict` 对象：
- `result["raw"]` - API 原始返回数据（JSON），**直接将此数据整理后展示给用户**
- `result["success"]` - 是否成功（True/False）
- `result["message"]` - 状态消息

---

## 项目结构

```
xiaobenyang_gaokao_skill/
├── scripts/
│   ├── __init__.py
│   ├── config.py       # 配置管理 + set_api_key()
│   ├── call_api.py      # API 客户端 + call_api()
│   └── tools.py         # 工具函数（直接调用）
├── requirements.txt
└── SKILL.md
```

---

## 注意事项

1. **API 密钥是必需的**，无密钥时必须通过 AskUserQuestion 询问用户
2. **禁止**在缺少 API 密钥时自行搜索或编造数据