文件预览

README.md

查看 Scientific Figure Analysis Pipeline 技能包中的文件内容。

文件内容

README.md

# Scientific Figure Analysis Pipeline

A complete scientific paper figure analysis workflow: **PDF extraction (PyMuPDF) → Kimi K2.6 multimodal vision → scientific reasoning → reverse engineering**.

## Overview

```
PDF Paper
   │
   ├─ Stage 1: PyMuPDF Extraction
   │   ├─ All embedded images (.jpeg/.png)
   │   ├─ Full text (captions + body)
   │   └─ Image metadata (dimensions, page number)
   │
   ├─ Stage 2: Kimi K2.6 Vision Analysis ★ Core
   │   ├─ base64 encoded images
   │   ├─ OpenAI SDK → api.moonshot.cn
   │   └─ Extract all visible values/labels/trends
   │
   ├─ Stage 3: Scientific Understanding
   │   ├─ Figure type recognition (18 types)
   │   ├─ Multimodal reasoning (vision + text)
   │   └─ Conservative confidence labeling
   │
   └─ Stage 4: Reverse Engineering
       ├─ Software inference (Prism/ggplot2/PyMOL...)
       └─ Pipeline reconstruction
```

## Features

- **Multi-panel figure detection** — accurately splits a/b/c/d sub-panels while preserving labels and legends
- **Kimi K2.6 vision AI** — extracts exact values (Kd, RMSD, fold change, OD values) not readable from text
- **18 scientific figure types** — microscopy, fluorescence, heatmaps, volcano plots, PCA, phylogeny, Western Blot, pathways, and more
- **Software fingerprinting** — GraphPad Prism, ggplot2, matplotlib, ImageJ, BioRender, Illustrator
- **Pipeline reconstruction** — infers normalization, clustering, differential expression, sequencing preprocessing workflows
- **Confidence-aware** — never hallucinate; uncertain findings are flagged as speculative

## Installation

```bash
pip install pymupdf opencv-python pdfplumber openai
```

Optional (OCR):
```bash
sudo apt install tesseract-ocr
```

## Quick Start

```python
import fitz, os
from openai import OpenAI
import base64, os

# 1. Extract figures from PDF
doc = fitz.open("paper.pdf")
os.makedirs("figures_output", exist_ok=True)

for pn in range(doc.page_count):
    for i, img in enumerate(doc[pn].get_images(full=True)):
        base = doc.extract_image(img[0])
        fname = f"figures_output/page{pn+1:02d}_img{i:02d}.{base['ext']}"
        with open(fname, 'wb') as f:
            f.write(base['image'])

# 2. Vision analysis with Kimi K2.6
client = OpenAI(
    api_key=os.environ.get("MOONSHOT_API_KEY"),
    base_url="https://api.moonshot.cn/v1",
)

with open("figures_output/page03_img00.jpeg", "rb") as f:
    image_data = f.read()

ext = "jpeg"
image_url = f"data:image/{ext};base64,{base64.b64encode(image_data).decode('utf-8')}"

completion = client.chat.completions.create(
    model="kimi-k2.6",
    messages=[{
        "role": "user",
        "content": [
            {"type": "image_url", "image_url": {"url": image_url}},
            {"type": "text", "text": "请描述这张科学图表的内容,包括所有可见的数值、标签和趋势。"},
        ],
    }],
    extra_body={"thinking": {"type": "disabled"}},
    max_tokens=4096,
)

print(completion.choices[0].message.content)
```

Set your API key:
```bash
export MOONSHOT_API_KEY="sk-..."   # from https://platform.kimi.com
```

## Figure Types Supported

| Type | Features |
|------|----------|
| Microscopy / Fluorescence | Grayscale/fluorescence channels, scale bars |
| Heatmap | Color scale matrix, row/column clustering |
| Volcano Plot | -log10(p) vs log2FC, threshold lines |
| PCA | Scatter + ellipse, variance explained |
| UMAP / t-SNE | Cluster distribution, perplexity annotation |
| Phylogenetic Tree | Branch topology, bootstrap values |
| Western Blot | Bands, molecular weight markers |
| Pathway Diagram | Arrows, nodes, enzyme labels |
| Workflow Chart | Step arrows, method boxes |
| Bar/Box/Violin Plot | Error bars, density contours |
| Sequencing QC | FastQC curves, duplication rates |
| Genome Browser | Tracks, read piles |
| Spatial Transcriptomics | Tissue section + expression overlay |
| Single-cell Clustering | UMAP + marker genes |

## File Structure

```
scientific-figure-analysis/
├── SKILL.md                                    # Main skill documentation
└── references/
    ├── kimi-k2.6-vision-api.md                 # Kimi K2.6 API reference
    └── busr-paper-case-study.md                 # Full case study (text-only fallback)
```

## License

MIT

## Contact & Citation

Designed and published by Huang Kai (huangjk8023@yeah.net).