Skill Vetter
一个面向 Security 场景的 Agent 技能。原始说明:Security-first skill vetting for AI agents. Use before installing any skill from ClawdHub, GitHub, or other sources. Checks for red flags, permission scope, and suspicious patterns.
name: ppx-parse
version: 0.2.0
title: PPX Parse
description: >
Parse PDFs and images into Markdown/JSON using the ppx CLI.
Use when the user asks to OCR scanned PDFs or screenshots, extract tables from PDFs,
convert PDF/image to Markdown, preserve document layout, inspect parsing output.
Also triggers on: 解析PDF、图片转文字、扫描件识别、扫描件转文字、提取表格、
PDF转Markdown、文档解析、OCR识别、识别图片文字、解析图片、提取文档内容。
metadata:
openclaw:
requires:
bins:
homepage: https://github.com/memect/memect-ppx
Use the local ppx CLI to parse PDFs and images into structured Markdown and JSON.
>= 3.12.ppx is missing, read references/troubleshooting.md and create a virtual environment before installing dependencies.version synchronized from the repository pyproject.toml with scripts/sync_version.py.>= 3.12.scripts/check_ppx_env.sh.ppx is missing, create or use a virtual environment and install PPX there.--ocr auto by default.--ocr yes for scanned PDFs or screenshots.--ocr no for native PDFs when OCR causes noise.--table auto by default.--table llm only when the user needs highest table accuracy and an LLM backend is configured.ppx parse <input> -o <output>.doc.mddoc.jsonpages/images/ when figures are extractedreferences/.ppx parse report.pdf -o output/
ppx parse scan.pdf --ocr yes -o output/
ppx parse figure.png -o output/
ppx parse report.pdf --pages "1-5,10" -o output/
ppx parse report.pdf --table llm --backend deepseek -o output/
doc.md, doc.json, or page-level files.references/cli-options.md when choosing parse flags.references/backend-config.md when using DeepSeek, Paddle, or GLM backends.references/troubleshooting.md when PPX is missing, Python is too old, or runtime dependencies fail.