AI AGENT SKILLS

Visla AI Video Creation

一个面向 Research 场景的 Agent 技能。原始说明：Creates AI-generated videos from text scripts, URLs, or PPT/PDF documents using Visla. Use when the user asks to generate a video, turn a webpage into a vide...

下载技能包打开来源页 Research

SKILL.md

name: visla
description: Creates AI-generated videos from text scripts, URLs, or PPT/PDF documents using Visla. Use when the user asks to generate a video, turn a webpage into a video, or convert a PPT/PDF into a video, or when the user asks to check Visla account credits/balance.
argument-hint: <script|url|doc|account|avatar|voice> [script|URL|file]
metadata:
clawdbot:
emoji: ""
requires:
env: [ "VISLAAPIKEY", "VISLAAPISECRET" ]
primaryEnv: "VISLAAPIKEY"
files: [ "scripts/*" ]

Visla Video Generation

Version: 260501-1423

Create AI-generated videos from text scripts, web URLs, or documents (PPT/PDF) using Visla's OpenAPI.

Before You Start

Credentials (NEVER output API keys/secrets in responses):

IMPORTANT: Never output API keys/secrets in responses.

Check if ~/.config/visla/.credentials exists (do NOT read it yet).
If the file exists, use a choice-based confirmation to ask the user:

"Found saved credentials. Allow reading ~/.config/visla/.credentials?"
Options: Allow / No

If the user selects Allow: proceed with the command.
If the user selects No, or the file does not exist:

Ask the user to provide credentials via one of:

Environment variables (VISLA_API_KEY, VISLA_API_SECRET)
CLI arguments (--key, --secret)
Direct input of API key and secret

If provided credentials fail with VISLA_CLI_ERROR_CODE=missing_credentials or

VISLA_CLI_ERROR_CODE=auth_failed, ask the user to re-enter valid credentials.

Only process local files (scripts/docs) explicitly provided by the user, and remind users to avoid uploading sensitive
data.

Tell the user: this is a one-time setup (once configured, they won't need to do this again)
Tell the user: get API Key and Secret from https://www.visla.us/visla-api
Do not repeat the secrets back in the response.

Credential validity check (practical):

If credentials exist but running account fails with VISLA_CLI_ERROR_CODE=missing_credentials or

VISLA_CLI_ERROR_CODE=auth_failed, treat credentials as invalid and ask the user to provide real ones.

File format (bash/zsh):

export VISLA_API_KEY="your_key"
export VISLA_API_SECRET="your_secret"

For PowerShell (temporary session):

$env:VISLA_API_KEY = "your_key"
$env:VISLA_API_SECRET = "your_secret"

Scripts: scripts/visla_cli.py (Python), scripts/visla_cli.sh (Bash)

Platform Execution

Default strategy:

Prefer Bash on macOS when dependencies are available (the Bash CLI avoids Python SSL-stack issues on some macOS

setups).

Prefer Python when you're already using a well-configured Python (or when Bash dependencies are missing).

Bash (recommended on macOS; also works on Linux-like environments):

# With user consent, you may source ~/.config/visla/.credentials
export VISLA_API_KEY="your_key"
export VISLA_API_SECRET="your_secret"
./scripts/visla_cli.sh <command>

Python (cross-platform):

python3 scripts/visla_cli.py --key "your_key" --secret "your_secret" <command>
# Or, credentials are auto-detected from ~/.config/visla/.credentials (with user consent):
python3 scripts/visla_cli.py <command>

Windows native (PowerShell/CMD without Bash; Python):

# PowerShell
$env:VISLA_API_KEY = "your_key"
$env:VISLA_API_SECRET = "your_secret"
python scripts/visla_cli.py <command>

Windows note:

The agent should prefer running the Python CLI on Windows unless it has verified a Bash environment (WSL/Git Bash)

is available.

For simple scripts, pass directly: python scripts/visla_cli.py script "Scene 1: ..."
For multi-line or complex scripts, use stdin with - (recommended, no temp files):

  @"
  Scene 1: ...
  Scene 2: ...
  "@ | python scripts/visla_cli.py script -

If you have Python Launcher installed, py -3 scripts/visla_cli.py <command> may work better than python.
Credentials:
The Python CLI auto-detects ~/.config/visla/.credentials when present.
On Windows the default path is typically: %USERPROFILE%\\.config\\visla\\.credentials.

Note: do not print credentials. Prefer environment variables or auto-detected credentials with explicit user consent.

Commands

| Command | Description |
|-----------------------------------|---------------------------------------------------|
| /visla script <script-or-@file> | Create video from a script (text or a local file) |
| /visla url <URL> | Create video from web page URL |
| /visla doc <file> | Create video from document (PPT/PDF) |
| /visla idea <text-or-@file> | Create video from an idea |
| /visla visual <file> [file ...] | Create video from visual resources (images/videos), supports multiple files |
| /visla speech <file> [file ...] | Create video from speech (audio/video file), supports multiple files |
| /visla account | Show account info and credit balance |
| /visla avatar | List available AI avatars |
| /visla voice | List available AI voices |

Important: For avatar and voice commands:

Run the full CLI command (./visla_cli.sh avatar or ./visla_cli.sh voice).
You may filter the output before presenting to the user:
For avatar: remove Thumbnail: lines
For voice: remove URL: lines
Categorize and format avatar results as follows:
Group avatars by gender category (Female, Male, Neutral, Dynamic)
List each avatar name with (n) where n = number of looks
For each look, show: Look Name (lookUuid)
Format: - AvatarName (n): Look1 (uuid), Look2 (uuid), ...
Example:

    **Female (16):**
    - Emma (5): Blue Dress (1000145), Patterned Dress (1000146), Black Blazer (1000147), Light Gray Blazer (1000148), Emerald Green Pantsuit (1000149)

Categorize voice results by language/region (e.g., System, US English, Chinese, Japanese, French, etc.)
You must NOT omit any items from the list. The user must see all available avatars/voices, even if the list is long.
Agents must use the exact ID from the listing when configuring videos.

Optional Parameters

| Parameter | Description |
|-----------------------|-----------------------------------------------------------------|
| -c, --config <file> | Path to JSON config file with video options |
| --avatar <id> | Avatar ID to use for the video (get list from avatar command) |
| --voice <id> | Voice ID to use for the video (get list from voice command) |

visual command specific

| Parameter | Description |
|-----------------------|-----------------------------------------------------------------|
| --script, -s <text>| Script or description text (or @filename) |
| --style <style> | Video style: montage, storytelling (default), explainer |

speech command specific

| Parameter | Description |
|-----------------------|-----------------------------------------------------------------|
| --function <func> | Speech to video function: SPEECH_TO_VIDEO_SUMMARY or SPEECH_TO_VIDEO_FULL_LENGTH |

All other options (aspectratio, pace, burnsubtitles, footageoptions, bgmoptions, etc.) can be set in the config
file.

Cleanup: After video creation completes, delete the config file unless it's intended for reuse.

Config File Format (JSON)

All video options can be stored in a JSON config file (nested structure matches API request body):

{
  "video_title": "My Video",
  "video_description": "Video description",
  "project_function": "SPEECH_TO_VIDEO_SUMMARY",
  "script_text_mode": "ai_rewrite",
  "doc_usage": "page_by_page_walkthrough",
  "speaker_notes_verbatim": false,
  "target_video": {
    "aspect_ratio": "16:9",
    "video_pace": "fast",
    "burn_subtitles": false,
    "video_duration_in_seconds": 60
  },
  "avatar_options": {
    "use_avatar": false,
    "look_id": 12345,
    "avatar_layout": "smart_composition",
    "enable_auto_wallpaper": true,
    "enable_in_preview": true
  },
  "voice_options": {
    "use_voice": false,
    "voice_id": 1
  },
  "footage_options": {
    "enable_footage": true,
    "use_free_stocks": true,
    "use_premium_stocks": true,
    "use_premium_stocks_getty": true,
    "use_private_stocks": true,
    "private_stock_ids": 123456
  },
  "bgm_options": {
    "enable_bgm": true,
    "use_free_stocks": true,
    "use_premium_stocks": true
  }
}

Note: avatar_options.avatar_layout accepts only: host_only, host_pip, smart_composition.

CLI arguments (avatar, voice) override config file values.

Source of truth for the exact CLI surface: run scripts/visla_cli.sh --help or python3 scripts/visla_cli.py --help.

Script Format

**Scene 1** (0-10 sec):
**Visual:** A futuristic calendar flipping to 2025 with digital patterns.
**Narrator:** "AI is evolving rapidly! Here are 3 game-changing AI trends."

**Scene 2** (10-25 sec):
**Visual:** Text: "Trend #1: Generative AI Everywhere." Show tools like ChatGPT.
**Narrator:** "Generative AI is dominating industries—creating content and images."

Workflow

The script, url, doc, idea, visual, and speech commands execute the complete flow automatically:

Create project
Poll until generation completes (may take a few minutes)
Auto-export and return download link

Execution Instructions:

Inform user that video generation takes some time
Report progress status periodically during polling

Timeout Guidance

This workflow typically takes 3-10 minutes, but can take up to ~30 minutes in the worst case. Set the

task/command timeout to >= 30 minutes (Windows defaults are often ~10 minutes and need to be increased). If you
cannot change the timeout, warn the user up front and, on timeout, ask whether to continue or switch to a step-by-step
run.

If timeout occurs, the CLI returns project_uuid in the output. Inform the user they can manually check project

status and continue later using the Visla web interface or API.

Examples

/visla script @myscript.txt
/visla script "Scene 1: ..."
/visla url https://blog.example.com/article
/visla doc presentation.pptx
/visla idea "Create a video about machine learning"
/visla idea @my_idea.txt
/visla visual image.jpg
/visla visual photo1.jpg photo2.jpg photo3.jpg
/visla visual image.jpg --script "Description of the images..."
/visla visual image.jpg --style montage
/visla speech interview.m4a
/visla speech podcast.mp3 audio1.mp3 audio2.mp3
/visla speech podcast.mp3 --function SPEECH_TO_VIDEO_SUMMARY
/visla account
/visla avatar
/visla voice

# With config file
/visla script "Scene 1: Hello" -c config.json

# With avatar/voice (CLI overrides config)
/visla script "Scene 1: Hello" --avatar avatar_123 --voice voice_456

Supported Document Formats

PowerPoint: .pptx, .ppt
PDF: .pdf

Supported Media Formats

Visual Resources (visual command)

Images: .jpg, .jpeg, .png, .gif, .webp
Videos: .mp4, .mov, .avi, .mkv

Audio/Speech (speech command)

Audio: .mp3, .wav, .m4a, .aac, .flac
Videos: .mp4, .mov, .avi, .mkv

Output Format

Start: Display "Visla Skill v260501-1423" when skill begins
End: Display "Visla Skill v260501-1423 completed" when skill finishes

Security

The CLI scripts enforce the following safety measures to prevent unauthorized file access:

Path traversal: Paths containing .. are rejected.
System directories: Access to /etc/, /proc/, /sys/, /dev/, /run/, /var/log/ (and Windows equivalents) is denied.
Text file extension restriction: The @file syntax in script, idea, and visual --script commands only accepts .txt, .md, .srt, .vtt, .csv files.
Document/media file validation: The doc, visual, and speech commands validate file extensions against supported formats before upload.
Credentials: The Python CLI auto-detects ~/.config/visla/.credentials only. No arbitrary credential file paths are accepted.
User consent: The agent must ask for user consent before accessing local files, as specified in the "Before You Start" section.

适用场景

分类

Research Research

风险等级

风险标签

may need API key network access file access

依赖

安装难度

VISLA_API_KEY VISLA_API_SECRET

文件

3

Research

Multi Search Engine

一个面向 Research 场景的 Agent 技能。原始说明：Multi search engine integration with 16 engines (7 CN + 9 Global). Supports advanced search operators, time filters, site search, privacy engines, and Wolfra...

Research 低风险

Polymarket

一个面向 Research 场景的 Agent 技能。原始说明：Query Polymarket prediction markets. Check odds, find trending markets, search events, track price movements.

Research 低风险

Baidu web search

一个面向 Research 场景的 Agent 技能。原始说明：Search the web using Baidu AI Search Engine (BDSE). Use for live information, documentation, or research topics.

Research 低风险

Clawdbot Documentation Expert

一个面向 Research 场景的 Agent 技能。原始说明：Clawdbot documentation expert with decision tree navigation, search scripts, doc fetching, version tracking, and config snippets for all Clawdbot features

Research 低风险

Find Skills Skill

一个面向 Research 场景的 Agent 技能。原始说明：Search and discover OpenClaw skills from various sources. Use when: user wants to find available skills, search for specific functionality, or discover new s...

Research 低风险

Memory Setup

一个面向 Research 场景的 Agent 技能。原始说明：Enable and configure Moltbot/Clawdbot memory search for persistent context. Use when setting up memory, fixing "goldfish brain," or helping users configure memorySearch in their config. Covers MEMORY.md, daily logs, and vector search setup.

SKILL.md

Visla Video Generation

Before You Start

Platform Execution

Commands

Optional Parameters

visual command specific

speech command specific

Config File Format (JSON)

Script Format

Workflow

Timeout Guidance

Examples

Supported Document Formats

Supported Media Formats

Visual Resources (visual command)

Audio/Speech (speech command)

Output Format

Security

相关技能

Multi Search Engine

Polymarket

Baidu web search

Clawdbot Documentation Expert

Find Skills Skill

Memory Setup