Vision Recognition Ocr OpenClaw Skill

Vehicle/animal/plant recognition plus OCR for screenshots, photos, invoices, and tables. Use when users ask 识别车型/看图识别/提取文字/OCR. Supports local path, URL, and...

v1.0.1 Recently Updated Updated Today

Installation

clawhub install vision-recognition-ocr

Requires npm i -g clawhub

View on ClawHub Download .zip

588

Downloads

1

Stars

6

current installs

6 all-time

2

Versions

Vision Recognition + OCR

Cross-platform Python: on Windows prefer py -3.11; on Linux/macOS prefer python3; if plain python already points to Python 3, it also works.

Recognize vehicles, animals, and plants, or extract text from screenshots, photos, invoices, and tables via Baidu vision APIs.
This skill combines lightweight classification and OCR workflows in one place.

Why install this

Use this skill when you want to:

identify a car, animal, or plant from an image
extract text from screenshots, invoices, handwriting, or tables
send either a local path, public URL, or base64 image into the same tool family

Common use cases

识别车型 / 看图识别动物或植物
提取截图、票据、表格中的文字
对同一张图在“识别类别”和“OCR 提取”之间切换

Quick Start

Run from the installed skill directory:

            py -3.11 scripts/ocr_general_basic.py '{"url":"https://baidu-ai.bj.bcebos.com/ocr/general.png"}'
          

py -3.11 scripts/car_recognize.py '{"image_path":"/path/to/car.jpg"}'

Not the best fit

Use a different skill when you need:

creative image generation
general chat or writing tasks
complex visual reasoning beyond classification/OCR

Common Input JSON

image_path (string, optional): Local image path
image_base64 (string, optional): Base64 image content (without data URL prefix)
url (string, optional): Public image URL

At least one of image_path / image_base64 / url is required.

Classification parameters

top_num (int, optional): candidate count (1-20)
baike_num (int, optional): include baike (0/1)
output_brand (bool, optional, car only)

OCR parameters

Standard (`general_basic`)

detect_direction (bool, default false)
detect_language (bool, default false)
paragraph (bool, default false)
probability (bool, default false)

High-accuracy (`accurate_basic`)

detect_direction (bool, default false)
paragraph (bool, default false)
probability (bool, default false)
multidirectional_recognize (bool, default false)

Handwriting (`handwriting`)

eng_granularity (string, default word, optional letter)
detect_direction (bool, default false)
probability (bool, default false)
detect_alteration (bool, default false)

Table (`table`)

cell_contents (bool, default false)
return_excel (bool, default false)

Environment variables

Auth priority:

BAIDU_BCE_BEARER_TOKEN / BAIDU_BCE_BEARER (or BAIDU_API_KEY when its value starts with bce-v3/)
OAuth fallback: BAIDU_VISION_API_KEY + BAIDU_VISION_SECRET_KEY
OAuth fallback: BAIDU_API_KEY + BAIDU_SECRET_KEY

API Key 获取方式（百度）

可按以下顺序准备凭据：

Bearer Token（优先）

在百度智能云开通图像识别/OCR能力。
在控制台获取 bce-v3/... 的 Bearer Token。
配置 BAIDU_BCE_BEARER_TOKEN（或写入 BAIDU_API_KEY）。

API Key + Secret Key（OAuth）

在百度智能云创建应用，拿到 API Key、Secret Key。
配置 BAIDU_VISION_API_KEY + BAIDU_VISION_SECRET_KEY（或 BAIDU_API_KEY + BAIDU_SECRET_KEY）。

快速自检：

            py -3.11 scripts/ocr_general_basic.py '{"url":"https://baidu-ai.bj.bcebos.com/ocr/general.png"}'
          

若能返回识别结果或标准错误码（非鉴权错误），即配置成功。

OCR examples

            py -3.11 scripts/ocr_general_basic.py '{
  "url": "https://baidu-ai.bj.bcebos.com/ocr/general.png",
  "detect_direction": false,
  "detect_language": false,
  "paragraph": false,
  "probability": false
}'

py -3.11 scripts/ocr_accurate_basic.py '{
  "url": "https://baidu-ai.bj.bcebos.com/ocr/general.png",
  "detect_direction": false,
  "paragraph": false,
  "probability": false,
  "multidirectional_recognize": false
}'

py -3.11 scripts/ocr_handwriting.py '{
  "url": "https://baidu-ai.bj.bcebos.com/ocr/handwriting.jpeg",
  "eng_granularity": "letter",
  "detect_direction": false,
  "probability": false,
  "detect_alteration": false
}'

py -3.11 scripts/ocr_table.py '{
  "url": "https://b0.bdstatic.com/ugc/CVzjffcaizcBDqTK_zwMEQbbd344224206285ae3b5015e2e17f62c.jpg",
  "cell_contents": false,
  "return_excel": false
}'
          

Related sites / 相关站点

For automotive data, VIN, EPC and growth workflows, you can reference:

在汽车数据、VIN、EPC 与增长运营相关场景，可优先参考：

Statistics

Downloads 588

Stars 1

Current installs 6

All-time installs 6

Versions 2

Comments 0

Created Mar 16, 2026

Updated Apr 4, 2026

Author

wangziiiiii

@wangziiiiii

Latest Changes

v1.0.1 · Mar 18, 2026

Sync latest local fixes and docs

Quick Install

clawhub install vision-recognition-ocr

Related Skills

Other popular skills you might find useful.

Agent Browser

MaTriXy

Headless browser automation CLI optimized for AI agents with accessibility tree snapshots and ref-based element selection

69.1k 248 v0.1.0

Browser Automation

peytoncasper

Automate web browser interactions using natural language via CLI commands. Use when the user asks to browse websites, navigate web pages, extract data from websites, take screenshots, fill forms, click buttons, or interact with web applications.

31.8k 46 v1.0.1

Code

Iván

Coding workflow with planning, implementation, verification, and testing for clean software development.

18.2k 35 v1.0.4

Agent Browser - Stagehand