universal-pdf-vision-parser OpenClaw Skill
Extract multilingual document content and language learning notes (French, German, Japanese, Spanish, etc.) from PDFs using multimodal vision (Qwen-VL-Max)....
Installation
clawhub install universal-pdf-vision-parse
Requires npm i -g clawhub
333
Downloads
0
Stars
6
current installs
7 all-time
1
Versions
Universal PDF Vision Parser Skill
Version: 0.1
This skill is a high-end multilingual document digitizer. It uses multimodal vision to 'look' at each PDF page, making it perfect for language learning notes, bilingual documents, and complex layouts that standard OCR fails to capture.
Prerequisites
- DashScope API Key: A valid key from Alibaba Cloud Bailian with
qwen-vl-maxaccess. - Environment:
pip install pymupdf dashscope
Usage
Basic Command
python scripts/vision_parse.py --pdf <path_to_pdf> --out <path_to_output.md> --api-key <YOUR_API_KEY> --max-pages 2
--max-pages: (Optional) Max pages to process. Defaults to2. Set to-1for all pages.
Agentic Workflow
- Visual Scanning: Converts PDF pages to 300 DPI PNGs.
- Expert Transcription: Qwen-VL-Max identifies the language and transcribes terms, translations, and explanations.
- Markdown Structuring: Automatically formats content with bold keywords, italicized meanings, and clean tables.
Examples
User: "Convert this German-Chinese note to markdown: notes.pdf"
Agent Action:
python scripts/vision_parse.py --pdf notes.pdf --out notes.md
Statistics
Author
M Z
@mingensiie
Latest Changes
v1.0.0 · Mar 3, 2026
Universal PDF Vision Parser Skill 1.0.0 - Initial release of a high-end, multilingual PDF digitizer for language learning documents. - Uses multimodal vision (Qwen-VL-Max) to extract and structure content from complex layouts into Markdown. - Supports multiple languages including French, German, Japanese, and Spanish. - Converts PDF pages to high-resolution images for accurate text parsing and formatting. - Perfect for extracting language notes, bilingual documents, and hard-to-capture formats.
Quick Install
clawhub install universal-pdf-vision-parse Related Skills
Other popular skills you might find useful.
Chat with 100+ AI Models in one App.
Use Claude, ChatGPT, Gemini alongside with EU-Hosted Models like Deepseek, GLM-5, Kimi K2.5 and many more.