Office Toolkit OpenClaw Skill

处理 Office 文档(Word/Excel/PPT/PDF)的技能。当用户要求读取、创建、编辑 Word 文档(.docx)、Excel 表格(.xlsx/.csv)、PPT(.pptx)或 PDF 时使用。基于 python-docx、openpyxl、python-pptx、pypdf 库。Require...

v1.0.0 Recently Updated Updated Today

Installation

clawhub install openclaw-office-toolkit

Requires npm i -g clawhub

0

Downloads

0

Stars

0

current installs

0 all-time

1

Versions

office-toolkit

处理 Office 文档:Word(.docx)、Excel(.xlsx/.csv)、PPT(.pptx)、PDF。

环境要求

pip install --break-system-packages python-docx openpyxl python-pptx pypdf
sudo apt install libreoffice-writer libreoffice-calc libreoffice-impress pandoc

快速参考

任务 库/命令
读 Word python-docxpandoc -t markdown
创建/编辑 Word python-docx
读 Excel openpyxlpandas
创建/编辑 Excel openpyxl
读 PPT python-pptx
创建/编辑 PPT python-pptx
读 PDF pypdfpandoc
PDF 格式验证 LibreOffice soffice
PDF 转图片 pdftoppm (poppler-utils)

Word (.docx)

读取

from docx import Document
doc = Document('file.docx')
for para in doc.paragraphs:
    print(para.text)

# 带格式提取
import subprocess
result = subprocess.run(['pandoc', '--track-changes=all', 'file.docx', '-t', 'markdown'], 
    capture_output=True, text=True)
print(result.stdout)

创建

from docx import Document
from docx.shared import Pt, Inches

doc = Document()
# 标题
doc.add_heading('文档标题', 0)

# 段落
p = doc.add_paragraph('正文内容')
p.runs[0].bold = True  # 加粗
p.runs[0].font.size = Pt(12)
p.runs[0].font.name = 'Arial'

# 引用
doc.add_paragraph('引用内容', style='Intense Quote')

# 表格
table = doc.add_table(rows=2, cols=3)
table.style = 'Light Grid Accent 1'
table.rows[0].cells[0].text = '表头1'
table.rows[0].cells[1].text = '表头2'

doc.save('output.docx')

编辑现有文档

  1. 解压 → 修改 XML → 重新打包(推荐用 python-docx 直接修改)
  2. 复杂格式建议用 LibreOffice 打开编辑

Excel (.xlsx)

读取

import openpyxl
wb = openpyxl.load_workbook('file.xlsx')
ws = wb.active
for row in ws.iter_rows(values_only=True):
    print(row)

创建/编辑

import openpyxl
from openpyxl.styles import Font, PatternFill, Alignment

wb = openpyxl.Workbook()
ws = wb.active
ws.title = '数据'

# 写入
ws['A1'] = '姓名'
ws['B1'] = '年龄'
ws['A2'] = '张三'
ws['B2'] = 25

# 格式化
header_fill = PatternFill(start_color='4472C4', end_color='4472C4', fill_type='solid')
ws['A1'].fill = header_fill
ws['A1'].font = Font(color='FFFFFF', bold=True)
ws['A1'].alignment = Alignment(horizontal='center')

# 保存
wb.save('output.xlsx')

格式化规则(财务场景)

  • 蓝色字体:硬编码输入值
  • 黑色字体:公式/计算
  • 绿色字体:同文件内链接
  • 红色字体:外部链接
  • 黄色背景:需要关注的假设

PowerPoint (.pptx)

读取

from pptx import Presentation
prs = Presentation('file.pptx')
for slide in prs.slides:
    for shape in slide.shapes:
        if hasattr(shape, 'text'):
            print(shape.text)

创建

from pptx import Presentation
from pptx.util import Inches, Pt

prs = Presentation()
prs.slide_width = Inches(10)
prs.slide_height = Inches(7.5)

# 使用空白布局
slide = prs.slides.add_slide(prs.slide_layouts[6])
title = slide.shapes.title
subtitle = slide.placeholders[1]
title.text = '演示标题'
subtitle.text = '副标题'

prs.save('output.pptx')

设计原则

  • 颜色方案:选一个大胆的配色,主色占60-70%,1-2个辅助色,1个尖锐强调色
  • 不要默认蓝色:根据主题选配色
  • 深浅对比:标题页用深色背景,结论页用浅色背景("三明治"结构)
  • 排版留白:不要堆满,留呼吸空间

PDF

读取

from pypdf import PdfReader
reader = PdfReader('file.pdf')
print(f'页数: {len(reader.pages)}')
for page in reader.pages:
    print(page.extract_text())

合并

from pypdf import PdfWriter, PdfReader
writer = PdfWriter()
for pdf_file in ['doc1.pdf', 'doc2.pdf']:
    reader = PdfReader(pdf_file)
    for page in reader.pages:
        writer.add_page(page)
with open('merged.pdf', 'wb') as f:
    writer.write(f)

分割

reader = PdfReader('input.pdf')
for i, page in enumerate(reader.pages):
    writer = PdfWriter()
    writer.add_page(page)
    with open(f'page_{i+1}.pdf', 'wb') as f:
        writer.write(f)

旋转

page = reader.pages[0]
page.rotate(90)  # 顺时针90度

依赖安装(当前环境状态)

依赖 状态
python-docx ✅ 已安装
openpyxl ✅ 已安装
python-pptx ✅ 已安装
pypdf ✅ 已安装
pandoc ✅ 已安装
LibreOffice ✅ 已安装

Agent Rules

  • 创建文件前先确保目录可写
  • 复杂文档先尝试 python-docx 等库,库无法处理再用 LibreOffice
  • PDF 读取优先用 pypdf,文字提取效果差时用 pandoc
  • Excel 格式化参照财务规范(蓝/黑/绿/红字体 + 黄色背景)
  • LibreOffice 路径:sofficelibreoffice(已安装)
  • 收到文件路径时,先检查文件是否存在:pathlib.Path(path).exists()

Statistics

Downloads 0
Stars 0
Current installs 0
All-time installs 0
Versions 1
Comments 0
Created Apr 1, 2026
Updated Apr 1, 2026

Latest Changes

v1.0.0 · Apr 1, 2026

- Initial release of office-toolkit, providing tools for reading, creating, and editing Word (.docx), Excel (.xlsx/.csv), PowerPoint (.pptx), and PDF documents. - Utilizes python-docx, openpyxl, python-pptx, and pypdf libraries for core functionality. - Supports document conversion and additional features via pandoc and LibreOffice. - Includes code examples and formatting guidelines for typical Office file operations. - Specifies dependency installation steps and agent rules for robust document handling.

Quick Install

clawhub install openclaw-office-toolkit
EU Made in Europe

Chat with 100+ AI Models in one App.

Use Claude, ChatGPT, Gemini alongside with EU-Hosted Models like Deepseek, GLM-5, Kimi K2.5 and many more.