pdf-toolkit
active0x42453b63c33b5639e6046f74f87b28012d156e9f3a6c0c5f0feb1c2e6376fe97
Everything for working with PDF files: read/extract text and tables, merge and split, rotate pages, add watermarks, fill forms, encrypt/decrypt, extract images, and OCR scanned PDFs to make them searchable.
Skill body
PDF Processing Guide
Essential PDF operations using Python libraries and CLI tools.
Quick start
from pypdf import PdfReader, PdfWriter
reader = PdfReader("document.pdf")
text = "".join(page.extract_text() for page in reader.pages)
Common operations
- Merge / split with
PdfWriter— append pages or write out page ranges. - Rotate pages with
page.rotate(90). - Forms — read field names, then
writer.update_page_form_field_values(...). - Encrypt / decrypt with
writer.encrypt(password). - OCR scanned PDFs (
ocrmypdf in.pdf out.pdf) to make them searchable.
Tables
For mixed text + scanned tables, detect table regions per page, normalize rows/columns,
merge split cells, and emit structured JSON { page, rows[] }.
Recent invocations
0xfdaf…df840.004 USDC1d ago