Skip to content

Quick Start

“I’ll be honest with you, I love his music. I do. I’m a Michael Bolton fan.”

Let’s get you extracting documents faster than you can say “TPS report cover sheet.”

  1. Point at a document

    Extract text from /path/to/quarterly-report.docx
  2. Get the content

    {
    "text": "Q4 2024 Financial Summary\n\nRevenue increased by 15%...",
    "metadata": {
    "format": "Word Document (DOCX)",
    "extraction_method": "python-docx",
    "extraction_time": 0.042
    }
    }
  3. That’s it.

# Works with .docx, .doc, .xlsx, .xls, .pptx, .ppt, .csv
result = await extract_text("document.docx")
print(result["text"])
result = await convert_to_markdown("report.docx")
print(result["markdown"])
result = await extract_word_tables(
"contract.docx",
output_format="markdown"
)
# Returns tables as markdown tables
result = await analyze_excel_data(
"sales-data.xlsx",
include_statistics=True,
check_data_quality=True
)
# Returns column types, missing values, outliers, statistics
# Index once
result = await index_document("novel.docx")
# Returns: {"doc_id": "abc123", "resources": {...}}
# Fetch chapters on demand via MCP resources
# chapter://abc123/1 → Chapter 1
# chapter://abc123/1.txt → Plain text
# chapters://abc123/1-5 → Multiple chapters

mcwaddams can fetch documents directly from URLs:

result = await extract_text("https://example.com/report.docx")

Files are cached for 1 hour by default.

Not sure what you’re dealing with?

result = await detect_office_format("mystery-file.doc")
# Returns: format, version, encryption status, document category

mcwaddams never silently fails. You’ll get either:

  1. Content — The extracted text/data
  2. Clear error — Explaining exactly what went wrong
result = await extract_text("encrypted.docx")
# Returns: {"error": "Document is password-protected", "hint": "..."}


“Looks like someone has a case of the Mondays.”


Not anymore. Your documents are handled.
🎉

Flair Earned!

Badge Name

🎖️

You earned your first flair!

What should we call you?