Skip to content

All Tools Reference

“I’m going to need you to go ahead and read the documentation…”

mcwaddams provides 20 tools organized into three categories. Each tool follows the same pattern: pass a file path (local or URL), get structured data back.


These tools work across all supported Office formats (.docx, .doc, .xlsx, .xls, .pptx, .ppt, .csv).

Extract text content from any Office document with automatic format detection.

result = await extract_text(
file_path="/path/to/document.docx",
method="auto", # auto | primary | fallback
include_metadata=True, # Include document metadata
preserve_formatting=False # Preserve structure (slower)
)

Returns:

{
"text": "The extracted content...",
"metadata": {
"format": "Word Document (DOCX)",
"extraction_method": "python-docx",
"extraction_time": 0.042
}
}

Extract embedded images from Office documents with filtering options.

result = await extract_images(
file_path="/path/to/report.docx",
output_format="png", # png | jpg | jpeg
min_width=100, # Minimum width in pixels
min_height=100, # Minimum height in pixels
include_context=True # Include surrounding text
)

Returns:

{
"images": [
{
"index": 0,
"format": "png",
"dimensions": {"width": 800, "height": 600},
"context": "Figure 1: Sales performance...",
"data_uri": "data:image/png;base64,..."
}
],
"total_found": 5,
"extracted": 3
}

Get comprehensive document metadata including author, dates, and custom properties.

result = await extract_metadata(
file_path="/path/to/contract.docx"
)

Returns:

{
"title": "Service Agreement",
"author": "Legal Team",
"created": "2024-01-15T10:30:00Z",
"modified": "2024-03-20T14:22:00Z",
"word_count": 4521,
"page_count": 12,
"custom_properties": {
"Client": "Acme Corp",
"Version": "3.0"
}
}

Identify document format, version, and encryption status.

result = await detect_office_format(
file_path="/path/to/mystery-file.doc"
)

Returns:

{
"format": "Word 97-2003 Document",
"extension": ".doc",
"mime_type": "application/msword",
"is_encrypted": false,
"is_legacy": true,
"ole_metadata": {
"created_by": "Microsoft Word 10.0"
}
}

Comprehensive integrity check with actionable recommendations.

result = await analyze_document_health(
file_path="/path/to/old-report.docx"
)

Returns:

{
"status": "healthy",
"issues": [],
"warnings": [
"Document contains 15 embedded fonts (may increase file size)"
],
"recommendations": [
"Consider running through a document optimizer"
],
"file_size": 2458624,
"structure_valid": true
}

Create an index for on-demand fetching via MCP resources.

result = await index_document(
file_path="/path/to/novel.docx",
include_images=True,
include_chapters=True,
include_sheets=True,
include_slides=True
)

Returns:

{
"doc_id": "abc123def456",
"resources": {
"chapters": ["chapter://abc123def456/1", "chapter://abc123def456/2"],
"images": ["image://abc123def456/0", "image://abc123def456/1"]
},
"stats": {
"total_chapters": 12,
"total_images": 45,
"estimated_tokens": 125000
}
}

List all supported file formats and their capabilities.

result = await get_supported_formats()

Specialized tools for .docx and .doc files.

Convert Word documents to Markdown with intelligent formatting.

result = await convert_to_markdown(
file_path="/path/to/report.docx",
preserve_structure=True, # Keep headings, lists, tables
include_images=True, # Extract images to files
page_range="", # e.g., "1-5" or "3"
summary_only=False # Just metadata for large docs
)

Pagination: Documents over 25k tokens automatically paginate. Use cursor_id for next pages.


Extract tables with structure preservation.

result = await extract_word_tables(
file_path="/path/to/contract.docx",
output_format="markdown", # structured | csv | json | markdown
include_headers=True,
preserve_merged_cells=True
)

Analyze document structure, headings, and hierarchy.

result = await analyze_word_structure(
file_path="/path/to/thesis.docx",
extract_outline=True,
analyze_styles=True,
include_page_info=True
)

Detect formatting inconsistencies and style issues.

result = await check_style_consistency(
file_path="/path/to/manuscript.docx"
)

Get a clean heading hierarchy (Table of Contents view).

result = await get_document_outline(
file_path="/path/to/book.docx",
include_word_counts=True,
detect_chapters=True
)

Extract opening sentences from each chapter.

result = await get_chapter_summaries(
file_path="/path/to/novel.docx",
sentences_per_chapter=3,
include_word_counts=True
)

Full-text search with context and location.

result = await search_document(
file_path="/path/to/legal.docx",
query="indemnification",
max_results=20,
context_chars=100
)

Extract named entities (people, places, organizations).

result = await extract_entities(
file_path="/path/to/novel.docx",
entity_types="all", # all | people | places | organizations
min_occurrences=1,
include_context=True
)

save_reading_progress / get_reading_progress

Section titled “save_reading_progress / get_reading_progress”

Bookmark your position in a document.

# Save
await save_reading_progress(
file_path="/path/to/book.docx",
chapter_number=5,
paragraph_index=12,
notes="Left off at the climax"
)
# Retrieve
progress = await get_reading_progress(
file_path="/path/to/book.docx"
)

Specialized tools for .xlsx, .xls, and .csv files.

Comprehensive statistical analysis.

result = await analyze_excel_data(
file_path="/path/to/sales.xlsx",
sheet_names=[], # Empty = all sheets
include_statistics=True, # Mean, median, std, etc.
detect_data_types=True,
check_data_quality=True # Missing values, duplicates
)

Extract and analyze formulas with dependencies.

result = await extract_excel_formulas(
file_path="/path/to/budget.xlsx",
sheet_names=[],
include_values=True, # Show calculated values
analyze_dependencies=True # Formula reference chains
)

Generate chart configurations for visualization libraries.

result = await create_excel_chart_data(
file_path="/path/to/data.xlsx",
chart_type="auto", # auto | bar | line | pie | scatter
output_format="chartjs", # chartjs | plotly | matplotlib
x_column="", # Empty = auto-detect
y_columns=[] # Empty = auto-detect
)

All tools accept HTTP/HTTPS URLs. Files are downloaded and cached for 1 hour.

result = await extract_text(
"https://example.com/quarterly-report.docx"
)

All tools return structured errors:

{
"error": "Document is password-protected",
"hint": "Remove password protection or provide an unencrypted version",
"file_path": "/path/to/encrypted.docx"
}

Common error types:

  • File not found — Check the path exists
  • Unsupported format — Check format support with get_supported_formats
  • Password protected — We detect but can’t extract encrypted files
  • Corrupted file — Try analyze_document_health for diagnostics

“Have you seen my documentation?”


— Milton, probably
🎉

Flair Earned!

Badge Name

🎖️

You earned your first flair!

What should we call you?