Document Extraction
Extract text, images, metadata from Word, Excel, PowerPoint
“I was told there would be no installation…”
Don’t want to install anything? Connect to our hosted mcwaddams server via HTTP.
claude mcp add mcwaddams-hosted --transport http "https://mcwaddams.l.supported.systems/mcp"{ "mcpServers": { "mcwaddams": { "transport": { "type": "streamable-http", "url": "https://mcwaddams.l.supported.systems/mcp" } } }}from mcp import ClientSessionfrom mcp.client.streamable_http import streamable_http_client
async with streamable_http_client("https://mcwaddams.l.supported.systems/mcp") as (read, write): async with ClientSession(read, write) as session: await session.initialize() tools = await session.list_tools() print(f"Connected! {len(tools.tools)} tools available")All 20+ mcwaddams tools are available via the hosted server:
Document Extraction
Extract text, images, metadata from Word, Excel, PowerPoint
Legacy Support
Process .doc, .xls, .ppt files from the 90s
MCP Resources
Index documents and fetch chapters/sheets on demand
Smart Fallbacks
Multiple extraction methods with automatic fallback
| Feature | Hosted | Local (uvx) |
|---|---|---|
| Installation | None | uvx mcwaddams |
| File Access | URL + base64 upload | Local + URL |
| Speed | Network latency | Instant |
| Privacy | Files processed on server | Files stay local |
| Availability | Requires internet | Works offline |
The hosted server cannot access files on your local machine. Instead, you have two options:
If your document is already hosted online:
# Extract text from a public documentresult = await session.call_tool("extract_text", { "file_path": "https://example.com/report.docx"})For local files, encode them as base64 and pass via file_content:
import base64from pathlib import Path
# Read and encode your documentdoc_path = Path("my-report.docx")file_content = base64.b64encode(doc_path.read_bytes()).decode("utf-8")
# Call the tool with file_contentresult = await session.call_tool("extract_text", { "file_path": "my-report.docx", # Used for extension detection "file_content": file_content})import { readFileSync } from 'fs';
// Read and encode your documentconst docBuffer = readFileSync('my-report.docx');const fileContent = docBuffer.toString('base64');
// Call the tool with file_contentconst result = await session.callTool("extract_text", { file_path: "my-report.docx", // Used for extension detection file_content: fileContent});# Encode document to base64FILE_CONTENT=$(base64 -w 0 my-report.docx)
# Call via HTTP (simplified example)curl -X POST https://mcwaddams.l.supported.systems/mcp \ -H "Content-Type: application/json" \ -d "{ \"method\": \"tools/call\", \"params\": { \"name\": \"extract_text\", \"arguments\": { \"file_path\": \"my-report.docx\", \"file_content\": \"$FILE_CONTENT\" } } }"Every tool that accepts file_path also accepts file_content:
extract_text - Extract text with base64 uploadextract_images - Extract images with base64 uploadconvert_to_markdown - Convert Word docs to Markdownanalyze_excel_data - Analyze uploaded spreadsheetsWant the convenience of HTTP without using our server? Self-host your own:
services: mcwaddams: image: ghcr.io/ryanmalloy/mcwaddams:latest environment: - MCP_TRANSPORT=streamable-http - MCP_HOST=0.0.0.0 - MCP_PORT=8000 # Enable local file access (disabled by default for security) # - MCP_ALLOW_LOCAL_FILES=true ports: - "8000:8000"Clone the repo and use our docker-compose with caddy-docker-proxy labels:
git clone https://github.com/ryanmalloy/mcwaddams.gitcd mcwaddamscp .env.example .env# Edit .env to set your hostnamemake docker-upYour server will be available at https://your-domain.com/mcp
Run HTTP mode locally without Docker:
# Installpip install mcwaddams
# Run with HTTP transportMCP_TRANSPORT=streamable-http MCP_HOST=127.0.0.1 MCP_PORT=8000 python -m mcwaddams.serverOr with uv:
MCP_TRANSPORT=streamable-http MCP_HOST=127.0.0.1 MCP_PORT=8000 uvx mcwaddamsError: Connection refusedThe server may be temporarily down. Try:
uvx mcwaddamsLarge documents may take longer to process. The hosted server has a 60-second timeout per request.
For very large documents, consider:
Ensure you’re using https:// not http://. The hosted server requires TLS.
“So if you could just go ahead and connect… that would be great.”