Test Dashboard

“Yeah, I’m gonna need you to go ahead and come in on Saturday…”

The Numbers

Tests Passing

Unit tests covering all 20 tools

99.3%

Torture Test Success

299/301 random Office documents

20%

Code Coverage

Focus on critical paths

1.39s

Test Suite Runtime

Fast feedback loop

Torture Test Results

We grabbed 301 random Office documents from a real filesystem and threw them at mcwaddams.

By Format

Format	Tested	Passed	Failed	Success Rate
`.docx`	142	142	0	100%
`.xlsx`	89	89	0	100%
`.pptx`	34	34	0	100%
`.doc`	18	17	1*	94.4%
`.xls`	12	12	0	100%
`.ppt`	4	4	0	100%
`.csv`	2	1	1*	50%

*Failed files were empty (0 bytes) or corrupt. Not extraction failures.

What We Found

1,293 resources indexed across all documents
Zero crashes — Every file handled gracefully
Clear error messages for the 2 failures
Average extraction time: 0.045s per document

Test Categories

Unit Tests (53)

Category	Count	Description
Universal Tools	12	`extract_text`, `extract_images`, `detect_format`, etc.
Word Tools	18	`convert_to_markdown`, `extract_tables`, structure analysis
Excel Tools	9	`analyze_data`, `extract_formulas`, chart generation
MCP Resources	8	Resource store, URI parsing, format conversion
Validation	6	File validation, error handling, edge cases

What We Test

Happy Path — Normal documents extract correctly
Legacy Formats — .doc, .xls, .ppt from the basement
Large Documents — Pagination triggers at 25k tokens
Malformed Files — Graceful errors, no crashes
Edge Cases — Empty files, Unicode, special characters
URL Processing — HTTP downloads, caching

Run Tests Yourself

# Clone the repo
git clone https://github.com/ryanmalloy/mcwaddams.git
cd mcwaddams

# Install dev dependencies
uv sync --dev

# Run tests
uv run pytest

# With coverage
uv run pytest --cov=mcwaddams

Makefile Shortcuts

make test          # Run tests + generate HTML dashboard
make test-pytest   # Just pytest, no dashboard
make view-dashboard  # Open the HTML report

The HTML Dashboard

We built a visual test dashboard because staring at pytest output gets old.

Features:

Pass/fail stats at a glance
Expandable test details
MS Office-inspired theme (Word blue, Excel green, PowerPoint orange)
Detailed I/O for debugging

Coverage Philosophy

Our coverage is 20% — and that’s intentional.

We focus on:

Critical extraction paths — The code that touches your documents
Error handling — Making sure failures are graceful
Edge cases — The weird stuff that breaks other tools

We don’t test:

Boilerplate and configuration
Third-party library internals
UI/formatting code

CI/CD

Every push triggers:

Lint — ruff check
Format — black --check
Type Check — mypy
Tests — pytest with coverage
Build — Verify package builds

“I could set the building on fire…”

But we’d rather just run the tests.

🎉

Flair Earned!

Badge Name