Batch-converting documents to markdown with Microsoft's markitdown

Published: (June 18, 2026 at 01:01 PM EDT)
1 min read
Source: Dev.to

Source: Dev.to

Here’s a quick tool that landed in my queue recently: microsoft/markitdown It’s a Python CLI that converts PDFs, Word docs, PowerPoint, and Excel files to Markdown. Not groundbreaking, but if you’ve ever had to process a folder of legacy documentation for a static site, you know the value of not doing it manually. Two things I found useful: Batch conversion with piping markitdown —input document.docx —output converted/

You can point it at a directory and it processes everything in one shot. Combine with standard Unix tools: find ./legacy-docs -name ‘*.docx’ | xargs -I{} sh -c ‘markitdown —input {} —output ./md/’

stdout output for scripting markitdown document.pdf

Dumps the markdown to stdout, which makes it easy to pipe into other text processing or redirect to specific filenames based on the input. It’s on PyPI (pip install markitdown), so it’ll drop into a CI pipeline without much friction. If you’ve got a documentation migration on your plate and you’re tired of manual conversions, it’s worth a look. https://github.com/microsoft/markitdown

0 views
Back to Blog

Related posts

Read more »

The Model Doesn't Remember. You Do

Introduction Before I dug into how an LLM works, I assumed each chat stored its memory or context in its own. The moment I realized it was just an array with al...