Why is DOM-based parsing better than regex for HTML to Markdown?

DOM parsing hands the job to the browser's own HTML parser, which handles nested tags, quoted attributes, and optional closing tags correctly — producing more accurate Markdown than regex-based approaches.

What Markdown flavour does the output use?

GitHub Flavored Markdown (GFM) with pipe tables, fenced code blocks with language tags, and 2-space nested list indentation. Supported by GitHub, GitLab, Notion, Obsidian, and most modern editors.

Can I convert a full webpage to Markdown?

Yes. Paste the full page HTML and enable Clean Article mode to remove nav, header, footer, and sidebar elements automatically before parsing. The converter extracts the main body content.

Is my HTML content processed on a server?

No. All conversion happens in your browser using JavaScript's DOMParser and the marked.js CDN library. Your content is never sent to any server.

What happens to CSS classes and inline styles?

CSS classes, inline styles, data attributes, and other non-semantic attributes are stripped. Only semantic meaning — headings, emphasis, links, and lists — is preserved in the Markdown output.

How does the Strip images option work?

When Strip images is checked, all img elements are ignored during conversion and no image Markdown is included in the output.

What does Clean Article mode do?

Clean Article removes nav, header, footer, aside, script, and style elements before parsing, leaving only the main editorial content for conversion.

HTML to Markdown Converter — Bidirectional

Paste HTML and get clean Markdown using accurate DOM-based parsing — handles nested lists, tables, code blocks, and blockquotes. Switch tabs to also convert Markdown to HTML. No server, no signup.

Strip images Strip links (plain text) Clean article (remove nav/header/footer)

HTML Input

Markdown Output

0 words

Markdown Input

HTML Output

Live Preview

How to Use the HTML to Markdown Converter

Select the HTML → Markdown tab (active by default).
Paste your HTML into the left input panel.
The right panel outputs clean Markdown instantly as you type.
Enable Strip images or Strip links to clean up the output, or enable Clean Article to remove nav, header, and footer elements from a full page paste.
Click Copy Markdown or Download .md to save your output.
Switch to the Markdown → HTML tab to convert in the opposite direction, with a live rendered preview below the output.

Key Features

DOM-based HTML parsing — accurate for nested lists, complex tables, and inline elements inside block elements
GitHub Flavored Markdown output: pipe tables, fenced code blocks with language tags, 2-space nested list indentation
Strip images option to remove all image Markdown from output
Strip links option to convert links to plain text
Clean Article mode — removes nav, header, footer, aside, script, and style before parsing
Bidirectional: Markdown → HTML via marked.js with live preview
Live word count on Markdown output
Copy Markdown and Download .md buttons
Fully browser-based — your HTML is never sent to any server

Use Cases

Convert full webpage HTML to Markdown for documentation

Paste the full HTML source of a web page and enable Clean Article mode to strip navigation, headers, and footers automatically. The converter extracts and converts the main editorial content to clean Markdown, ready for use in a documentation repository or knowledge base.

Extract article content as Markdown from blog HTML

When migrating blog content from one CMS to another, you often need Markdown instead of HTML. Paste the article HTML here to get a clean Markdown file you can commit to a static site generator like Hugo, Jekyll, or Eleventy.

Convert HTML tables to Markdown pipe tables

HTML tables convert cleanly to GFM pipe table format with proper header separators. This is useful when moving tabular data from a web page or exported report into a README, documentation page, or Notion database.

Paste HTML email content and get clean Markdown

HTML emails are notoriously messy. Pasting the email HTML here with Strip links and Strip images enabled strips out tracking links, inline styles, and image tags, leaving you with a clean Markdown representation of the email body text.

Preview GitHub README formatting from Markdown input

Switch to the Markdown → HTML tab and paste your README content to see a live rendered preview. This lets you verify heading hierarchy, code block formatting, table alignment, and task list rendering before pushing to GitHub.

Supported HTML Elements

Headings: h1–h6 → # through ######
Paragraphs: p → text with surrounding blank lines
Bold / Italic: strong, b → **, em, i → *
Links: a → [text](href)
Images: img → ![alt](src)
Lists: ul/li → - item, ol/li → 1. item (2-space indent for nesting)
Blockquotes: blockquote → > text
Inline code: code → `text`
Code blocks: pre → ```language\ncode\n```
Tables: table → GFM pipe table
Horizontal rule: hr → ---
Wrappers: div, span, section, article, main → content only

FAQ's

Regular expressions cannot reliably parse nested HTML — they fail on attributes with quotes, deeply nested tags, and optional closing tags. DOM parsing hands the job to the browser's own HTML parser, which handles all edge cases correctly and produces more accurate Markdown for complex documents.

The output uses GitHub Flavored Markdown (GFM), widely supported by GitHub, GitLab, Notion, Obsidian, VS Code, and most modern Markdown editors. Tables use the GFM pipe table format; code blocks use triple-backtick syntax with language tags.

Yes. Paste the full page HTML source and enable Clean Article mode to automatically remove navigation, header, footer, and sidebar elements before parsing. The converter extracts and converts the main body content. Results vary based on how the page is structured.

No. All conversion happens entirely in your browser using JavaScript's DOMParser for HTML→Markdown and the marked.js CDN library for Markdown→HTML. Your content is never sent to any server. The tool works offline once the page and marked.js have loaded.

CSS classes, inline styles, data attributes, and other non-semantic HTML attributes are stripped during conversion. Markdown has no concept of styling — only semantic meaning (headings, emphasis, links, lists) is preserved, producing clean, portable Markdown.

When Strip images is checked, all img elements are ignored during conversion. No image Markdown (![alt](src)) is included in the output. This is useful when you want the text content of a page without embedded image references.

Clean Article removes nav, header, footer, aside, script, and style elements from the DOM before parsing begins. This strips site chrome and leaves only the main content area. It is ideal for extracting readable article text from a full page paste.

Nested lists are indented by 2 spaces per level in the Markdown output, following the GFM convention. A ul inside a li becomes a nested list item prefixed with two spaces, preserving the hierarchy accurately.