HTML Entity Encoder / Decoder

Encode special characters to HTML entities and decode entities back to plain text.

Plain Text
Encoded HTML

How It Works

HTML entities are special codes that represent characters which either have special meaning in HTML markup or cannot be typed directly. The most critical ones are &amp; for the ampersand (&), &lt; for less-than (<), and &gt; for greater-than (>). Failing to encode these when inserting user-supplied data into HTML is a primary cause of Cross-Site Scripting (XSS) attacks.

There are three entity formats. Named entities like &copy; are human-readable and widely supported. Decimal entities like &#169; use the character's Unicode code point in base-10. Hexadecimal entities like &#xA9; use base-16. All three formats decode to identical characters — choose whichever fits your context.

The Encode all characters option converts every character, including plain ASCII letters and digits, to its entity form. This is useful for email address obfuscation (hiding addresses from scrapers) or encoding text for contexts where only entity form is safe.

Decoding uses the browser's built-in DOMParser API, which handles all named, decimal, and hexadecimal entities correctly — including edge cases like &nbsp; (non-breaking space) and multi-byte Unicode characters.

Frequently Asked Questions

What are HTML entities and why do I need them?

HTML entities represent characters that have special meaning in HTML markup (like < > &) or that are hard to type directly (like © or €). Without encoding, < in text would be interpreted as the start of an HTML tag, causing display errors or security vulnerabilities (XSS).

How do I encode < and > in HTML?

Use &lt; for < and &gt; for >. Also encode & as &amp;, " as &quot;, and ' as &apos; (or &#39;) when inside attribute values. These five are the minimum required to prevent XSS and display issues.

What is the difference between named, decimal, and hex entities?

Named: &copy; (human-readable, works in HTML). Decimal: &#169; (Unicode code point in base 10). Hexadecimal: &#xA9; (Unicode code point in base 16). All three produce the same character (©). Named entities are only defined for a subset of Unicode; decimal/hex work for any Unicode character.

Does encoding HTML entities prevent XSS attacks?

Encoding < > & " ' prevents most injection attacks when inserting user content into HTML. However, context matters: encoding is different for HTML attributes, JavaScript strings, CSS, and URL parameters. Always use a proper escaping library for your framework rather than manual encoding.