01 August 2010

How to use ampersands in HTML: to encode or not to encode?

In HTML, the ampersand character (“&”) declares the beginning of an entity reference (a special character). If you want one to appear in text on a web page you should use the encoded named entity “&”—more technical mumbo-jumbo at w3c.org. While most web browsers will let you get away without encoding them, stuff can get dicey in weird edge cases and fail completely in XML.

This seems like a simple rule, but what about urls in HTML, javascript files, javascript in HTML, etc… Here’s a little guide to help clear up that ampersand HTML confusion:

Text in HTML:

<p>Jack &amp; Jill ran up the hill.</p>

A link in HTML (or any HTML attribute value):

<a href="http://lmgtfy.com/?l=1&amp;q=rick+roll">tired meme</a>

Note: Matt describes in the comments below the difference between escaping an & to split up query string parameters and percent escaping one to be in the value of a query string parameter.

A link in javascript:

window.location = 'http://lmgtfy.com/?l=1&q=rick+roll';

If you’re using a web framework that escapes variables for you and you pass in a url as a variable into javascript, then you’ll have to make sure it doesn’t encode the ampersands. In Django, you would write something like this: window.location = '{{ url|escapejs }}';

Also, if this is inline javascript—in an HTML document, not a separate .js file—then you still shouldn’t escape the ampersand, which means the document will not validate as XHTML. Either throw it into a separate .js file or stop worrying so much about validating your code.

Inside an onclick in HTML:

<a href="#" onclick="window.location='?l=1&amp;q=rick+roll';return false">

This is redundant to the second example, but worth pointing out since it’s javascript inside an attribute of an HTML tag.

Dynamically in Javascript (example using jQuery):

$('#result').text('Jack & Jill'); // .text() escapes the text for you
$('#result').html('Jack &amp; Jill'); // .html() sets the HTML directly
document.getElementById('result').innerHTML = 'Jack &amp; Jill';

In a Tweet

When to use&and when to use&amp;amp;http://bit.ly/dtiumF

Twitter auto-converts encoded ampersands…

Some extra notes:

  • If you want to use an ampersand as a value inside the query string of a url (and not as a delimiter for separating arguments), then you should use the URL-encoded value: %26
  • Quotes should be encoded too (&quot;), but I prefer to use utf8 curly quotes
  • The other main characters to remember to encode are < (&lt;) and > (&gt;), you don’t want to confuse your browser about where HTML tags start and end

Comments (2)

1. Matt Giuca wrote:
<p> Nice summary. I think it&#39;s important to point out when you say that you need to encode &#39;&amp;&#39; characters in a HTML link attribute (&lt;a href=&quot;&quot;&gt;) that you are referring only to the ampersands of the URI syntax, not those found in the link (which must be percent-encoded). Your example is absolutely correct, but I thought I&#39;d clarify it with another. </p> <p> &lt;a href=&quot;http://lmgtfy.com/?l=1&amp;amp;q=rock%26roll&quot;&gt;awesome music&lt;/a&gt; </p> <p> This example has two ampersands in it. Firstly, the query string term &quot;rock&amp;roll&quot; needs to be percent-encoded before it can be included in the URL. So its ampersand is percent-encoded as %26 and this component becomes &quot;rock%26roll&quot;. Any ampersand in the text itself, such as &quot;rock&amp;roll&quot;, should always be percent-encoded, not HTML-escaped. </p> <p> This is then used to construct the full URL, &quot;http://lmgtfy.com/?l=1&amp;q=rock%26roll&quot;. Thus the second ampersand appears. This ampersand is part of the URI syntax and so when you put the URI into the HTML link, it is still a bare ampersand which needs to be escaped. Therefore it is encoded as &quot;&amp;amp;&quot;. Only the URI-syntax ampersands should be HTML-escaped. </p>

Posted on 2 March 2011 at 6:03 PM  |  permalink

2. peter wrote:
<p> Great point Matt! If anyone wants more detail, check out his post on the subject: https://unspecified.wordpress.com/2008/05/24/uri-encoding/ </p>

Posted on 2 March 2011 at 8:03 PM  |  permalink

Did you find this helpful or fun? Please donate!

donate via btc or eth

btc: 18jCzwsZDGQYcs6Kyv92pd4683cnnxm1Dd
eth: 0xC285F21Cb271Cb4B3F70c4C47B2f7B26063AF590
paypal: paypal.me/mrcoles

Peter Coles

Peter Coles

is a software engineer who lives in NYC, works at Ringly, and blogs here.
More about Peter »

github · soundcloud · @lethys · rss

It’s time to get big money out of politics. Join the kick-started campaign to put government back in the hands of the people. Pledge mayday.us now