22 February 2010

Mod rewrite index.html and www

Duplicate content is bad for SEO. A common mistake is serving pages under both index.html and a directory or serving the same content with and without www. For example:

http://mrcoles.com/
http://mrcoles.com/index.html
http://www.mrcoles.com/
http://www.mrcoles.com/index.html

This can dilute your link juice across multiple pages and confuse search engines, who must choose which page they think is the original.

Fortunately, this can be easily solved with this mod rewrite (aka mod_rewrite) script that redirects those pages to one canonical url. If you’re running apache, add this to either your .htaccess or httpd.conf file:

RewriteEngine on
RewriteCond %{HTTP_HOST} ^www\.mrcoles\.com [OR]
RewriteCond %{REQUEST_URI} /index\.html$
RewriteRule (.*?)(index\.html)?$ http://mrcoles.com$1 [NE,R=301,L]

What you get out of this script:

  • it will redirect any pages that start with “www” or end with “index.html” to a no-www and non-index.html page, e.g. http://mrcoles.com/.
  • it uses a 301 redirect, which tells search engines that the page has been permanently moved to the new address.
  • by combining the check for www and index.html into one call, the script will only ever do one redirect, instead of potentially two if you were to separate them

Furthermore, you can easily modify this to prefer www over no-www, or to redirect something like index.php instead of index.html:

RewriteCond %{HTTP_HOST} ^mrcoles\.com [OR]
RewriteCond %{REQUEST_URI} /index\.php$
RewriteRule (.*?)(index\.php)?$ http://www.mrcoles.com$1 [NE,R=301,L]

If you find yourself a little confused by mod_rewrite, check out my simple way to understand mod rewrite.

Comments (3)

1. George wrote:

Thank you Peter.

Posted on 24 February 2010 at 11:02 PM  |  permalink

2. Alex wrote:

Will this method work using IIS Mod-Rewite? http://www.micronovae.com/ModRewrite/ModRewrite.html

Posted on 26 April 2010 at 11:04 AM  |  permalink

3. peter wrote:

I don’t know, let me know if you give it a try and it does (or doesn’t)

Posted on 26 April 2010 at 2:04 PM  |  permalink

Peter Coles

Peter Coles

is a software engineer who lives in NYC, works at Hunch/eBayNYC, and blogs here.
More about Peter »

@lethys · github · hunch
rss