404 Error Pages and SEO
404 error pages have recently gained some attention in the SEO community, thanks to Google’s announcement that Webmaster Tools now shows where your 404 errors come from. Google’s celebrity blogger Matt Cutts pointed out that webmasters should start using this new tool to find the sources of 404 errors and either:
- Get in touch with the originating site and ask them to change the link to point to the correct page or;
- 301 redirect users to the correct URL or an equivalent resource (well, he doesn’t quite say that, but almost)
This new report within Google Webmaster Tools makes debugging 404 errors easier for webmasters, but this is not the first time such data has been made available. In fact, savvy webmasters have been dealing with such errors in the ways listed above for years, finding this information by analysing their log files or using Analytics tools. You can even view this information from within Google Analytics, if you set it up correctly.
However, detecting and redirecting 404 errors is only one small part of the problem with invalid, expired or moved content. There are several major pitfalls, and sites need to set up a strategy to deal with these issues effectively. My suggestion would be the following steps:
Check your 404 pages are configured correctly
404 pages range from the downright ugly “404 Page Not Found” type errors that we’ve all seen, to the more friendly and helpful customised error pages that you see on sites like Twitter.com. A vital function of these pages is to tell both users and spiders that the page does not exist. To tell spiders, the page should return a special code in the HTTP response header of the page.
Use an HTTP header analyzer like the one at SEO consultants to check the response you get back from the server on an invalid URL. These tools will return something like the block below, which was the response for the non-existant URL http://www.jamiedigi.com/dsfadsff:
#1 Server Response: http://www.jamiedigi.com/dsfadsff
HTTP Status Code: HTTP/1.1 404
Date: Sun, 02 Nov 2008 20:12:35 GMT
Server: Apache
Cache-Control: no-cache, must-revalidate, max-age=0
Expires: Wed, 11 Jan 1984 05:00:00 GMT
Pragma: no-cache
X-Pingback: http://www.jamiedigi.com/xmlrpc.php
X-Powered-By: PHP/4.4.9
Last-Modified: Sun, 02 Nov 2008 20:12:36 GMT
Connection: close
Content-Type: text/html; charset=UTF-8
Look for the line “HTTP Status Code: HTTP/1.1 404“, indicating a 404 error response. If you find this line then your server is configured correctly. If the status code is 301, 302, 200 or something completely different then you have a problem, and you need to get in touch with your technical resource to correct the issue. All invalid URLs should return a 404 error.
Define content expiry rules
Many websites have content that expires. An example might be a retailer who stocks products that reach end of life after a certain length of time. In this case, the retailer has three options:
- Remove the products from the site: This is a wasted opportunity, as it means that any links to that product from around the web will return a 404 error, and any rankings that the page has in the search engines will disappear. Also if affiliates are promoting the product then their links will no longer work.
- Keep the products on the site: If a product has reached end of life then the retailer could keep it listed on the site, with a prominent message saying that it is no longer available and you can get x instead, where x is a newer edition or an equivalent product. This means that the page will stay ranked in the search engines, and existing links from third parties will still work, while traffic arriving at this page will be referred to a newer product, still giving the user a chance to buy.
- 301 redirect to another page: redirecting the user to another page on the site using a 301 redirect will mean that the external links still work, but point to the new page, that search engine link equity from the old page is passed to the new page, and that the user doesn’t get taken to a “page not found” error.
- Work with your CMS system to keep the URLs the same (the best choice by far) or;
- 301 redirect requests from all old URLs to the newest equivalent
Which of the methods you use above will depend a lot on the sort of site you run and how relevant it is to redirect expired content to another page, category or product. Some of the more advanced, modern CMS systems will allow you to define a “cascade-down” system, whereby administrators can define an alternative product when an item comes to end of life, and the CMS will 301 redirect requests from the old product to the new product. If there is not alternative product defined, then the CMS will 301 redirect requests from the old product to the nearest product category. If the category no longer exists then the CMS will return a 404 error page, or 301 redirect the request to the site home page.
Define rules for moving content
Moving content can cause severe damage to your natural search rankings. If you change a significant number of URLs on your site (because of a site restructure, or a new CMS system) then you will see natural traffic drop dramatically. To minimise this impact either:
If you chose the 301 route then it is vital that you redirect your URLs to the nearest equivalent URL in the new structure, not just to your home page or another category page. There should ideally by a one-on-one mapping between each old page and each new page on the site.
404 error monitoring
Whether you use the 404 error monitoring tool in Google Webmaster Tools or your log files / analytics package, you need to set up a regular report showing what 404 errors occur on your site, how often people are visiting those URLs and where those visitors are coming from. If a non-existant page is getting more than the occassional visitor, you need to work out a plan to deal with that error – either pointing the visitor in the right direction with a 301 redirect, or requesting that the originating site changes their link. If it is an internal link that is incorrect then this needs to be detected and fixed quickly.
In addition to monitoring 404 errors on your own site, it is often worth running a spider like Xenu’s Link Sleuth across your site to detect invalid links pointing from your site to other sites. Excessively pointing to invalid content on other sites can cause your site to be penalised.
Conclusion: Strategise!
By defining a way to regularly work through the steps above you can be sure that you are making the most of the content that you put on your site, not missing any opportunities for free links, and not making the job of ranking in the natural search engines any harder for yourself.


404 errors are killers. If you have over 100 of them you can kiss your site goodbye