As an SEO copywriter, the concepts governing the altogether more complicated world of technical SEO can sometimes seem abstract and strange. Often, as was the case with the canonical tag, it involves things I haven’t encountered before when surfing the web.
For my latest stab at technical SEO, however, I’ve decided to take a look at something that I see on a ridiculously common basis online; 404 pages.
To the common man, the 404 page is a minor irritant, a sign that maybe we’ve misspelt a URL or that a page we once loved no longer exists. It’s a pain, but we move on.
In SEO, however, the appearance of a 404 status code can have more dire consequences. A page that returns a 404 status code doesn’t get indexed by search engine crawlers and doesn’t pass any link equity. Which is good in a sense, if you don’t want a search engine to index an old page.
However, if a 404 is being returned when it shouldn’t be, or someone has linked to a page with a slight misspelling, you could potentially be losing out on a lot of linking power. And that, as we all know, isn’t good.
404s are also important when it comes to user experience. They can be helpful in the sense that they can redirect a user to an actual live page on the site via a custom 404 page, but too many 404s can make for a frustrating experience.
What Problems Can 404s Present?
404s can bring forth a lot of issues, especially if the page shouldn’t be returning a 404. A site with too many 404s is generally thought of as being lower quality and SEO AJ Kohn suggests that a high ratio of 404s to actual pages could well be a negative ranking factor.
If a page that previously had a lot of link equity returns a 404, then that link power is lost completely. Should that page have hosted a time specific piece of viral content that’s now irrelevant but drew a lot of links at the time, that’s a lot of work to be throwing down the pan for the sake of a status code.
Strangely, 404s can also present a problem if they don’t actually return a 404 status code. Say you take a page down and replace it with your custom 404 graphic, but the page still returns a 200 status code. A crawler like Googlebot will see that page as still being active and potentially penalise you for thin or duplicate content (if other pages intended to be 404s are also returning the 200 status code).
As previously alluded to, too many 404s can also make for a poor user experience, particularly if the page returning the 404 is still linked to on the main site.
How To Find 404 Pages
Finding out which of your pages returns a 404 status code is relatively easy thanks to Google’s ever-trusty Webmaster Tools. Webmaster Tools will list all of the 404 pages on your site, while ‘Fetch as Googlebot’ can tell what’s being indexed and what isn’t.
Beyond Webmaster Tools, you could also use plug-ins like HTTPFox and Live HTTP Headers. As well as 404s, these plug-ins will give you a comprehensive overview of all the status codes currently active on your site.
How to Solve Problematic 404s
The first step in solving problematic 404s is to determine whether the page should actually be returning a 404 status code. It sounds horribly obvious, but it can save you from wasting time on unnecessary 301 redirects. If the page holds no link value, doesn’t have a lot of traffic and the URL can’t be redirected to a similar page, then returning a 404 would be entirely justifiable.
For instances where a URL previously returned content similar to that of an existing page, use a 301 redirect to drive users to the current page and pass the previous link value to the existing live page. This can also be applied to misspellings of URLs; although the best practice is to only apply redirects to common spelling errors (i.e frequently missed letters, etc.)
One method commonly pursued by SEOs is to redirect all 404 pages to the homepage. In one way, this is actually quite a good idea, redirecting traffic and links to old pages back to a page that will always exist so long as the website does. However, it can prove a frustrating and confusing experience for the user, with unrelated URLs booting them back to the homepage.
In this instance, I would suggest only redirecting pages for which the lost value in terms of traffic and links is significant and leaving other pages as 404 errors, with links (that won’t pass value) back to various pages on the site. This might seem counter-intuitive in SEO terms, but it gives the user a choice of where to navigate to next; it’s surprising how endearing users find this freedom of choice when forming an initial opinion of your site!
The same applies to leaving custom 404 pages returning 200 status codes – it looks bad and it’ll eat up your ‘crawl quota’, so make sure all pages that appear to be 404 pages return the 404 status code. There is one exception that I can think of, however; if you happen to have an impressive custom 404 page, leave one page as a 200. This page could potentially draw links and go viral, so it’s worth making the most of that link equity!
And that’s 404 status codes – relatively simple but potentially devastating if used incorrectly. What are your thoughts on 404 pages? What 404 tricks have you used to your advantage? Let me know in the comments!