Screaming Frog and Its Uses

Screaming-Frog-LogoAt some point in a SEOs life there will be a need for a comprehensive site crawler that can also be customised. Introducing Screaming Frog, a desktop program that spiders websites for links, images, CSS and much more.

Now chances are you have already heard of and know its capabilities of checking for broken links, page titles and meta descriptions, which it does very well, but did you know that Screaming Frog can be used for a range of other situations not just SEO related but from a general website maintenance and web development aspect?

Below are series of tasks that you can perform with Screaming Frog with a quick step by step guide.

Check Google Analytics code is installed on all pages of your site

  • Open up Screaming Frog and go to Configuration > Custom and you will be met with the following popup:

  • Here you can instruct Screaming Frog to search the code of internal web pages and get it to tell you whether a specific element exists or not. So grab your analytics ‘UA’ code (UA-xxxxxx-x) and paste it into ‘Filter 1′ and ‘Filter 2’.
  • Now on the drop down box for ‘Filter 2’ select ‘Does Not Contain’. Essentially you are now asking Screaming Frog to tell you which pages do not Google Analytics installed via ‘Filter 2’ and which pages do have it installed via ‘Filter 1’
  • Click ‘OK’ and then in the main interface place in you URL and click ‘Start’
  • Screaming Frog will now start flying through all your pages doing its normal checks and also checking whether Analytics is installed or not. To see the result of this click on the ‘Custom’ tab

  • Remember you need toggle between Filter 1 & 2 using the drop down box and Screaming Frog will give you the URLs that do and don’t have Google Analytics installed.

Replacing an existing website

Imagine you have just rebuilt a site and the URLs have all changed. Well, this is where Screaming Frog is a godsend, not only for ensuring you pick up on all the URLs that need redirecting but also to ensure they are redirecting as they should be.

  • First of all set Screaming Frog to spider your existing site, you may want to change the configuration and stop it from checking images, CSS, JavaScript, SWF files and external links (to do this go to Configuration > Spider and deselect the appropriate boxes)
  • Once complete export the results into a CSV file using the ‘Export’ button and store this file away somewhere safe as you will need it later. Of course, this file will help you get you redirect rules in place, whether they are done ‘en masse’ or individually.
  • The next step can either be done once the new site is live or even better when the new site is on a test URL. At this point I am going to assume the site is on test URL and you have in place all your redirect rules.
  • Go back to your CSV file and copy and paste all the URLs into a notepad file – they should look like this (http:// has to be present):

  • As above we have assumed your soon-to-be-launched site is on a test server so you will need use find and replace in Notepad (shortcut: Ctrl+H)

  • Save this file and head over to Screaming Frog. Now we need to give Screaming Frog this file to spider those URLs and ensure our redirects are in place. Go to Mode > List and you should see the following:

  • Hit ‘Select File’ and navigate to your earlier saved text file. You will be prompted the following popup:

  • This confirms that Screaming Frog can read your URL and recognises them. Hit ‘OK’ and then on the main interface click ‘Start’. You will see the crawl happen under the Response Codes tab.
  • Screaming Frog provides all the header response code for the URLs you provided in the text file. This allows you check that your redirects are in good working order and that you haven’t missed anything. Best of all, you are able to do this before putting the new site live and cocking up previous SEO work!

Blogger outreach

Note: You will need access to Followerwonk here

Blogger outreach is an increasingly popular method of link building for SEOs these days and, strangely enough, Screaming Frog can help. Essentially it is possible to use Screaming Frog to trawl a number of website URLs and check whether they allow guest posts or accept guest posts.

In this scenario we are looking for food bloggers. To find these people we use the Followerwonk tool recently acquired by SEOmoz.

  • Do a simple search for ‘food blog’ on Followerwonk and export the results into a CSV file. In this case there are a lot of results (just over 6,000), some of which maybe a little messy but it’s fine for this example. The point is that whilst we have found over 6,000 food bloggers.

It would be nice to know which of these would accept guest posts, if any. Of course, a number of sites do not openly advertise that they accept such posts so this method maybe eliminating a lot of those. We can’t do everything for you!

  • The CSV file will have Twitter profiles with a whole host of information including URLs for websites (hopefully food blogs). Now it’s time to use some Excel skills and filter those URLs out of the spreadsheet and into a notepad file (out of the 6,000 URLs I am left with just under 5,000). You will have something that looks like this:

  • Save that file and head over to Screaming Frog. Go to Mode > List and select your Notepad file and as above Screaming Frog should confirm in a popup that it is able to read the URLs.
  • Now you need to set up some rules for Screaming Frog to adhere to when spidering each of these domains so head over to Configuration > Custom where you will be met with a popup.
  • The screenshot below shows how you could treat the different fields in that popup box.

  • Essentially you are asking Screaming Frog to check the URLs in the notepad file for any instances of the above keywords, you can chop and change these as you please.
  • Hit ‘OK’ and in the main interface click ‘Start’. Depending on how many URLs you have this may take some time. Once complete head over to the custom tab in Screaming Frog where you will be able to see the result of the crawl by filter. In this example, out of the 5,000 URLs, I have (potentially):
  • 4 URLs that have ‘write for us’ on their site
    103 URLs that have ‘guest post’ on their site
    121 URLs that have ‘contribute’ on their site
    3 URLs that have ‘guest author’ on their site
    78 URLs that have ‘contributors’ on their site

    Now you have a list of sites that could potentially accept guest posts. Of course, there is still a lot of leg work to do in terms of determining the quality of those sites but in finding out quickly which sites would accept guest posts this is a great technique.

Backlink Checking / Broken Linkbuilding

Every SEO has a list of links they have built for their client’s site or their own site. Screaming Frog makes it a doddle to go and check these links still exist.

  • Take the list of links and as above place them in a Notepad file ensuring each link starts with http://
  • In the custom configuration box under ‘Filter 1’ place your domain in like so test.co.uk and in Filter 2 do the same but change the drop down to ‘Does Not Contain’ then set Screaming Frog to spider those URLs
  • In the custom tab you will now be able to see which external sites still have your link on them and which don’t. This is a quick and easy check that enables you to go back and regain those links that have been lost.

URL issues

Screaming Frog can help identify problem URLs, Spider traps and potential duplicate content.

  • Set Screaming Frog to crawl your site and export the results into a CSV file (you may want to configure it so that images, CSS, scripts are ignored)
  • Take this CSV file use Excel filters to first off order the URLs by A to Z
  • Now manually scan through the URLs and check for trailing slash issues, canonical URLs and infinite URLs. Example of each are shown below:
  • Trailing slash
    • http://www.site.co.uk/products
    • http://www.site.co.uk/products/
  • Canonical
    • http://www.site.co.uk/index.php
    • http://site.co.uk/index.php
    • http://site.co.uk/
  • Infinite URLs
    • http://www.site.co.uk/www.site.co.uk/1

The above are all duplicate content issues and as long as they are linked to as above, they will be found by Screaming Frog allowing you to easily identify issues and rectify them.

There are a number of other uses for Screaming Frog and a lot of them can be done using similar methods as above. So why not get Screaming Frog and try out the above tasks yourself and put it into practice with some of the scenarios below:

  • Google Webmaster Tools provides a number of crawl errors which are sometimes out of date / incorrect. Use the software to get up to date statuses of these errors and then tackle them.
  • Use Screaming Frog to create XML sitemaps for your site, remember the spider configuration can be very useful here.
  • Check your existing sitemaps for errors
  • Use Screaming Frog to check the migration of an existing site.

Can you share any other uses with us?

(By the way since writing this it seems theres a new update available for Screaming Frog, more info can be found here).

Mithul Mistry
  • Written by on 8th February 2013 at 10:43
  • “Mithul Mistry is the head of digital marketing at Fluid Creativity.”
  • Google+