ZOMDir > Blog

Wednesday, 22 November 2017

How often should I check for broken links?

How often you should check broken links depends on the percentage of broken links which is acceptable for you. 

The lower the percentage broken links you allow, the more frequent you should check and repair broken links.

With the tool Maintenance Frequency at a Glance you are able to find out how often you should check for broken links.

This tool is based on research regarding the half time of broken links in a copy of the former Yahoo! directory.

Some findings are:
  • When 3% broken links is acceptable, you should check your site every 1 month.
  • When you check your site every 3 months, you might expect 8% broken links.
  • When you check your site every 6 months, you might expect 16% broken links.
  • When you check your site every year, you might expect 29% broken links.
  • When you check your site every 2 year, you might expect 50% broken links.
I think that you should check for broken links at least every 3 months although I often advice to check your site every month for broken links.

For relative small sites I advice Broken Links at a Glance. For larger sites I advice Xenu's Link Sleuth.

Happy broken link hunting,
Hans

--
ZOMDir.com is a dynamic directory and a wiki
Everyone is able to add a link in 10 seconds
To learn more view this Slideshare presentation

Saturday, 14 October 2017

Dead Link City - A comparison of 8 Free Online Link Checkers

The site DeadLinkCity.com is a test site used by DeadLinkChecker.com.

I used this site to improve the link checker Broken Links at a Glance. Now it as good as DeadLinkChecker or, in my humble opinion, even better.

Take a look at these results.

1. Broken Links at a Glance
Tests 86 unique links, finds 75 broken links

2. DeadLinkChecker
Tests 95 links, finds 74 broken links

The difference between these link checkers is a link to the page http://www.deadlinkcity.com/disallowed/disallowed.html

This page should be blocked by robots.txt but isn't due to malformed code.

Instead of:
User-agent: *
Disallow: /disallowed/disallowed.html

this robots.txt file contains the code:
User-agent: *
Disallow: disallowed/disallowed.html

The difference of one "/" is the difference between blocked or not. See for yourself what a difference a "/" makes with this Robots.txt Testing Tool.

You might also use this Robots.txt Test Tool to see live if this "disallowed" page is really disallowed.

So DeadLinkChecker interprets the robots.txt file of DeadLinkCity incorrect, at the moment of writing this blogpost.

3. Online Domain Tools
Tests 85 links, finds 68 broken links

This link checker misses the following broken links:
  • disallowed/disallowed.html
  • error-page.asp?e=401
  • images/missing-command-icon.jpg
  • images/missing-video-poster.jpg
  • missing-button-formaction.asp
  • missing-head-profile.txt
  • missing-html-manifest.txt

4. W3C Link checker
Tests 79 links, finds 68 broken links

This link checker misses partly the same broken links. The missed broken links are:
  • disallowed/disallowed.html
  • images/missing-input-src.jpg
  • missing-button-formaction.asp
  • missing-form-action.asp
  • missing-input-formaction.html
  • missing-object-classid.html 
  • missing-object-codebase.html 

5. Dr. LinkCheck
Tests 51 links, finds 46 broken links

6. BrokenLinkCheck
Tests 49 links, finds 43 broken links

7. Internet Marketing Ninjas
Tests 50 links, finds 39 broken links

8. WebToolHub
Tests 54 links, finds 9 broken links

Good luck with your links, choose your link checker tool wisely and don't forget to check your links at a regular base.

Hans

--
ZOMDir.com is a dynamic directory and a wiki
Everyone is able to add a link in 10 seconds
To learn more view this Slideshare presentation

Thursday, 12 October 2017

The half-life of a link is two year

The half-life of a link is two year. Better said, the half-life of an external link is two year. 

That is, when you create today a website with 100 working external links and checks your website after two year with a broken link checker, you will discover that rougly 50 links are broken.


How do you know?

I can almost hear you thinking "How do you know?". Well I will explain below.

In the past I have copied as much data as possible of the directory Yahoo! This is because Yahoo! stopped, I have created a directory myself and I wanted to analyse the links and structure of this famous directory.

At January 4, 2016 I analysed the data I have and concluded that 77% (or more exactly 76.8387682%) of the links are fine.

Recently (October, 9 2017) I analysed the data again. Now 42% (42.0219319%) of the links are fine.

Based on this data I concluded that on an average day 0,093670021% of external links will get broken. That does not seem much. However the linkrot percentage per month is 2.81%. 


Consequences

After a half year one sixth of the links are broken.
After a year 30% of the links are broken.
After two years 50% of the links are broken. Hence the half-life of a link is two year.

See also this graph below



So when you think 3% broken links is acceptable, then you should check for broken links every month.

When 5% is acceptable, check every two months and when you think 10% is acceptable, check every 4 months for broken links.

Be wise, and check and repair your links at a regular base,
Hans

Update: After writing this blogpost  I discovered that in the document "A longitudinal study of Web pages continued: a consideration of document persistence" it is stated that the half-time of a random web page is about 2.0 years. Great that's exactly what I concluded.  

--
ZOMDir.com is a dynamic directory and a wiki
Everyone is able to add a link in 10 seconds

To learn more view this Slideshare presentation

Monday, 9 October 2017

How DeadLinkCity improved "Broken Links at a Glance"

DeadlinkCity is a test site of www.deadlinkchecker.com.

The site contains several types of broken links similar to httpstat.us/

At DeadLinkCity it is stated that:
(...) There are 74 known bad links in DeadLinkCity.com, and one additional link which should not be reported if the tool obeys robots.txt directives. (...) The perfect score is 74 - the closer the number of reported errors is to 74, the more accurate the tool is. (...)

When I tested DeadlinkCity with "Broken Links at a Glance", the link checker discovered 56 broken links. That is far from perfect. 

Quickly I discovered that "Broken Links at a Glanceonly checks links mentioned in the href and src attribute.

So I decided to modify "Broken Links at a Glance". Now it also checks links which might be mentioned in the attributes:
  • action
  • archive
  • background
  • cite
  • classid
  • codebase
  • data
  • formaction
  • icon
  • longdesc
  • manifest
  • poster
  • profile
  • srcset
  • usemap

This list is based on these overviews of HTML 4 attributes and HTML 5 attributes.

Retesting DeadlinkCity with "Broken Links at a Glance" gives the almost perfect score of 72 broken links found.

Nice but it seems that I still miss the 3 URL's mentioned in the CSS code. So I updated "Broken Links at a Glance" that it also checks for links in the CSS code. The tool now finds 75 broken links.

That's one to much.

According to DeadLinkCity disallowed/disallowed.html shouldn't be tested, because it was mentioned in their robots.txt file. 

However the robots tester I use is very strict. It does allow this link because it didn't start with a "/".

Hopefully "Broken Links at a Glance" will be added to the Comparison Table of DeadLinkCity soon (and their robots.txt file will be corrected).

Hans

--
ZOMDir.com is a dynamic directory and a wiki
Everyone is able to add a link in 10 seconds

To learn more view this Slideshare presentation

Monday, 18 September 2017

How to fix broken links

When I detect a redirected or broken link with Broken Links at a Glance I follow in general the following steps:

  • Follow the link to analyse it 
  • Try another URL of the same site 
  • Try to contact the website owner
  • Search for an alternative webpage 
  • Remove the link


1. Follow the link to analyse it

Always follow the link to see what happens. 

When you got a 404 Page not found you might go to the next step.

When you link to a website that doesn't exist anymore you might get the error "Server not found". 

If that's the case and the HTTP error code wasn't 503 assume this link is broken and go to step 4 or depending on the situation to step 5. 

2. Try another URL of the same site

When you thought you linked to the homepage of a website and you got a 404 error then it is often easy to navigate to the new homepage of that website.

Update your webpage by replacing the old link with the new address of the homepage.

When you linked to a specific page, it could be that that information is still available at another location. 

So you have to search at that site for the same information. 

When found, you should update your webpage by replacing the old link with the new address. 

When not found, take the next step.

3. Try to contact the website owner

As you probably have experienced, websites aren't as static as you want. However that doesn't always mean that the information is gone. 

Due to a website reorganisation the information you want to link to might be at another address. 

When you are not able to find it yourself, you might contact the website owner. Almost every website has a contact information page. 

If you can't find an e-mail address you might try the e-mail address info@websitename.com.

Make clear that you linked to a webpage with information regarding ... and ask what the new address is for this information because the webpage you linked to has vanished.  

4. Search for an alternative webpage

Linking out is good practice, so I prefer and advise you to keep linking. 

When all previous steps failed search for another webpage with the information you want to link. 

Simple use your favorite search engine and hunt for the information you want to link to. 

Often you will find an alternative. 

When found, update your webpage by replacing the old link with the new address of the alternative webpage found. 

Otherwise, you should remove the link as described in the next step.

5. Remove the link

Too bad, the webpage you linked to doesn't exist anymore, and you can't find an alternative webpage to link to.

When that's the case you should update the webpage where you linked from and remove the link completely. 

Mind that this might have the consequence that you should rewrite your text.



That said, the process of fixing broken links is relative straightforward.  

For the best results, check your links at a regular base. For example every month.

Finally I like to tell you something about redirected links and broken links. 


Why check redirected links?

Redirected links are often indicated by the HTTP status code 301

In general a HTTP status code which has the format 3XX indicates a redirected link.

Often people think -incorrectly- that a redirected link isn't a problem. Okay, sometimes it isn't a problem, but sometimes it is. 

To find out you have to click these links to see where they redirect to.


From http to https

When the redirect is logical, for example from http:// to https:// it is advised to update your webpage with the redirected link by removing the old location (http://...) with the new location (https://...). 

By doing this, the next time you check your website for broken links, you have to check fewer redirected links.


Another system

It might happen that the website you link is using now another content management system with the side effect that old pages are redirected. When the redirect is logical update your webpage and replace the old location with the new location. However when the redirect isn't logical, then consider this as a broken link.


For sale or sold

It might happen that the website you link to is gone and a domain name speculant redirects your page to a "domain for sale landing page". You should consider this as a broken link. 

It might happen that the website you link to is now owned by someone else who works in a complete different business. In that case you should consider this as a broken link.


Hyjacked

It also might happen that the website you link to is hyjacked and is now selling shoes instead of ... whatever you where linking to. Also in this case you should consider this as a broken link. You might consider to warn the original website owner by sending a mail to info@websitename.com. Inform what you have discovered and ask polite to inform you when the website is restored so you are able to restore your link.



What are broken links?

Broken links are often indicated by the HTTP status code 404

However other status codes also might indicate a broken link. When the status code starts has the format 4XX or 5XX the link is probably broken.

I have experienced that websites which respond with a 408 or a 500 HTTP status code still might work although the may be a little slow. 

When that's the case you have to decide for yourself if you consider this as a broken link or not. 

When you link to a small website which probably doesn't get much visitors it might occur that the first time someone visits that website (the broken link checker) the response is slow while at a second visit (you checking the links marked as broken) the response is reasonable.

A website which responds with a 503 HTTP status code is in maintenance mode. You might ignore this broken link for the moment, however you might assume that in a few days that website will work.

Hope this helps,
Hans

--
ZOMDir.com is a dynamic directory and a wiki
Everyone is able to add a link in 10 seconds

To learn more view this Slideshare presentation

Testing free broken link checkers for windows (or how to seperate 1 men from many boys)

After I have tested 8 free webbased broken link checkers I thought it is time to check free broken link checkers for windows.

Roughly the same requirements apply as in the earlier test. They are:

  • scan as much links as possible
  • scan as fast as possible
  • give correct results
  • respects robots.txt
I selected 8 windows based broken link checkers, including SEO checking tools. 4 of them are very limited in the free edition, like checking 500 links. Of the remaining 4 windows based broken link checkers only one broken link checker is fast enough for me. 

Xenu's Link Sleuth

Xenu's Link Sleuth is the only link checker which checks links at a reasonable speed. Of course it depends on your hardware, the configuration of Xenu's Link Sleuth and the site you check.  In my test environment Xenu's Link Sleuth checks 9 links per second.

All other windows based link checkers are much much slower. Number two checked 1 link per second. For other tools I recorded speeds like 1 link per 6 seconds and 1 link per 20 seconds. 

If you have a relative small site (less than 1,000 links to test) I recommend the free online link checker Broken Links at a Glance.

Hope this helps,
Hans

--
ZOMDir.com is a dynamic directory and a wiki
Everyone is able to add a link in 10 seconds


To learn more view this Slideshare presentation


Thursday, 14 September 2017

8 free online broken link checkers compared (only 2 could be advised)

Broken links

Broken links at your website are very annoying and lowers the user experience dramatically. Hence it lowers probably your ranking in search engines.

It is good practice to check your site at a regular base for broken links. Lets say every month.

You should then check and re-check your site till all links are fine. 

Check and update also the redirected links. A redirected link is often no problem, but sometimes it redirects to a phising or malware site. 

Broken links and redirected links could be find by a broken link checker. A broken link checker analyses your site but isn't able to correct a broken link. That is a manual action which should be done by yourself.


Types of broken link checkers

There are a lot of broken link checkers available. These link checkers could have characteristics like:
  • Platform (e.g. Windows, webbased, Wordpress plug-in, Chrome plug-in, etcetera)
  • Price (e.g. free, initial free - paid upgrade possible, paid)
  • Capabilities (e.g. link checker, seo tool, link checker plus another check, maximum links that will be checked)
For this article I have tested the following free generic online broken link checkers:


Two online broken link checkers are fine

The two link checkers that could be advised are Broken Links at a Glance and Dead link checker.

Here is why.


Test method

Before I tested the broken link checkers I made a list of characteristics of a fine broken link checker. A broken link checker should:

  • scan as much links as possible
  • scan as fast as possible
  • give correct results
  • respects robots.txt
  • be mobile friendly


Given this list I started to test the broken link checkers with some large news sites.

That was a good start, because it made clear that speed is more important as I thought before. Based on these tests 4 of the 8 link checkers failed.

The second test was about good results. I tested this with https://httpstat.us/. This time 2 of the 4 remaining link checkers failed.

The two remaining link checkers both respects robots.txt. 

Only Broken Links at a Glance is optimized for mobile, so at the end the results are:


1. Broken Links at a Glance

Broken Links at a Glance is intended for websites up to 1,000 unique links and checks 5 links per second on average. That's respectable although there are other broken links checkers with better numbers. 

However Broken Links at a Glance respects the directives in the robots.txt file, is mobile friendly and is ad free. Broken Links at a Glance checks all links coded at a webpage, not only the visible and clickable links.


2. Dead Link Checker

Dead link checker checks the first 2,000 links on any website. That's second best. Besides that on average it checks 5 links per second. 

Like Broken Links at a Glance, Dead link checker is ad free, respects robots.txt and checks all links coded at a webpage.

Dead link checker isn't optimized for mobile. Besides that you have to enter a code before you are able to start the test.


3. Dr. Link Check

The third place in this test is for Dr. Link Check. The free link checks of Dr. Link Check are limited to 1,000 links per website. On average it checks 10 links per second. That is the highest speed of all tested broken link checkers.

At this moment (September 2017) Dr. Link Check checks only "normal" page links. In version 2.0 which is coming soon they will not only check "normal" page links, but also verify links to images, style sheets, scripts and other resources required to properly display your website.

Dr. Link Check doesn't respect robots.txt and isn't optimized for mobile. Dr. Link Check is the only website which doesn't allow you to check the same website within 10 minutes.

When I tested the site https://httpstat.us/ with  Dr. Link Check, it wasn't able to check all links.

I doubted if I should advise Dr. Link Check. The latest test decided for me not to advise this link checker.


4. Online Broken Link Checker

Of the tested link checkers Online Broken Link Checker has the highest limit of pages to crawl. There is a 3,000-page limit, however there are no limits on number of hyperlinks within those webpages. That's great however there are some downsides.

Online Broken Link Checker: 

  • is slower than the top 3 link checkers. On average it checks 2.5 links per second;
  • doesn't crawl subfolders / URLs with slashes;
  • has problems with https://httpstat.us/;
  • doesn't respect robots.txt;
  • isn't optimized for mobile.


5. Internet Ninja's Find Broken Links, Redirects & Site Crawl Tool

Internet Ninja's Find Broken Links, Redirects & Site Crawl Tool is the best of the rest. This tool crawls up to 1,000 pages of your website. However it's slow. On average it isn't able to check a link in a second.


6. W3C Link Checker

The W3C Link Checker has the option "Check linked documents recursively". When this option is checked then, in theory, by default the recursion depth is unlimited. However in practice less than 500 pages will be checked. The W3C Link Checker is slow. On average it isn't able to check a link in two seconds.


7. Online Website Link Checker

The Online Website Link Checker crawls max 500 pages for free. It has a nasty pop-up which might be closed after a few seconds and it is slow. 


8. Broken Link Checker

Broken Link Checker by WebToolHub doesn't mention a maximum of pages or links that will be checked. It isn't that relevant because this checker is slow. On average it isn't able to check a link in a second.


In summary

Of the 8 tested free online broken link checkers Broken Links at a Glance and Dead link checker are the best link checkers available.

As alternative you might try Dr. Link Check or Online Broken Link Checker

I hope this advise is useful for you.
Hans

--
ZOMDir.com is a dynamic directory and a wiki
Everyone is able to add a link in 10 seconds


To learn more view this Slideshare presentation