Monday, 30 July 2018

Linking out isn't about risk. It's about uncertainty.

Recently I wrote that you should check redirected links.

In The Ambler Warning from Robert Ludlum it is explained exactly why this is important.

From the previous blog, you might get the idea that it is an acceptable risk that one percent of the removed links, link out to hacked sites.

However Caston -my favorite character in The Ambler Warning- explains the difference between risk and uncertainty. Let me quote Caston.
'This isn't about risk. That's what people like you never understand. It's about uncertainty. You think you can assign a probability metric to future events like this. For technical reasons we do that all the time. But it's bullshit - nothing more than a convention, an accounting conceit. Risk suggests measurable probability. Uncertainty is when likelihood of future events is simple incalculable. Uncertainty is when you don't even know what you don't know. Uncertainty is humility in the presence of ignorance.'
Because I don't know how sites get hacked, how it will be done in the future -Will the hackers use artificial intelligence?-, -Are hacking instructions for sale?- I realise that the percentage of 1% in the previous link is indeed bullshit.

However the advice to check redirected links is still valid, although it was based on risk. I think the advice should be based on uncertainty.

Again, good luck with checking and repairing your broken and redirected links.

Monday, 23 July 2018

Why you should check redirected links (and 5 facts regarding broken links)

Everyone will understand that broken external links will lower the user experience.

Not everyone will know that the percentage of broken links at your website is an indication for search engines how often you check for broken links. 

A lot of broken links? Poor maintenance!

The halftime of an external link is two year. Hence the number of broken links indicates what was the last time you checked and repaired broken links.

Repairing broken links at a regular base is the difference between a neglected site and a smooth website.

Redirected links are dangerous

Almost everyone thinks that redirected links won't do any serious harm. Believe me that is a huge mistake. Redirected links are dangerous.

Recently I analyzed the reason why links are automatically removed from ZOMDir.com (you will find the numbers below).

One of the main findings is that one percent of the removed links, link out to hacked sites

How to find links to hacked sites?

Almost the only way to discover links to hacked sites, is by following the redirected link. When you see an online shoe store where you expected information of a fitness center, you know the site is hacked. 

Of course there are a lot of redirected links. A lot of websites switch from http to https these days. Make live easier for yourself and link to the redirected site instead of the original site. Otherwise you have to check every redirected site again and again when you check and repair broken links.

Other findings

1. Half of the broken and redirected links could be fixed. That is the target website is still working fine, although the website has been rebuild. Depending on the quality of the rebuilding team you are redirected, get a neat 404 error or a brute server error; 

2. Ten percent of the broken links are due to programming errors. This vary from very slow loading webpages, to pages without any content, to incorrect database connections. So always check your entire site with a broken link checker when someone has changed something at your site;

3. Another ten percent of broken links are due to the classic 404 page not found error. Most of the time these error indicates that the website was rebuild, and the rebuilding team didn't redirect the old webpage; 

4. When a site ownes gives up a domain name, around 25% of old domains are parked by a domain broker;

5. 30% of the removed links are removed due to redirections. Nearly all redirected webpages redirect to a secure site.

Raw research results

Here are the raw results of my research.

Error codes broken links

ZOMDir has an inbuild linkchecker (similar to "Broken Links at a Glance") and ZOMDir keeps track when and why a link is removed.

I analyzed the last 5000 removed links. These links where removed from July 30, 2016 till July 16, 2018.  

On average every day 7 links are removed at ZOMDir.com.

Manual removed links

These links are removed by hand. Probably because these links were added to an inappropriate category.

Manual        408     8%

Server side errors

Error 500    2361    47%
Error 503     169     3%
Error 502      10     0%
Error 504       1     0%


Error 301    1027    21%
Error 302     374     7%
Error 303      25     1%

Client side errors

Error 404     479    10%
Error 400      30     1%
Error 401      30     1%
Error 402      20     0%
Error 403      18     0%
Error 410      16     0%

Security errors

When ZOMDir checks broken links,  there is also a check if the page linked to is still safe to visit. When that's not the case you will get the 604 error.

Error 604      32     1%

Manual checked links

By hand the latest 564 removed links where checked. The formal error codes aren't that relevant now. I want to know indepth why a link was removed. Here are the findings:

Server not found          149  26%
Temporary error           110  20%
Redirected                101  18%
Page not found             76  13%
Domainparking              45   8% 
Programming error          32   6% 
Slow website (no response) 19   3% 
Stop of freehosting        15   3%
Site in maintenance mode   10   2% 
Hacked                      7   1%

Thanks, for your attention, happy linking and please check and repair your links at a regular base.

