Posted by Kurtis
Hi, my name's Kurtis and I'm relatively new here at Moz. My official title is "Captain of Special Projects," which means I spend a lot of time browsing strange parts of the web, assembling metrics and inputting data in Google Docs/Excel. If you walk past my desk in the Mozplex, be warned, investigating webspam is on my task list, hence you may come away slightly traumatized by what you see. I ward off the demons by taking care of two cats and fondly remembering my days as a semi-professional scoundrel in Minnesota.
Let's move on to my first public project, which came about after Google deindexed several directories a few weeks ago. This event left us wondering if there was a rhyme to their reason. So we decided to do some intensive data collection of our own and try to figure out what was really going on.
We gathered a total of 2,678 directories from lists like Val Web Design, SEOTIPSY.com, SEOmoz's own directory list (just the web directories were used), and a few others, the search for clues began. Out of the 2,678 directories, only 94 were banned – not too shabby. However, there were 417 additional directories that had avoided being banned, but had been penalized.
We define banned as having no results in Google when a site:domain.com search is performed:
We defined penalized as meaning the directory did not show up when highly obvious queries including its title tag / brand name produced the directory deep in the results (and that this could be repeated for any internal pages on the site as well):
As you can see above, the directory itself is nowhere to be found despite the exact title query, yet it's clearly still indexed (as you can see below by performing a domain name match query):
At first, the data for the banned directories had one common trait – none of them had a visible toolbar Pagerank. For the most part, this initial observation was fairly accurate. As we pressed on, the results became more sporadic. This leads me to believe that it may have been a manual update, rather than an algorithmic one, or at least, that no particular public metrics/patterns are clear from the directories that suffered a penalization/ban.
That is not to say the ones left unharmed are safe from a future algorithmic update. In fact, I suspect this update was intended to serve as a warning; Google will be cracking down on directories. Why? In my own humble opinion, most of the classic, "built-for-SEO-and-links" directories do not provide any benefit to users, falling under the category of non-content spam.
Some directories and link resource lists are likely going to be valuable and useful long term (e.g. CSS Beauty's collection of great designs, the Craft Site Directory or Public Legal's list of legal resources). These are obviously not in the same world as those "SEO directories" and thus probably don't deserve the same classification despite the nomenclature overlap.
Updated Directory List!
In the midst of the panic, a concerned individual brought to my attention that “half of our directories were deindexed” and wanted to know when we would be updating our list. If by half he meant 4 of the 228 we listed were banned and an additional 4 just penalized, then I’d have to agree. 😉 In any case, our list is now updated. Thanks for being patient!
Let's look at the data
We've set up two spreadsheets that show which directories were banned and/or penalized, plus a bit of data about each one. Please feel free to check them out for yourself.
Additional Data Analysis
Given the size and scope of the data available, we're hoping that lots of you can jump in and perform your own analysis on these directories, and possibly find some other interesting correlations. As the process for checking for banning/penalization is very tedious and cumbersome, we likely won't be doing an analysis on this scale again in the very near future. But we may revisit it again in 6-12 months to see if things have changed and Google's cracking down more, letting some of the penalties/bans be lifted or making any other notable moves.
Changes were made to the list on Friday, June 1, 2012.
I look forward to your feedback and suggestions in the comments!
p.s. The Mozscape metrics (PA, DA, mozRank, etc) are from index 51, which rolled out at the start of May. Our new index, which was just released earlier today, will have more updated and possibly more useful/interesting data. If I have the chance, I'll try to update the public spreadsheets using those numbers.
Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!