You probably heard about “broken link building strategy” from 1,66,000 SEO blogs available out there but no-one really talks about how to find these broken links and how to leverage the power of these links for scaling up your website’s search presence.
The most common answer you’ll find around the web is – Use Ahrefs and find all “Broken links” for any specific domain say Wikipedia, which absolutely gives you some amount of data but not 100% since the Ahrefs index is way too small as compared to the universe of the domains. (But without any doubt, Ahrefs is the best tool available in the known or unknown universe if you want to analyze backlinks of any website)
To try out this method, I chose https://wordpress.org/themes/ as my seed domain because:
1. Free WordPress themes have “Credit or Powered By” links at their footer and imagine how many people are using free WordPress themes and who don’t know how to remove footer credits.
2. Anyone who is using such free themes is by default linking to the theme developer’s domain from every single page from their own blog, and if you don’t know the value of footer links, I recommend you read this article by Glen Allsopp from Detailed.com where he explained how 16 companies are dominating the SEARCH using sitewide/footer links.
3. Every single landing page of a WordPress theme has a “Theme Homepage” link, with which you can easily find the domains which might have thousands of referring domains since free themes have a crazy amount of active installations.
Here is an example: https://wordpress.org/themes/twentynineteen/
To begin with this experiment, I took the obvious step:
I used Ahrefs and tried to get all of the domains which has an external link from any page which contains https://wordpress.org/themes
Surprisingly Ahrefs gave me a data of 1,876 unique URLs, when the fact is Google tell me there are more than 7,09,000 theme pages available on WordPress Themes node.
So I decided to take the matter in my hands and crawl whole WordPress Themes directory to find all external links from individual WP theme page.
Now let me take a breathe in and introduce our Tony Stark of SEO – Subhash, who once told me if I want the moon, he can bring it for me with his tech skills, but then I thought I can’t do much with the moon, so I asked him to crawl the whole WP themes directory rather. Within a matter of few clicks, he found the index of all WordPress theme pages. (PS: He didn’t complete his engineering, but he is smart with reverse-engineering stuff lol )
Within an hour after running this crawler, we found exact 18,198 WordPress theme pages (Since I later noticed that WordPress has language-specific theme pages, and there was huge overlap of external linking domains), but still 18,198 is better than 1,876 domains data provided by Ahrefs.
Example of pages which has same external links:
https://br.wordpress.org/themes/twentynineteen/
https://wordpress.org/themes/twentynineteen/
But wait, how did you find theme homepage links from a directory of WordPress themes?
Okay, so – going through the WordPress SVN (code repo of all WordPress themes in the WordPress theme directory), each of these themes have multiple version, and each version contained a style.css file.
The style.css file has the author’s info including his domain name. The theme author links to this domain from the theme footer.
The custom crawler would go through each of the themes listed in that repo, scan through the versions and find the style.css file from there.
A one-line RegX extracted the author URL from these style files and exported it in a . CSV file.
After removing duplicate domains (authors having multiple themes, or multiple themes linking to GitHub repots) I still had 6,000+ domains left with me.
Crawling website content of any specific website might be against their policy, so refer to their terms and conditions before you crawl any data. We do not recommend/promote this practice.
How do I check the quality of 6000 domains in one shot?
Well, we have our custom-built tools which allow us to process backlinks data of 100k+ domains in a matter of few minutes. So after spending 1 minute and 27 seconds – I had a file of all metrics (such as DR, Ahrefs Rank, Availability) of these 6000+ domains and I found exact 7 domains which had more than 125+ referring domains, and decent DR (maybe these 7 theme developers don’t know the value of links, or maybe they are too lazy to renew their domains)
Took me another 2 minutes to buy all these domains.
Well, it’s been just 3 days since I’ve got these domains in my account and I’ll perform some experiments (maybe I’ll setup a PBN’s or maybe I’ll redirect them on my money site) – I’ll keep you posted about the result of this case study with you in this article itself.
But Hey! Wait!
This story is not over yet, since those 764 words I wrote above is just a HACK – trust me – Hacks are never scalable, and this made me question myself –
Can I leverage a power of sitewide links from WordPress (or any other platform) to build something better?
I can’t give you an exact answer for this question, but here is my opinion around this topic – let’s go on Google and search for “best {category} WordPress themes”, for instance “free fitness WordPress themes” and you’ll land on https://colorlib.com
Now it’s the time for me to do some postmortem on the links profile vs their performance using Ahrefs and SimilarWeb pro.
Well, Ahrefs says colorlib.com has DR 92, with 63.8K unique domains pointing 59.9M times to them and all you can say is “f**k, this Colorlib has nailed it“, but you’ll be surprised to see that most of their inbound links are coming from the “credit/footer” links for their brand keyword “colorlib” form everyone who is using their themes.
With 1.5M organic visitors per month and $2.1M value of search traffic, no doubt Colorlib has achieved things that normal companies can’t even think of.
FREE sells and leveraging power of FREE WORDPRESS THEMES seems to be working absolutely fine with Colorlib.
On top of that, I’ve realized that the kind of themes colorlib is designing and developing is pretty good, so I don’t have any issues with “credit or footer” links in return of the help with the good UI/UX.
However, I’m still questioning Google that – If webmaster had no intent of linking back to theme developer, why Google still considers the value of links which was forcefully placed within the content on any particular domain?
Glen has already mentioned about “How 16 Companies are Dominating the World’s Google Search Results” in this post almost two years back and I can still see the same strategy is working for a lot of big brands.
Here is one more example, which you (probably) never heard of: Indiamart.com
If you don’t know who is Indiamart, Indiamart is an Indian B2B marketplace which started as a website designing company back in 1999 and they apparently survived dot com bubble (since they turned this website creating business into lead gen business). They get approximately 10M organic visitors per month and the company is making over 55M annual revenue who is planning to file an IPO.
Indiamart has around 8.5M keywords with 21.5K RD and over 17.8M backlinks (source Ahrefs). Which looks highly unreal and unnatural as almost 95% of websites that are pointing to their root domain are created by their own.
Now if you backtrack how these 21.5 referring domains were created, you should check this page. Now if you are wondering wtf is Mini Dynamic Catalog, it’s a PRIVATE BLOG NETWORK in simple words. They create a website of any manufacturer who gets listed on their platform by themselves and ads their link in the footer. So don’t you think this is something similar to legal robbery? Pretty much similar to a PBN?
I’m sure that a company that is making 50m+ in revenue and for whom SEARCH is one of the biggest user acquisition avenues has an SEO manager who knows about the word – NOFOLLOW! Don’t you think so?
I’m not sure why Google is not taking any actions around this strategy when they already know everything, but I do agree to the comment from @themadentrepreneur on this reddit post:
The SERPs have become a caste system where the wealth is controlled by the top 1%, and everyone else is scrubs and peasants getting whipped and lashed by Google Policeman and thrown in Google Prison because they can’t compete with footer links on a private network of 10 mega authority sites. Honestly, I’ve never seen such a sickening and more atrocious website study in the history of my career as an SEO.
What big G says about such link schemes?
Assuming the fact that all of these big players are following white-hat SEO practices mentioned by Google on this page, let’s see what the current scenario.
One of their policy point says: Additionally, creating links that weren’t editorially placed or vouched for by the site’s owner on a page, otherwise known as unnatural links, can be considered a violation of our guidelines. Here are a few common examples of unnatural links that may violate our guidelines: Widely distributed links in the footers or templates of various sites
Now I’m not sure whether Google crawlers are not able to detect this “widely” factor or they are just ignoring the case. Whatever it is, I’ve always found it bizarre when brands like Indiamart proves – “Rules are meant to be broken“, and at the same time I remember a quote – “Everyone has a plan until they get punched in the mouth.