In this post we’re going to be looking at a case study for improper canonicalization of URL’s. The point of this article is to give you a real in-action example of what not to do with the rel=”canonical” tag, and to show you what happens when you don’t follow Google’s guidelines.
As the article on canonicalization of URL’s at Google points out, the reason for canonicalization is to clarify a situation where you have multiple url’s that all post to a page that says the same thing. If you have this situation, you place a small bit of code in the head element of the pages that are lower priority. The code looks something like this:
<link rel="canonical" href="http://www.example.com/a-page.html"/>
Reason for the Case Study
I found in a webposition rankings report that we had two pages ranking in the top 100 results for the search term “seo newsletter”. The higher ranking page was our newsletter feedback page (not the page we wanted ranking highest). Our target newsletter landing page was ranking a few results lower. We also had a third page with a similar title to these other two pages, though it wasn’t ranking in the top 100.
Rather than using our robots.txt to block Google (and other engines) from the two other pages that were competing with our target page, I figured I’d try a little experiment. An experiment that is actually against Google’s webmasters guidelines for the rel=”canonical” tag. The two possibilities were 1) I get away with this and our target page goes up in rankings 2) It backfires and I have a great case study for illustrating what not to do.
Facts About the Case Study
Target Search Term: SEO Newsletter
URL of Page Canonicalized #1: /help-improve-our-seo-newsletter – Title: Help Improve our SEO Newsletter
- On August 9th this page ranked #22 in Google for the target term.
- Page Rank 2
- Not Ranked in SERP for target term.
- Page Rank 1
Target Page URL: /contact-us/why-you-should-subscribe-to-our-seo-newsletter – Title: Sign Up for Our SEO Newsletter
- On August 9th this page ranked #24 in Google for the target term.
- Page Rank 1
The /newsletter-signup page was crawled first and was quickly de-indexed. This is expected because we were telling Google it should ignore this page and assign all it’s authority to the target page, because they are copies of each other (which they weren’t). The next thing I noticed caught me off guard. The target page dropped to #55 in the SERP.
When the /help-improve-our-seo-newsletter page was crawled, it was also de-indexed and the target page took another fall all the way down to #77.
This means that Google listened to our suggestion to the point of de-indexing the pages that I said were duplicate content, but didn’t heed my suggestion to give the authority to the target page, and in fact penalized the target page because of the incorrect usage.
Below is the webposition ranking report graph showing the changes by date.
Another interesting point I noticed was the difference between how Google treated these changes compared to Bing and Yahoo. Take a look at the ranking timeline for them.
Basically, Google penalized me for using the canonical tag in a way they didn’t find accurate, but Bing and Yahoo did not. If you read up on the issue you’ll find that these search engines do have different policies regarding canonicalization and it was definitely illustrated by the example above.
I wanted to see if I could take multiple pages with the same basic theme and point them at a desired landing page for the targeted term in order to boost the rankings of that page. The result was even worse than I expected. If you’re going to use canonical tags I suggest reading up on proper usage.
As you might imagine, we removed the rel=”canonical” tag from the head elements of those non-canonical pages. Now the burning question is, will the canonical page rebound? And if so, how long will it take? You can read the follow up to this post to find out – Canonical Case Study Part 2: Did Google Forgive Us?