Duplicate content confuses search engines and has unwanted effects on search result rankings. At the 2009 Search Marketing Expo, Google introduced a new solution.
Let's talk about duplicate content and what is wrong with it. Duplicate content is when the same web page is accessed by different URLs. A simple example is a home page. The home page of a website can be viewed though URLs, site.com, www.site.com, or www.site.com/index.html. So, a search engine treats these as multiple copies of the same content, even though there is only one copy of index.html on the server. You won't be penalized for of duplicate content, but that causes some trouble, namely that internal link popularity will be split by duplicate content, unnecessary crawling of all duplicate content cases, extra load on your server (Think about duplicate content with thousands of different session IDs.), and so on.
To eliminate duplicate content, you can do the following:
- Go over the content management system, and make sure that the same content is accessed though a consistent URL.
- Use 301 permanent redirect with a consistent URL.
- Specify a consistent URL for each page in sitemaps.txt .
Unfortunately, the above methods cannot solve all the problems. Here is where the Canonical Link Element comes in. Go back to the example of site.com/index.html. If you use a Canonical Link Element, for example
<link rel="canonical" href="www.site.com/index.html"/>
in the header section (between <head> and </head>) of index.html. The search engine will understand that you prefer the URL "www.site.com/index.html" to display as your home page and that the other pages are duplicates.
The Canonical Link Element itself is quite simple, but there are some rules you need to follow.
- The domain name for viewing the page and domain name specified in the element have to be the same.
- When following a URL specified in the element, it is not recommended that there be a different URL specified in the element on the destination page.
- The page visitors see and the page specified in the element have to be similar, but they do not have to be a word-for-word match; i.e., you can consider that a page sorting products by price from lowest to highest and a page that sorts them from highest to lowest are the same.)
- The use of an absolute link is preferred over the use of a relative link.
The Canonical Link Element was Google's idea, but Yahoo! and Microsoft have also announced that they will adopt it, so there is no need to worry about compatibility. One line of code will work on all of major search engines.
© September, 2009