Saturday, April 4, 2009

Solution for content duplication

The content your site publishes are often reachable trough multiple urls. It happened to me while using Horde_Routes (which is really nice way to organize your urls with Seagull), and there weren't easy workarounds (http redirect is not a too nice solution)

This could cause you problems with search engines, because "score" of that particular content are spread trough multiple urls. Your content gets lower relevance in search results which is definitely not what you want.

Now there is a nice solution for this issue called : cannonical URL link

Basically you just have to add
<link rel="canonical" href="http://someurl/comes/here" />

to HEAD section and you are done. It will tell the search engines which url to associate with content. All major search engines support canonical url links.

With Seagull you have to modify www/yourtheme/default/header.html and add a line like

<link flexy:if="canonicalUrl" rel="canonical" href="{canonicalUrl}" />

which means that if you set $output->canonicalUrl in your controller it will find its way to search engines