Showing posts with label seagull. Show all posts
Showing posts with label seagull. Show all posts

Saturday, April 4, 2009

Solution for content duplication

The content your site publishes are often reachable trough multiple urls. It happened to me while using Horde_Routes (which is really nice way to organize your urls with Seagull), and there weren't easy workarounds (http redirect is not a too nice solution)

This could cause you problems with search engines, because "score" of that particular content are spread trough multiple urls. Your content gets lower relevance in search results which is definitely not what you want.

Now there is a nice solution for this issue called : cannonical URL link

Basically you just have to add
<link rel="canonical" href="http://someurl/comes/here" />

to HEAD section and you are done. It will tell the search engines which url to associate with content. All major search engines support canonical url links.

With Seagull you have to modify www/yourtheme/default/header.html and add a line like

<link flexy:if="canonicalUrl" rel="canonical" href="{canonicalUrl}" />

which means that if you set $output->canonicalUrl in your controller it will find its way to search engines

Thursday, October 23, 2008

Remove index.php from urls in Seagull with mod_rewrite

To make Seagulls urls even more SEO friendly, you can "remove" index.php from them. It could be done relatively easily with Apache`s mod_rewrite module.

Example of functionality:

http://www.example.com/index.php/default/maintenance/
=>
http://www.example.com/default/maintenance/

Apache configuration

You need to enable mod_rewrite module first. Then you have to make decision where to keep your rewrite rules. Options are : global httpd.conf, virtualhost or .htaccess files. I dont suggest .htaccess files as apache should check them for every request, and with higher load you dont want it. I keep rules in virtualhost files. Lets see an example:

/etc/apache2/sites-available/www.example.com

<virtualhost *>
ServerAdmin webmaster@localhost
DocumentRoot /var/www/example.com/www
ServerName www.example.com
<Directory /var/www/example.com/www>
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-l
RewriteRule ^(.*)$ index.php/$1 [L]
</Directory>
# RewriteLog /var/log/apache2/example.com-rewrite

ErrorLog /var/log/apache2/example.com-error_log
CustomLog /var/log/apache2/example.com-access_log common
</virtualhost>

The rewrite rules are set on same directory as document root.

For each request:
- check if the requested url is an existing file
RewriteCond %{REQUEST_FILENAME} !-f

- If its not a file, prepend index.php.
RewriteRule ^(.*)$ index.php/$1 [L]

Save config file & reload apache in order to use newly set up rules

You can test if the rewrite rules function properly by simply typing in few in your browser without index.php.

Seagull configuration

Next step is to change how Seagull generates links because we dont want index.php in our links any more. You have to edit Seagull config file located in <install_dir>/var/www.example.com.conf.php , and set frontScriptName to an empty string:

$conf['site']['frontScriptName'] = '';


You can find out more about this at:
Seagull Wiki