| | | |  |  | |  |
18.08.2004, 10:16
|
#1 (permalink)
| | Junior Mamber
Join Date: Jun 2004
Posts: 38
| SEF urls, the right way (I think) I've seen a few sketchy little SEF implementations floating around, that still leave a lot of crap in the URL. I've been thinking about how to get past that for a while now, and finally came up with this.
I developed this against 4.5.1beta4, but it actually works with 4.5 1.0.7 (and i'd assume .8 and .9) as well (haven't tested extensively).
This method uses a database, and effectively just caches the various urls and references them with a string. There are a few modifications to sef.php, a new table, a slight change to index.php, and a change to the mod_rewrite rules (.htaccess). I was trying not to touch the core stuff as much as possible, but it was a bit unavoidable.
You can see an example of this working on my temporary site: https://www.gregmaclellan.com. 4.5 install is at https://www.mwater.ca.
The 4.5 one has a url conflict, so you can see how it's handled. "Residental > Treatment Techniques" and "Commercial > Treatment Techniques" both link to the same content category, but carry around different Itemid's to keep the pathway linked properly. You can see under the commercial side, it added the suffix "_1" to the url since the title (and therefore URL) is a duplicate. It'll actually keep incrementing that number, no matter how many duplicate titles you have.
Table: Code: #
# Table structure for table `mos_sef`
#
CREATE TABLE `mos_sef` (
`id` int(10) unsigned NOT NULL auto_increment,
`title` varchar(100) NOT NULL default '',
`location` varchar(255) NOT NULL default '',
`url` varchar(255) NOT NULL default '',
PRIMARY KEY (`id`),
UNIQUE KEY `location` (`location`)
) TYPE=MyISAM ; (continued, because my post is too long apparently!) |
| |
18.08.2004, 10:17
|
#2 (permalink)
| | Junior Mamber
Join Date: Jun 2004
Posts: 38
| Re: SEF urls, the right way (I think) sef.php: Code: <?php
/**
* @version $Id: sef.php,v 1.6 2004/08/02 15:46:54 saka Exp $
* @package Mambo_4.5
* @copyright (C) 2000 - 2004 Miro International Pty Ltd
* @license http://www.gnu.org/copyleft/gpl.html GNU/GPL
* Mambo is Free Software
*/
/** ensure this file is being included by a parent file */
defined( '_VALID_MOS' ) or die( 'Direct Access to this location is not allowed.' );
function titleToLocation($title) {
return preg_replace(array("/'/","/[^a-zA-Z0-9]+/","/(^_|_$)/"),array("","_",""),$title);
}
function sefGetLocation($url, $title) {
GLOBAL $database, $mosConfig_sefSuffix;
$location = titleToLocation($title);
$iteration = 0;
$realloc = false;
do {
// temploc is $location, unless we're on a second or greater iteration,
// then its $location_$iteration
$temploc = $location.(($iteration == 0) ? "" : "_".$iteration);
if (isset($mosConfig_sefSuffix)) {
$temploc .= ".".$mosConfig_sefSuffix;
}
// see if we have a result for this location
$database->setQuery("SELECT url FROM #__sef WHERE location = '".$temploc."'");
if ($dburl = $database->loadResult()) {
if ($dburl == $url) {
// we found the matching object
$realloc = $temploc;
}
// else, didn't find it, let us increment and try again
} else {
// not found, insert new entry
$database->setQuery("INSERT INTO #__sef (title, location, url) ".
"VALUES ('".$title."', '".$temploc."', '".$url."')");
$database->query();
$realloc = $temploc;
}
$iteration++;
} while (!$realloc);
return $realloc;
}
if ($mosConfig_sef) {
$url_array = explode("/", $_SERVER['REQUEST_URI']);
// first we see if we can grab a sef database string
$database->setQuery("SELECT url FROM #__sef WHERE location = '".$url_array[1]."'");
if ($url = $database->loadResult()) {
// found one.. now fake like we got it this way
$_SERVER["REQUEST_URI"] = $url;
$url_array = explode("/", $_SERVER['REQUEST_URI']);
}
/**
* Content
* http://www.domain.com/$option/$task/$sectionid/$id/$Itemid/$limit/$limitstart
*/
if (in_array("content", $url_array)) {
$uri = explode("content/", $_SERVER['REQUEST_URI']);
$option = "com_content";
$_GET['option'] = $option;
$_REQUEST['option'] = $option;
$pos = array_search ("content", $url_array);
// $option/$task/$sectionid/$id/$Itemid/$limit/$limitstart
if (isset($url_array[$pos+6]) && $url_array[$pos+6]!="") {
$task = $url_array[$pos+1];
$sectionid = $url_array[$pos+2];
$id = $url_array[$pos+3];
$Itemid = $url_array[$pos+4];
$limit = $url_array[$pos+5];
$limitstart = $url_array[$pos+6];
$_GET['task'] = $task;
$_REQUEST['task'] = $task;
$_GET['sectionid'] = $sectionid;
$_REQUEST['sectionid'] = $sectionid;
$_GET['id'] = $id;
$_REQUEST['id'] = $id;
$_GET['Itemid'] = $Itemid;
$_REQUEST['Itemid'] = $Itemid;
$_GET['limit'] = $limit;
$_REQUEST['limit'] = $limit;
$_GET['limitstart'] = $limitstart;
$_REQUEST['limitstart'] = $limitstart;
$QUERY_STRING = "option=com_content&task=$task§ionid=$sectionid&id=$id&Itemid=$Itemid&limit=$limit&limitstart=$limitstart";
// $option/$task/$id/$Itemid/$limit/$limitstart
} else if (isset($url_array[$pos+5]) && $url_array[$pos+5]!="") {
$task = $url_array[$pos+1];
$id = $url_array[$pos+2];
$Itemid = $url_array[$pos+3];
$limit = $url_array[$pos+4];
$limitstart = $url_array[$pos+5];
$_GET['task'] = $task;
$_REQUEST['task'] = $task;
$_GET['id'] = $id;
$_REQUEST['id'] = $id;
$_GET['Itemid'] = $Itemid;
$_REQUEST['Itemid'] = $Itemid;
$_GET['limit'] = $limit;
$_REQUEST['limit'] = $limit;
$_GET['limitstart'] = $limitstart;
$_REQUEST['limitstart'] = $limitstart;
$QUERY_STRING = "option=com_content&task=$task&id=$id&Itemid=$Itemid&limit=$limit&limitstart=$limitstart";
// $option/$task/$sectionid/$id/$Itemid
} else if (!(isset($url_array[$pos+5]) && $url_array[$pos+5]!="") && isset($url_array[$pos+4]) && $url_array[$pos+4]!="") {
$task = $url_array[$pos+1];
$sectionid = $url_array[$pos+2];
$id = $url_array[$pos+3];
$Itemid = $url_array[$pos+4];
$_GET['task'] = $task;
$_REQUEST['task'] = $task;
$_GET['sectionid'] = $sectionid;
$_REQUEST['sectionid'] = $sectionid;
$_GET['id'] = $id;
$_REQUEST['id'] = $id;
$_GET['Itemid'] = $Itemid;
$_REQUEST['Itemid'] = $Itemid;
$QUERY_STRING = "option=com_content&task=$task§ionid=$sectionid&id=$id&Itemid=$Itemid";
// $option/$task/$id/$Itemid
} else if (!(isset($url_array[$pos+4]) && $url_array[$pos+4]!="") && (isset($url_array[$pos+3]) && $url_array[$pos+3]!="")) {
$task = $url_array[$pos+1];
$id = $url_array[$pos+2];
$Itemid = $url_array[$pos+3];
$_GET['task'] = $task;
$_REQUEST['task'] = $task;
$_GET['id'] = $id;
$_REQUEST['id'] = $id;
$_GET['Itemid'] = $Itemid;
$_REQUEST['Itemid'] = $Itemid;
$QUERY_STRING = "option=com_content&task=$task&id=$id&Itemid=$Itemid";
// $option/$task/$id
} else if (!(isset($url_array[$pos+3]) && $url_array[$pos+3]!="") && (isset($url_array[$pos+2]) && $url_array[$pos+2]!="")) {
$task = $url_array[$pos+1];
$id = $url_array[$pos+2];
$_GET['task'] = $task;
$_REQUEST['task'] = $task;
$_GET['id'] = $id;
$_REQUEST['id'] = $id;
$QUERY_STRING = "option=com_content&task=$task&id=$id";
// $option/$task
} else if (!(isset($url_array[$pos+2]) && $url_array[$pos+2]!="") && (isset($url_array[$pos+1]) && $url_array[$pos+1]!="")) {
$task = $url_array[$pos+1];
$_GET['task'] = $task;
$_REQUEST['task'] = $task;
$QUERY_STRING = "option=com_content&task=$task";
}
$_SERVER['QUERY_STRING'] = $QUERY_STRING;
$REQUEST_URI = $uri[0]."index.php?".$QUERY_STRING;
$_SERVER['REQUEST_URI'] = $REQUEST_URI;
}
/*
Components
http://www.domain.com/component/$name,$value
*/
else if (in_array("component", $url_array)) {
$uri = explode("component/", $_SERVER['REQUEST_URI']);
$uri_array = explode("/", $uri[1]);
$QUERY_STRING = "";
foreach($uri_array as $value) {
$temp = explode(",", $value);
if (isset($temp[0]) && $temp[0]!="" && isset($temp[1]) && $temp[1]!="") {
$_GET[$temp[0]] = $temp[1];
$_REQUEST[$temp[0]] = $temp[1];
$QUERY_STRING .= $temp[0]=="option" ? "$temp[0]=$temp[1]" : "&$temp[0]=$temp[1]";
}
}
$_SERVER['QUERY_STRING'] = $QUERY_STRING;
$REQUEST_URI = $uri[0]."index.php?".$QUERY_STRING;
$_SERVER['REQUEST_URI'] = $REQUEST_URI;
}
// Extract to globals
while(list($key,$value)=each($_GET)) $GLOBALS[$key]=$value;
// Don't allow config vars to be passed as global
include( "configuration.php" );
}
function sefRelToAbs( $string ) {
GLOBAL $mosConfig_live_site, $mosConfig_sef;
GLOBAL $database;
if ($mosConfig_sef && strcasecmp(substr($string,0,4),"http") && strcasecmp(substr($string,0,1),"/") && strcasecmp(substr($string,0,7),"mailto:")) {
// Replace all & with &
$string = str_replace( '&', '&', $string );
$string = str_replace( '&', '&', $string );
/*
Home
index.php
*/
if ($string=="index.php") {
$string="";
}
$sefstring = "";
/*
Content
index.php?option=com_content&task=$task§ionid=$sectionid&id=$id&Itemid=$Itemid&limit=$limit&limitstart=$limitstart
*/
if ( (eregi("option=com_content",$string) || eregi("option=content",$string) ) && !eregi("task=new",$string) && !eregi("task=edit",$string) ) {
$sefstring .= "content/";
if (eregi("&task=",$string)) {
$temp = split("&task=", $string);
$temp = split("&", $temp[1]);
$task = $temp[0];
$sefstring .= $temp[0]."/";
}
if (eregi("&sectionid=",$string)) {
$temp = split("&sectionid=", $string);
$temp = split("&", $temp[1]);
$sefstring .= $temp[0]."/";
}
if (eregi("&id=",$string)) {
$temp = split("&id=", $string);
$temp = split("&", $temp[1]);
$id = $temp[0];
$sefstring .= $temp[0]."/";
}
if (eregi("&Itemid=",$string)) {
$temp = split("&Itemid=", $string);
$temp = split("&", $temp[1]);
$sefstring .= $temp[0]."/";
}
if (eregi("&limit=",$string)) {
$temp = split("&limit=", $string);
$temp = split("&", $temp[1]);
$sefstring .= $temp[0]."/";
}
if (eregi("&limitstart=",$string)) {
$temp = split("&limitstart=", $string);
$temp = split("&", $temp[1]);
$sefstring .= $temp[0]."/";
}
$string = $sefstring;
// at this point, see if we can get the content title
// $string = url
// also have: $id, $task
$title = false;
switch ($task) {
case "view":
$database->setQuery("SELECT title FROM #__content WHERE id = '".$id."'");
$title = $database->loadResult();
break;
case "category":
case "blogcategory":
case "archivecategory":
$database->setQuery("SELECT name FROM #__categories WHERE id = '".$id."'");
$title = $database->loadResult();
break;
case "section":
case "blogsection":
case "archivesection":
$database->setQuery("SELECT name FROM #__sections WHERE id = '".$id."'");
$title = $database->loadResult();
break;
default:
$string = "default";
break;
}
if ($title) {
// find/create entry for this url/title in #__sef table, and return it
$string = sefGetLocation($string, $title);
}
} (continued, sorry about doing that mid-file) |
| |
18.08.2004, 10:17
|
#3 (permalink)
| | Junior Mamber
Join Date: Jun 2004
Posts: 38
| Re: SEF urls, the right way (I think) sef.php, continued: Code: /*
Components
index.php?option=com_xxxx&...
*/
if (eregi("option=com_",$string) && !eregi("option=com_registration",$string) && !eregi("task=new",$string) && !eregi("task=edit",$string)) {
$sefstring = "component/";
$temp = split("\?", $string);
$temp = split("&", $temp[1]);
foreach($temp as $key => $value) {
$sefstring .= $value."/";
}
$string = str_replace( '=', ',', $sefstring );
if (preg_match("/Itemid,(\d+)/",$string,$matches)) {
$Itemid = $matches[1];
// see if we can lookup a title for this object
$database->setQuery("SELECT name FROM #__menu WHERE id = '".$Itemid."'");
if ($title = $database->loadResult()) {
// find/create entry for this url/title in #__sef table, and return it
$string = sefGetLocation($string, $title);
}
}
}
//echo $mosConfig_live_site."/".$string;
return $mosConfig_live_site."/".$string;
} else {
return $string;
}
}
?> index.php needs a slight modification:
the line that initalizes the database, starting with "$database = ..." needs to be moved just before "include("includes/sef.php");"
The sef code checks the database to see if the given url matches.
Finally, .htaccess:
Replace: Code: RewriteRule ^content(.*) index.php
RewriteRule ^component/(.*) index.php with: Code: RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*) /index.php/$1 [L] (basically, just rewrites everything, unless it exists as a directory or file, to /index.php/<whatever>)
A couple notes:
- The title isn't stored for any particular reason.. it helps a bit to see what's going on, and initially I was going to use it, but then didn't, and just didn't bother removing it because I thought it might be useful in the future perhaps.
- If the _sef table is cleared, the urls on the site may change. If you have duplicate titles, the ones that get loaded first will get no suffix.. so for example, at mwater.ca, I went to the residential side first.. if I cleared that table, then went to the commercial side first, and the residential side second, the links in the residential area would have the "_1" at the end of them, opposite to what is there now. Is this a big deal? probably not.. you probably don't really need to clear that table.. thought it might get excessively big with a site that takes a lot of editing.
- There is one configuration variable added, $mosConfig_sefSuffix, which lets you add a file extension. my 4.5.1 site has it set to "html" so you can see what it does.
- The sef table will actually override any other url. If there is a sef entry for a real url that changes it to something else, then you won't be able to access that url at all .. now this is not a big deal, since it won't create this on its own anyways..
But that brings me to another possiblity: url aliases. Would make a decent component to be able to enter a new entry like: (location,url) = ('special','content/view/5/1') so that you can verbally tell people to go to "https://www.mwater.ca/jobs" to see your job listings, even though the url from the site is normally "https://www.mwater.ca/Employment_Opportunities". Only restriction is it has to go to "content/category/2/7/31/" (which is the real url for that page) and not "Employment_Opportunities" .. it doesn't resolve the url recursively, it just does it once when it loads.
Hopefully some people will find this useful, it would definately be a nice feature to see in 4.5.1 as it's really not very difficult. |
| |
23.08.2004, 18:41
|
#4 (permalink)
| | Baby Mamber
Join Date: Aug 2004
Posts: 2
| Re: SEF urls, the right way (I think) gregmac,
Thanks for this hack. I just implimented it on my site this weekend and it works fine.
Regards,
Steve |
| |
23.08.2004, 20:37
|
#5 (permalink)
| | Mamber
Join Date: Apr 2004 Location: Belize
Posts: 86
| Advantages of SEF urls Advantages of making the url look like a flat html site???
do search engines pick them up faster/easier/rank them higher ?....
any other reason...
__________________
The Truth shall set you free, unless you done it |
| |
23.08.2004, 22:16
|
#6 (permalink)
| | Junior Mamber
Join Date: Apr 2004
Posts: 29
| Re: Advantages of SEF urls Quote: |
Originally Posted by BzBeauty do search engines pick them up faster/easier/rank them higher ?.... | Yes and yes.
Greg, this looks excellent. I'll give it a try soon. Thanks very much for sharing that piece of code.
Kind regards,
Zorro |
| |
24.08.2004, 15:51
|
#7 (permalink)
| | Baby Mamber
Join Date: Jun 2004
Posts: 10
| Re: SEF urls, the right way (I think) Looks nice and useful.
What will the behaviour be for non-standard pages, e.g. installed components like simpleboard or docman?
Personally I like to see the categories as directories, the same for the final page, e.g.:
".../news/latest/news_article" (instead of just "news_article.html")
Can this be an option?
I like the alias option, can I do that without the future component you talked about?
How would you do something like that with a standard Mambo installation? |
| |
25.08.2004, 02:57
|
#8 (permalink)
| | Baby Mamber
Join Date: Aug 2004
Posts: 7
| abolutely right | ideas: google and users this looks great! i agree the other solutions are a bit of a mess.
how about all lower case, with hyphens instead of underscores. would make it great for google. offering the option (perhaps in code which is commented out) of including sections and/or categories in the URL would be ace.
i personally wouldn't want to use both as it puts the file too far from root but might like to use one or the other depending on how a site is organized.
google sometimes prefers sites organised into sections, although files in root (top-level) is great too. don't worry about this extra idea if it will break the component or introduce mind-boggling complexity to it.
more importantly (if the section/category will resolve without the article title), it gives the user a super friendly way to run around the site and to bookmark it.
great work! |
| |
26.08.2004, 21:07
|
#9 (permalink)
| | Elite Mamber
Join Date: Apr 2004 Location: /dev/peru/lima
Posts: 1,008
| Re: abolutely right | ideas: google and users Great job!  i love it. |
| |
26.08.2004, 23:57
|
#10 (permalink)
| | Mamber
Join Date: Jul 2004
Posts: 51
| Re: abolutely right | ideas: google and users Excellent idea. I'm going to try this on my 1.0.9 this week... |
| | | Thread Tools | | | | Display Modes | Linear Mode |
Posting Rules
| You may not post new threads You may not post replies You may not post attachments You may not edit your posts HTML code is Off | | | All times are GMT +2. The time now is 13:38. | | | |