Short URLs - URL Mapping

Current situation

Currently we have a big problem with the URLs we use for common tasks, for example: to read a Blog entry we need to write a very long URL in the way of: index.php?gadget=Blog&action=View&id=123 for example. This problem happens with the photos we have under Phoo, StaticPages and there’s an high probability that it will be repeated in the future.

So, when a user wants to reference from a Blog entry to another one, he needs to write all that long url instead of writing short URLs like: http://www.foobarme.com/Blog/123 .

Solutions

Here is where the URL mapping thingie comes.

URL mapping is a ‘map’ of log urls that will be associated with short identifiers called IIDs. Each gadget thats wants to join in the URL mapping game, needs to create an URL map when its installed.

This URL consists of a base url, once the URL map is created, the base url (base_url) will be added to the url_maps DB table for future access an management.

So, for example, suppose we are hacking the Blog Model, in the InstallGadget method of the Model (BlogModel::InstallGadget) there’s gonna be a line creating it:

$this->AddNewURLMap ("index.php?gadget=Blog&action=View&id={IID}", "Blog/{IID}", "BLOG_ENTRY");

Here are three parts:

  • URL: When we call it, it will create a new record on base_url DB table with the url we give it. It should have the {IID}, so when we create a ‘reference’ to it we know where to add the id (IID).
  • Short URL: This is the URL the user will type in the browser, so just by calling Blog/123, its gonna be translated to index.php?gadget=Blog&action=View&id={IID}.
  • Identifier: The identifier is a short key to remember the url map name, instead of writing the long url again just to make reference to it, we just use the BLOG_ENTRY string.

A good idea is to add a new section to our JawsInfo files called: URL_MAPS or something like that, so once we create it, the installer will look up for these URL_MAPS and also when we are executing the Model, the Model will know those URL_MAPS (instead of having them on a private variable).

But lets to go the part of ‘maintaining’ the url_maps. Everytime we create or delete an entry we should be calling the DeleteMapIID or AddMapIID methods. These methods will be adding or deleting the IID to the iid_url_maps (to give it a name), so, when we call the NewEntry method of BlogModel (BlogModel::NewEntry) it will be calling:

$this->AddMapIID ("BLOG_ENTRY", $id);

or

$this->DeleteMapIID ("BLOG_ENTRY", $id);

So, having the following DB schema:

CREATE TABLE url_maps (
  id INT(10) NOT NULL AUTOINCREMENT,
  baseurl VARCHAR (200) NOT NULL,
  shorturl VARCHAR (200) NOT NULL,
  PRIMARY KEY (id)
);
 
CREATE TABLE iid_url_maps (
  iid INT(10) NOT NULL DEFAULT '0',
  base_id INT(10) NOT NULL DEFAULT '0'
);

So, when we added the BLOG_ENTRY (key) base_url we were doing:

id baseurl shorturl
1 index.php?gadget=Blog&action=View&id={IID} Blog/{IID}


And when we added (AddMapIID) a new entry (123) referenced to the BLOG_ENTRY we were doing:

iid base_id
123 1


Everytime we call the DeleteMapIID we are deleting that IID (123) record from iid_url_maps.

We should have a JawsURLMapping class that will help us with this job.

What's next?

We now have the solution on how to get a relation between long urls and short urls. But now comes the question: How index.php or other wrappers are going to handle it?.

We should add a new line to these wrappers to call the REQUEST_URL, if the ‘url_maps’ format is found (using a regexp we can know it), we should proceed to load it (everytime we check if it exists, and bla bla bla), for example:

JawsURLMapping::Parse ($_SERVER["QUERY_STRING"]);

Then, we have two options to load/execute these urls once they are parsed:

  • Take the URL, and overwrite the $_SERVER[“SCRIPT_URL”] or whatever it is, so when we try to find the $_REQUEST[“gadget”] and $_REQUEST[“id”] in index.php, we will find the ‘clean’ gadget name with its actions and all that crap. URL that user typed on the browser will be preserved on the browser, not redirected.
  • Prepare a redirect to the real URL. In this case, the user will be redirected to the real URL.

Other solutions?

Yes, there’s another one: using .htaccess and rewrite rules. But why not this option?:

  • Not all users can run a .htaccess
  • Not all servers are running mod_rewrite
  • Using url_mapping is easier for developers. No regexps!

RFC - Request for comments and commits

Note: Sorry for the bad english but I’m tired, tomorrow I’ll take a look and correct it. :-P

I would suggest replacing the $_REQUEST / $_GET vars but not overwriting the $_SERVER[’SCRIPT_URL’] entry. This way the original URL is still around inc ase you need it but you have your vars.

(EB:) I am not sure, the thing I originaly had in mind was similar but did contain a global map. Read: the name of the gadget would be encoded within the id-to-url map (along with the specific id data for the each blog). In fact, one could even open it by:

http://www.foobar.com/jaws/?123 (or via fast-links with [IID 123])

and the entry 123 in the global map would be expanded to /Blog/…&gadget=…&id=17 (or so). The first part would be expanded by apache (IIRC default expansion to index… if the script name is omited) and Blog, the Blog-ID and other stuff would come from the global-map table.

And for the example above, the fast-links plugin would generate a href link with a short title (also coming from the global map, which has been manualy set or auto-generated by some heuristics algorithm, eg. <a href=http://www.foobar.com/verylongurl>Debian-Blog from April/2005</a>).

Author Information

Author: Pablo Fischer (pablo@pablo.com.mx)
Last Update: 20/04/2005
Inspired on the bug report of Eduard Bloch.

 
  /var/www/wiki/htdocs/data/jaws/proposals/url_mapping.txt · Last modified: 2007/11/02 16:27