stoimen.com/blog

web developing

Sorting Algorithms

Last Friday I wrote a quick post showing the recursive version of quicksort on both PHP and JavaScript, but this time let me begin with some breve history. Actually sorting and searching algorithms are widely used in the computer science and are developed and improved many times in the last 60 years or so. As the world goes so wired, all the information around is stored in data bases and the need of fast search methods is critical. But as we all know from our experience, it is far more easy to search into a sorted data structure. Such are all the dictionaries we use. Can you, actually, imagine to search for a word in a dictionary where all the words don’t follow the alphabetical order and are dispersed chaotically instead?

That’s why from the early years of computer science both sorting and searching algorithms go together. I’ll cover more on search algorithms in the future, but as I started with quicksort – one of the many sorting algorithms, let me mention that this algorithm is not anonymous! What I mean is that his author is well known and he’s recognized in the computer science community. His name is …

C. A. R. Hoare

Sir C. A. R. Hoare
Sir Charles Antony Richard Hoare is born on the 11 of January 1934 in Ceylon, now Sri Lanka, and after finishing his studies in the University of Oxford and being in the Royal Navy for two years, he went to … Moscow where at the age of 26 (in 1960) he developed the famous quicksort algorithm. That’s not everything! Beside the sorting algorithm he’s well recognized with the Hoare logic and the invention of the formal language CSP (Communicating Sequential Processes).

Here’s what Sir C. A. R. Hoare wrote in his lecture “The Emperor’s Old Clothes” published by Communications of the ACM in 1981:

My first task was to implement for the new Elliot 803 computer, a library subroutine for a new fast method of internal sorting just invented by Shell. I greatly enjoyed the challenge of maximizing efficiency in the simple decimal-addressed machine code of those days. My boss and tutor, Pat Shackleton, was very pleased with my completed program. I then said timidly that I thought I had invented a sorting method that would usually run faster than SHELLSORT, without taking much extra store. He bet me sixpence that I had not. Although my method was very difficult to explain, he finally agreed that I had won my bet.

The quicksort was born!

Recursive vs. Iterative

There are two main directions in solving such problems as the sorting algorithms. Recursive way and the iterative way. Every developer knows what is recursive and what’s iterative. We like to say that recursive is more “elegant” solution, than the iterative, but it’s interesting to say that both are completely interchangeable. Yeah, every recursion can be modeled with a stack (as the recursion is always converted into a stack on the machine level), and every loop can be converted to a recursion. It’s interesting to say that the second direction is easier, while the switch from recursion to stack it’s not always obvious and easy!

Here’s a code snippet from the recursive version of the quicksort, I’ve posted last Friday:

return quicksort(left).concat(pivot, quicksort(right));

What actually happens here is that the second call of quicksort() is always delayed for a later execution. Until the first quicksort() doesn’t finish his job, the second is never used! Why than we need a recursion? Why don’t we put the right part of the array in a stack for later execution? In fact this is the solution! Every right part will be put into a stack. Thus while the stack has some elements we pop an element from it and than split it again on two – left and right: the right part enters the stack and the left is yet again sorted.

In that scenario we end with an element split when the left part is with only one element. Here’s important to say that this might not be the perfect solution and there might be some quite good optimizations!

Assuming we’ve the following array – [8,4,6,2,1,9,5], here’s what happens, note that in JavaScript array.pop() pops the rightmost element, in this case 5:

stack = [[8,4,6,2,1,9,5]]
sorted = [];
 
1. step
stack = [5,8,6,9][4,2,1]
sorted = []
 
2. step
stack = [5,8,6,9][4][2,1]
sorted = []
 
3. step
stack = [5,8,6,9][4][2][1]
sorted = []
 
4. step
stack = [8,6,9][5]
sorted = [1,2,4]
 
5. step
stack [8,9][6]
sorted = [1,2,4,5]
...

Code

Here’s the implementation of the iterative version, again on both PHP and JavaScript:

JavaScript:

var a = [8,4,6,2,1,9,5,5,4,3,4,3];
 
function qsort(arr)
{
    var stack = [arr];
    var sorted = [];
 
    while (stack.length) {
 
        var temp = stack.pop(), tl = temp.length;
 
        if (tl == 1) {
            sorted.push(temp[0]);
            continue;
        }
        var pivot = temp[0];
        var left = [], right = [];
 
        for (var i = 1; i < tl; i++) {
            if (temp[i] < pivot) {
                left.push(temp[i]);
            } else {
                right.push(temp[i]);
            }
        }
 
        left.push(pivot);
 
        if (right.length)
            stack.push(right);
        if (left.length)
            stack.push(left);
 
    }
 
    console.log(sorted);
}
qsort(a);

PHP:

$unsorted = array(8,4,6,2,1,9,5,5,4,3,4,3);
 
function qsort($array)
{
    $stack = array($array);
    $sorted = array();
 
    while (count($stack) > 0) {
 
        $temp = array_pop($stack);
 
        if (count($temp) == 1) {
            $sorted[] = $temp[0];
            continue;
        }
 
        $pivot = $temp[0];
        $left = $right = array();
 
        for ($i = 1; $i < count($temp); $i++) {
            if ($pivot > $temp[$i]) {
                $left[] = $temp[$i];
            } else {
                $right[] = $temp[$i];
            }
        }
 
        $left[] = $pivot;
 
        if (count($right))
            array_push($stack, $right);
        if (count($left))
            array_push($stack, $left);
    }
 
    return $sorted;
}
 
$sorted = qsort($unsorted);
print_r($sorted);

First of all this is more code than the recursive solution, and besides that it’s perhaps not the optimal solution – I’ll cover in the future more on how to optimize it and where it can be useful.

Till next Friday!

This is nothing useful, but let me show the beauty of jQuery. This snippet just adds more and more nodes on a simple UL based tree.

Here’s the HTML:

<ul>
    <li><a href="#">Node 1</a></li>
    <li><a href="#">Node 2</a></li>
    <li><a href="#">Node 3</a></li>
    <li><a href="#">Node 4</a></li>
</ul>

And that’s the jQuery:

var index = 5;
$('li > a').live('click', function() {
    $(this).parent().append('<ul><li><a href="#">Node ' + index + '</a></li></ul>');
 
    index++;
 
    return false;
});

And here’s the demo.

PHP Style

As you may know in PHP you can access everything in the request uri by accessing the global $_GET array. If there is something like that in the browser’s address field: www.example.com/index.php?controller=index&action=test, you can simply get the values by that:

echo $_GET['controller']; // this will print "index"
echo $_GET['action'];     // this will print "test"

Zend Framework, Uri & Request Params

If you code with Zend Framework, you should know already, and that’s perhaps the first thing you’ve learned about Zend, that $_GET params can be accessed by calling the requrest’s getParam() method. But first of all the request uri will be different. The Zend’s convetion is to place after the domain name first the module name (which is omited when there’s only one module), than the controller’s name followed by the action and the key value pairs of all parameters. In that scheme the request uri above will look like that:

http://www.example.com/index/test

Here the keywords “controller” and “action” are omitted. This is cool – it’s more user friendly and it definitely helps the SEO.

Get the $_GET

Once the uri is setup like so – /index/test you can access it via the Zend way:

echo $this->getRequest()->getParam('controller'); // this will print "index"

The cool thing is that in the first case you don’t have any prevention of a missing value, while in the second case there is a second parameter or the getParam() method that does this job. What if the uri is www.example.com/index.php?controller=&action=test than by printing the $_GET['controller'] you’ll get nothing. In other hand even this:

echo $this->getRequest()->getParam('action');

won’t return “test” if the uri is http://www.example.com/index/

Note: actually here the default “index” action will be referenced!

That’s where the power of the framework comes. In the first case the solution is:

echo (empty($_GET['controller']) ? 'default': $_GET['controller']);

while in Zend there’s more elegant solution:

echo $this->getRequest()->getParam('action', 'test');

Thus when the action param is missing the “test” value is considered as default! Very useful!

There are some really good reasons to use Zend_Translate for your multilingual site instead of some other technique. Actually you can definitely simulate something like this Zend component with and array or ini file, but here are some good things to mention.

1. Write a human readable source file

In fact you can do this even without Zend_Translate. Image you’ve an ini file with some pairs like:

lblMyLabel = "My Label"

Nobody says this cannot be “more” human readable – like so:

My Label = "My Label"

Than you can support several files i.e. en.ini, de.ini, etc. where you can translate this label. The problem is that once you read the ini file you’ve to deal with those white spaces into the labels, while with Zend_Translate you can simply call them with:

$translate->_("My Label");

2. Default values for missing labels

Zend_Translate forces you to define a source file. Which is really great, because you always have a default value. Assuming that you don’t have a specific label translated in one of the ini files, you’ll get the default value from the source file, and defining source and other files is as easy as:

// en.ini
My Label = "My Label"
 
// de.ini
...
 
// while the locale is de_DE you'll have 
// the default value returned
$translate->_("My Label"); // will return My Label from the en.ini

3. Locales

It’s really great that you can directly setup a locale for this module. Thus you have more flexibility when switching between languages and you can benefit from the power of the framework when reading some browser specific locales. Here’s some code:

$translate = new Zend_Translate('ini', 'en.ini', 'en');
$translate->addTranslation('de.ini', 'de');
 
// than when you setup the translation locale
$translate->setLocale('en');

4. Cache

One of the greatest things about Zend_Translate is the native support of cache. It’s so cool! Actually the labels of a large web system are pretty static and don’t change every day. In other hand in large scale systems there are lots of labels, and the load time will be faster with cache.

$frontendOptions = array('lifetime' => 600, 'automatic_serialization' => true);
$backendOptions  = array('cache_dir' => 'path_to_cache_dir');
 
$cache = Zend_Cache::factory('Core', 'File', $frontendOptions, $backendOptions);
 
Zend_Translate::setCache($cache);

It’s a great advantage to use Zend_Translate after all this. Actually I have finished projects without any use of it, but I really can’t imagine how I’ll work now without it!

Let’s assume there’s a little, simple JavaScript array:

var a = [3,2,4,5,2,4,5,8];

If you pop() an element from it – what element will be taken? The answer is a bit frustrating, but logical – the last one:

var b = a.pop();
alert(b); // this will result in 8