## Overview

Insertion sort is a great algorithm, because it’s very intuitive and it is easy to implement, but the problem is that it makes many exchanges for each “light” element in order to put it on the right place. Thus “light” elements at the end of the list may slow down the performance of insertion sort a lot. That is why in 1959 Donald Shell proposed an algorithm that tries to overcome this problem by comparing items of the list that lie far apart.

In the other hand it is obvious that by comparing items that lie apart the list can’t be sorted in one pass as insertion sort. That is why on each pass we should use a fixed gap between the items, then decrease the value on every consecutive iteration.

However it is intuitively clear that Shell sort may need even more comparisons than insertion sort. Then why should we use it?

The thing is that insertion sort is not an effective sorting algorithm at all, but in some cases, when the list is almost sorted it can be quite useful. Here’s the answer of the question above. With Shell sort once the list is sorted for gap = i, it is sorted for every gap = j, where j < i, and this is its main advantage. [caption id="attachment_2792" align="alignnone" width="620" caption="Shell sort can make less exchanges than insertion sort."][/caption]

### How to choose gap size

Not a cool thing about Shell sort is that we’ve to choose “the perfect” gap sequence for our list. However this is not an easy task, because it depends a lot of the input data. The good news is that there are some gap sequences proved to be working well in the general cases.

### Shell Sequence

Donald Shell proposes a sequence that follows the formula FLOOR(N/2^{k}), then for N = 1000, we get the following sequence: [500, 250, 125, 62, 31, 15, 7, 3, 1]

### Pratt Sequence

Pratt proposes another sequence that’s growing with a slower pace than the Shell’s sequence. He proposes successive numbers of the form 2^{p}3^{q} or [1, 2, 3, 4, 6, 8, 9, 12, …].

### Knuth Sequence

Knuth in other hand proposes his own sequence following the formula (3^{k} – 1) / 2 or [1, 4, 14, 40, 121, …]

Of course there are many other gap sequences, proposed by various developers and researchers, but the problem is that the effectiveness of the algorithm strongly depends on the input data. But before taking a look to the complexity of Shell sort, let’s see first its implementation.

## Implementation

Here’s a Shell sort implementation on PHP using the Pratt gap sequence. The thing is that for this data set other gap sequences may appear to be better solution.

$input = array(6, 5, 3, 1, 8, 7, 2, 4); function shell_sort($arr) { $gaps = array(1, 2, 3, 4, 6); $gap = array_pop($gaps); $len = count($arr); while($gap > 0) { for($i = $gap; $i < $len; $i++) { $temp = $arr[$i]; $j = $i; while($j >= $gap && $arr[$j - $gap] > $temp) { $arr[$j] = $arr[$j - $gap]; $j -= $gap; } $arr[$j] = $temp; } $gap = array_pop($gaps); } return $arr; } // 1, 2, 3, 4, 5, 6, 7, 8 shell_sort($input); |

It’s easy to change this code in order to work with Shell sequence.

$input = array(6, 5, 3, 1, 8, 7, 2, 4); function shell_sort($arr) { $len = count($arr); $gap = floor($len/2); while($gap > 0) { for($i = $gap; $i < $len; $i++) { $temp = $arr[$i]; $j = $i; while($j >= $gap && $arr[$j - $gap] > $temp) { $arr[$j] = $arr[$j - $gap]; $j -= $gap; } $arr[$j] = $temp; } $gap = floor($gap/2); } return $arr; } // 1, 2, 3, 4, 5, 6, 7, 8 shell_sort($input); |

## Complexity

Yet again we can’t determine the exact complexity of this algorithm, because it depends on the gap sequence. However we may say what is the complexity of Shell sort with the sequences of Knuth, Pratt and Donald Shell. For the Shell’s sequence the complexity is O(n^{2}), while for the Pratt’s sequence it is O(n*log^{2}(n)). The best approach is the Knuth sequence where the complexity is O(n^{3/2}), as you can see on the diagram bellow.

## Application

Well, as insertion sort and bubble sort, Shell sort is not very effective compared to quicksort or merge sort. The good thing is that it is quite easy to implement (not easier than insertion sort), but in general it should be avoided for large data sets. Perhaps the main advantage of Shell sort is that the list can be sorted for a gap greater than 1 and thus making less exchanges than insertion sort.

The images give the impression that every i’th number is compared in shell sort and then swapped if necessary.

The image presents the algorithm the following way:

If the gap is 3:

For all elements in the array compare only every third numbers,

Where the Shell Algorithm says:

If the gap is 3:

for every x in the array:

compare x with y (which lies 3 items ahead)

This video shows how the algorithm works:

http://www.youtube.com/watch?v=qzXAVXddcPU

Can I say that the shell sort just reduce the # of swaps but # of compares is more than the insertion sort?

Because I think even if the input is sorted, the algorithm needs to compare the sequence once when gap = 1, and when we calculate the other gaps, the # of compares must increase.