Find it on GitHub
Perhaps right after the addition at school we’ve learned how to multiply two numbers. This algorithm isn’t as easy as addition, but besides that we’re so familiar with it and that we even don’t recognize it as an “algorithm”. We just know it by heart.
However, as I already said, multiplication is a bit more difficult than addition. This algorithm is interesting because for several reasons. First of all, let’s compare multiplication in binary and decimal.
So, let’s see how to multiply two numbers.
Multiplying and adding is practically the same thing, so where’s the difference?
It’s clear that a product of two numbers (each with N digits) can be represented as N sums, as we can see on the next picture.
Now, for larger integers each time we shift left the next intermediate sum.
We know how to add two integers using a perfectly simple and useful algorithm learned from school or even earlier. This is perhaps one of the very first techniques we learn in mathematics. However we need to answer few questions. First of all do computers use the same technique, since they use binary representation of numbers? Is there a faster algorithm used by computers? What about boundaries and large integers?
Let’s start by explaining how we humans add two numbers. An important fact is that by adding two single-digit numbers we get at most two digit number. This can be proven by simply realizing that 9+9 = 18. This fact lays down in the way we add integers. Here’s how.
We just line-up the integers on their right-most digit and we start adding them in a column. In case we got a sum greater than 9 (let’s say 14) we keep only the right-most digit (the 4) and the 1 is added to the next sum.
Thus we get to the simple fact that by adding two n-digit integers we can have either an n-digit integer or a n+1 digit integer. As an example we see that by adding 53 + 35 (two 2-digit integers) we get 88, which is again 2-digit integer, but 53 + 54 result in 107, which is 3 digit integer.
That fact is practically true, as I mentioned above, for each pair of n-digit integers.
What about binary numbers?
In fact binaries can be added by using the exact same algorithm. At the example below we add two integers represented as binary numbers.
As a matter of fact this algorithm is absolutely wonderful, because it works not only on decimals and binaries but in any base B.
Of course computers tend to perform better when adding integers that “fit” the machine word. However as we can see later this isn’t always the case and sometimes we need to add larger numbers that exceed the type boundaries.
What about big integers?
Since we know how to add “small” integers, it couldn’t be so hard to apply the same algorithm on big integers. The only problem is that the addition will be slower and sometimes (done by humans) can be error prone.
So practically the algorithm is the same, but we can’t just put a 1 billion integer into a standard computer type INT, right? That means that the tricky part here is the way we represent integers in our application. A common solution is to store the “big” integer into an array, thus each digit will be a separate array item. Then the operation of addition will be simple enough to be applied.
When we talk about an algorithm that is so well known by every human being (or almost every) a common question is “is there anything faster” or “do computers use a different algorithm”. The answer may be surprising to someone, but unfortunately that is the fastest (optimal) algorithm for number addition.
Practically there’s nothing to optimize here. We just read the two n-digit numbers (O(n)), we apply “simple” addition to each digit and we carry over the 1 from the sums greater than 9 to the next “simple” addition. We don’t have loops or any complex operation in order to search for an optimization niche.
It’s strange how often this algorithm is asked on coding interviews. Perhaps the catch is whether the interviewed person will start to look for a faster approach?! Thus is cool to know that this algorithm is optimal.
Sometimes we may ask ourselves why we humans use decimals. It’s considered because we have 10 fingers on our hands and this is perhaps true.
An interesting fact though, is that the Mayas (who barely predicted the end of the world a couple of weeks ago) used a system of a base 20. That is logical, since we have not 10, but total of 20 fingers considering our legs.
Finally, this algorithm may seem to easy to be explained but it lays down in more complex algorithms.
What’s the fastest way to sort the following sequence [9, 3, 0, 5, 4, 1, 2, 6, 8, 7]? Well, the question is a bit tricky since the input is somehow “predefined”. First of all we have only integers, and fortunately they are all different. That’s great and we know that in practice it’s almost impossible to count on such lucky coincidence. However here we can sort the sequence very quickly.
First of all we can pass through all these integers and by using an auxiliary array we can just put them at their corresponding index. We know in advance that that is going to work really well, because they are all different.
There is only one major problem in this solution. That’s because we assume all the integers are different. If not – we can just put all them in one single corresponding index.
That is why we can use bucket sort.
Bucket sort it’s the perfect sorting algorithm for the sequence above. We must know in advance that the integers are fairly well distributed over an interval (i, j). Then we can divide this interval in N equal sub-intervals (or buckets). We’ll put each number in its corresponding bucket. Finally for every bucket that contains more than one number we’ll use some linear sorting algorithm.
The thing is that we know that the integers are well distributed, thus we expect that there won’t be many buckets with more than one number inside.
That is why the sequence [1, 2, 3, 2, 1, 2, 3, 1] won’t be sorted faster than [4, 3, 1, 2, 9, 5, 4, 8].
1. Let n be the length of the input list L; 2. For each element i from L 2.1. If B[i] is not empty 2.1.1. Put A[i] into B[i] using insertion sort; 2.1.2. Else B[i] := A[i] 3. Concatenate B[i .. n] into one sorted list;
The complexity of bucket sort isn’t constant depending on the input. However in the average case the complexity of the algorithm is O(n + k) where n is the length of the input sequence, while k is the number of buckets.
The problem is that its worst-case performance is O(n^2) which makes it as slow as bubble sort.
As the other two linear time sorting algorithms (radix sort and counting sort) bucket sort depends so much on the input. The main thing we should be aware of is the way the input data is dispersed over an interval.
Another crucial thing is the number of buckets that can dramatically improve or worse the performance of the algorithm.
This makes bucket sort ideal in cases we know in advance that the input is well dispersed.