One of the two main algorithms in finding the minimum spanning tree algorithms is the algorithm of Kruskal. Before getting into the details, let’s get back to the principles of the minimum spanning tree.
We have a weighted graph and of all spanning trees we’d like to find the one with minimal weight. As an example on the picture above you see a spanning tree (T) on the graph (G), but that isn’t the minimum weight spanning tree!
The binary search tree is a very useful data structure, where searching can be significantly faster than searching into a linked list. However in some cases searching into a binary tree can be as slow as searching into a linked list and this mainly depends on the input sequence. Indeed in case the input is sorted the binary tree will seem much like a linked list and the search will be slow.
To overcome this we must change a bit the data structure in order to stay well balanced. It’s intuitively clear that the searching process will be better if the tree is well branched. This is when finding an item will become faster with minimal effort.
Since we know how to construct a binary search tree the only thing left is to keep it balanced. Obviously we will need to re-balance the tree on each insert and delete, which will make this data structure more difficult to maintain compared to non-balanced search trees, but searching into it will be significantly faster. Continue reading Computer Algorithms: Balancing a Binary Search Tree→
Constructing a linked list is a fairly simple task. Linked lists are a linear structure and the items are located one after another, each pointing to its predecessor and its successor. Almost every operation is easy to code in few lines and doesn’t require advanced skills. Operations like insert, delete, etc. over linked lists are performed in a linear time. Of course on small data sets this works fine, but as the data grows these operations, especially the search operation becomes too slow.
Indeed searching in a linked list has a linear complexity and in the worst case we must go through the entire list in order to find the desired element. The worst case is when the item doesn’t belong to the list and we must check every single item of the list even the last one without success. This approach seems much like the sequential search over arrays. Of course this is bad when we talk about large data sets.
Every developer knows that computer algorithms are tightly related to data structures. Indeed many of the algorithms depend on a data structures and can be very effective for some data structures and ineffective for others. A typical example of this is the heapsort algorithm, which depends on a data structure called “heap”. In this case although the stack and the queue are data structures instead of pure algorithms it’s imporant to understand their structure and the way they operate over data.
However, before we continue with the concrete realization of the stack and the queue, let’s first take a look on the definition of this term. A data structure is a logical abstraction that “models” the real world and presents (stores) our data in a specific format. The access to this data structure is often predefined thus we can access directly every item containing data. This help us to perform a different kind of tasks and operations over different kind of data structures – insert, delete, search, etc.. A typical data structures are the stack, the queue, the linked list and the tree.
All these structures help us perform specific operations effectively. For instance searching in a balanced tree is faster than searching in a linked list.
It is also very important to note that data structures can be represented in many different ways. We can model them using arrays or pointers, as shown in this post. In fact the most important thing is to represent the logical structure of the data structure you’re modeling. Thus the stack is a structure that follows the LIFO (Last In First Out) principle and it doesn’t matter how it is represented in our program (whether it will be coded with an array or with pointers). The important thing into a stack representation is to follow the LIFO principle correctly. In this case if the stack is an array only its top should be accessible and the only operation must be inserting new top of the stack. Continue reading Computer Algorithms: Stack and Queue→
Run-length encoding is a data compression algorithm that helps us encode large runs of repeating items by only sending one item from the run and a counter showing how many times this item is repeated. Unfortunately this technique is useless when trying to compress natural language texts, because they don’t have long runs of repeating elements. In the other hand RLE is useful when it comes to image compression, because images happen to have long runs pixels with identical color.
As you can see on the following picture we can compress consecutive pixels by only replacing each run with one pixel from it and a counter showing how many items it contains.
In this case we can save only counters for pixels that are repeated more than once. Such the input stream “aaaabbaba” will be compressed as “ababa”.
Actually there are several ways run-length encoding can be used for image compression. A possible way of compressing a picture can be either row by row or column by column, as it is shown on the picture below.