<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>stoimen&#039;s web log</title>
	<atom:link href="http://www.stoimen.com/blog/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.stoimen.com/blog</link>
	<description>about web development</description>
	<lastBuildDate>Mon, 30 Jan 2012 18:27:26 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Computer Algorithms: Data Compression with Relative Encoding</title>
		<link>http://www.stoimen.com/blog/2012/01/30/computer-algorithms-data-compression-with-relative-encoding/</link>
		<comments>http://www.stoimen.com/blog/2012/01/30/computer-algorithms-data-compression-with-relative-encoding/#comments</comments>
		<pubDate>Mon, 30 Jan 2012 18:27:26 +0000</pubDate>
		<dc:creator>Stoimen</dc:creator>
				<category><![CDATA[algorithms]]></category>
		<category><![CDATA[Africa]]></category>
		<category><![CDATA[Algorithm]]></category>
		<category><![CDATA[Algorithmic efficiency]]></category>
		<category><![CDATA[Application This algorithm]]></category>
		<category><![CDATA[Computing]]></category>
		<category><![CDATA[Data compression]]></category>
		<category><![CDATA[data compression algorithm]]></category>
		<category><![CDATA[Google Inc.]]></category>
		<category><![CDATA[JSON]]></category>
		<category><![CDATA[Lossless data compression]]></category>
		<category><![CDATA[Mathematics]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Run-length encoding]]></category>
		<category><![CDATA[San Francisco]]></category>
		<category><![CDATA[Technology/Internet]]></category>
		<category><![CDATA[web server]]></category>
		<category><![CDATA[west coast]]></category>
		<category><![CDATA[Yahoo! Communications Europe Ltd.]]></category>

		<guid isPermaLink="false">http://www.stoimen.com/blog/?p=2658</guid>
		<description><![CDATA[Overview Relative encoding is another data compression algorithm. While run-length encoding, bitmap encoding and diagram and pattern substitution were trying to reduce repeating data, with relative encoding the goal is a bit different. Indeed run-length encoding was searching for long &#8230; <a href="http://www.stoimen.com/blog/2012/01/30/computer-algorithms-data-compression-with-relative-encoding/">Continue reading <span class="meta-nav">&#8594;</span></a>
Related posts:<ol>
<li><a href='http://www.stoimen.com/blog/2012/01/09/computer-algorithms-data-compression-with-run-length-encoding/' rel='bookmark' title='Computer Algorithms: Data Compression with Run-length Encoding'>Computer Algorithms: Data Compression with Run-length Encoding</a></li>
<li><a href='http://www.stoimen.com/blog/2012/01/16/computer-algorithms-data-compression-with-bitmaps/' rel='bookmark' title='Computer Algorithms: Data Compression with Bitmaps'>Computer Algorithms: Data Compression with Bitmaps</a></li>
<li><a href='http://www.stoimen.com/blog/2012/01/23/computer-algorithms-data-compression-with-diagram-encoding-and-pattern-substitution/' rel='bookmark' title='Computer Algorithms: Data Compression with Diagram Encoding and Pattern Substitution'>Computer Algorithms: Data Compression with Diagram Encoding and Pattern Substitution</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<h2>Overview</h2>
<p>Relative encoding is another data compression algorithm. While <a href="http://www.stoimen.com/blog/2012/01/09/computer-algorithms-data-compression-with-run-length-encoding/" title="Computer Algorithms: Data Compression with Run-length Encoding">run-length encoding</a>, <a href="http://www.stoimen.com/blog/2012/01/16/computer-algorithms-data-compression-with-bitmaps/" title="Computer Algorithms: Data Compression with Bitmaps">bitmap encoding</a> and <a href="http://www.stoimen.com/blog/2012/01/23/computer-algorithms-data-compression-with-diagram-encoding-and-pattern-substitution/" title="Computer Algorithms: Data Compression with Diagram Encoding and Pattern Substitution">diagram and pattern substitution</a> were trying to reduce repeating data, with relative encoding the goal is a bit different. Indeed run-length encoding was searching for long runs of repeating elements, while pattern substitution and bitmap encoding were trying to “map” where the repetitions happen to occur. </p>
<p>The only problem with these algorithms is that not always the input stream of data is constructed out of repeating elements. It is clear that if the input stream contains many repeating elements there must be some way of reducing them. However that doesn’t mean that we cannot compress data if there are no repetitions. It all depends on the data. Let’s say we have the following stream to compress.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #cc66cc;">1</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">2</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">3</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">4</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">5</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">6</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">7</span></pre></div></div>

<p>We can hardly imagine how this stream of data can be compressed. The same problem may occur when trying to compress the alphabet. Indeed the alphabet letters the very base of the words so it is the minimal part for word construction and it&#8217;s hard to compress them.</p>
<p>Fortunately this isn’t true always. An algorithm that tryies to deal with non repeating data is relative encoding. Let’s see the following input stream &#8211; years from a given decade (the 90&#8242;s).</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #cc66cc;">1991</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1991</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1999</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1998</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1991</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1993</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1992</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1992</span></pre></div></div>

<p>Here we have 39 characters and we can reduce them. A natural approach is to remove the leading “19” as we humans often do.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #cc66cc;">91</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">91</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">99</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">98</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">91</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">93</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">92</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">92</span></pre></div></div>

<p>Now we have a shorter string, but we can go even further with keeping only the first year. All other years will as relative to this year.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #cc66cc;">91</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">0</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">8</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">7</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">0</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">2</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1</span></pre></div></div>

<p>Now the volume of transferred data is reduced a lot (from 39 to 16 &#8211; more than 50%). However there are some questions we need to answer first, because the stream wont be always formatted in such pretty way. How about the next character stream?</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #cc66cc;">91</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">94</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">95</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">95</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">98</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">100</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">101</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">102</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">105</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">110</span></pre></div></div>

<p>We see that the value 100 is somehow in the middle of the interval and it is handy to use it as a base value for the relative encoding. Thus the stream above will become:</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #339933;">-</span><span style="color: #cc66cc;">9</span><span style="color: #339933;">,-</span><span style="color: #cc66cc;">6</span><span style="color: #339933;">,-</span><span style="color: #cc66cc;">5</span><span style="color: #339933;">,-</span><span style="color: #cc66cc;">5</span><span style="color: #339933;">,-</span><span style="color: #cc66cc;">2</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">100</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">2</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">5</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">10</span></pre></div></div>

<p>The problem is that we can’t decide which value will be the <strong>base value</strong> so easily. What if the data was dispersed in a different way.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #cc66cc;">96</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">97</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">98</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">99</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">100</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">101</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">102</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">103</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">999</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1000</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1001</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1002</span></pre></div></div>

<p>Now the value of “100” isn’t useful, because compressing the stream will get something like this:</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #339933;">-</span><span style="color: #cc66cc;">4</span><span style="color: #339933;">,-</span><span style="color: #cc66cc;">3</span><span style="color: #339933;">,-</span><span style="color: #cc66cc;">2</span><span style="color: #339933;">,-</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">100</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">2</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">3</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">899</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">900</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">901</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">902</span></pre></div></div>

<p>To group the relative values around “some” base values will be far more handy.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #009900;">&#40;</span><span style="color: #339933;">-</span><span style="color: #cc66cc;">4</span><span style="color: #339933;">,-</span><span style="color: #cc66cc;">3</span><span style="color: #339933;">,-</span><span style="color: #cc66cc;">2</span><span style="color: #339933;">,-</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">100</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">2</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#40;</span><span style="color: #339933;">-</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1000</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">&#41;</span></pre></div></div>

<p>However to decide which value will be the base value isn’t that easy. Also the encoding format is not so trivial. In the other hand this type of encoding can be useful in som specific cases as we can see bellow.<br />
<span id="more-2658"></span></p>
<h2>Implementation</h2>
<p>The implementation of this algorithm depends on the specific task and the format of the data stream. Assuming that we’ve to transfer the stream of years in JSON from a web server to a browser, here’s a short PHP snippet.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #666666; font-style: italic;">// JSON: [1991,1991,1999,1998,1999,1998,1995,1997,1994,1993]</span>
<span style="color: #000088;">$years</span> <span style="color: #339933;">=</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">1991</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1991</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1999</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1998</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1999</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1998</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1995</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1997</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1994</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1993</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #000000; font-weight: bold;">function</span> relative_encoding<span style="color: #009900;">&#40;</span><span style="color: #000088;">$input</span><span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span>
	<span style="color: #000088;">$output</span> <span style="color: #339933;">=</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	<span style="color: #000088;">$inputLength</span> <span style="color: #339933;">=</span> <span style="color: #990000;">count</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$input</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #000088;">$base</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$input</span><span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">0</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #000088;">$output</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$base</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #b1b100;">for</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span> <span style="color: #000088;">$i</span> <span style="color: #339933;">&lt;</span> <span style="color: #000088;">$inputLength</span><span style="color: #339933;">;</span> <span style="color: #000088;">$i</span><span style="color: #339933;">++</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #000088;">$output</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$input</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">-</span> <span style="color: #000088;">$base</span><span style="color: #339933;">;</span>
	<span style="color: #009900;">&#125;</span>
&nbsp;
	<span style="color: #b1b100;">return</span> <span style="color: #000088;">$output</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #666666; font-style: italic;">// JSON: [1991,0,8,7,8,7,4,6,3,2]</span>
<span style="color: #b1b100;">echo</span> <span style="color: #990000;">json_encode</span><span style="color: #009900;">&#40;</span>relative_encoding<span style="color: #009900;">&#40;</span><span style="color: #000088;">$years</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<h2>Application</h2>
<p>This algorithm may be very useful in many cases, but here’s one of them. There are plenty of map applications around the web. Some products as <a href="http://maps.google.com/" title="Google Maps" target="_blank">Google Maps</a>, <a href="http://maps.yahoo.com/" title="Yahoo! Maps" target="_blank">Yahoo! Maps</a>, <a href="http://www.bing.com/maps/" title="Bing Maps" target="_blank">Bing Maps</a> are quite famous, while there are very useful open source projects as <a href="http://www.openstreetmap.org/" title="OpenStreetMap" target="_blank">OpenStreetMap</a>. The web sites using these apps are thousands. </p>
<p>A typical use case is to transfer lots of Geo coordinates from web server to a browser using JSON. Indeed any GEO point on Earth is relative to the point (0,0), which is located near the west coast of Africa, however on large zoom levels, when there are tons of markers we can transfer the information with relative encoding.</p>
<p>For instance the following diagram shows San Francisco with some markers on it. Their coordinates are be relative to the point (0,0) on Earth.</p>
<div id="attachment_2682" class="wp-caption alignnone" style="width: 829px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/01/FullLatLononSanFrancisco.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/01/FullLatLononSanFrancisco.png" alt="San Francisco map with full lat and lon markers" title="FullLatLononSanFrancisco" width="819" height="456" class="size-full wp-image-2682" /></a><p class="wp-caption-text">Map markers can be relative to the (0, 0) point on Earth, which can be sometimes useless.</p></div>
<p>Far more useful may be to encode those markers, relative to the center of the city, thus we can save some space.</p>
<div id="attachment_2681" class="wp-caption alignnone" style="width: 829px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/01/SanFranciscoMap.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/01/SanFranciscoMap.png" alt="San Francisco map with relative encoded markers" title="SanFranciscoMap" width="819" height="456" class="size-full wp-image-2681" /></a><p class="wp-caption-text">Relative encoding can be useful for map markers on large zoom level!</p></div>
<p>However this type of compression can be tricky, for example when dragging the map and updating the marker array. In the other hand we must group markers if we have to load more than one city. That’s why we must be careful when implementing it. But in the other hand it can be very useful &#8211; for instance on initial load of the map we can reduce data and speed up the load time. </p>
<p>The thing is that with relative encoding we can save only changes to base value (data) &#8211; something like version control systems and thus reducing data transfer and load. Here&#8217;s a graphical example. In the first case on the diagram bellow we can see that each item is stored on its own. It doesn&#8217;t depend on the adjacent items and it can be completely independent of them.</p>
<div id="attachment_2694" class="wp-caption alignnone" style="width: 610px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/01/chart_11.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/01/chart_11.png" alt="Non-relative encoding" title="Non-relative encoding" width="600" height="371" class="size-full wp-image-2694" /></a><p class="wp-caption-text"> </p></div>
<p>However we can keep full info only for the first item and any other item will be relative to it, like on the diagram bellow.</p>
<div id="attachment_2695" class="wp-caption alignnone" style="width: 610px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/01/chart_21.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/01/chart_21.png" alt="Relative encoding" title="Relative encoding" width="600" height="371" class="size-full wp-image-2695" /></a><p class="wp-caption-text"> </p></div>
<p>Related posts:<ol>
<li><a href='http://www.stoimen.com/blog/2012/01/09/computer-algorithms-data-compression-with-run-length-encoding/' rel='bookmark' title='Computer Algorithms: Data Compression with Run-length Encoding'>Computer Algorithms: Data Compression with Run-length Encoding</a></li>
<li><a href='http://www.stoimen.com/blog/2012/01/16/computer-algorithms-data-compression-with-bitmaps/' rel='bookmark' title='Computer Algorithms: Data Compression with Bitmaps'>Computer Algorithms: Data Compression with Bitmaps</a></li>
<li><a href='http://www.stoimen.com/blog/2012/01/23/computer-algorithms-data-compression-with-diagram-encoding-and-pattern-substitution/' rel='bookmark' title='Computer Algorithms: Data Compression with Diagram Encoding and Pattern Substitution'>Computer Algorithms: Data Compression with Diagram Encoding and Pattern Substitution</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.stoimen.com/blog/2012/01/30/computer-algorithms-data-compression-with-relative-encoding/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>JavaScript Performance: for vs. while</title>
		<link>http://www.stoimen.com/blog/2012/01/24/javascript-performance-for-vs-while/</link>
		<comments>http://www.stoimen.com/blog/2012/01/24/javascript-performance-for-vs-while/#comments</comments>
		<pubDate>Tue, 24 Jan 2012 14:20:05 +0000</pubDate>
		<dc:creator>Stoimen</dc:creator>
				<category><![CDATA[javascript]]></category>
		<category><![CDATA[Computer programming]]></category>
		<category><![CDATA[Computing]]></category>
		<category><![CDATA[Control flow]]></category>
		<category><![CDATA[firebug]]></category>
		<category><![CDATA[Increment]]></category>
		<category><![CDATA[JavaScript programming language]]></category>
		<category><![CDATA[Software engineering]]></category>
		<category><![CDATA[software/hardware]]></category>
		<category><![CDATA[Technology/Internet]]></category>
		<category><![CDATA[While loop]]></category>

		<guid isPermaLink="false">http://www.stoimen.com/blog/?p=2635</guid>
		<description><![CDATA[JavaScript Loops If you have read some preformance tests on JavaScript loops, you may have heard that &#8220;while&#8221; is faster than &#8220;for&#8221;. However the question is how faster is &#8220;while&#8221;? Here are some results, but first let&#8217;s take a look &#8230; <a href="http://www.stoimen.com/blog/2012/01/24/javascript-performance-for-vs-while/">Continue reading <span class="meta-nav">&#8594;</span></a>
Related posts:<ol>
<li><a href='http://www.stoimen.com/blog/2010/02/02/firebugs-console-time-accuracy/' rel='bookmark' title='Firebug&#8217;s console.time() accuracy'>Firebug&#8217;s console.time() accuracy</a></li>
<li><a href='http://www.stoimen.com/blog/2010/02/02/profiling-javascript-with-firebug-console-profile-console-time/' rel='bookmark' title='Profiling JavaScript with Firebug. console.profile() &amp; console.time()!'>Profiling JavaScript with Firebug. console.profile() &#038; console.time()!</a></li>
<li><a href='http://www.stoimen.com/blog/2012/01/05/php-performance-bitwise-division/' rel='bookmark' title='PHP Performance: Bitwise Division'>PHP Performance: Bitwise Division</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<h2>JavaScript Loops</h2>
<p>If you have read some preformance tests on JavaScript loops, you may have heard that &#8220;while&#8221; is faster than &#8220;for&#8221;. However the question is how faster is &#8220;while&#8221;? Here are some results, but first let&#8217;s take a look on the JavaScript code. </p>
<h4>The <strong>for</strong> experiment</h4>

<div class="wp_syntax"><div class="code"><pre class="javascript" style="font-family:monospace;">console.<span style="color: #660066;">time</span><span style="color: #009900;">&#40;</span><span style="color: #3366CC;">'for'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #000066; font-weight: bold;">for</span> <span style="color: #009900;">&#40;</span><span style="color: #003366; font-weight: bold;">var</span> i <span style="color: #339933;">=</span> <span style="color: #CC0000;">0</span><span style="color: #339933;">;</span> i <span style="color: #339933;">&lt;</span> <span style="color: #CC0000;">10000000</span><span style="color: #339933;">;</span> i<span style="color: #339933;">++</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
	i <span style="color: #339933;">/</span> <span style="color: #CC0000;">2</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span>
console.<span style="color: #660066;">timeEnd</span><span style="color: #009900;">&#40;</span><span style="color: #3366CC;">'for'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<h4>The <strong>while</strong> experiment</h4>

<div class="wp_syntax"><div class="code"><pre class="javascript" style="font-family:monospace;">console.<span style="color: #660066;">time</span><span style="color: #009900;">&#40;</span><span style="color: #3366CC;">'while'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #003366; font-weight: bold;">var</span> i <span style="color: #339933;">=</span> <span style="color: #CC0000;">0</span><span style="color: #339933;">;</span>
<span style="color: #000066; font-weight: bold;">while</span> <span style="color: #009900;">&#40;</span>i<span style="color: #339933;">++</span> <span style="color: #339933;">&lt;</span> <span style="color: #CC0000;">10000000</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
	i <span style="color: #339933;">/</span> <span style="color: #CC0000;">2</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span>
console.<span style="color: #660066;">timeEnd</span><span style="color: #009900;">&#40;</span><span style="color: #3366CC;">'while'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>Note &#8211; these tests are performed and measured with Firebug on Firefox.</p>
<h2>Results</h2>
<p>It&#8217;s a fact, that you&#8217;ll get different results as many times as you run this snippet. It depends also on the enviroment and software/hardware specs. That is why I performed them 10 times and then I took the average value. Here are the values of my performance tests. Note that both <strong>for</strong> and <strong>while</strong> perform 10,000,000 iterations.</p>
<p><iframe width='576' height='263' frameborder='0' src='https://docs.google.com/spreadsheet/pub?hl=en_US&#038;hl=en_US&#038;key=0Avxdu4aY4-UGdDktZEJHYUh5RWFLd1prOG52dDdTdUE&#038;single=true&#038;gid=0&#038;output=html&#038;widget=true'></iframe></p>
<h2>And the Winner Is</h2>
<p><strong>While</strong> is the winner with an average result of 83.5 milliseconds, while &#8220;for&#8221; result is 88 average milliseconds.</p>
<p>As the diagram bellow shows, <strong>the while loop is slightly faster</strong>. However we should be aware that these performance gains are significant for large number of iterations!</p>
<div id="attachment_2668" class="wp-caption alignnone" style="width: 610px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/01/chart_1-1.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/01/chart_1-1.png" alt="JavaScript Performance: for vs. while" title="JavaScript Performance: for vs. while" width="600" height="371" class="size-full wp-image-2668" /></a><p class="wp-caption-text">JavaScript Performance: for vs. while</p></div>
<p>Related posts:<ol>
<li><a href='http://www.stoimen.com/blog/2010/02/02/firebugs-console-time-accuracy/' rel='bookmark' title='Firebug&#8217;s console.time() accuracy'>Firebug&#8217;s console.time() accuracy</a></li>
<li><a href='http://www.stoimen.com/blog/2010/02/02/profiling-javascript-with-firebug-console-profile-console-time/' rel='bookmark' title='Profiling JavaScript with Firebug. console.profile() &amp; console.time()!'>Profiling JavaScript with Firebug. console.profile() &#038; console.time()!</a></li>
<li><a href='http://www.stoimen.com/blog/2012/01/05/php-performance-bitwise-division/' rel='bookmark' title='PHP Performance: Bitwise Division'>PHP Performance: Bitwise Division</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.stoimen.com/blog/2012/01/24/javascript-performance-for-vs-while/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Computer Algorithms: Data Compression with Diagram Encoding and Pattern Substitution</title>
		<link>http://www.stoimen.com/blog/2012/01/23/computer-algorithms-data-compression-with-diagram-encoding-and-pattern-substitution/</link>
		<comments>http://www.stoimen.com/blog/2012/01/23/computer-algorithms-data-compression-with-diagram-encoding-and-pattern-substitution/#comments</comments>
		<pubDate>Mon, 23 Jan 2012 14:58:48 +0000</pubDate>
		<dc:creator>Stoimen</dc:creator>
				<category><![CDATA[algorithms]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Coding theory]]></category>
		<category><![CDATA[Computer file formats]]></category>
		<category><![CDATA[Computer science]]></category>
		<category><![CDATA[Computing]]></category>
		<category><![CDATA[conventional compressing tool]]></category>
		<category><![CDATA[Data compression]]></category>
		<category><![CDATA[html]]></category>
		<category><![CDATA[Information theory]]></category>
		<category><![CDATA[Lossy compression]]></category>
		<category><![CDATA[pattern substitution algorithm]]></category>
		<category><![CDATA[pattern substitution algorithms]]></category>
		<category><![CDATA[Pattern Substitution The pattern substitution algorithm]]></category>
		<category><![CDATA[Run-length encoding]]></category>
		<category><![CDATA[web hosting]]></category>

		<guid isPermaLink="false">http://www.stoimen.com/blog/?p=2623</guid>
		<description><![CDATA[Overview Two variants of run-length encoding are the diagram encoding and the pattern substitution algorithms. The diagram encoding is actually a very simple algorithm. Unlike run-length encoding, where the input stream must consists of many repeating elements, as “aaaaaaaa” for &#8230; <a href="http://www.stoimen.com/blog/2012/01/23/computer-algorithms-data-compression-with-diagram-encoding-and-pattern-substitution/">Continue reading <span class="meta-nav">&#8594;</span></a>
Related posts:<ol>
<li><a href='http://www.stoimen.com/blog/2012/01/30/computer-algorithms-data-compression-with-relative-encoding/' rel='bookmark' title='Computer Algorithms: Data Compression with Relative Encoding'>Computer Algorithms: Data Compression with Relative Encoding</a></li>
<li><a href='http://www.stoimen.com/blog/2012/01/09/computer-algorithms-data-compression-with-run-length-encoding/' rel='bookmark' title='Computer Algorithms: Data Compression with Run-length Encoding'>Computer Algorithms: Data Compression with Run-length Encoding</a></li>
<li><a href='http://www.stoimen.com/blog/2012/01/16/computer-algorithms-data-compression-with-bitmaps/' rel='bookmark' title='Computer Algorithms: Data Compression with Bitmaps'>Computer Algorithms: Data Compression with Bitmaps</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<h2>Overview</h2>
<p>Two variants of <a href="http://www.stoimen.com/blog/2012/01/09/computer-algorithms-data-compression-with-run-length-encoding/" title="Computer Algorithms: Data Compression with Run-length Encoding">run-length encoding</a> are the diagram encoding and the pattern substitution algorithms. The diagram encoding is actually a very simple algorithm. Unlike run-length encoding, where the input stream must consists of many repeating elements, as <strong>“aaaaaaaa”</strong> for instance, which are very rare in a natural language, there are many so called “diagrams” in almost any natural language. In plain English there are some diagrams as <strong>“the”</strong>, <strong>“and”</strong>, <strong>“ing”</strong> (in the word “waiting” for example), <strong>“ a”</strong>, <strong>“ t”</strong>, <strong>“ e”</strong> and many doubled letters. Actually we can extend those diagrams by adding surrounding spaces. Thus we can encode not only “the”, but “ the “, which are 5 characters (2 spaces and 3 letters) with something shorter. In the other hand, as I said, in plain English there are two many doubled letters, which unfortunately aren’t something special for run-length encoding and the compression ratio will be small. Even worse the encoded text may happen to be longer than the input message. Let’s see some examples.</p>
<p>Let’s say we’ve to encode the message “successfully accomplished”, which consists of four doubled letters. However to compress it with run-length encoding we’ll need at least 8 characters, which doesn’t help us a lot.</p>
<pre>
// 8 chars replaced by 8 chars!?
input: 	"successfully accomplished"
output:	"su2ce2sfu2ly a2complished"
</pre>
<p>The problem is that if the input text contains numbers, “2” in particular, we’ve to chose an escape symbol (“@” for example), which we’ll use to mark where the encoded run begins. Thus if the input message is “2 successfully accomplished tasks”, it will be encoded as “2 su@2ce@2sfu@2ly a@2complished tasks”. Now the output message is longer!!! than the input string.</p>
<pre>
// the compressed message is longer!!!
input:	"2 successfully accomplished"
output:	"2 su@2ce@2sfu@2ly a@2complished tasks"
</pre>
<p>Again if the input stream contains the escape symbol, we have to find another one, and the problem is that it is often too difficult to find short escape symbol that doesn’t appear in the input text, without a full scan of the text.<span id="more-2623"></span></p>
<p>That is why run-length encoding isn’t a good solution when compressing plain text, where long runs rarely appear. Well, of course, there are exceptions. For example such an exception is the lossy text compression with run-length encoding. It is intuitively clear that compressing text with loss is rarely useful, especially when you’ve to decompress exactly the same text. However there are some cases that lossy compression may be useful. Such case can be removing spaces. Indeed the text <strong>“successfully      accomplished”</strong> brings us exactly the same information as <strong>“successfully accomplished”</strong>. In this case we can simply remove those spaces. Indeed we can use a marker to indicate the long run of spaces like <strong>“successfully@6 accomplished”</strong> in order to decompress the input string with absolutely no loss, but we can also throw those symbols away. This desision depends on the goal. Exactly with the same goal in mind we can remove new lines and tabs, only if we’re sure that the sense of the text is preserved. Yet again, a problem is that such long runs don’t happen to occur in random texts. That is why it’s better to use diagram encoding for plain text compression instead of run-length encoding.</p>
<h2>Few Questions</h2>
<p>After understanding the principles of the diagram encoding, let’s see some examples. In the example above it is better to replace doubled letters with something shorter. Let’s say # for “cc”, @ for “ss” and % for “ll”. Thus the input text will be compressed as “su#e@fu%y a#omplished”,  which is shorter. But yet again what will happen if the input message contains one of the substitutions? Also we can’t say if there are many doubled letters and enough reasonable substitutions for them. A better approach is to replace patterns. </p>
<div id="attachment_2640" class="wp-caption alignnone" style="width: 630px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/01/DiagramEncodingonTexts.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/01/DiagramEncodingonTexts.png" alt="Compressing texts with diagram encoding" title="Compressing texts with diagram encoding" width="620" class="size-full wp-image-2640" /></a><p class="wp-caption-text">Run-length encoding isn&#039;t a good approach for text compression, because long runs rarely appear in a natural language.</p></div>
<h2>Pattern Substitution</h2>
<p>The pattern substitution algorithm is a variant of the diagram encoding. As I said above in plain English a very commonly used pattern can be “ the “, which is five characters long. We can now replace it with something like “$%” for example. In this case the message <strong>“I send the message”</strong> will become <strong>“I send$%message”</strong>. However there are some obstacles to overcome.</p>
<p>The first problem is that we need to know the language and somehow to define commonly used patterns in a dictionary. What would happen with a message written in some language we don’t know nothing about. Let’s say &#8211; Latin like the example bellow.</p>
<blockquote><p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Cras venenatis, sapien eget suscipit placerat, justo quam blandit mauris, quis tempor ante sapien sodales augue. Praesent ut mauris quam. Phasellus scelerisque, ante quis consequat tristique, metus turpis consectetur leo, vitae facilisis sapien mi eu sapien. Praesent vitae ligula elit, et faucibus augue. Sed rhoncus sodales dolor ut gravida. In quis augue ac nulla auctor mattis sed sed libero. Donec eget purus eget enim tempor porta vitae eget diam. Mauris aliquet malesuada ipsum, non pulvinar urna vestibulum ac. Donec feugiat velit vitae nunc cursus imperdiet. Donec accumsan faucibus dictum. Phasellus sed mauris sapien. Maecenas mi metus, tincidunt sed rhoncus nec, sodales non sapien.</p></blockquote>
<p>Clearly without knowing Latin it isn’t easy to define which are those commonly used patterns. The thing is that it&#8217;s better to use pattern substitution if you know in advance the set of words and characters.</p>
<p>The second problem is related to decompression. It is obvious that we need to define a dictionary and this dictionary must be used when decoding the message. It will be great also if we find more patterns longer than three characters. If not, the compression ratio will be low. Unfortunately such patterns aren’t very common in any natural language.</p>
<div id="attachment_2643" class="wp-caption alignnone" style="width: 630px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/01/PatternSubstitutiononTexts.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/01/PatternSubstitutiononTexts.png" alt="Text compression with diagram encoding and pattern substitution" title="Text compression with diagram encoding and pattern substitution" width="620" class="size-full wp-image-2643" /></a><p class="wp-caption-text">Diagram encoding and pattern substitution are far more suitable for text compression than run-length encoding. In fact, pattern substitution is very effective on compressing programming languages.</p></div>
<h2>Application</h2>
<p>It is interesting to answer the question, how to use diagram encoding or patter substitution to compress text in natural language, especially when we don’t know the language in detail? The answer hides in the question. We wont compress natural languages, but machine language. Exactly machine (programming) languages are limited to a smaller sets of words and symbols. Isn’t it true for any programing language? Like PHP, where words like <strong>“function”</strong>, <strong>“while”</strong>, <strong>“for”</strong>, <strong>“break”</strong>, <strong>“switch”</strong>, <strong>“foreach”</strong> happen to be often in use, or HTML with its defined set of tags. Perhaps the best example is CSS, where only the values of the properties can vary. CSS files also tend to have multiple new lines, tabs and spaces, which only humans read.</p>
<p>The question here is why should we compress those file types. It’s clear that after the compression they will be completely useless, both for humans and machines. Yes, that is true, but what if we have to store versions of those files into a DB. Kind of a backup. Imagine you’re working for a web hosting company that has to store daily versions of the sites it’s hosting. Thus the volume of stored information even for small companies hosting only few sites can be enormous. The problem is that compressing those files with some conventional compressing tool isn’t a good idea. Thus we’ve to save a copy of the entire site every day, but as we know the difference between daily versions of a site can be small. A version control system is another solution, but then you’ve to store the plain text of the files. </p>
<p>Perhaps a better approach is to compress the text using pattern substitution and then saving only differences &#8211; kind of version control, which can be done with “relative encoding”.</p>
<p>Using the above method we can save lots of disk space and in the same time we can compress/decompress easily. Another good thing is that you can save only changes to the initial files, like version control, which can also be compressed.</p>
<h2>Implementation</h2>
<p>The implementation of this algorithm is again on PHP and tries only to describe the main principles of compression. In this case I tried to compress a CSS file using the compression above. Although this example is quite primitive we can see some interesting facts. First of all you only need encoding and decoding dictionaries. Practically the encoding and decoding processes are equal, so you don’t need to implement two different functions. Here in this example a native PHP function is used &#8211; str_replace, because the purpose of this algorithm is not to describe pattern substitution techniques, but pattern substitution. It assumes that today’s programming languages have string manipulation functions for the purposes of this task.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #000088;">$str</span> <span style="color: #339933;">=</span> <span style="color: #990000;">file_get_contents</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'large_style_file.css'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #000088;">$encoding_dict</span> <span style="color: #339933;">=</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span>
	<span style="color: #0000ff;">&quot;<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span> 		<span style="color: #339933;">=&gt;</span> <span style="color: #0000ff;">'$0'</span><span style="color: #339933;">,</span>
	<span style="color: #0000ff;">'text'</span> 		<span style="color: #339933;">=&gt;</span> <span style="color: #0000ff;">'$1'</span><span style="color: #339933;">,</span>
	<span style="color: #0000ff;">'color'</span> 	<span style="color: #339933;">=&gt;</span> <span style="color: #0000ff;">'$2'</span><span style="color: #339933;">,</span>
	<span style="color: #0000ff;">'display'</span> 	<span style="color: #339933;">=&gt;</span> <span style="color: #0000ff;">'$3'</span><span style="color: #339933;">,</span>
	<span style="color: #0000ff;">'font'</span> 		<span style="color: #339933;">=&gt;</span> <span style="color: #0000ff;">'$4'</span><span style="color: #339933;">,</span>
	<span style="color: #0000ff;">'width'</span> 	<span style="color: #339933;">=&gt;</span> <span style="color: #0000ff;">'$5'</span><span style="color: #339933;">,</span>
	<span style="color: #0000ff;">'height'</span>	<span style="color: #339933;">=&gt;</span> <span style="color: #0000ff;">'$6'</span><span style="color: #339933;">,</span>	
	<span style="color: #0000ff;">' '</span>		<span style="color: #339933;">=&gt;</span> <span style="color: #0000ff;">''</span><span style="color: #339933;">,</span>
<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #000000; font-weight: bold;">function</span> replace_patterns<span style="color: #009900;">&#40;</span><span style="color: #000088;">$input</span><span style="color: #339933;">,</span> <span style="color: #000088;">$dict</span><span style="color: #009900;">&#41;</span> 
<span style="color: #009900;">&#123;</span>
	<span style="color: #b1b100;">foreach</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$dict</span> <span style="color: #b1b100;">as</span> <span style="color: #000088;">$pattern</span> <span style="color: #339933;">=&gt;</span> <span style="color: #000088;">$replace</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #000088;">$input</span> <span style="color: #339933;">=</span> <span style="color: #990000;">str_replace</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$pattern</span><span style="color: #339933;">,</span> <span style="color: #000088;">$replace</span><span style="color: #339933;">,</span> <span style="color: #000088;">$input</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	<span style="color: #009900;">&#125;</span>
&nbsp;
	<span style="color: #b1b100;">return</span> <span style="color: #000088;">$input</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #000088;">$result</span> <span style="color: #339933;">=</span> replace_patterns<span style="color: #009900;">&#40;</span><span style="color: #000088;">$str</span><span style="color: #339933;">,</span> <span style="color: #000088;">$encoding_dict</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>By only replacing few CSS properties I achieved almost 40% of compression ratio (as shows the diagram bellow). The initial file is 202 KB, while compressed it&#8217;s only 131 KB. Of course, it all depends on the CSS file, but how about replacing all property names with shorter ones. Perhaps then the compression will be even better.</p>
<div id="attachment_2647" class="wp-caption alignnone" style="width: 610px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/01/chart_1.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/01/chart_1.png" alt="CSS compression with pattern substitution" title="CSS compression with pattern substitution" width="600" height="371" class="size-full wp-image-2647" /></a><p class="wp-caption-text"> </p></div>
<p>Related posts:<ol>
<li><a href='http://www.stoimen.com/blog/2012/01/30/computer-algorithms-data-compression-with-relative-encoding/' rel='bookmark' title='Computer Algorithms: Data Compression with Relative Encoding'>Computer Algorithms: Data Compression with Relative Encoding</a></li>
<li><a href='http://www.stoimen.com/blog/2012/01/09/computer-algorithms-data-compression-with-run-length-encoding/' rel='bookmark' title='Computer Algorithms: Data Compression with Run-length Encoding'>Computer Algorithms: Data Compression with Run-length Encoding</a></li>
<li><a href='http://www.stoimen.com/blog/2012/01/16/computer-algorithms-data-compression-with-bitmaps/' rel='bookmark' title='Computer Algorithms: Data Compression with Bitmaps'>Computer Algorithms: Data Compression with Bitmaps</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.stoimen.com/blog/2012/01/23/computer-algorithms-data-compression-with-diagram-encoding-and-pattern-substitution/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Computer Algorithms: Data Compression with Bitmaps</title>
		<link>http://www.stoimen.com/blog/2012/01/16/computer-algorithms-data-compression-with-bitmaps/</link>
		<comments>http://www.stoimen.com/blog/2012/01/16/computer-algorithms-data-compression-with-bitmaps/#comments</comments>
		<pubDate>Mon, 16 Jan 2012 09:35:24 +0000</pubDate>
		<dc:creator>Stoimen</dc:creator>
				<category><![CDATA[algorithms]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Algorithm]]></category>
		<category><![CDATA[bitmap]]></category>
		<category><![CDATA[bitmap compressing algorithm]]></category>
		<category><![CDATA[Bzip2]]></category>
		<category><![CDATA[Coding theory]]></category>
		<category><![CDATA[Computing]]></category>
		<category><![CDATA[Data compression]]></category>
		<category><![CDATA[gif]]></category>
		<category><![CDATA[Graphics file formats]]></category>
		<category><![CDATA[image compression]]></category>
		<category><![CDATA[Information theory]]></category>
		<category><![CDATA[JSON]]></category>
		<category><![CDATA[Lossless data compression]]></category>
		<category><![CDATA[Run-length encoding]]></category>
		<category><![CDATA[Technology/Internet]]></category>

		<guid isPermaLink="false">http://www.stoimen.com/blog/?p=2604</guid>
		<description><![CDATA[Overview In my previous post we saw how to compress data consisting of very long runs of repeating elements. This type of compression is known as &#8220;run-length encoding&#8221; and can be very handy when transferring data with no loss. The &#8230; <a href="http://www.stoimen.com/blog/2012/01/16/computer-algorithms-data-compression-with-bitmaps/">Continue reading <span class="meta-nav">&#8594;</span></a>
Related posts:<ol>
<li><a href='http://www.stoimen.com/blog/2012/01/09/computer-algorithms-data-compression-with-run-length-encoding/' rel='bookmark' title='Computer Algorithms: Data Compression with Run-length Encoding'>Computer Algorithms: Data Compression with Run-length Encoding</a></li>
<li><a href='http://www.stoimen.com/blog/2012/01/30/computer-algorithms-data-compression-with-relative-encoding/' rel='bookmark' title='Computer Algorithms: Data Compression with Relative Encoding'>Computer Algorithms: Data Compression with Relative Encoding</a></li>
<li><a href='http://www.stoimen.com/blog/2012/01/23/computer-algorithms-data-compression-with-diagram-encoding-and-pattern-substitution/' rel='bookmark' title='Computer Algorithms: Data Compression with Diagram Encoding and Pattern Substitution'>Computer Algorithms: Data Compression with Diagram Encoding and Pattern Substitution</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<h2>Overview</h2>
<p>In <a href="http://www.stoimen.com/blog/2012/01/09/computer-algorithms-data-compression-with-run-length-encoding/" title="Computer Algorithms: Data Compression with Run-length Encoding">my previous post</a> we saw how to compress data consisting of very long runs of repeating elements. This type of compression is known as &#8220;run-length encoding&#8221; and can be very handy when transferring data with no loss. The problem is that the data must follow a specific format. Thus the string <strong>“aaaaaaaabbbbbbbb”</strong> can be compressed as <strong>“a8b8”</strong>. Now a string with length 16 can be compressed as a string with length 4, which is 25% of its initial length without loosing any information. There will be a problem in case the characters (elements) were dispersed in a different way. What would happen if the characters are the same, but they don’t form long runs? What if the string was <strong>“abababababababab”</strong>? The same length, the same characters, but we cannot use run-length encoding! Indeed using this algorithm we’ll get at best the same string.</p>
<p>In this case, however, we can see another fact. The string consists of too many repeating elements, although not arranged one after another. We can compress this string with a bitmap. This means that we can save the positions of the occurrences of a given element with a sequence of bits, which can be easily converted into a decimal value. In the example above the string <strong>“abababababababab”</strong> can be compressed as <strong>“1010101010101010”</strong>, which is <strong>43690</strong> in decimals, and even better <strong>AAAA</strong> in hexadecimal. Thus the long string can be compressed. When decompressing (decoding) the message we can convert again from decimal/hexadecimal into binary and match the occurrences of the characters. Well, the example above is too simple, but let’s say only one of the characters is repeating and the rest of the string consists of different characters like this: <strong>“abacadaeafagahai”</strong>. Then we can use bitmap only for the character “a” &#8211; <strong>“1010101010101010”</strong> and compress it as <strong>“AAAA bcdefghi”</strong>. As you can see all the example strings are exactly 16 characters and that is a limitation. To use bitmaps with variable length of the data is a bit tricky and it is not always easy (if possible) to decompress it.</p>
<p><div id="attachment_2624" class="wp-caption alignnone" style="width: 630px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/01/Run-lengthvs.BitmapCompression.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/01/Run-lengthvs.BitmapCompression.png" alt="Bitmap Compression" title="Bitmap Compression" width="620" class="size-full wp-image-2624" /></a><p class="wp-caption-text">Basically bitmap compression saves the positions of an element that is repeated very often in the message!</p></div><br />
<span id="more-2604"></span><br />
In the other hand bitmap compression  is not only applicable on strings. We can compress also arrays, objects or any kind of data. The example from my previous post is very suitable. Then we had to transfer a large array from a server to the client (browser) using <a href="http://www.stoimen.com/blog/tag/json/" title="JSON on stoimen.com/blog">JSON</a>. The data then was very suitable for “run-length encoding”. Now let’s assume we have the same data &#8211; a set of different years, which this time are dispersed in a different way.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #000088;">$data</span> <span style="color: #339933;">=</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span>
	<span style="color: #cc66cc;">0</span> 	<span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">1991</span><span style="color: #339933;">,</span>
	<span style="color: #cc66cc;">1</span> 	<span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">1992</span><span style="color: #339933;">,</span>
	<span style="color: #cc66cc;">2</span> 	<span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">1993</span><span style="color: #339933;">,</span>
	<span style="color: #cc66cc;">3</span> 	<span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">1994</span><span style="color: #339933;">,</span>
	<span style="color: #cc66cc;">4</span> 	<span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">1991</span><span style="color: #339933;">,</span>
	<span style="color: #cc66cc;">5</span> 	<span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">1992</span><span style="color: #339933;">,</span>
	<span style="color: #cc66cc;">6</span> 	<span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">1993</span><span style="color: #339933;">,</span>
	<span style="color: #cc66cc;">7</span> 	<span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">1992</span><span style="color: #339933;">,</span>
	<span style="color: #cc66cc;">8</span> 	<span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">1991</span><span style="color: #339933;">,</span>
	<span style="color: #cc66cc;">9</span> 	<span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">1991</span><span style="color: #339933;">,</span>
	<span style="color: #cc66cc;">10</span> 	<span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">1991</span><span style="color: #339933;">,</span>
	<span style="color: #cc66cc;">11</span> 	<span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">1992</span><span style="color: #339933;">,</span>
	<span style="color: #cc66cc;">12</span> 	<span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">1992</span><span style="color: #339933;">,</span>
	<span style="color: #cc66cc;">13</span> 	<span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">1991</span><span style="color: #339933;">,</span>
	<span style="color: #cc66cc;">14</span> 	<span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">1991</span><span style="color: #339933;">,</span>
	<span style="color: #cc66cc;">15</span> 	<span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">1992</span><span style="color: #339933;">,</span>	
	<span style="color: #339933;">...</span>
<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>The JSON will encoded message will be the following (a simple but yet very large javascript array).</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">1991</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1992</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1993</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1994</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1991</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1992</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1993</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1992</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1991</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1991</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1991</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1992</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1992</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1991</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1991</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1992</span><span style="color: #339933;">,</span> <span style="color: #339933;">...</span><span style="color: #009900;">&#93;</span></pre></div></div>

<p>However if we use bitmap compression we’ll get a &#8220;shorter&#8221; array.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #000088;">$data</span> <span style="color: #339933;">=</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span>
	<span style="color: #cc66cc;">0</span> <span style="color: #339933;">=&gt;</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">1991</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">'1000100011100110'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span>
	<span style="color: #cc66cc;">1</span> <span style="color: #339933;">=&gt;</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">1992</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">'0100010100011001'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span>
	<span style="color: #cc66cc;">2</span> <span style="color: #339933;">=&gt;</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">1993</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">'0010001000000000'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span>
	<span style="color: #cc66cc;">3</span> <span style="color: #339933;">=&gt;</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">1994</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">'0001000000000000'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span>
<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>Now the JSON is:</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">1991</span><span style="color: #339933;">,</span><span style="color: #0000ff;">&quot;1000100011100110&quot;</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">,</span><span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">1992</span><span style="color: #339933;">,</span><span style="color: #0000ff;">&quot;0100010100011001&quot;</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">,</span><span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">1993</span><span style="color: #339933;">,</span><span style="color: #0000ff;">&quot;0010001000000000&quot;</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">,</span><span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">1994</span><span style="color: #339933;">,</span><span style="color: #0000ff;">&quot;0001000000000000&quot;</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#93;</span></pre></div></div>

<p>It is obvious that the compression ratio is getting better and better as the uncompressed data grows. In fact, most of us know bitmap compression from images, because this algorithm is largely used for image compression. We can imagine how successful it can be when compressing black and white images (as black and white can be represented as 0 and 1s). Actually it is used for more than two colors (256 for instance) and again the level of compression is very high.</p>
<h2>Implementation</h2>
<p>The following implementation on <a href="http://www.stoimen.com/blog/category/php/" title="PHP on stoimen.com">PHP</a> aims only to illustrate the bitmap compressing algorithm. As we know this algorithm can be applicable for any kind of data structures.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #666666; font-style: italic;">// too many repeating &quot;a&quot; characters</span>
<span style="color: #000088;">$msg</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">'aazahalavaatalawacamaahakafaaaqaaaiauaacaaxaauaxaaaaaapaayatagaaoafaawayazavaaaazaaabararaaaaakakaaqaarazacajaazavanazaaaeanaaoajauaaaaaxalaraaapabataaavaaab'</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #000000; font-weight: bold;">function</span> bitmap<span style="color: #009900;">&#40;</span><span style="color: #000088;">$message</span><span style="color: #009900;">&#41;</span> 
<span style="color: #009900;">&#123;</span>
	<span style="color: #000088;">$i</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span>
	<span style="color: #000088;">$bits</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$rest</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">''</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #b1b100;">while</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$v</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$message</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$v</span> <span style="color: #339933;">==</span> <span style="color: #0000ff;">'a'</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
			<span style="color: #000088;">$bits</span> <span style="color: #339933;">.=</span> <span style="color: #0000ff;">'1'</span><span style="color: #339933;">;</span>
		<span style="color: #009900;">&#125;</span> <span style="color: #b1b100;">else</span> <span style="color: #009900;">&#123;</span>
			<span style="color: #000088;">$bits</span> <span style="color: #339933;">.=</span> <span style="color: #0000ff;">'0'</span><span style="color: #339933;">;</span>
			<span style="color: #000088;">$rest</span> <span style="color: #339933;">.=</span> <span style="color: #000088;">$v</span><span style="color: #339933;">;</span>
		<span style="color: #009900;">&#125;</span>
		<span style="color: #000088;">$i</span><span style="color: #339933;">++;</span>
	<span style="color: #009900;">&#125;</span>
&nbsp;
	<span style="color: #b1b100;">return</span> <span style="color: #990000;">number_format</span><span style="color: #009900;">&#40;</span><span style="color: #990000;">bindec</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$bits</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">'.'</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">''</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">.</span> <span style="color: #000088;">$rest</span><span style="color: #339933;">;;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #b1b100;">echo</span> bitmap<span style="color: #009900;">&#40;</span><span style="color: #000088;">$msg</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #666666; font-style: italic;">// uncompressed: </span>
acaaaaadaaaabalaaeaaaaganaaxakaavawamaasavajawaaaayaauaaadalanagaeaeamaarafalaazaaaiasaanaahaaazaraxaalaahaaawaaajasamahaajaakarapanaakaoakaanawalaacamauaamaal
<span style="color: #666666; font-style: italic;">// compressed:</span>
152299251941730035874325065523548237677352452096zhlvtlwcmhkfqiucxuxpytgofwyzvzbrrkkqrzcjzvnzenojuxlrpbtvb</pre></div></div>

<h2>Application</h2>
<p>This algorithm is very useful when there is an element in our data that repeats very often, so you need to investigate the nature of the data you want to compress. Actually because of this fact this algorithm is used for image compression as <a href="http://en.wikipedia.org/wiki/Portable_Network_Graphics" title="Portable Network Graphics" target="_blank">PNG8</a> or <a href="http://en.wikipedia.org/wiki/Graphics_Interchange_Format" title="Graphics Interchange Format" target="_blank">GIF</a>.</p>
<p>Related posts:<ol>
<li><a href='http://www.stoimen.com/blog/2012/01/09/computer-algorithms-data-compression-with-run-length-encoding/' rel='bookmark' title='Computer Algorithms: Data Compression with Run-length Encoding'>Computer Algorithms: Data Compression with Run-length Encoding</a></li>
<li><a href='http://www.stoimen.com/blog/2012/01/30/computer-algorithms-data-compression-with-relative-encoding/' rel='bookmark' title='Computer Algorithms: Data Compression with Relative Encoding'>Computer Algorithms: Data Compression with Relative Encoding</a></li>
<li><a href='http://www.stoimen.com/blog/2012/01/23/computer-algorithms-data-compression-with-diagram-encoding-and-pattern-substitution/' rel='bookmark' title='Computer Algorithms: Data Compression with Diagram Encoding and Pattern Substitution'>Computer Algorithms: Data Compression with Diagram Encoding and Pattern Substitution</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.stoimen.com/blog/2012/01/16/computer-algorithms-data-compression-with-bitmaps/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Computer Algorithms: Data Compression with Run-length Encoding</title>
		<link>http://www.stoimen.com/blog/2012/01/09/computer-algorithms-data-compression-with-run-length-encoding/</link>
		<comments>http://www.stoimen.com/blog/2012/01/09/computer-algorithms-data-compression-with-run-length-encoding/#comments</comments>
		<pubDate>Mon, 09 Jan 2012 09:08:06 +0000</pubDate>
		<dc:creator>Stoimen</dc:creator>
				<category><![CDATA[algorithms]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[ajax]]></category>
		<category><![CDATA[Algorithmic efficiency]]></category>
		<category><![CDATA[binary search]]></category>
		<category><![CDATA[Bzip2]]></category>
		<category><![CDATA[Data compression]]></category>
		<category><![CDATA[data compression algorithm]]></category>
		<category><![CDATA[data compression algorithms]]></category>
		<category><![CDATA[faster services]]></category>
		<category><![CDATA[Google Inc.]]></category>
		<category><![CDATA[JSON]]></category>
		<category><![CDATA[Lossless data compression]]></category>
		<category><![CDATA[lossless data compression algorithm]]></category>
		<category><![CDATA[Lossy compression]]></category>
		<category><![CDATA[programmer]]></category>
		<category><![CDATA[run-length algorithm]]></category>
		<category><![CDATA[Run-length encoding]]></category>
		<category><![CDATA[search algorithms]]></category>
		<category><![CDATA[Technology/Internet]]></category>
		<category><![CDATA[This algorithm]]></category>
		<category><![CDATA[virtual machine]]></category>
		<category><![CDATA[web server]]></category>

		<guid isPermaLink="false">http://www.stoimen.com/blog/?p=2594</guid>
		<description><![CDATA[Introduction No matter how fast today&#8217;s computers and networks are, the users will constantly need faster and faster services. To reduce the volume of the transferred data we usually use some sort of compression. That is why this computer sciences &#8230; <a href="http://www.stoimen.com/blog/2012/01/09/computer-algorithms-data-compression-with-run-length-encoding/">Continue reading <span class="meta-nav">&#8594;</span></a>
Related posts:<ol>
<li><a href='http://www.stoimen.com/blog/2012/01/30/computer-algorithms-data-compression-with-relative-encoding/' rel='bookmark' title='Computer Algorithms: Data Compression with Relative Encoding'>Computer Algorithms: Data Compression with Relative Encoding</a></li>
<li><a href='http://www.stoimen.com/blog/2012/01/16/computer-algorithms-data-compression-with-bitmaps/' rel='bookmark' title='Computer Algorithms: Data Compression with Bitmaps'>Computer Algorithms: Data Compression with Bitmaps</a></li>
<li><a href='http://www.stoimen.com/blog/2012/01/23/computer-algorithms-data-compression-with-diagram-encoding-and-pattern-substitution/' rel='bookmark' title='Computer Algorithms: Data Compression with Diagram Encoding and Pattern Substitution'>Computer Algorithms: Data Compression with Diagram Encoding and Pattern Substitution</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<h2>Introduction</h2>
<p>No matter how fast today&#8217;s computers and networks are, the users will constantly need faster and faster services. To reduce the volume of the transferred data we usually use some sort of compression. That is why this computer sciences area will be always interesting to research and develop.</p>
<p>There are many data compression algorithms, some of them lossless, others lossy, but their main goal aways will be to spare storage space and traffic. These algorithms are very useful when talking about data transfer between two distant places. Perhaps the best example is the transfer between a web server and a browser.</p>
<p>In the last few years a lot of research has been done on compressing files, executed on the client side. Such files are javascript, css, htmls and images. In fact servers and clients already have some techniques to compress data, like using <a href="http://www.gzip.org/" title="The gzip home page" target="_blank">GZIP</a> for instance, that can dramatically decrease the transfer. In the other hand there are lots of tools and tricks in order to decrease the size of the data.</p>
<p>Actually when a file is executed by the client&#8217;s virtual machine, it doesn&#8217;t matter how &#8220;beautifully&#8221; it is formatted from a programmer&#8217;s point of view. Thus the spaces, tabs and the new lines don&#8217;t bring any significant information for the environment. That is why such compressing tools like <a href="http://developer.yahoo.com/yui/compressor/" title="YUI Compressor" target="_blank">YUI Compressor</a>, <a href="http://code.google.com/closure/compiler/" title="Closure Compiler - Google Code" target="_blank">Google Closure Compiler</a>, etc. remove those symbols. Well, they can achieve even more in order to improve the compression rate. In this post I won&#8217;t cover this, but this shows how important data compression algorithms are.</p>
<p>It would be great if we could just compress data with some tool. Unfortunately this is not the case and usually the compression rate depends on the data itself. It is obvious that the choice of data compression algorithm depends mainly on the data and first of all we must explore the data.</p>
<p>Here I&#8217;ll cover one very simple lossless data compression algorithm called &#8220;run-length encoding&#8221; that can be very useful in some cases.</p>
<div id="attachment_2618" class="wp-caption alignnone" style="width: 630px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/01/Run-lengthEncoding1.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/01/Run-lengthEncoding1.png" alt="Run-length Encoding" title="Run-length Encoding" width="620" class="size-full wp-image-2618" /></a><p class="wp-caption-text"> </p></div>
<h2>Overview</h2>
<p>This algorithm consists of replacing large sequences of repeating data with only one item of this data followed by a counter showing how many times this item is repeated. To become clearer let’s see a string example.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;">aaaaaaaaaabbbaxxxxyyyzyx</pre></div></div>

<p>This string&#8217;s length is <strong>24</strong> and as we can see there are lots of repetitions. Using the run-length algorithm, we replace any run with shorter string followed by a counter.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;">a10b3a1x4y3z1y1x1</pre></div></div>

<p>The length of this string is <strong>17</strong>, which is approximately <strong>70%</strong> of the initial length. <span id="more-2594"></span>Obviously this is not the optimal way to compress the given string. For instance we don&#8217;t need to use the digit “1” when the character is repeated only once. In some cases this approach can increase the length of the initial string which is exactly the opposite of what we need. In this case we’ll get the string bellow.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;">a10b3ax4y3zyx</pre></div></div>

<p>Now the length of the resulting string is <strong>13</strong>, which is <strong>54%</strong> of the initial length! A variation of the example above is not to keep a counter of the repetitions of the character, but their position instead. Thus the initial string will be compressed as follows.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;">a0b10a13x14y18z21y22x23</pre></div></div>

<p>Which of these two approaches you&#8217;ll use depends on the goal. In the second case we can achieve a good optimization of <a href="http://www.stoimen.com/blog/2011/12/26/computer-algorithms-binary-search/" title="Computer Algorithms: Binary Search">binary search</a>.</p>
<p>It is clear that this algorithm is not only applicable on strings. We can achieve very good results on arrays. A typical example is the transfer of <a href="http://www.json.org/" title="JSON" target="_blank">JSON</a> from a server to a client. Then if there are large sequences of repeating data we can achieve great results.</p>
<h2>Implementation</h2>
<p>The implementation bellow is assuming that we&#8217;re compressing a string and it&#8217;s written on PHP. However the nature of this algorithm doesn&#8217;t restrict us to use only strings. As I said before with slight modifications we can use it with other data structures. It is important only to understand that the run-length algorithm is very useful on large sequences of repeating elements, no matter characters or array items.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #000088;">$message</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">'aaaaaaaaaabbbaxxxxyyyzyx'</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #000000; font-weight: bold;">function</span> run_length_encode<span style="color: #009900;">&#40;</span><span style="color: #000088;">$msg</span><span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span>
	<span style="color: #000088;">$i</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$j</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span>
	<span style="color: #000088;">$prev</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">''</span><span style="color: #339933;">;</span>
	<span style="color: #000088;">$output</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">''</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #b1b100;">while</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$msg</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$msg</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">!=</span> <span style="color: #000088;">$prev</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
&nbsp;
			<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$i</span><span style="color: #009900;">&#41;</span> 
				<span style="color: #000088;">$output</span> <span style="color: #339933;">.=</span> <span style="color: #000088;">$j</span><span style="color: #339933;">;</span>
&nbsp;
			<span style="color: #000088;">$output</span> <span style="color: #339933;">.=</span> <span style="color: #000088;">$msg</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
&nbsp;
			<span style="color: #000088;">$prev</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$msg</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
&nbsp;
			<span style="color: #000088;">$j</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span>
		<span style="color: #009900;">&#125;</span>
		<span style="color: #000088;">$j</span><span style="color: #339933;">++;</span>
		<span style="color: #000088;">$i</span><span style="color: #339933;">++;</span>
	<span style="color: #009900;">&#125;</span>
&nbsp;
	<span style="color: #000088;">$output</span> <span style="color: #339933;">.=</span> <span style="color: #000088;">$j</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #b1b100;">return</span> <span style="color: #000088;">$output</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #666666; font-style: italic;">// a10b3a1x4y3z1y1x1</span>
<span style="color: #b1b100;">echo</span> run_length_encode<span style="color: #009900;">&#40;</span><span style="color: #000088;">$message</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>And slightly optimized.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #000088;">$message</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">'aaaaaaaaaabbbaxxxxyyyzyx'</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #000000; font-weight: bold;">function</span> run_length_encode<span style="color: #009900;">&#40;</span><span style="color: #000088;">$msg</span><span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span>
	<span style="color: #000088;">$i</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$j</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span>
	<span style="color: #000088;">$prev</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">''</span><span style="color: #339933;">;</span>
	<span style="color: #000088;">$output</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">''</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #b1b100;">while</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$msg</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$msg</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">!=</span> <span style="color: #000088;">$prev</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
&nbsp;
			<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">&amp;&amp;</span> <span style="color: #000088;">$j</span> <span style="color: #339933;">&gt;</span> <span style="color: #cc66cc;">1</span><span style="color: #009900;">&#41;</span> 
				<span style="color: #000088;">$output</span> <span style="color: #339933;">.=</span> <span style="color: #000088;">$j</span><span style="color: #339933;">;</span>
&nbsp;
			<span style="color: #000088;">$output</span> <span style="color: #339933;">.=</span> <span style="color: #000088;">$msg</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
&nbsp;
			<span style="color: #000088;">$prev</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$msg</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
&nbsp;
			<span style="color: #000088;">$j</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span>
		<span style="color: #009900;">&#125;</span>
		<span style="color: #000088;">$j</span><span style="color: #339933;">++;</span>
		<span style="color: #000088;">$i</span><span style="color: #339933;">++;</span>
	<span style="color: #009900;">&#125;</span>
&nbsp;
	<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$j</span> <span style="color: #339933;">&gt;</span> <span style="color: #cc66cc;">1</span><span style="color: #009900;">&#41;</span>
		<span style="color: #000088;">$output</span> <span style="color: #339933;">.=</span> <span style="color: #000088;">$j</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #b1b100;">return</span> <span style="color: #000088;">$output</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #666666; font-style: italic;">// a10b3ax4y3zyx</span>
<span style="color: #b1b100;">echo</span> run_length_encode<span style="color: #009900;">&#40;</span><span style="color: #000088;">$message</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>Finally a small change &#8211; now we store the position of the character.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #000088;">$message</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">'aaaaaaaaaabbbaxxxxyyyzyx'</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #000000; font-weight: bold;">function</span> run_length_encode<span style="color: #009900;">&#40;</span><span style="color: #000088;">$msg</span><span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span>
	<span style="color: #000088;">$i</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span>
	<span style="color: #000088;">$prev</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">''</span><span style="color: #339933;">;</span>
	<span style="color: #000088;">$output</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">''</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #b1b100;">while</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$msg</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$msg</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">!=</span> <span style="color: #000088;">$prev</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
&nbsp;
			<span style="color: #000088;">$output</span> <span style="color: #339933;">.=</span> <span style="color: #000088;">$msg</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">.</span> <span style="color: #000088;">$i</span><span style="color: #339933;">;</span>
&nbsp;
			<span style="color: #000088;">$prev</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$msg</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
&nbsp;
		<span style="color: #009900;">&#125;</span>
&nbsp;
		<span style="color: #000088;">$i</span><span style="color: #339933;">++;</span>
	<span style="color: #009900;">&#125;</span>
&nbsp;
	<span style="color: #b1b100;">return</span> <span style="color: #000088;">$output</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #666666; font-style: italic;">// a0b10a13x14y18z21y22x23</span>
<span style="color: #b1b100;">echo</span> run_length_encode<span style="color: #009900;">&#40;</span><span style="color: #000088;">$message</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<h2>Complexity and Data Compression</h2>
<p>We&#8217;re used to talk about complexity of an algorithm measuring time and we usually try to find the fastest implementation, like in search algorithms. Here it is not so important to compress data quickly, but to compress as much as possible so the output is as small as possible without lossing data. A great feature of run-length encoding is that this algorithm is easy to implement.</p>
<h2>Application</h2>
<p>We can use run-length encoding in many cases. It is commonly used to compress images and is very successful when we deal only with black and white images. Here I&#8217;ll cover another use case that I only mentioned above. Let&#8217;s say we have to transfer a very large array of data to our AJAX-powered application using JSON. Let&#8217;s say also that the data are some years, for instance the years of the premiere of a movie. There are lots of movies with a premiere in the same year, thus although the data is sorted, we actually can&#8217;t have any benefit. More important is that we have large sequences of data. Here we can use run-length encoding.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #000088;">$data</span> <span style="color: #339933;">=</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span>
	<span style="color: #cc66cc;">0</span> 	<span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">1991</span><span style="color: #339933;">,</span>
	<span style="color: #cc66cc;">1</span> 	<span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">1991</span><span style="color: #339933;">,</span>
	<span style="color: #339933;">...</span>
	<span style="color: #cc66cc;">2223</span> 	<span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">1991</span><span style="color: #339933;">,</span>
	<span style="color: #cc66cc;">2224</span> 	<span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">1992</span><span style="color: #339933;">,</span>
	<span style="color: #339933;">...</span>
	<span style="color: #cc66cc;">19298</span> 	<span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">1995</span><span style="color: #339933;">,</span>
	<span style="color: #cc66cc;">19299</span> 	<span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">1996</span><span style="color: #339933;">,</span>
	<span style="color: #339933;">...</span>
<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>As you can see to transfer the whole array can be a nightmare, especially on slow networks. It is better to compress it (i.e. with PHP&#8217;s <a href="http://php.net/manual/en/function.json-encode.php" title="PHP: json_encode" target="_blank">json_encode</a>).</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #666666; font-style: italic;">// {&quot;0&quot;:1991,&quot;1&quot;:1991, ..., &quot;2223&quot;:1991,&quot;2224&quot;:1992, ..., &quot;19298&quot;:1995,&quot;19299&quot;:1996, ...}</span>
<span style="color: #b1b100;">echo</span> <span style="color: #990000;">json_encode</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$data</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>After running run-length encoding we can receive something like the following array (note that these are only sample data and it&#8217;s up to you to decide which is the best format to store data).</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #000088;">$data</span> <span style="color: #339933;">=</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span>
	<span style="color: #cc66cc;">0</span> <span style="color: #339933;">=&gt;</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">1991</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">2224</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span>
	<span style="color: #cc66cc;">1</span> <span style="color: #339933;">=&gt;</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">1992</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">3948</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span>
	<span style="color: #cc66cc;">2</span> <span style="color: #339933;">=&gt;</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">1995</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">2398</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span>
	<span style="color: #cc66cc;">3</span> <span style="color: #339933;">=&gt;</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">1996</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">3489</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span>
<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>And the JSON output.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #666666; font-style: italic;">// [[1991,2224],[1992,3948],[1995,2398],[1996,3489]]</span>
<span style="color: #b1b100;">echo</span> <span style="color: #990000;">json_encode</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$data</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>Note that if the data is sorted we can achieve great success compressing it!!! This approach can be used for images, graphics or map coordinates.</p>
<p>This is only one example of how data compression can be useful in our daily work. Although the communication between the server and the client can be optimized and compressed, we can improve it. In other words we&#8217;re not always sure that the opposite side supports compression.</p>
<p>Well, it&#8217;s true that the client has to decompress the data, which can also be slow. Now in the first case we have only the time to transfer, as on the diagram bellow.</p>
<div id="attachment_2609" class="wp-caption alignnone" style="width: 630px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/01/DataTransferWithoutCompression.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/01/DataTransferWithoutCompression.png" alt="Data Transfer Without Compression" title="Data Transfer Without Compression" width="620" class="size-full wp-image-2609" /></a><p class="wp-caption-text">Time to transfer data without compression!</p></div>
<p>In the second case, we should sum the time for compression, transfer and decompression.</p>
<div id="attachment_2610" class="wp-caption alignnone" style="width: 630px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/01/DataTransferwithCompression.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/01/DataTransferwithCompression.png" alt="Data Transfer with Compression" title="Data Transfer with Compression" width="620" class="size-full wp-image-2610" /></a><p class="wp-caption-text">Time to send data with compression!</p></div>
<p>All this is important, but in general data compression can be handy in many cases in our daily work. </p>
<p>Related posts:<ol>
<li><a href='http://www.stoimen.com/blog/2012/01/30/computer-algorithms-data-compression-with-relative-encoding/' rel='bookmark' title='Computer Algorithms: Data Compression with Relative Encoding'>Computer Algorithms: Data Compression with Relative Encoding</a></li>
<li><a href='http://www.stoimen.com/blog/2012/01/16/computer-algorithms-data-compression-with-bitmaps/' rel='bookmark' title='Computer Algorithms: Data Compression with Bitmaps'>Computer Algorithms: Data Compression with Bitmaps</a></li>
<li><a href='http://www.stoimen.com/blog/2012/01/23/computer-algorithms-data-compression-with-diagram-encoding-and-pattern-substitution/' rel='bookmark' title='Computer Algorithms: Data Compression with Diagram Encoding and Pattern Substitution'>Computer Algorithms: Data Compression with Diagram Encoding and Pattern Substitution</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.stoimen.com/blog/2012/01/09/computer-algorithms-data-compression-with-run-length-encoding/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>PHP Performance: Bitwise Division</title>
		<link>http://www.stoimen.com/blog/2012/01/05/php-performance-bitwise-division/</link>
		<comments>http://www.stoimen.com/blog/2012/01/05/php-performance-bitwise-division/#comments</comments>
		<pubDate>Thu, 05 Jan 2012 15:38:13 +0000</pubDate>
		<dc:creator>Stoimen</dc:creator>
				<category><![CDATA[PHP]]></category>
		<category><![CDATA[web development]]></category>
		<category><![CDATA[bitwise operation]]></category>
		<category><![CDATA[bitwise operators]]></category>
		<category><![CDATA[division]]></category>
		<category><![CDATA[performance tests]]></category>
		<category><![CDATA[php performance]]></category>
		<category><![CDATA[research]]></category>

		<guid isPermaLink="false">http://www.stoimen.com/blog/?p=2577</guid>
		<description><![CDATA[Recently I wrote about binary search and then I said that in some languages, like PHP, bitwise division by two is not faster than the typical “/” operator. However I decided to make some experiments and here are the results. &#8230; <a href="http://www.stoimen.com/blog/2012/01/05/php-performance-bitwise-division/">Continue reading <span class="meta-nav">&#8594;</span></a>
Related posts:<ol>
<li><a href='http://www.stoimen.com/blog/2012/01/24/javascript-performance-for-vs-while/' rel='bookmark' title='JavaScript Performance: for vs. while'>JavaScript Performance: for vs. while</a></li>
<li><a href='http://www.stoimen.com/blog/2010/02/02/firebugs-console-time-accuracy/' rel='bookmark' title='Firebug&#8217;s console.time() accuracy'>Firebug&#8217;s console.time() accuracy</a></li>
<li><a href='http://www.stoimen.com/blog/2009/12/30/jquery-live-vs-bind-performance/' rel='bookmark' title='jQuery live() vs bind() performance'>jQuery live() vs bind() performance</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>Recently I wrote about <a href="http://www.stoimen.com/blog/2011/12/26/computer-algorithms-binary-search/" title="Computer Algorithms: Binary Search">binary search</a> and then I said that in some languages, like PHP, bitwise division by two is not faster than the typical “/” operator. However I decided to make some experiments and here are the results.</p>
<h2>Important Note</h2>
<p>It’s very important to say that the following results are dependant from the machine and the environment!</p>
<h2>Source Code</h2>
<p>Here&#8217;s the PHP source code.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">function</span> divide<span style="color: #009900;">&#40;</span><span style="color: #000088;">$n</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">1</span><span style="color: #009900;">&#41;</span> 
<span style="color: #009900;">&#123;</span>
	<span style="color: #000088;">$a</span> <span style="color: #339933;">=</span> <span style="color: #990000;">microtime</span><span style="color: #009900;">&#40;</span><span style="color: #009900; font-weight: bold;">true</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	<span style="color: #b1b100;">for</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span> <span style="color: #000088;">$i</span> <span style="color: #339933;">&lt;</span> <span style="color: #000088;">$n</span><span style="color: #339933;">;</span> <span style="color: #000088;">$i</span><span style="color: #339933;">++</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #cc66cc;">300</span><span style="color: #339933;">/</span><span style="color: #cc66cc;">2</span><span style="color: #339933;">;</span>
	<span style="color: #009900;">&#125;</span>
	<span style="color: #b1b100;">echo</span> <span style="color: #990000;">microtime</span><span style="color: #009900;">&#40;</span><span style="color: #009900; font-weight: bold;">true</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">-</span> <span style="color: #000088;">$a</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
divide<span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">100</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #666666; font-style: italic;">//divide(1000);</span>
<span style="color: #666666; font-style: italic;">//divide(10000);</span>
<span style="color: #666666; font-style: italic;">//divide(100000);</span>
<span style="color: #666666; font-style: italic;">//divide(1000000);</span>
<span style="color: #666666; font-style: italic;">//divide(10000000);</span></pre></div></div>

<p><span id="more-2577"></span><br />
and bitwise &#8230;</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">function</span> bitwise<span style="color: #009900;">&#40;</span><span style="color: #000088;">$n</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">1</span><span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span>
	<span style="color: #000088;">$a</span> <span style="color: #339933;">=</span> <span style="color: #990000;">microtime</span><span style="color: #009900;">&#40;</span><span style="color: #009900; font-weight: bold;">true</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	<span style="color: #b1b100;">for</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span> <span style="color: #000088;">$i</span> <span style="color: #339933;">&lt;</span> <span style="color: #000088;">$n</span><span style="color: #339933;">;</span> <span style="color: #000088;">$i</span><span style="color: #339933;">++</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #cc66cc;">300</span> <span style="color: #339933;">&gt;&gt;</span> <span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span>
	<span style="color: #009900;">&#125;</span>
	<span style="color: #b1b100;">echo</span> <span style="color: #990000;">microtime</span><span style="color: #009900;">&#40;</span><span style="color: #009900; font-weight: bold;">true</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">-</span> <span style="color: #000088;">$a</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
bitwise<span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">100</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #666666; font-style: italic;">//bitwise(1000);</span>
<span style="color: #666666; font-style: italic;">//bitwise(10000);</span>
<span style="color: #666666; font-style: italic;">//bitwise(100000);</span>
<span style="color: #666666; font-style: italic;">//bitwise(1000000);</span>
<span style="color: #666666; font-style: italic;">//bitwise(10000000);</span></pre></div></div>

<p>Note that each method was called 6 times with the same parameter. This means that divide(100) was called 6 times and then I used the average value of these six times.</p>
<h2>Results</h2>
<p>I said back then in my binary search post, that in PHP the bitwise operator &#8220;>> 1&#8243; is not faster than the typical division with the &#8220;/&#8221; operator. However the results tells us that using bitwise division is slightly faster, as you can see at the diagram bellow.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;">n		<span style="color: #0000ff;">&quot;&gt;&gt;&quot;</span>			<span style="color: #0000ff;">&quot;/&quot;</span>
<span style="color: #cc66cc;">100</span>		<span style="color:#800080;">0.0002334912618</span>		<span style="color:#800080;">0.000311803817749</span>
<span style="color: #cc66cc;">1000</span>		<span style="color:#800080;">0.001911004384359</span>	<span style="color:#800080;">0.007335503896078</span>
<span style="color: #cc66cc;">10000</span>		<span style="color:#800080;">0.013423800468445</span>	<span style="color:#800080;">0.039460102717081</span>
<span style="color: #cc66cc;">100000</span>		<span style="color:#800080;">0.14417803287506</span>	<span style="color:#800080;">0.21413381894429</span>
<span style="color: #cc66cc;">1000000</span>		<span style="color:#800080;">1.15839115778605</span>	<span style="color:#800080;">1.17152162392935</span>
<span style="color: #cc66cc;">10000000</span>	<span style="color:#800080;">10.556711634</span>		<span style="color:#800080;">11.0911625623705</span></pre></div></div>

<div id="attachment_2596" class="wp-caption alignnone" style="width: 610px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/01/bitwise-divide-by-two.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/01/bitwise-divide-by-two.png" alt="PHP Performance: Bitwise division is slightly faster!" title="bitwise-divide-by-two" width="600" height="371" class="size-full wp-image-2596" /></a><p class="wp-caption-text">Bitwise division is slightly faster!</p></div>
<h2>Conclusion</h2>
<p>Although bitwise division is a bit faster the difference is so small that you should work with very large data in order to gain some performance.</p>
<p>Related posts:<ol>
<li><a href='http://www.stoimen.com/blog/2012/01/24/javascript-performance-for-vs-while/' rel='bookmark' title='JavaScript Performance: for vs. while'>JavaScript Performance: for vs. while</a></li>
<li><a href='http://www.stoimen.com/blog/2010/02/02/firebugs-console-time-accuracy/' rel='bookmark' title='Firebug&#8217;s console.time() accuracy'>Firebug&#8217;s console.time() accuracy</a></li>
<li><a href='http://www.stoimen.com/blog/2009/12/30/jquery-live-vs-bind-performance/' rel='bookmark' title='jQuery live() vs bind() performance'>jQuery live() vs bind() performance</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.stoimen.com/blog/2012/01/05/php-performance-bitwise-division/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Computer Algorithms: Interpolation Search</title>
		<link>http://www.stoimen.com/blog/2012/01/02/computer-algorithms-interpolation-search/</link>
		<comments>http://www.stoimen.com/blog/2012/01/02/computer-algorithms-interpolation-search/#comments</comments>
		<pubDate>Mon, 02 Jan 2012 18:31:42 +0000</pubDate>
		<dc:creator>Stoimen</dc:creator>
				<category><![CDATA[algorithms]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Algorithm]]></category>
		<category><![CDATA[binary search]]></category>
		<category><![CDATA[Binary search algorithm]]></category>
		<category><![CDATA[Binary search tree]]></category>
		<category><![CDATA[even binary search]]></category>
		<category><![CDATA[Interpolation]]></category>
		<category><![CDATA[Interpolation search]]></category>
		<category><![CDATA[interpolation search algorithm]]></category>
		<category><![CDATA[Jump search]]></category>
		<category><![CDATA[Logarithm]]></category>
		<category><![CDATA[search algorithm]]></category>
		<category><![CDATA[search algorithms]]></category>
		<category><![CDATA[searching algorithms]]></category>
		<category><![CDATA[Selection algorithm]]></category>
		<category><![CDATA[Technology/Internet]]></category>

		<guid isPermaLink="false">http://www.stoimen.com/blog/?p=2560</guid>
		<description><![CDATA[Overview I wrote about binary search in my previous post, which is indeed one very fast searching algorithm, but in some cases we can achieve even faster results. Such an algorithm is the “interpolation search” &#8211; perhaps the most interesting &#8230; <a href="http://www.stoimen.com/blog/2012/01/02/computer-algorithms-interpolation-search/">Continue reading <span class="meta-nav">&#8594;</span></a>
Related posts:<ol>
<li><a href='http://www.stoimen.com/blog/2011/12/26/computer-algorithms-binary-search/' rel='bookmark' title='Computer Algorithms: Binary Search'>Computer Algorithms: Binary Search</a></li>
<li><a href='http://www.stoimen.com/blog/2011/12/12/computer-algorithms-jump-search/' rel='bookmark' title='Computer Algorithms: Jump Search'>Computer Algorithms: Jump Search</a></li>
<li><a href='http://www.stoimen.com/blog/2011/11/24/computer-algorithms-sequential-search/' rel='bookmark' title='Computer Algorithms: Sequential Search'>Computer Algorithms: Sequential Search</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<h2>Overview</h2>
<p>I wrote about <a title="Computer Algorithms: Binary Search" href="http://www.stoimen.com/blog/2011/12/26/computer-algorithms-binary-search/">binary search</a> in my previous post, which is indeed one very fast searching algorithm, but in some cases we can achieve even faster results. Such an algorithm is the “interpolation search” &#8211; perhaps the most interesting of all searching algorithms. However we shouldn’t forget that the data must follow some limitations. In first place the array must be sorted. Also we must know the bounds of the interval.</p>
<p>Why is that? Well, this algorithm tries to follow the way we search a name in a phone book, or a word in the dictionary. We, humans, know in advance that in case the name we’re searching starts with a &#8220;B&#8221;, like &#8220;Bond&#8221; for instance, we should start searching near the beginning of the phone book. Thus if we&#8217;re searching the word “algorithm” in the dictionary, you know that it should be placed somewhere at the beginning. This is because we know the order of the letters, we know the interval (a-z), and somehow we intuitively know that the words are dispersed equally. These facts are enough to realize that the binary search can be a bad choice. Indeed the binary search algorithm divides the list in two equal sub-lists, which is useless if we know in advance that the searched item is somewhere in the beginning or the end of the list. Yes, we can use also <a href="http://www.stoimen.com/blog/2011/12/12/computer-algorithms-jump-search/" title="Computer Algorithms: Jump Search">jump search</a> if the item is at the beginning, but not if it is at the end, in that case this algorithm is not so effective.</p>
<p>So the interpolation search is based on some simple facts. The binary search divides the interval on two equal sub-lists, as shown on the image bellow.</p>
<div id="attachment_2580" class="wp-caption alignnone" style="width: 630px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/01/InterpolationSearchfig.1.png"><img class="size-full wp-image-2580" title="Interpolation Search fig. 1" src="http://www.stoimen.com/blog/wp-content/uploads/2012/01/InterpolationSearchfig.1.png" alt="Binary search basic approach" width="620" /></a><p class="wp-caption-text">The binary search algorithm divides the list in two equal sub-lists!</p></div>
<p>What will happen if we don&#8217;t use the constant ½, but another more accurate constant &#8220;C&#8221;, that can lead us closer to the searched item.</p>
<div id="attachment_2579" class="wp-caption alignnone" style="width: 630px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/01/InterpolationSearchfig.2.png"><img class="size-full wp-image-2579" title="Interpolation Search fig. 2" src="http://www.stoimen.com/blog/wp-content/uploads/2012/01/InterpolationSearchfig.2.png" alt="Interpolation search" width="620" /></a><p class="wp-caption-text">The interpolation search algorithm tries to improve the binary search!</p></div>
<p><span id="more-2560"></span></p>
<p>The question is how to find this value? Well, we know bounds of the interval and looking closer to the image above we can define the following formula.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;">C <span style="color: #339933;">=</span> <span style="color: #009900;">&#40;</span>x<span style="color: #339933;">-</span>L<span style="color: #009900;">&#41;</span><span style="color: #339933;">/</span><span style="color: #009900;">&#40;</span>R<span style="color: #339933;">-</span>L<span style="color: #009900;">&#41;</span></pre></div></div>

<p>Now we can be sure that we&#8217;re closer to the searched value.</p>
<h2>Implementation</h2>
<p>Here&#8217;s an implementation of interpolation search in PHP.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #000088;">$list</span> <span style="color: #339933;">=</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">201</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">209</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">232</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">233</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">332</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">399</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">400</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #000088;">$x</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">332</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #000000; font-weight: bold;">function</span> interpolation_search<span style="color: #009900;">&#40;</span><span style="color: #000088;">$list</span><span style="color: #339933;">,</span> <span style="color: #000088;">$x</span><span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span>
	<span style="color: #000088;">$l</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span>
	<span style="color: #000088;">$r</span> <span style="color: #339933;">=</span> <span style="color: #990000;">count</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$list</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">-</span> <span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #b1b100;">while</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$l</span> <span style="color: #339933;">&lt;=</span> <span style="color: #000088;">$r</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$list</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$l</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">==</span> <span style="color: #000088;">$list</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$r</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
			<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$list</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$l</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">==</span> <span style="color: #000088;">$x</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
				<span style="color: #b1b100;">return</span> <span style="color: #000088;">$l</span><span style="color: #339933;">;</span>
			<span style="color: #009900;">&#125;</span> <span style="color: #b1b100;">else</span> <span style="color: #009900;">&#123;</span>
				<span style="color: #666666; font-style: italic;">// not found</span>
				<span style="color: #b1b100;">return</span> <span style="color: #339933;">-</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span>
			<span style="color: #009900;">&#125;</span>
		<span style="color: #009900;">&#125;</span>
&nbsp;
		<span style="color: #000088;">$k</span> <span style="color: #339933;">=</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$x</span> <span style="color: #339933;">-</span> <span style="color: #000088;">$list</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$l</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">/</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$list</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$r</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">-</span> <span style="color: #000088;">$list</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$l</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
		<span style="color: #666666; font-style: italic;">// not found</span>
		<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$k</span> <span style="color: #339933;">&lt;</span> <span style="color: #cc66cc;">0</span> <span style="color: #339933;">||</span> <span style="color: #000088;">$k</span> <span style="color: #339933;">&gt;</span> <span style="color: #cc66cc;">1</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
			<span style="color: #b1b100;">return</span> <span style="color: #339933;">-</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span>
		<span style="color: #009900;">&#125;</span>
&nbsp;
		<span style="color: #000088;">$mid</span> <span style="color: #339933;">=</span> <span style="color: #990000;">round</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$l</span> <span style="color: #339933;">+</span> <span style="color: #000088;">$k</span><span style="color: #339933;">*</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$r</span> <span style="color: #339933;">-</span> <span style="color: #000088;">$l</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
		<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$x</span> <span style="color: #339933;">&lt;</span> <span style="color: #000088;">$list</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$mid</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
			<span style="color: #000088;">$r</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$mid</span> <span style="color: #339933;">-</span> <span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span>
		<span style="color: #009900;">&#125;</span> <span style="color: #b1b100;">else</span> <span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$x</span> <span style="color: #339933;">&gt;</span> <span style="color: #000088;">$list</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$mid</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
			<span style="color: #000088;">$l</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$mid</span> <span style="color: #339933;">+</span> <span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span>
		<span style="color: #009900;">&#125;</span> <span style="color: #b1b100;">else</span> <span style="color: #009900;">&#123;</span>
			<span style="color: #666666; font-style: italic;">// success!</span>
			<span style="color: #b1b100;">return</span> <span style="color: #000088;">$mid</span><span style="color: #339933;">;</span>
		<span style="color: #009900;">&#125;</span>
&nbsp;
		<span style="color: #666666; font-style: italic;">// not found</span>
		<span style="color: #b1b100;">return</span> <span style="color: #339933;">-</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span>
	<span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #b1b100;">echo</span> interpolation_search<span style="color: #009900;">&#40;</span><span style="color: #000088;">$list</span><span style="color: #339933;">,</span> <span style="color: #000088;">$x</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<h2>Complexity</h2>
<p>The complexity of this algorithm is log<sub>2</sub>(log<sub>2</sub>(n)) + 1. While I wont cover its proof, I’ll say that this is very slowly growing function as you can see on the following chart.</p>
<p><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/01/logntologlogn.png"><img class="alignnone size-full wp-image-2578" title="log(n) compared to log(log(n))" src="http://www.stoimen.com/blog/wp-content/uploads/2012/01/logntologlogn.png" alt="log(n) compared to log(log(n))" width="600" height="371" /></a></p>
<p>Indeed when the values are equally dispersed into the interval this search algorithm can be extremely useful &#8211; way faster than the binary search. As you can see log<sub>2</sub>(log<sub>2</sub>(100 M)) ≈ 4.73 !!!</p>
<h2>Application</h2>
<p>As I said already this algorithm is extremely interesting and very appropriate in many use cases. Here’s an example where interpolation search can be used. Let’s say there’s an array with user data, sorted by their year of birth. We know in advance that all users are born in the 80’s. In this case sequential or even binary search can be slower than interpolation search.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #000088;">$list</span> <span style="color: #339933;">=</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span>
	<span style="color: #cc66cc;">0</span> <span style="color: #339933;">=&gt;</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'year'</span> <span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">1980</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">'name'</span> <span style="color: #339933;">=&gt;</span> <span style="color: #0000ff;">'John Smith'</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">'username'</span> <span style="color: #339933;">=&gt;</span> <span style="color: #0000ff;">'John'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span>
	<span style="color: #cc66cc;">1</span> <span style="color: #339933;">=&gt;</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'year'</span> <span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">1980</span><span style="color: #339933;">,</span> <span style="color: #339933;">...</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span>
	<span style="color: #339933;">...</span>
	<span style="color: #cc66cc;">10394</span> <span style="color: #339933;">=&gt;</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'year'</span> <span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">1981</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">'name'</span> <span style="color: #339933;">=&gt;</span> <span style="color: #0000ff;">'Tomas M.'</span><span style="color: #339933;">,</span> <span style="color: #339933;">...</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span>
	<span style="color: #339933;">...</span>
	<span style="color: #cc66cc;">348489</span> <span style="color: #339933;">=&gt;</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'year'</span> <span style="color: #339933;">=&gt;</span> <span style="color: #0000ff;">'1985'</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">'name'</span> <span style="color: #339933;">=&gt;</span> <span style="color: #0000ff;">'James Bond'</span><span style="color: #339933;">,</span> <span style="color: #339933;">...</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span>
	<span style="color: #339933;">...</span>
	<span style="color: #cc66cc;">2808008</span> <span style="color: #339933;">=&gt;</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'year'</span> <span style="color: #339933;">=&gt;</span> <span style="color: #0000ff;">'1990'</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">'name'</span> <span style="color: #339933;">=&gt;</span> <span style="color: #0000ff;">'W.A. Mozart'</span><span style="color: #339933;">,</span> <span style="color: #339933;">...</span><span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>Now if we search for somebody born in 1981 a good approach is to use interpolation search.</p>
<p>Related posts:<ol>
<li><a href='http://www.stoimen.com/blog/2011/12/26/computer-algorithms-binary-search/' rel='bookmark' title='Computer Algorithms: Binary Search'>Computer Algorithms: Binary Search</a></li>
<li><a href='http://www.stoimen.com/blog/2011/12/12/computer-algorithms-jump-search/' rel='bookmark' title='Computer Algorithms: Jump Search'>Computer Algorithms: Jump Search</a></li>
<li><a href='http://www.stoimen.com/blog/2011/11/24/computer-algorithms-sequential-search/' rel='bookmark' title='Computer Algorithms: Sequential Search'>Computer Algorithms: Sequential Search</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.stoimen.com/blog/2012/01/02/computer-algorithms-interpolation-search/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Computer Algorithms: Binary Search</title>
		<link>http://www.stoimen.com/blog/2011/12/26/computer-algorithms-binary-search/</link>
		<comments>http://www.stoimen.com/blog/2011/12/26/computer-algorithms-binary-search/#comments</comments>
		<pubDate>Mon, 26 Dec 2011 13:14:25 +0000</pubDate>
		<dc:creator>Stoimen</dc:creator>
				<category><![CDATA[algorithms]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Algorithm]]></category>
		<category><![CDATA[binary search]]></category>
		<category><![CDATA[Binary search algorithm]]></category>
		<category><![CDATA[Control flow]]></category>
		<category><![CDATA[famous and best suitable search algorithm]]></category>
		<category><![CDATA[Fibonacci number]]></category>
		<category><![CDATA[Fibonacci search algorithm]]></category>
		<category><![CDATA[Fibonacci search technique]]></category>
		<category><![CDATA[Golden section search]]></category>
		<category><![CDATA[golden section search algorithm]]></category>
		<category><![CDATA[Jump search]]></category>
		<category><![CDATA[Mathematics]]></category>
		<category><![CDATA[Recursion]]></category>
		<category><![CDATA[Recursion theory]]></category>
		<category><![CDATA[recursive and iterative solution]]></category>
		<category><![CDATA[search algorithm]]></category>
		<category><![CDATA[search algorithms]]></category>
		<category><![CDATA[sequential search]]></category>
		<category><![CDATA[suitable search algorithm]]></category>
		<category><![CDATA[Theoretical computer science]]></category>
		<category><![CDATA[two algorithms]]></category>

		<guid isPermaLink="false">http://www.stoimen.com/blog/?p=2538</guid>
		<description><![CDATA[Overview The binary search is perhaps the most famous and best suitable search algorithm for sorted arrays. Indeed when the array is sorted it is useless to check every single item against the desired value. Of course a better approach &#8230; <a href="http://www.stoimen.com/blog/2011/12/26/computer-algorithms-binary-search/">Continue reading <span class="meta-nav">&#8594;</span></a>
Related posts:<ol>
<li><a href='http://www.stoimen.com/blog/2012/01/02/computer-algorithms-interpolation-search/' rel='bookmark' title='Computer Algorithms: Interpolation Search'>Computer Algorithms: Interpolation Search</a></li>
<li><a href='http://www.stoimen.com/blog/2011/12/12/computer-algorithms-jump-search/' rel='bookmark' title='Computer Algorithms: Jump Search'>Computer Algorithms: Jump Search</a></li>
<li><a href='http://www.stoimen.com/blog/2011/11/24/computer-algorithms-sequential-search/' rel='bookmark' title='Computer Algorithms: Sequential Search'>Computer Algorithms: Sequential Search</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<h2>Overview</h2>
<p>The binary search is perhaps the most famous and best suitable search algorithm for sorted arrays. Indeed when the array is sorted it is useless to check every single item against the desired value. Of course a better approach is to jump straight to the middle item of the array and if the item’s value is greater than the desired one, we can jump back again to the middle of the interval. Thus the new interval is half the size of the initial one.</p>
<div id="attachment_2561" class="wp-caption alignnone" style="width: 630px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2011/12/BinarySearchfig.1.png"><img class="size-full wp-image-2561" title="Binary Search fig.1" src="http://www.stoimen.com/blog/wp-content/uploads/2011/12/BinarySearchfig.1.png" alt="Binary search basic implementation" width="620" /></a><p class="wp-caption-text">Basic implementation of binary search</p></div>
<p>If the searched value is greater than the one placed at the middle of the sorted array, we can jump forward. Again on each step the considered list is getting half as long as the list on the previous step, as shown on the image bellow.</p>
<div id="attachment_2564" class="wp-caption alignnone" style="width: 630px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2011/12/BinarySearchfig.2.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2011/12/BinarySearchfig.2.png" alt="Binary search - basic implementation" title="Binary Search fig.2" width="620" class="size-full wp-image-2564" /></a><p class="wp-caption-text">Binary search - basic implementation</p></div>
<h2>Implementation</h2>
<p>Here’s a sample implementation of this algorithm on <a href="http://www.stoimen.com/blog/category/php/" title="PHP on stoimen.com">PHP</a>. Obviously the nature of this approach is guiding us to a recursive implementation, but as we know, sometimes recursion can be dangerous. That&#8217;s why here we can see either the recursive and iterative solution.<span id="more-2538"></span></p>
<h3>Recursive Binary Search</h3>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #000088;">$list</span> <span style="color: #339933;">=</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">0</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">1</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">1</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">2</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">3</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">5</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">8</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">13</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">21</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">34</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">55</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">89</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">144</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #000088;">$x</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">55</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #000000; font-weight: bold;">function</span> binary_search<span style="color: #009900;">&#40;</span><span style="color: #000088;">$x</span><span style="color: #339933;">,</span> <span style="color: #000088;">$list</span><span style="color: #339933;">,</span> <span style="color: #000088;">$left</span><span style="color: #339933;">,</span> <span style="color: #000088;">$right</span><span style="color: #009900;">&#41;</span> 
<span style="color: #009900;">&#123;</span>
	<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$left</span> <span style="color: #339933;">&gt;</span> <span style="color: #000088;">$right</span><span style="color: #009900;">&#41;</span>
		<span style="color: #b1b100;">return</span> <span style="color: #339933;">-</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #000088;">$mid</span> <span style="color: #339933;">=</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$left</span> <span style="color: #339933;">+</span> <span style="color: #000088;">$right</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">&gt;&gt;</span> <span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$list</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$mid</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">==</span> <span style="color: #000088;">$x</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #b1b100;">return</span> <span style="color: #000088;">$mid</span><span style="color: #339933;">;</span>
	<span style="color: #009900;">&#125;</span> <span style="color: #b1b100;">elseif</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$list</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$mid</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">&gt;</span> <span style="color: #000088;">$x</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #b1b100;">return</span> binary_search<span style="color: #009900;">&#40;</span><span style="color: #000088;">$x</span><span style="color: #339933;">,</span> <span style="color: #000088;">$list</span><span style="color: #339933;">,</span> <span style="color: #000088;">$left</span><span style="color: #339933;">,</span> <span style="color: #000088;">$mid</span><span style="color: #339933;">-</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	<span style="color: #009900;">&#125;</span> <span style="color: #b1b100;">elseif</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$list</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$mid</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">&lt;</span> <span style="color: #000088;">$x</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #b1b100;">return</span> binary_search<span style="color: #009900;">&#40;</span><span style="color: #000088;">$x</span><span style="color: #339933;">,</span> <span style="color: #000088;">$list</span><span style="color: #339933;">,</span> <span style="color: #000088;">$mid</span><span style="color: #339933;">+</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">,</span> <span style="color: #000088;">$right</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	<span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #b1b100;">echo</span> binary_search<span style="color: #009900;">&#40;</span><span style="color: #000088;">$x</span><span style="color: #339933;">,</span> <span style="color: #000088;">$list</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">,</span> <span style="color: #990000;">count</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$list</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">-</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<h3>Iterative Binary Search</h3>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #000088;">$list</span> <span style="color: #339933;">=</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">0</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">1</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">1</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">2</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">3</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">5</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">8</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">13</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">21</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">34</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">55</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">89</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">144</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #000088;">$x</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">55</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #000000; font-weight: bold;">function</span> iterative_binary_search<span style="color: #009900;">&#40;</span><span style="color: #000088;">$x</span><span style="color: #339933;">,</span> <span style="color: #000088;">$list</span><span style="color: #009900;">&#41;</span> 
<span style="color: #009900;">&#123;</span>
	<span style="color: #000088;">$left</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span>
	<span style="color: #000088;">$right</span> <span style="color: #339933;">=</span> <span style="color: #990000;">count</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$list</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">-</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #b1b100;">while</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$left</span> <span style="color: #339933;">&lt;=</span> <span style="color: #000088;">$right</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #000088;">$mid</span> <span style="color: #339933;">=</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$left</span> <span style="color: #339933;">+</span> <span style="color: #000088;">$right</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">&gt;&gt;</span> <span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span>
&nbsp;
		<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$list</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$mid</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">==</span> <span style="color: #000088;">$x</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
			<span style="color: #b1b100;">return</span> <span style="color: #000088;">$mid</span><span style="color: #339933;">;</span>
		<span style="color: #009900;">&#125;</span> <span style="color: #b1b100;">elseif</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$list</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$mid</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">&gt;</span> <span style="color: #000088;">$x</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
			<span style="color: #000088;">$right</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$mid</span> <span style="color: #339933;">-</span> <span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span>
		<span style="color: #009900;">&#125;</span> <span style="color: #b1b100;">elseif</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$list</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$mid</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">&lt;</span> <span style="color: #000088;">$x</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
			<span style="color: #000088;">$left</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$mid</span> <span style="color: #339933;">+</span> <span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span>
		<span style="color: #009900;">&#125;</span>
	<span style="color: #009900;">&#125;</span>
&nbsp;
	<span style="color: #b1b100;">return</span> <span style="color: #339933;">-</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #b1b100;">echo</span> iterative_binary_search<span style="color: #009900;">&#40;</span><span style="color: #000088;">$x</span><span style="color: #339933;">,</span> <span style="color: #000088;">$list</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<h2>Caution: Optimization</h2>
<p>Most of the optimization techniques mentioned online recommend to replace the expensive operation of dividing by 2 with its bitwise equivalent (n >> 1) == n/2. That is not always true and it is very dependant from the programming language. Thus in PHP those operations are fairly similar as PHP is written in C. You’ve to be aware of the language specific features when optimizing code.</p>
<h2>Fibonacci Search</h2>
<p>Every developer has heard of Fibonacci and his sequence. The Fibonacci search algorithm is practically a variation of the binary search algorithm. In fact the only difference is that the binary search algorithm divides the list into two equal parts, while the Fibonacci search divides it in two but not equal parts. In fact sometimes it is faster to search if you divide the list by such non equal sub-lists. However the length of the sub-lists is not random.</p>
<p>It is clear that the ratio of any two consecutive numbers in the Fibonacci sequence is practically forming the golden ratio. This can lead us to another variation of Fibonacci and binary search &#8211; the golden section search. The only different thing is that you’ve to divide the length of the list in two parts exactly by the golden ratio.</p>
<div id="attachment_2563" class="wp-caption alignnone" style="width: 630px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2011/12/GoldenRatioSearch.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2011/12/GoldenRatioSearch.png" alt="Golden Section Search" title="Golden Section Search" width="620" class="size-full wp-image-2563" /></a><p class="wp-caption-text">The golden section search doesn&#039;t divide the array on two equal sub-lists!</p></div>
<p>The complexity both of the Fibonacci and the golden section search algorithm is identical with the complexity of the binary search. However these two algorithms are rarely used in practice. Also it is more difficult to implement these two algorithms than the binary search and their advantage depends on specifically dispersed data.</p>
<h2>Complexity</h2>
<p>The complexity of the binary search algorithm is intuitively clear &#8211; O(log(n)), which makes it far more effective than the sequential search.</p>
<div id="attachment_2562" class="wp-caption alignnone" style="width: 610px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2011/12/chart_1.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2011/12/chart_1.png" alt="log(n)" title="log(n)" width="600" height="371" class="size-full wp-image-2562" /></a><p class="wp-caption-text">f(n) = log(n) compared to f(n) = n</p></div>
<h2>Application</h2>
<p>It is useless to mention examples of its use. This algorithm is easy to implement and in the same times it is very fast. Yes, indeed, this algorithm is only possible on sorted lists and this is a limitation. Also, as I said, compared to the jump search here we have more than one jump back in most of the cases, which sometimes can be more expensive than jump forward. However is this the fastest search algorithm? I’ll try to answer this question in my next article.</p>
<p>Related posts:<ol>
<li><a href='http://www.stoimen.com/blog/2012/01/02/computer-algorithms-interpolation-search/' rel='bookmark' title='Computer Algorithms: Interpolation Search'>Computer Algorithms: Interpolation Search</a></li>
<li><a href='http://www.stoimen.com/blog/2011/12/12/computer-algorithms-jump-search/' rel='bookmark' title='Computer Algorithms: Jump Search'>Computer Algorithms: Jump Search</a></li>
<li><a href='http://www.stoimen.com/blog/2011/11/24/computer-algorithms-sequential-search/' rel='bookmark' title='Computer Algorithms: Sequential Search'>Computer Algorithms: Sequential Search</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.stoimen.com/blog/2011/12/26/computer-algorithms-binary-search/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Computer Algorithms: Jump Search</title>
		<link>http://www.stoimen.com/blog/2011/12/12/computer-algorithms-jump-search/</link>
		<comments>http://www.stoimen.com/blog/2011/12/12/computer-algorithms-jump-search/#comments</comments>
		<pubDate>Mon, 12 Dec 2011 09:15:38 +0000</pubDate>
		<dc:creator>Stoimen</dc:creator>
				<category><![CDATA[algorithms]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Algorithm]]></category>
		<category><![CDATA[Analysis of algorithms]]></category>
		<category><![CDATA[binary search]]></category>
		<category><![CDATA[Binary search algorithm]]></category>
		<category><![CDATA[Jump search]]></category>
		<category><![CDATA[jump search algorithm]]></category>
		<category><![CDATA[jumping forward]]></category>
		<category><![CDATA[Linear search]]></category>
		<category><![CDATA[primitive jump search]]></category>
		<category><![CDATA[search algorithms]]></category>
		<category><![CDATA[Selection algorithm]]></category>
		<category><![CDATA[sequential search]]></category>
		<category><![CDATA[sequential search algorithm]]></category>
		<category><![CDATA[sorting algorithm]]></category>

		<guid isPermaLink="false">http://www.stoimen.com/blog/?p=2521</guid>
		<description><![CDATA[Overview In my previous article I discussed how the sequential (linear) search can be used on an ordered lists, but then we were limited by the specific features of the given task. Obviously the sequential search on an ordered list &#8230; <a href="http://www.stoimen.com/blog/2011/12/12/computer-algorithms-jump-search/">Continue reading <span class="meta-nav">&#8594;</span></a>
Related posts:<ol>
<li><a href='http://www.stoimen.com/blog/2011/12/26/computer-algorithms-binary-search/' rel='bookmark' title='Computer Algorithms: Binary Search'>Computer Algorithms: Binary Search</a></li>
<li><a href='http://www.stoimen.com/blog/2012/01/02/computer-algorithms-interpolation-search/' rel='bookmark' title='Computer Algorithms: Interpolation Search'>Computer Algorithms: Interpolation Search</a></li>
<li><a href='http://www.stoimen.com/blog/2011/11/24/computer-algorithms-sequential-search/' rel='bookmark' title='Computer Algorithms: Sequential Search'>Computer Algorithms: Sequential Search</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<h2>Overview</h2>
<p>In <a title="Computer Algorithms: Linear Search in Sorted Lists" href="http://www.stoimen.com/blog/2011/12/02/computer-algorithms-linear-search-in-sorted-lists/">my previous article</a> I discussed how the sequential (linear) search can be used on an ordered lists, but then we were limited by the specific features of the given task. Obviously the <a href="http://www.stoimen.com/blog/2011/11/24/computer-algorithms-sequential-search/" title="Computer Algorithms: Sequential Search">sequential search</a> on an ordered list is ineffective, because we consecutively check every one of its elements. Is there any way we can optimize this approach? Well, because we know that the list is sorted we can check some of its items, but not all of them. Thus when an item is checked, if it is less than the desired value, we can skip some of the following items of the list by jumping ahead and then check again. Now if the checked element is greater than the desired value, we can be sure that the desired value is hiding somewhere between the previously checked element and the currently checked element. If not, again we can jump ahead. Of course a good approach is to use a fixed step. Let’s say the list length is n and the step’s length is k. Basically we check list(0), then list(k-1), list(2k-1) etc. Once we find the interval where the value might be (m*k-1 &lt; x &lt;= (m+1)*k &#8211; 1), we can perform a sequential search between the last two checked positions. By choosing this approach we avoid a lot the weaknesses of the sequential search algorithm. Many comparisons from the sequential search here are eliminated.</p>
<h2>How to choose the step&#8217;s length</h2>
<p>We know that it is a good practice to use a fixed size step. Actually when the step is 1, the algorithm is the traditional sequential search. The question is what should be the length of the step and is there any relation between the length of the list (n) and the length of the step (k)? Indeed there is such a relation and often you can see sources directly saying that the best length k = √n. Why is that?</p>
<p>Well, in the worst case, we do n/k jumps and if the last checked value is greater than the desired one, we do at most k-1 comparisons more. This means n/k + k &#8211; 1 comparisons. Now the question is for what values of k this function reaches its minimum. For those of you who remember maths classes this can be found with the formula -n/(k^2) + 1 = 0. Now it’s clear that for k = √n the minimum of the function is reached.</p>
<p>Of course you don’t need to prove this every time you use this algorithm. Instead you can directly assign √n to be the step length. However it is good to be familiar with this approach when trying to optimize an algorithm.</p>
<p>Let’s cosider the following list: (0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610). Its length is 16. Jump search will find the value of 55 with the following steps.</p>
<div id="attachment_2539" class="wp-caption alignnone" style="width: 630px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2011/12/jump-search-fig-1.png"><img class="size-full wp-image-2539" title="jump-search-fig-1" src="http://www.stoimen.com/blog/wp-content/uploads/2011/12/jump-search-fig-1.png" alt="Jump search basic implementation" width="620" /></a><p class="wp-caption-text">Jump search skips some of the items of the list in order to improve performance!</p></div>
<h2>Implementation</h2>
<p>Let’s see an example of jump search, written in <a title="PHP on stoimen.com" href="http://www.stoimen.com/blog/category/php/">PHP</a>.<span id="more-2521"></span></p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #000088;">$list</span> <span style="color: #339933;">=</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #b1b100;">for</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span> <span style="color: #000088;">$i</span> <span style="color: #339933;">&lt;</span> <span style="color: #cc66cc;">1000</span><span style="color: #339933;">;</span> <span style="color: #000088;">$i</span><span style="color: #339933;">++</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
	<span style="color: #000088;">$list</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$i</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #666666; font-style: italic;">// now we have a sorted list: (0, 1, 2, 3, ..., 999)</span>
&nbsp;
<span style="color: #000000; font-weight: bold;">function</span> jump_search<span style="color: #009900;">&#40;</span><span style="color: #000088;">$x</span><span style="color: #339933;">,</span> <span style="color: #000088;">$list</span><span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span>
	<span style="color: #666666; font-style: italic;">// calculate the step</span>
	<span style="color: #000088;">$len</span> <span style="color: #339933;">=</span> <span style="color: #990000;">count</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$list</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	<span style="color: #000088;">$step</span> <span style="color: #339933;">=</span> <span style="color: #990000;">floor</span><span style="color: #009900;">&#40;</span><span style="color: #990000;">sqrt</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$len</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	<span style="color: #000088;">$prev</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #b1b100;">while</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$list</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$step</span> <span style="color: #339933;">&lt;</span> <span style="color: #000088;">$len</span> ? <span style="color: #000088;">$step</span> <span style="color: #339933;">:</span> <span style="color: #000088;">$len</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">&lt;</span> <span style="color: #000088;">$x</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #000088;">$prev</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$step</span><span style="color: #339933;">;</span>
		<span style="color: #000088;">$step</span> <span style="color: #339933;">+=</span> <span style="color: #990000;">floor</span><span style="color: #009900;">&#40;</span><span style="color: #990000;">sqrt</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$len</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
		<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$step</span> <span style="color: #339933;">&gt;=</span> <span style="color: #000088;">$len</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
			<span style="color: #b1b100;">return</span> <span style="color: #009900; font-weight: bold;">FALSE</span><span style="color: #339933;">;</span>
		<span style="color: #009900;">&#125;</span>
	<span style="color: #009900;">&#125;</span>
&nbsp;
	<span style="color: #b1b100;">while</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$list</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$prev</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">&lt;</span> <span style="color: #000088;">$x</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #000088;">$prev</span><span style="color: #339933;">++;</span>
		<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$prev</span> <span style="color: #339933;">==</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$step</span> <span style="color: #339933;">&lt;</span> <span style="color: #000088;">$len</span> ? <span style="color: #000088;">$step</span> <span style="color: #339933;">:</span> <span style="color: #000088;">$len</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
			<span style="color: #b1b100;">return</span> <span style="color: #009900; font-weight: bold;">FALSE</span><span style="color: #339933;">;</span>
		<span style="color: #009900;">&#125;</span>
	<span style="color: #009900;">&#125;</span>
&nbsp;
	<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$list</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$prev</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">==</span> <span style="color: #000088;">$x</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #b1b100;">return</span> <span style="color: #000088;">$prev</span><span style="color: #339933;">;</span>
	<span style="color: #009900;">&#125;</span>
&nbsp;
	<span style="color: #b1b100;">return</span> <span style="color: #009900; font-weight: bold;">FALSE</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #b1b100;">echo</span> <span style="color: #009900;">&#40;</span>int<span style="color: #009900;">&#41;</span>jump_search<span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">674</span><span style="color: #339933;">,</span> <span style="color: #000088;">$list</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>Here we have a sorted list with 1000 elements that looks like this: (0, 1, 2, &#8230;, 999). Obviously with sequential search we&#8217;ll find the value of 674 with exactly on the 674-th iteration. Here, with jump search we can reach it on the 44-th iteration, and this shows us the advantage of jump search over the sequential search on ordered lists.</p>
<h2>Further Optimization</h2>
<p>Although all examples here deal with small lists in practice this is not always true. Sometimes the step itself can be a very large number, so once you know the interval where the desired value could be you can perform jump search again.</p>
<p>We saw that the best size of the step is √n, but it is not a good idea to start from the first element of the list just as we didn’t in the example above. A better option is to begin from kth item. Now we can improve the above solution.</p>
<div id="attachment_2542" class="wp-caption alignnone" style="width: 630px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2011/12/jump-search-fig-2.png"><img class="size-full wp-image-2542" title="jump-search-fig-2" src="http://www.stoimen.com/blog/wp-content/uploads/2011/12/jump-search-fig-2.png" alt="Basic jump search can be slightly optimized!" width="620" /></a><p class="wp-caption-text">The basic implementation of jump search can be slightly optimized!</p></div>
<h2>Complexity</h2>
<p>Obviously the complexity of the algorithm is O(√n), but once we know the interval where the value is we can improve it by applying jump search again. Indeed let’s say the list length is 1,000,000. The jump interval should be: √1000000=1000. As you can see again, you can use jump search with a new step √1000≈31. Every time we find the desired interval we can apply the jump search algorithm with a smaller step. Of course finally the step will be 1. In this case the complexity of the algorithm is no longer O(√n). Now its complexity is approaching logarithmic value. The problem is that the implementation of this approach is considered to be more difficult than the binary search, where the complexity is also O(log(n)).</p>
<h2>Application</h2>
<p>As almost every algorithm the jump search is very convinient for a certain kind of tasks. Yes, the binary search is easy to implement and its complexity is O(log(n)), but in case of a very large list the direct jump to the middle can be a bad idea. Then we should make a large step back if the searched value is placed at the beginning of the list.</p>
<p>Perhaps every one of us has performed some sort of a primitive jump search in his life without even knowing it. Do you remember cassette recorders? We used the &#8220;fast forward&#8221; key and periodically checked whether the tape was on our favorite song. Once we stopped at the middle of the song we used the &#8220;rewind&#8221; button to find exactly the beginning of the song.</p>
<p>This clumsy example can give us the answer of where jump search can be better than binary search. The advantage of jump search is that you need to jump back only once (in case of the basic implementation).</p>
<div id="attachment_2544" class="wp-caption alignnone" style="width: 630px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2011/12/jump-search-fig-3.png"><img class="size-full wp-image-2544" title="jump-search-fig-3" src="http://www.stoimen.com/blog/wp-content/uploads/2011/12/jump-search-fig-3.png" alt="Jump search is very useful when jumping back is significantly slower than jumping forward!" width="620" /></a><p class="wp-caption-text">Jump search is very useful when jumping back is significantly slower than jumping forward!</p></div>
<p>If jumping back takes you significantly more time than jumping forward then you should use this algorithm.</p>
<p>Related posts:<ol>
<li><a href='http://www.stoimen.com/blog/2011/12/26/computer-algorithms-binary-search/' rel='bookmark' title='Computer Algorithms: Binary Search'>Computer Algorithms: Binary Search</a></li>
<li><a href='http://www.stoimen.com/blog/2012/01/02/computer-algorithms-interpolation-search/' rel='bookmark' title='Computer Algorithms: Interpolation Search'>Computer Algorithms: Interpolation Search</a></li>
<li><a href='http://www.stoimen.com/blog/2011/11/24/computer-algorithms-sequential-search/' rel='bookmark' title='Computer Algorithms: Sequential Search'>Computer Algorithms: Sequential Search</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.stoimen.com/blog/2011/12/12/computer-algorithms-jump-search/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Computer Algorithms: Linear Search in Sorted Lists</title>
		<link>http://www.stoimen.com/blog/2011/12/02/computer-algorithms-linear-search-in-sorted-lists/</link>
		<comments>http://www.stoimen.com/blog/2011/12/02/computer-algorithms-linear-search-in-sorted-lists/#comments</comments>
		<pubDate>Fri, 02 Dec 2011 14:20:07 +0000</pubDate>
		<dc:creator>Stoimen</dc:creator>
				<category><![CDATA[algorithms]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Algorithm]]></category>
		<category><![CDATA[binary search]]></category>
		<category><![CDATA[Binary search algorithm]]></category>
		<category><![CDATA[cellular telephone]]></category>
		<category><![CDATA[Computing]]></category>
		<category><![CDATA[faster algorithm]]></category>
		<category><![CDATA[Index]]></category>
		<category><![CDATA[Linear search]]></category>
		<category><![CDATA[Mathematics]]></category>
		<category><![CDATA[search algorithms]]></category>
		<category><![CDATA[sequential search]]></category>
		<category><![CDATA[sequential search using sentinel]]></category>
		<category><![CDATA[sorting algorithm]]></category>

		<guid isPermaLink="false">http://www.stoimen.com/blog/?p=2492</guid>
		<description><![CDATA[Overview The expression &#8220;linear search in sorted lists&#8221; itself sounds strange. Why should we use this algorithm for sorted lists when there are lots of other algorithms that are far more effective? As I mentioned in my previous post the &#8230; <a href="http://www.stoimen.com/blog/2011/12/02/computer-algorithms-linear-search-in-sorted-lists/">Continue reading <span class="meta-nav">&#8594;</span></a>
Related posts:<ol>
<li><a href='http://www.stoimen.com/blog/2011/11/24/computer-algorithms-sequential-search/' rel='bookmark' title='Computer Algorithms: Sequential Search'>Computer Algorithms: Sequential Search</a></li>
<li><a href='http://www.stoimen.com/blog/2011/12/12/computer-algorithms-jump-search/' rel='bookmark' title='Computer Algorithms: Jump Search'>Computer Algorithms: Jump Search</a></li>
<li><a href='http://www.stoimen.com/blog/2011/12/26/computer-algorithms-binary-search/' rel='bookmark' title='Computer Algorithms: Binary Search'>Computer Algorithms: Binary Search</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<h2>Overview</h2>
<p>The expression &#8220;linear search in sorted lists&#8221; itself sounds strange. Why should we use this algorithm for sorted lists when there are lots of other algorithms that are far more effective? As I mentioned in <a href="http://www.stoimen.com/blog/2011/11/24/computer-algorithms-sequential-search/" title="Computer Algorithms: Sequential Search">my previous post</a> the sequential search is very ineffective in most of the cases and it is primary used for unordered lists. Indeed sometimes it is more useful first to sort the data and then use a faster algorithm like the binary search. On the other hand the analysis shows that for lists with less than ten items the linear search is much faster than the binary search.</p>
<p>Although, for instance, binary search is more effective on sorted lists, sequential search can be a better solution in some specific cases with minor changes. The problem is that when developers hear the expression &#8220;sorted list&#8221; they directly choose an algorithm different from the linear search. Perhaps the problem lays in the way we understand what an ordered list is?</p>
<h3>What is a sorted list?</h3>
<p>We used to think that this list <strong>(1, 1, 2, 3, 5, 8, 13)</strong> is sorted. Actually we think so because it is &#8230; sorted, but the list <strong>(3, 13, 1, 3, 3.14, 1.5, -1)</strong> is also sorted, except that we don’t know how. Thus we can think that any array is sorted, although it is not always obvious how.</p>
<p>There are basically two cases when sequential search can be very useful. First when the list is very short or when we know in advance that there are some values that are very frequently searched.<span id="more-2492"></span></p>
<p>Let&#8217;s say we have a very large list, with hundreds of thousands of items, but actually most of the searches in that list always find the same ten values. This additional information tells us that using a binary search will be quite ineffective in this case. A possible approach, of course, is to place those values at the front of the list and to perform a sequential search. </p>
<div id="attachment_2523" class="wp-caption alignnone" style="width: 630px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2011/12/search.jpg"><img src="http://www.stoimen.com/blog/wp-content/uploads/2011/12/search.jpg" alt="You should choose a search algorithm by carefully examining the data you search." title="magnifying glass" width="620" height="270" class="size-full wp-image-2523" /></a><p class="wp-caption-text">You should choose a search algorithm by carefully examining the data you search.</p></div>
<p>Unfortunately the search will be slow when we search for some value missing from the front of the list, but then we can use another algorithm on sorted lists. </p>
<p>Still there is one question that should be answered. We do know that in most of the cases we search for the same values, but we do not know exactly those values. So the question is how to put the most frequently accessed values at the front of the list, since we don&#8217;t know them. Here we need some sort of auto adjustment of the list.</p>
<h2>Self-Organization</h2>
<p>Self-organization practically means that every time we search and find the desired value, we somehow change the list so the next search will be far more effective. </p>
<p>There are basically two approaches to do that.</p>
<ol>
<li>To move the item one position forward to the front of the list;</li>
<li>To move the item directly at the front of the list;</li>
</ol>
<p>Of course it depends on your case which approach you&#8217;ll choose, but it is assumed that the second option, the one that we choose to move the item directly at the front of the list, is better.</p>
<p>Indeed if we choose the first option and the list is (&#8230;, 24, 31) after constantly searching for those two values the array will be changing from (&#8230;, 24, 31) to (&#8230;, 31, 24) and once again to (&#8230;, 24, 31) and so on and so on.</p>
<p>Thus a better solution is to move the desired item directly to the front of the list. Now if we look for the value of &#8220;5&#8243; in the list <strong>(1, 2, 4, &#8230;, 5, &#8230;, 398)</strong> it will become <strong>(5, 1, 2, &#8230;, 398)</strong> after the value is found.</p>
<p>By choosing this approach we can be sure that as the number of searches increases, the most frequently searched values are placed at the front of the list. Now the sequential search is quite a good solution!</p>
<p>Here&#8217;s an example of sequential search from my previous article. The only change is that after we find the desired value we need to move it to the front of the list.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #009933; font-style: italic;">/**
 * Performs a sequential search using sentinel
 * and changes the array after the value is found
 *
 * @param array $arr
 * @param mixed $value
 */</span>
<span style="color: #000000; font-weight: bold;">function</span> sequential_search<span style="color: #009900;">&#40;</span><span style="color: #339933;">&amp;</span><span style="color: #000088;">$arr</span><span style="color: #339933;">,</span> <span style="color: #000088;">$value</span><span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span>
	<span style="color: #000088;">$arr</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$value</span><span style="color: #339933;">;</span>
	<span style="color: #000088;">$index</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #b1b100;">while</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$arr</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$index</span><span style="color: #339933;">++</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">!=</span> <span style="color: #000088;">$value</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$index</span> <span style="color: #339933;">&lt;</span> <span style="color: #990000;">count</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$arr</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
&nbsp;
		<span style="color: #666666; font-style: italic;">// put the item at the front of the list</span>
		<span style="color: #990000;">array_unshift</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$arr</span><span style="color: #339933;">,</span> <span style="color: #000088;">$arr</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$index</span><span style="color: #339933;">-</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
		<span style="color: #666666; font-style: italic;">// remove the value from its previous position</span>
		<span style="color: #990000;">unset</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$arr</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$index</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
		<span style="color: #666666; font-style: italic;">// unset the sentinel</span>
		<span style="color: #990000;">unset</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$arr</span><span style="color: #009900;">&#91;</span><span style="color: #990000;">count</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$arr</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">+</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
		<span style="color: #b1b100;">return</span> <span style="color: #009900; font-weight: bold;">true</span><span style="color: #339933;">;</span>
	<span style="color: #009900;">&#125;</span>
&nbsp;
	<span style="color: #b1b100;">return</span> <span style="color: #009900; font-weight: bold;">false</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #666666; font-style: italic;">// the list</span>
<span style="color: #000088;">$arr</span> <span style="color: #339933;">=</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">2</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">3</span><span style="color: #339933;">,</span> <span style="color:#800080;">3.14</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">5</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">4</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">6</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">9</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">8</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #666666; font-style: italic;">// the value</span>
<span style="color: #000088;">$x</span> <span style="color: #339933;">=</span> <span style="color:#800080;">3.14</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span>sequential_search<span style="color: #009900;">&#40;</span><span style="color: #000088;">$arr</span><span style="color: #339933;">,</span> <span style="color: #000088;">$x</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
	<span style="color: #666666; font-style: italic;">// now the array is changed to</span>
	<span style="color: #666666; font-style: italic;">// (3.14, 1, 2, 3, 5, 4, 6, 9, 8)</span>
	<span style="color: #b1b100;">echo</span> <span style="color: #0000ff;">&quot;The value <span style="color: #006699; font-weight: bold;">$x</span> is found!&quot;</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span> <span style="color: #b1b100;">else</span> <span style="color: #009900;">&#123;</span>
	<span style="color: #b1b100;">echo</span> <span style="color: #0000ff;">&quot;The value <span style="color: #006699; font-weight: bold;">$x</span> doesn't appear to be in the list!&quot;</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<h2>Application</h2>
<p>Using sequential search in sorted lists can be very useful and fast, the only thing is that we need to know in advance that there are some values that are frequently searched. A typical example of this case is the contact list on your phone. Perhaps you have lots of names in there, but most of the times you search in it is to find your best friends&#8217; and family phone numbers. That is why most of the cell phone manufacturers add to their phones the ability to predefine shortcut keys for the most frequently dialed numbers.</p>
<p>Here&#8217;s another use case. Let&#8217;s say that we have the same scenario as in my previous <a href="http://www.stoimen.com/blog/2011/11/24/computer-algorithms-sequential-search/" title="Computer Algorithms: Sequential Search">post</a>, where username/name pairs are stored into a CSV file. We can fetch those values in a PHP array.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #666666; font-style: italic;">// list with username/name pairs</span>
<span style="color: #000088;">$arr</span> <span style="color: #339933;">=</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span>
	<span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'name'</span> <span style="color: #339933;">=&gt;</span> <span style="color: #0000ff;">'James Bond'</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">'username'</span> <span style="color: #339933;">=&gt;</span> <span style="color: #0000ff;">'jamesbond007'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span>
	<span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'name'</span> <span style="color: #339933;">=&gt;</span> <span style="color: #0000ff;">'John Smith'</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">'username'</span> <span style="color: #339933;">=&gt;</span> <span style="color: #0000ff;">'jsmith'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span>
	<span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'name'</span> <span style="color: #339933;">=&gt;</span> <span style="color: #0000ff;">'John Silver'</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">'username'</span> <span style="color: #339933;">=&gt;</span> <span style="color: #0000ff;">'hohoho'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span>
	<span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'name'</span> <span style="color: #339933;">=&gt;</span> <span style="color: #0000ff;">'Yoda'</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">'username'</span> <span style="color: #339933;">=&gt;</span> <span style="color: #0000ff;">'masteryoda900'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span>
	<span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'name'</span> <span style="color: #339933;">=&gt;</span> <span style="color: #0000ff;">'Darth Vader'</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">'username'</span> <span style="color: #339933;">=&gt;</span> <span style="color: #0000ff;">'vader'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span>
<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>Every time a user enters the site we search for his name by his username and a welcome message is displayed. We know that some users enter the site very frequently while others do that once per month so we cannot only perform a sequential search but also we can use self-organization for the array and change the CSV file at the end.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #009933; font-style: italic;">/**
 * Performs a sequential search using a sentinel.
 * Returns the index of the item if the item was found
 * and FALSE otherwise.
 * 
 * @param array $arr
 * @param mixed $value
 */</span>
<span style="color: #000000; font-weight: bold;">function</span> sequential_search<span style="color: #009900;">&#40;</span><span style="color: #339933;">&amp;</span><span style="color: #000088;">$arr</span><span style="color: #339933;">,</span> <span style="color: #000088;">$value</span><span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span>	
	<span style="color: #666666; font-style: italic;">// usign a sentinel</span>
	<span style="color: #000088;">$arr</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'username'</span> <span style="color: #339933;">=&gt;</span> <span style="color: #000088;">$value</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">'name'</span> <span style="color: #339933;">=&gt;</span> <span style="color: #0000ff;">''</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	<span style="color: #000088;">$index</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #b1b100;">while</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$arr</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$index</span><span style="color: #339933;">++</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#91;</span><span style="color: #0000ff;">'username'</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">!=</span> <span style="color: #000088;">$value</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #666666; font-style: italic;">// if the desired element is in the array</span>
	<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$index</span> <span style="color: #339933;">&lt;</span> <span style="color: #990000;">count</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$arr</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
&nbsp;
		<span style="color: #666666; font-style: italic;">// push the element at the front of the list</span>
		<span style="color: #990000;">array_unshift</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$arr</span><span style="color: #339933;">,</span> <span style="color: #000088;">$arr</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$index</span><span style="color: #339933;">-</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
		<span style="color: #666666; font-style: italic;">// remove the item from its previous place</span>
		<span style="color: #990000;">unset</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$arr</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$index</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
		<span style="color: #666666; font-style: italic;">// remove the sentinel</span>
		<span style="color: #990000;">unset</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$arr</span><span style="color: #009900;">&#91;</span><span style="color: #990000;">count</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$arr</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
		<span style="color: #666666; font-style: italic;">// return the index of the value</span>
		<span style="color: #b1b100;">return</span> <span style="color: #000088;">$index</span><span style="color: #339933;">;</span>
	<span style="color: #009900;">&#125;</span>
&nbsp;
	<span style="color: #666666; font-style: italic;">// the element has not been found</span>
	<span style="color: #b1b100;">return</span> <span style="color: #009900; font-weight: bold;">FALSE</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #009900; font-weight: bold;">FALSE</span> <span style="color: #339933;">!==</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$iterations</span> <span style="color: #339933;">=</span> sequential_search<span style="color: #009900;">&#40;</span><span style="color: #000088;">$arr</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">'vader'</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
	<span style="color: #b1b100;">echo</span> <span style="color: #0000ff;">&quot;Hello, <span style="color: #006699; font-weight: bold;">{$arr[0]['name']}</span>&quot;</span><span style="color: #339933;">;</span>
	<span style="color: #b1b100;">echo</span> <span style="color: #0000ff;">&quot;Found after <span style="color: #006699; font-weight: bold;">$iterations</span> iterations!&quot;</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span> <span style="color: #b1b100;">else</span> <span style="color: #009900;">&#123;</span>
	<span style="color: #b1b100;">echo</span> <span style="color: #0000ff;">&quot;Hi, guest!&quot;</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #009900; font-weight: bold;">FALSE</span> <span style="color: #339933;">!==</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$iterations</span> <span style="color: #339933;">=</span> sequential_search<span style="color: #009900;">&#40;</span><span style="color: #000088;">$arr</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">'vader'</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
	<span style="color: #b1b100;">echo</span> <span style="color: #0000ff;">&quot;Hello, <span style="color: #006699; font-weight: bold;">{$arr[0]['name']}</span>&quot;</span><span style="color: #339933;">;</span>
	<span style="color: #b1b100;">echo</span> <span style="color: #0000ff;">&quot;Found after <span style="color: #006699; font-weight: bold;">$iterations</span> iterations!&quot;</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span> <span style="color: #b1b100;">else</span> <span style="color: #009900;">&#123;</span>
	<span style="color: #b1b100;">echo</span> <span style="color: #0000ff;">&quot;Hi, guest!&quot;</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>The result is:</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;">Hello<span style="color: #339933;">,</span> Darth Vader 
Found after <span style="color: #cc66cc;">5</span> iterations<span style="color: #339933;">!</span>
Hello<span style="color: #339933;">,</span> Darth Vader
Found after <span style="color: #cc66cc;">1</span> iterations<span style="color: #339933;">!</span></pre></div></div>

<p>Now every time Darth Vader tries to sign in, you won&#8217;t bother him to wait a lot for sure. However I bet nobody uses CSV files to store such information, but this is only an example.</p>
<p><strong><em>Note:</em></strong> all the examples in this article are written in <a href="http://www.stoimen.com/blog/category/php/" title="PHP on stoimen.com">PHP</a>. </p>
<p>Related posts:<ol>
<li><a href='http://www.stoimen.com/blog/2011/11/24/computer-algorithms-sequential-search/' rel='bookmark' title='Computer Algorithms: Sequential Search'>Computer Algorithms: Sequential Search</a></li>
<li><a href='http://www.stoimen.com/blog/2011/12/12/computer-algorithms-jump-search/' rel='bookmark' title='Computer Algorithms: Jump Search'>Computer Algorithms: Jump Search</a></li>
<li><a href='http://www.stoimen.com/blog/2011/12/26/computer-algorithms-binary-search/' rel='bookmark' title='Computer Algorithms: Binary Search'>Computer Algorithms: Binary Search</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.stoimen.com/blog/2011/12/02/computer-algorithms-linear-search-in-sorted-lists/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

