<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>stoimen&#039;s web log</title>
	<atom:link href="http://www.stoimen.com/blog/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.stoimen.com/blog</link>
	<description>about web development</description>
	<lastBuildDate>Wed, 16 May 2012 06:45:58 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
		<item>
		<title>Computer Algorithms: Karatsuba Fast Multiplication</title>
		<link>http://www.stoimen.com/blog/2012/05/15/computer-algorithms-karatsuba-fast-multiplication/</link>
		<comments>http://www.stoimen.com/blog/2012/05/15/computer-algorithms-karatsuba-fast-multiplication/#comments</comments>
		<pubDate>Tue, 15 May 2012 19:52:59 +0000</pubDate>
		<dc:creator>Stoimen</dc:creator>
				<category><![CDATA[algorithms]]></category>
		<category><![CDATA[Anatolii Alexeevitch Karatsuba]]></category>
		<category><![CDATA[Andrey Kolmogorov]]></category>
		<category><![CDATA[Cohen-Sutherland]]></category>
		<category><![CDATA[Divide and conquer algorithm]]></category>
		<category><![CDATA[Karatsuba algorithm]]></category>
		<category><![CDATA[Mathematics]]></category>
		<category><![CDATA[Multiplication]]></category>
		<category><![CDATA[Multiplication algorithm]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[structured algorithm]]></category>

		<guid isPermaLink="false">http://www.stoimen.com/blog/?p=3121</guid>
		<description><![CDATA[Introduction Typically multiplying two n-digit numbers require n2 multiplications. That is actually how we, humans, multiply numbers. Let’s take a look of an example in case we’ve to multiply two 2-digit numbers. 12 x 15 = ? OK, we know &#8230; <a href="http://www.stoimen.com/blog/2012/05/15/computer-algorithms-karatsuba-fast-multiplication/">Continue reading <span class="meta-nav">&#8594;</span></a>
Related posts:<ol>
<li><a href='http://www.stoimen.com/blog/2012/05/08/computer-algorithms-determine-if-a-number-is-prime/' rel='bookmark' title='Computer Algorithms: Determine if a Number is Prime'>Computer Algorithms: Determine if a Number is Prime</a></li>
<li><a href='http://www.stoimen.com/blog/2011/12/02/computer-algorithms-linear-search-in-sorted-lists/' rel='bookmark' title='Computer Algorithms: Linear Search in Sorted Lists'>Computer Algorithms: Linear Search in Sorted Lists</a></li>
<li><a href='http://www.stoimen.com/blog/2011/12/26/computer-algorithms-binary-search/' rel='bookmark' title='Computer Algorithms: Binary Search'>Computer Algorithms: Binary Search</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<h2>Introduction</h2>
<p>Typically multiplying two n-digit numbers require n<sup>2</sup> multiplications. That is actually how we, humans, multiply numbers. Let’s take a look of an example in case we’ve to multiply two 2-digit numbers.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #cc66cc;">12</span> x <span style="color: #cc66cc;">15</span> <span style="color: #339933;">=</span> ?</pre></div></div>

<p>OK, we know that the answer is 180 and there are lots of intuitive methods that help us get the right answer. Indeed 12 x 15 it’s just a bit more difficult to calculate than 10 x 15, because multiplying by 10 it really easy &#8211; we just add one 0 at the end of the number. Thus 15 x 10 equals 150. But now again on 12 x 15 &#8211; we know that this equals 10 x 15 (which is 150) and 2 x 15, which is also very easy to calculate and it is 30. The result of 12&#215;15 will be 150 + 30, which fortunately isn’t difficult to get and equals to 180.</p>
<p>That was easy but in some cases the calculations are a bit more difficult and we need a structured algorithm to get the right answer. What about 65 x 97? That is not so easy as 12 x 15, right?</p>
<p>The algorithm we know from the primary school, described on the diagram below, is well structured and help us multiply two numbers.</p>
<div class="mceTemp">
<dl id="attachment_3139" class="wp-caption alignnone" style="width: 494px;">
<dt class="wp-caption-dt"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/05/1.-Typical-Multiplication.png"><img class="size-full wp-image-3139" title="Typical Multiplication" src="http://www.stoimen.com/blog/wp-content/uploads/2012/05/1.-Typical-Multiplication.png" alt="Typical Multiplication" width="484" height="518" /></a></dt>
<dd class="wp-caption-dd"></dd>
</dl>
</div>
<p>We see that even for two-digit numbers this is quite difficult &#8211; we have 4 multiplications and some additions.<span id="more-3121"></span></p>
<div id="attachment_3143" class="wp-caption alignnone" style="width: 494px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/05/2.-Number-of-Multiplications.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/05/2.-Number-of-Multiplications.png" alt="Number of Multiplications" title="Number of Multiplications" width="484" height="518" class="size-full wp-image-3143" /></a><p class="wp-caption-text">We need 4 multiplications in order to calculate the product of two 2-digit numbers!</p></div>
<p>However so far we know how to multiply numbers, the only problem is that our task becomes very difficult as the numbers grow. If multiplying 65 by 97 was somehow easy, what about</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #cc66cc;">374773294776321</span>
x
<span style="color: #cc66cc;">222384759707982</span></pre></div></div>

<p>It seems almost impossible.</p>
<h3>History</h3>
<p><a title="Andrey Kolmogorov" href="http://en.wikipedia.org/wiki/Andrey_Kolmogorov" target="_blank">Andrey Kolmogorov</a> is one of the brightest russian mathematicians of the 20th century. In 1960, during a seminar, Kolmogorov stated that two n-digit numbers can’t be multiplied with less than n<sup>2</sup> multiplications!<br />
Only a week later a 23-year young student called <a title="Anatolii Alexeevitch Karatsuba" href="http://en.wikipedia.org/wiki/Anatolii_Alexeevitch_Karatsuba" target="_blank">Anatolii Alexeevitch Karatsuba</a> proved that the multiplication of two n-digit numbers can be computed with n ^ lg(3) multiplications with an ingenious divide and conquer approach.</p>
<h2>Overview</h2>
<p>Basically Karatsuba stated that if we have to multiply two n-digit numbers x and y, this can be done with the following operations, assuming that B is the base of and m &lt; n.</p>
<p>First both numbers x and y can be represented as x1,x2 and y1,y2 with the following formula.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;">x <span style="color: #339933;">=</span> x1 <span style="color: #339933;">*</span> B^m <span style="color: #339933;">+</span> x2 
y <span style="color: #339933;">=</span> y1 <span style="color: #339933;">*</span> B^m <span style="color: #339933;">+</span> y2</pre></div></div>

<p>Obviously now xy will become as the following product.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;">xy <span style="color: #339933;">=</span> <span style="color: #009900;">&#40;</span>x1 <span style="color: #339933;">*</span> B^m <span style="color: #339933;">+</span> x2<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#40;</span>y1 <span style="color: #339933;">*</span> B^m <span style="color: #339933;">+</span> y2<span style="color: #009900;">&#41;</span> <span style="color: #339933;">=&gt;</span>
&nbsp;
a <span style="color: #339933;">=</span> x1 <span style="color: #339933;">*</span> y1
b <span style="color: #339933;">=</span> x1 <span style="color: #339933;">*</span> y2 <span style="color: #339933;">+</span> x2 <span style="color: #339933;">*</span> y1
c <span style="color: #339933;">=</span> x2 <span style="color: #339933;">*</span> y2</pre></div></div>

<p>Finally xy will become:</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;">xy <span style="color: #339933;">=</span> a <span style="color: #339933;">*</span> B^2m <span style="color: #339933;">+</span> b <span style="color: #339933;">*</span> B^m <span style="color: #339933;">+</span> c</pre></div></div>

<p>However a, b and c can be computed at least with four multiplication, which isn’t a big optimization. That is why Karatsuba came up with the brilliant idea to calculate b with the following formula:</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;">b <span style="color: #339933;">=</span> <span style="color: #009900;">&#40;</span>x1 <span style="color: #339933;">+</span> x2<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#40;</span>y1 <span style="color: #339933;">+</span> y2<span style="color: #009900;">&#41;</span> <span style="color: #339933;">-</span> a <span style="color: #339933;">-</span> c</pre></div></div>

<p>That make use of only three multiplications to get xy.</p>
<p>Let’s see this formula by example.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #cc66cc;">47</span> x <span style="color: #cc66cc;">78</span>
&nbsp;
x <span style="color: #339933;">=</span> <span style="color: #cc66cc;">47</span>
x <span style="color: #339933;">=</span> <span style="color: #cc66cc;">4</span> <span style="color: #339933;">*</span> <span style="color: #cc66cc;">10</span> <span style="color: #339933;">+</span> <span style="color: #cc66cc;">7</span>
&nbsp;
x1 <span style="color: #339933;">=</span> <span style="color: #cc66cc;">4</span>
x2 <span style="color: #339933;">=</span> <span style="color: #cc66cc;">7</span>
&nbsp;
y <span style="color: #339933;">=</span> <span style="color: #cc66cc;">78</span>
y <span style="color: #339933;">=</span> <span style="color: #cc66cc;">7</span> <span style="color: #339933;">*</span> <span style="color: #cc66cc;">10</span> <span style="color: #339933;">+</span> <span style="color: #cc66cc;">8</span>
&nbsp;
y1 <span style="color: #339933;">=</span> <span style="color: #cc66cc;">7</span>
y2 <span style="color: #339933;">=</span> <span style="color: #cc66cc;">8</span>
&nbsp;
a <span style="color: #339933;">=</span> x1 <span style="color: #339933;">*</span> y1 <span style="color: #339933;">=</span> <span style="color: #cc66cc;">4</span> <span style="color: #339933;">*</span> <span style="color: #cc66cc;">7</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">28</span>
c <span style="color: #339933;">=</span> x2 <span style="color: #339933;">*</span> y2 <span style="color: #339933;">=</span> <span style="color: #cc66cc;">7</span> <span style="color: #339933;">*</span> <span style="color: #cc66cc;">8</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">56</span>
b <span style="color: #339933;">=</span> <span style="color: #009900;">&#40;</span>x1 <span style="color: #339933;">+</span> x2<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#40;</span>y1 <span style="color: #339933;">+</span> y2<span style="color: #009900;">&#41;</span> <span style="color: #339933;">-</span> a <span style="color: #339933;">-</span> b <span style="color: #339933;">=</span> <span style="color: #cc66cc;">11</span> <span style="color: #339933;">*</span> <span style="color: #cc66cc;">15</span> <span style="color: #339933;">-</span> <span style="color: #cc66cc;">28</span> <span style="color: #339933;">-</span> <span style="color: #cc66cc;">56</span></pre></div></div>

<p>Now the thing is that 11 * 15 it’s again a multiplication between 2-digit numbers, but fortunately we can apply the same rules two them. This makes the algorithm of Karatsuba a perfect example of the “divide and conquer” algorithm.</p>
<h2>Implementation</h2>
<h3>Standard Multiplication</h3>
<p>Typically the standard implementation of multiplication of n-digit numbers require n<sup>2</sup> multiplications as you can see from the following <a href="http://www.stoimen.com/blog/category/php/" title="PHP on stoimen.com">PHP</a> implementation.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #000088;">$x</span> <span style="color: #339933;">=</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">2</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">3</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">4</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #000088;">$y</span> <span style="color: #339933;">=</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">5</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">6</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">7</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">8</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #000000; font-weight: bold;">function</span> multiply<span style="color: #009900;">&#40;</span><span style="color: #000088;">$x</span><span style="color: #339933;">,</span> <span style="color: #000088;">$y</span><span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span>	
	<span style="color: #000088;">$len_x</span> <span style="color: #339933;">=</span> <span style="color: #990000;">count</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$x</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	<span style="color: #000088;">$len_y</span> <span style="color: #339933;">=</span> <span style="color: #990000;">count</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$y</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	<span style="color: #000088;">$half_x</span> <span style="color: #339933;">=</span> <span style="color: #990000;">ceil</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$len_x</span> <span style="color: #339933;">/</span> <span style="color: #cc66cc;">2</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	<span style="color: #000088;">$half_y</span> <span style="color: #339933;">=</span> <span style="color: #990000;">ceil</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$len_y</span> <span style="color: #339933;">/</span> <span style="color: #cc66cc;">2</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	<span style="color: #000088;">$base</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">10</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #666666; font-style: italic;">// bottom of the recursion</span>
	<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$len_x</span> <span style="color: #339933;">==</span> <span style="color: #cc66cc;">1</span> <span style="color: #339933;">&amp;&amp;</span> <span style="color: #000088;">$len_y</span> <span style="color: #339933;">==</span> <span style="color: #cc66cc;">1</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #b1b100;">return</span> <span style="color: #000088;">$x</span><span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">0</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">*</span> <span style="color: #000088;">$y</span><span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">0</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
	<span style="color: #009900;">&#125;</span>
&nbsp;
	<span style="color: #000088;">$x_chunks</span> <span style="color: #339933;">=</span> <span style="color: #990000;">array_chunk</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$x</span><span style="color: #339933;">,</span> <span style="color: #000088;">$half_x</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	<span style="color: #000088;">$y_chunks</span> <span style="color: #339933;">=</span> <span style="color: #990000;">array_chunk</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$y</span><span style="color: #339933;">,</span> <span style="color: #000088;">$half_y</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #666666; font-style: italic;">// predefine aliases</span>
	<span style="color: #000088;">$x1</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$x_chunks</span><span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">0</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
	<span style="color: #000088;">$x2</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$x_chunks</span><span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
	<span style="color: #000088;">$y1</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$y_chunks</span><span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">0</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
	<span style="color: #000088;">$y2</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$y_chunks</span><span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #b1b100;">return</span>  multiply<span style="color: #009900;">&#40;</span><span style="color: #000088;">$x1</span><span style="color: #339933;">,</span> <span style="color: #000088;">$y1</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">*</span> <span style="color: #990000;">pow</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$base</span><span style="color: #339933;">,</span> <span style="color: #000088;">$half_x</span> <span style="color: #339933;">*</span> <span style="color: #cc66cc;">2</span><span style="color: #009900;">&#41;</span> 					<span style="color: #666666; font-style: italic;">// a</span>
		 	<span style="color: #339933;">+</span> <span style="color: #009900;">&#40;</span>multiply<span style="color: #009900;">&#40;</span><span style="color: #000088;">$x1</span><span style="color: #339933;">,</span> <span style="color: #000088;">$y2</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">+</span> multiply<span style="color: #009900;">&#40;</span><span style="color: #000088;">$x2</span><span style="color: #339933;">,</span> <span style="color: #000088;">$y1</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">*</span> <span style="color: #990000;">pow</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$base</span><span style="color: #339933;">,</span> <span style="color: #000088;">$half_x</span><span style="color: #009900;">&#41;</span> 	<span style="color: #666666; font-style: italic;">// b</span>
		 	<span style="color: #339933;">+</span> multiply<span style="color: #009900;">&#40;</span><span style="color: #000088;">$x2</span><span style="color: #339933;">,</span> <span style="color: #000088;">$y2</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>							<span style="color: #666666; font-style: italic;">// c</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #666666; font-style: italic;">// 7 006 652</span>
<span style="color: #b1b100;">echo</span> multiply<span style="color: #009900;">&#40;</span><span style="color: #000088;">$x</span><span style="color: #339933;">,</span> <span style="color: #000088;">$y</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<h3>Karatsuba Multiplication</h3>
<p>Karatsuba replaces two of the multiplications &#8211; this of x1 * y2 + x2 * y1 with only one &#8211; (x1 + x2)(y1 + y2) and this makes the algorithm faster.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #000088;">$x</span> <span style="color: #339933;">=</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">2</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">3</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">4</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #000088;">$y</span> <span style="color: #339933;">=</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">5</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">6</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">7</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">8</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #000000; font-weight: bold;">function</span> karatsuba<span style="color: #009900;">&#40;</span><span style="color: #000088;">$x</span><span style="color: #339933;">,</span> <span style="color: #000088;">$y</span><span style="color: #009900;">&#41;</span> 
<span style="color: #009900;">&#123;</span>
	<span style="color: #000088;">$len_x</span> <span style="color: #339933;">=</span> <span style="color: #990000;">count</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$x</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	<span style="color: #000088;">$len_y</span> <span style="color: #339933;">=</span> <span style="color: #990000;">count</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$y</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #666666; font-style: italic;">// bottom of the recursion</span>
	<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$len_x</span> <span style="color: #339933;">==</span> <span style="color: #cc66cc;">1</span> <span style="color: #339933;">&amp;&amp;</span> <span style="color: #000088;">$len_y</span> <span style="color: #339933;">==</span> <span style="color: #cc66cc;">1</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #b1b100;">return</span> <span style="color: #000088;">$x</span><span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">0</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">*</span> <span style="color: #000088;">$y</span><span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">0</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
	<span style="color: #009900;">&#125;</span> 
	<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$len_x</span> <span style="color: #339933;">==</span> <span style="color: #cc66cc;">1</span> <span style="color: #339933;">||</span> <span style="color: #000088;">$len_y</span> <span style="color: #339933;">==</span> <span style="color: #cc66cc;">1</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #000088;">$t1</span> <span style="color: #339933;">=</span> <span style="color: #990000;">implode</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">''</span><span style="color: #339933;">,</span> <span style="color: #000088;">$x</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
		<span style="color: #000088;">$t2</span> <span style="color: #339933;">=</span> <span style="color: #990000;">implode</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">''</span><span style="color: #339933;">,</span> <span style="color: #000088;">$y</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
		<span style="color: #b1b100;">return</span> <span style="color: #009900;">&#40;</span>int<span style="color: #009900;">&#41;</span><span style="color: #000088;">$t1</span> <span style="color: #339933;">*</span> <span style="color: #000088;">$t2</span><span style="color: #339933;">;</span>
	<span style="color: #009900;">&#125;</span>
&nbsp;
	<span style="color: #000088;">$a</span> <span style="color: #339933;">=</span> <span style="color: #990000;">array_chunk</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$x</span><span style="color: #339933;">,</span> <span style="color: #990000;">ceil</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$len_x</span><span style="color: #339933;">/</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	<span style="color: #000088;">$b</span> <span style="color: #339933;">=</span> <span style="color: #990000;">array_chunk</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$y</span><span style="color: #339933;">,</span> <span style="color: #990000;">ceil</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$len_y</span><span style="color: #339933;">/</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #000088;">$deg</span> <span style="color: #339933;">=</span> <span style="color: #990000;">floor</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$len_x</span><span style="color: #339933;">/</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #000088;">$x1</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$a</span><span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">0</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>	<span style="color: #666666; font-style: italic;">// 1</span>
	<span style="color: #000088;">$x2</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$a</span><span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>	<span style="color: #666666; font-style: italic;">// 2</span>
	<span style="color: #000088;">$y1</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$b</span><span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">0</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>	<span style="color: #666666; font-style: italic;">// 1</span>
	<span style="color: #000088;">$y2</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$b</span><span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>	<span style="color: #666666; font-style: italic;">// 2</span>
&nbsp;
	<span style="color: #b1b100;">return</span>  <span style="color: #009900;">&#40;</span><span style="color: #000088;">$a</span> <span style="color: #339933;">=</span> karatsuba<span style="color: #009900;">&#40;</span><span style="color: #000088;">$x1</span><span style="color: #339933;">,</span> <span style="color: #000088;">$y1</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">*</span> <span style="color: #990000;">pow</span><span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">10</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">2</span> <span style="color: #339933;">*</span> <span style="color: #000088;">$deg</span><span style="color: #009900;">&#41;</span>
			<span style="color: #339933;">+</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$c</span> <span style="color: #339933;">=</span> karatsuba<span style="color: #009900;">&#40;</span><span style="color: #000088;">$x2</span><span style="color: #339933;">,</span> <span style="color: #000088;">$y2</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span>
			<span style="color: #339933;">+</span> <span style="color: #009900;">&#40;</span>karatsuba<span style="color: #009900;">&#40;</span>sum<span style="color: #009900;">&#40;</span><span style="color: #000088;">$x1</span><span style="color: #339933;">,</span> <span style="color: #000088;">$x2</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span> sum<span style="color: #009900;">&#40;</span><span style="color: #000088;">$y1</span><span style="color: #339933;">,</span> <span style="color: #000088;">$y2</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">-</span> <span style="color: #000088;">$a</span> <span style="color: #339933;">-</span> <span style="color: #000088;">$c</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">*</span> <span style="color: #990000;">pow</span><span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">10</span><span style="color: #339933;">,</span> <span style="color: #000088;">$deg</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #666666; font-style: italic;">// 7 006 652</span>
<span style="color: #b1b100;">echo</span> karatsuba<span style="color: #009900;">&#40;</span><span style="color: #000088;">$x</span><span style="color: #339933;">,</span> <span style="color: #000088;">$y</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<h2>Complexity</h2>
<p>Assuming that we replace two of the multiplications with only one makes the program faster. The question is how fast. Karatsuba improves the multiplication process by replacing the initial complexity of O(n<sup>2</sup>) by O(n<sup>lg3</sup>), which as you can see on the diagram below is much faster for big n.</p>
<div id="attachment_3141" class="wp-caption alignnone" style="width: 610px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/05/Karatsuba-Complexity.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/05/Karatsuba-Complexity.png" alt="Karatsuba Complexity" title="Karatsuba Complexity" width="600" height="371" class="size-full wp-image-3141" /></a><p class="wp-caption-text">O(n^2) grows much faster than O(n^lg3)</p></div>
<h2>Application</h2>
<p>It&#8217;s obvious where the Karatsuba algorithm can be used. It is very efficient when it comes to integer multiplication, but that isn’t its only advantage. It is often used for polynomial multiplications.</p>
<p>Related posts:<ol>
<li><a href='http://www.stoimen.com/blog/2012/05/08/computer-algorithms-determine-if-a-number-is-prime/' rel='bookmark' title='Computer Algorithms: Determine if a Number is Prime'>Computer Algorithms: Determine if a Number is Prime</a></li>
<li><a href='http://www.stoimen.com/blog/2011/12/02/computer-algorithms-linear-search-in-sorted-lists/' rel='bookmark' title='Computer Algorithms: Linear Search in Sorted Lists'>Computer Algorithms: Linear Search in Sorted Lists</a></li>
<li><a href='http://www.stoimen.com/blog/2011/12/26/computer-algorithms-binary-search/' rel='bookmark' title='Computer Algorithms: Binary Search'>Computer Algorithms: Binary Search</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.stoimen.com/blog/2012/05/15/computer-algorithms-karatsuba-fast-multiplication/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>You think you know algorithms. Quiz results!</title>
		<link>http://www.stoimen.com/blog/2012/05/09/you-think-you-know-algorithms-quiz-results-2/</link>
		<comments>http://www.stoimen.com/blog/2012/05/09/you-think-you-know-algorithms-quiz-results-2/#comments</comments>
		<pubDate>Wed, 09 May 2012 14:14:50 +0000</pubDate>
		<dc:creator>Stoimen</dc:creator>
				<category><![CDATA[quiz]]></category>
		<category><![CDATA[Algorithm]]></category>
		<category><![CDATA[Bubble sort]]></category>
		<category><![CDATA[Divide and conquer algorithm]]></category>
		<category><![CDATA[Merge sort]]></category>
		<category><![CDATA[Quicksort]]></category>
		<category><![CDATA[Radix sort]]></category>
		<category><![CDATA[Sort]]></category>
		<category><![CDATA[Sorting algorithms]]></category>

		<guid isPermaLink="false">http://www.stoimen.com/blog/?p=3115</guid>
		<description><![CDATA[Finally the results from &#8220;You think you know algorithms&#8221; are out. This time only 3 of you have answered correctly to all the questions. 1. Which string searching algorithm is faster? Morris-Pratt correct answer (ref) Brute force Rabin-Karp 2. Can &#8230; <a href="http://www.stoimen.com/blog/2012/05/09/you-think-you-know-algorithms-quiz-results-2/">Continue reading <span class="meta-nav">&#8594;</span></a>
Related posts:<ol>
<li><a href='http://www.stoimen.com/blog/2012/02/29/you-think-you-know-algorithms-quiz-results/' rel='bookmark' title='You think you know algorithms. Quiz results!'>You think you know algorithms. Quiz results!</a></li>
<li><a href='http://www.stoimen.com/blog/2012/03/16/you-think-you-know-php-quiz-results/' rel='bookmark' title='You think you know PHP. Quiz Results!'>You think you know PHP. Quiz Results!</a></li>
<li><a href='http://www.stoimen.com/blog/2012/03/07/you-think-you-know-javascript-quiz-results/' rel='bookmark' title='You think you know javascript. Quiz results!'>You think you know javascript. Quiz results!</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>Finally the results from <a href="http://www.stoimen.com/blog/2012/04/11/you-think-you-know-algorithms/" title="You think you know algorithms" target="_blank">&#8220;You think you know algorithms&#8221;</a> are out. This time only <strong>3</strong> of you have answered correctly to all the questions.</p>
<h3>1. Which string searching algorithm is faster?</h3>
<ul>
<li>Morris-Pratt <span style="color: #339966;">correct answer</span> (<a href="http://www.stoimen.com/blog/2012/04/09/computer-algorithms-morris-pratt-string-searching/" title="Computer Algorithms: Morris-Pratt String Searching" target="_blank">ref</a>)</li>
<li>Brute force</li>
<li>Rabin-Karp</li>
</ul>
<p><div id="attachment_3123" class="wp-caption alignnone" style="width: 610px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/05/Answers1.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/05/Answers1.png" alt="Quiz results for &quot;Which string searching algorithm is faster?&quot;" title="Quiz results for &quot;Which string searching algorithm is faster?&quot;" width="600" height="371" class="size-full wp-image-3123" /></a><p class="wp-caption-text">  </p></div><br />
<span id="more-3115"></span></p>
<h3>2. Can you use radix sort for sorting floats?</h3>
<ul>
<li>Yes</li>
<li>No <span style="color: #339966;">correct answer</span> (<a href="http://www.stoimen.com/blog/2012/03/19/computer-algorithms-radix-sort/" title="Computer Algorithms: Radix Sort" target="_blank">ref</a>)</li>
</ul>
<div id="attachment_3124" class="wp-caption alignnone" style="width: 610px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/05/Answers2.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/05/Answers2.png" alt="Quiz results for &quot;Can you use radix sort for sorting floats?&quot;" title="Quiz results for &quot;Can you use radix sort for sorting floats?&quot;" width="600" height="371" class="size-full wp-image-3124" /></a><p class="wp-caption-text"> </p></div>
<h3>3. Quicksort needs additional memory space?</h3>
<ul>
<li>Yes</li>
<li>No</li>
<li>Only in iterative implementation <span style="color: #339966;">correct answer</span> (<a href="http://www.stoimen.com/blog/2012/03/13/computer-algorithms-quicksort/" title="Computer Algorithms: Quicksort" target="_blank">ref</a>)</li>
<li>Only in recursive implementation</li>
</ul>
<div id="attachment_3125" class="wp-caption alignnone" style="width: 610px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/05/Answers3.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/05/Answers3.png" alt="Quiz results for &quot;Quicksort needs additional memory space?&quot;" title="Quiz results for &quot;Quicksort needs additional memory space?&quot;" width="600" height="371" class="size-full wp-image-3125" /></a><p class="wp-caption-text"> </p></div>
<h3>4. In the worst case scenario which is slower?</h3>
<ul>
<li>Quicksort</li>
<li>Bubble sort</li>
<li>They are equally slow <span style="color: #339966;">correct answer</span> (<a href="http://www.stoimen.com/blog/2012/03/13/computer-algorithms-quicksort/" title="Computer Algorithms: Quicksort" target="_blank">ref</a>)</li>
</ul>
<div id="attachment_3126" class="wp-caption alignnone" style="width: 610px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/05/Answers4.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/05/Answers4.png" alt="Quiz results for &quot;In the worst case scenario which is slower?&quot;" title="Quiz results for &quot;In the worst case scenario which is slower?&quot;" width="600" height="371" class="size-full wp-image-3126" /></a><p class="wp-caption-text"> </p></div>
<h3>5. Is merge sort faster than quicksort in general?</h3>
<ul>
<li>Yes, its complexity is O(n.log(n)) always!</li>
<li>No, in practice quicksort is often faster than merge sort <span style="color: #339966;">correct answer</span> (ref)<a href="http://www.stoimen.com/blog/2012/03/13/computer-algorithms-quicksort/" title="Computer Algorithms: Quicksort" target="_blank"></a></li>
</ul>
<div id="attachment_3127" class="wp-caption alignnone" style="width: 610px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/05/Answers5.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/05/Answers5.png" alt="Quiz results for &quot;Is merge sort faster than quicksort in general?&quot;" title="Quiz results for &quot;Is merge sort faster than quicksort in general?&quot;" width="600" height="371" class="size-full wp-image-3127" /></a><p class="wp-caption-text"> </p></div>
<p>Related posts:<ol>
<li><a href='http://www.stoimen.com/blog/2012/02/29/you-think-you-know-algorithms-quiz-results/' rel='bookmark' title='You think you know algorithms. Quiz results!'>You think you know algorithms. Quiz results!</a></li>
<li><a href='http://www.stoimen.com/blog/2012/03/16/you-think-you-know-php-quiz-results/' rel='bookmark' title='You think you know PHP. Quiz Results!'>You think you know PHP. Quiz Results!</a></li>
<li><a href='http://www.stoimen.com/blog/2012/03/07/you-think-you-know-javascript-quiz-results/' rel='bookmark' title='You think you know javascript. Quiz results!'>You think you know javascript. Quiz results!</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.stoimen.com/blog/2012/05/09/you-think-you-know-algorithms-quiz-results-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Computer Algorithms: Determine if a Number is Prime</title>
		<link>http://www.stoimen.com/blog/2012/05/08/computer-algorithms-determine-if-a-number-is-prime/</link>
		<comments>http://www.stoimen.com/blog/2012/05/08/computer-algorithms-determine-if-a-number-is-prime/#comments</comments>
		<pubDate>Tue, 08 May 2012 20:42:40 +0000</pubDate>
		<dc:creator>Stoimen</dc:creator>
				<category><![CDATA[algorithms]]></category>
		<category><![CDATA[Cryptography]]></category>
		<category><![CDATA[Eratosthenes]]></category>
		<category><![CDATA[ineffective algorithm]]></category>
		<category><![CDATA[Integer factorization algorithms]]></category>
		<category><![CDATA[Number]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Politics]]></category>
		<category><![CDATA[Primality tests]]></category>
		<category><![CDATA[Prime number]]></category>
		<category><![CDATA[Quadratic sieve]]></category>
		<category><![CDATA[Sieve of Atkin]]></category>
		<category><![CDATA[Sieve of Eratosthenes]]></category>

		<guid isPermaLink="false">http://www.stoimen.com/blog/?p=3100</guid>
		<description><![CDATA[Introduction Each natural number that is divisible only by 1 and itself is prime. Prime numbers appear to be more interesting to humans than other numbers. Why is that and why prime numbers are more important than the numbers that &#8230; <a href="http://www.stoimen.com/blog/2012/05/08/computer-algorithms-determine-if-a-number-is-prime/">Continue reading <span class="meta-nav">&#8594;</span></a>
Related posts:<ol>
<li><a href='http://www.stoimen.com/blog/2012/04/24/computer-algorithms-how-to-determine-the-day-of-the-week/' rel='bookmark' title='Computer Algorithms: How to Determine the Day of the Week'>Computer Algorithms: How to Determine the Day of the Week</a></li>
<li><a href='http://www.stoimen.com/blog/2012/05/15/computer-algorithms-karatsuba-fast-multiplication/' rel='bookmark' title='Computer Algorithms: Karatsuba Fast Multiplication'>Computer Algorithms: Karatsuba Fast Multiplication</a></li>
<li><a href='http://www.stoimen.com/blog/2012/04/02/computer-algorithms-rabin-karp-string-searching/' rel='bookmark' title='Computer Algorithms: Rabin-Karp String Searching'>Computer Algorithms: Rabin-Karp String Searching</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<h2>Introduction</h2>
<p>Each natural number that is divisible only by 1 and itself is prime. Prime numbers appear to be more interesting to humans than other numbers. Why is that and why prime numbers are more important than the numbers that are divisible by 2, for instance? </p>
<p>Perhaps the answer is that prime numbers are largely used in cryptography, although they were interesting for the ancient Egyptians and Greeks (Euclid has proved that the prime numbers are infinite circa 300 BC). The problem is that there is not a formula that can tell us which is the next prime number, although there are algorithms that check whether a given natural number is prime. It&#8217;s very important these algorithms to be very effective, especially for big numbers.</p>
<h2>Overview</h2>
<p>As I said each natural number that is divisible only by 1 and itself is prime. That means that 2 is the first prime number and 1 is not considered prime. It’s easy to say that 2, 3, 5 and 7 are prime numbers, but what about 983? Well, yes 983 is prime, but how do we check that? </p>
<p>If we want to know whether <strong>n</strong> is prime the very basic approach is to check every single number between 2 and n. It’s kind of a brute force.<br />
<span id="more-3100"></span></p>
<h2>Implementation</h2>
<p>The basic implementation in <a href="http://www.stoimen.com/blog/category/php/" title="PHP on stoimen.com" target="_blank">PHP</a> for the very basic (brute force) approach is as follows.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">function</span> isPrime<span style="color: #009900;">&#40;</span><span style="color: #000088;">$n</span><span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span>
	<span style="color: #000088;">$i</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">2</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$n</span> <span style="color: #339933;">==</span> <span style="color: #cc66cc;">2</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #b1b100;">return</span> <span style="color: #009900; font-weight: bold;">true</span><span style="color: #339933;">;</span>	
	<span style="color: #009900;">&#125;</span>
&nbsp;
	<span style="color: #b1b100;">while</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">&lt;</span> <span style="color: #000088;">$n</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$n</span> <span style="color: #339933;">%</span> <span style="color: #000088;">$i</span> <span style="color: #339933;">==</span> <span style="color: #cc66cc;">0</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
			<span style="color: #b1b100;">return</span> <span style="color: #009900; font-weight: bold;">false</span><span style="color: #339933;">;</span>
		<span style="color: #009900;">&#125;</span>
		<span style="color: #000088;">$i</span><span style="color: #339933;">++;</span>
	<span style="color: #009900;">&#125;</span>
&nbsp;
	<span style="color: #b1b100;">return</span> <span style="color: #009900; font-weight: bold;">true</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #666666; font-style: italic;">// prints the prime numbers between 2 and 100</span>
<span style="color: #666666; font-style: italic;">// 3, 5, 7, ..., 97</span>
<span style="color: #b1b100;">for</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">3</span><span style="color: #339933;">;</span> <span style="color: #000088;">$i</span> <span style="color: #339933;">&lt;</span> <span style="color: #cc66cc;">100</span><span style="color: #339933;">;</span> <span style="color: #000088;">$i</span><span style="color: #339933;">++</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
	<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span>isPrime<span style="color: #009900;">&#40;</span><span style="color: #000088;">$i</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #b1b100;">echo</span> <span style="color: #000088;">$i</span><span style="color: #339933;">;</span>
	<span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>Unfortunately this is one very ineffective algorithm. We don’t have to check every single number between 1 and n, it’s enough to check only the numbers between 1 and n/2-1. If we find such a divisor that will be enough to say that <strong>n</strong> isn’t prime.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">function</span> isPrime<span style="color: #009900;">&#40;</span><span style="color: #000088;">$n</span><span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span>
	<span style="color: #000088;">$i</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">2</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$n</span> <span style="color: #339933;">==</span> <span style="color: #cc66cc;">2</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #b1b100;">return</span> <span style="color: #009900; font-weight: bold;">true</span><span style="color: #339933;">;</span>	
	<span style="color: #009900;">&#125;</span>
&nbsp;
	<span style="color: #b1b100;">while</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">&lt;=</span> <span style="color: #000088;">$n</span><span style="color: #339933;">/</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$n</span> <span style="color: #339933;">%</span> <span style="color: #000088;">$i</span> <span style="color: #339933;">==</span> <span style="color: #cc66cc;">0</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
			<span style="color: #b1b100;">return</span> <span style="color: #009900; font-weight: bold;">false</span><span style="color: #339933;">;</span>
		<span style="color: #009900;">&#125;</span>
		<span style="color: #000088;">$i</span><span style="color: #339933;">++;</span>
	<span style="color: #009900;">&#125;</span>
&nbsp;
	<span style="color: #b1b100;">return</span> <span style="color: #009900; font-weight: bold;">true</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>Although that code above optimizes a lot our first prime checker, it’s clear that for large numbers it won&#8217;t be very effective.</p>
<p>Indeed checking against the interval [2, n/2 -1] isn’t the optimal solution. A better approach is to check against [2, sqrt(n)]. This is correct, because if <strong>n</strong> isn’t prime it can be represented as p*q = n. Of course if p > sqrt(n), which we assume can&#8217;t be true, that will mean that q < sqrt(n).</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">function</span> isPrime<span style="color: #009900;">&#40;</span><span style="color: #000088;">$n</span><span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span>
	<span style="color: #000088;">$i</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">2</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$n</span> <span style="color: #339933;">==</span> <span style="color: #cc66cc;">2</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #b1b100;">return</span> <span style="color: #009900; font-weight: bold;">true</span><span style="color: #339933;">;</span>	
	<span style="color: #009900;">&#125;</span>
&nbsp;
	<span style="color: #000088;">$sqrtN</span> <span style="color: #339933;">=</span> <span style="color: #990000;">sqrt</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$n</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	<span style="color: #b1b100;">while</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">&lt;=</span> <span style="color: #000088;">$sqrtN</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$n</span> <span style="color: #339933;">%</span> <span style="color: #000088;">$i</span> <span style="color: #339933;">==</span> <span style="color: #cc66cc;">0</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
			<span style="color: #b1b100;">return</span> <span style="color: #009900; font-weight: bold;">false</span><span style="color: #339933;">;</span>
		<span style="color: #009900;">&#125;</span>
		<span style="color: #000088;">$i</span><span style="color: #339933;">++;</span>
	<span style="color: #009900;">&#125;</span>
&nbsp;
	<span style="color: #b1b100;">return</span> <span style="color: #009900; font-weight: bold;">true</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>Beside that these implementations shows how we can find prime number, they are a very good example of how an algorithm can be optimized a lot with some small changes.</p>
<h3>Sieve of Eratosthenes</h3>
<p>Although the sieve of Eratosthenes isn’t the exact same approach (to check whether a number is prime) it can give us a list of prime numbers quite easily. To remove numbers that aren’t prime, we start with 2 and we remove every single item from the list that is divisible by two. Then we check for the rest items of the list, as shown on the picture below.</p>
<div id="attachment_3117" class="wp-caption alignnone" style="width: 630px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/05/SieveofEratosthenes.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/05/SieveofEratosthenes.png" alt="Sieve of Eratosthenes" title="Sieve of Eratosthenes" width="620" height="566" class="size-full wp-image-3117" /></a><p class="wp-caption-text">2, 3, 5, 7, ...</p></div>
<p>The PHP implementation of the Eratosthenes sieve isn&#8217;t difficult.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">function</span> eratosthenes_sieve<span style="color: #009900;">&#40;</span><span style="color: #339933;">&amp;</span><span style="color: #000088;">$sieve</span><span style="color: #339933;">,</span> <span style="color: #000088;">$n</span><span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span>
	<span style="color: #000088;">$i</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">2</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #b1b100;">while</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">&lt;=</span> <span style="color: #000088;">$n</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$sieve</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">==</span> <span style="color: #cc66cc;">0</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
			<span style="color: #b1b100;">echo</span> <span style="color: #000088;">$i</span><span style="color: #339933;">;</span>
&nbsp;
			<span style="color: #000088;">$j</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$i</span><span style="color: #339933;">;</span>
			<span style="color: #b1b100;">while</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$j</span> <span style="color: #339933;">&lt;=</span> <span style="color: #000088;">$n</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
				<span style="color: #000088;">$sieve</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$j</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span>
				<span style="color: #000088;">$j</span> <span style="color: #339933;">+=</span> <span style="color: #000088;">$i</span><span style="color: #339933;">;</span>
			<span style="color: #009900;">&#125;</span>
		<span style="color: #009900;">&#125;</span>
		<span style="color: #000088;">$i</span><span style="color: #339933;">++;</span>
	<span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #000088;">$n</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">100</span><span style="color: #339933;">;</span>
<span style="color: #000088;">$sieve</span> <span style="color: #339933;">=</span> <span style="color: #990000;">array_fill</span><span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">0</span><span style="color: #339933;">,</span> <span style="color: #000088;">$n</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">0</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #666666; font-style: italic;">// 2, 3, 5, 7, ..., 97</span>
eratosthenes_sieve<span style="color: #009900;">&#40;</span><span style="color: #000088;">$sieve</span><span style="color: #339933;">,</span> <span style="color: #000088;">$n</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<h2>Application</h2>
<p>As I said prime numbers are widely used in cryptography, so they are always of a greater interest in computer science. In fact every number can be represented by the product of two prime numbers and that fact is used in cryptography as well. That&#8217;s because if we know that number, which is usually very very big, it is still very difficult to find out what are its prime multipliers.</p>
<p>Unfortunately the algorithms in this article are very basic and can be handy only if we work with small numbers or if our machines are tremendously powerful. Fortunately in practice there are more complex algorithms for finding prime numbers. Such are the sieves of Euler, Atkin and Sundaram. </p>
<p>Related posts:<ol>
<li><a href='http://www.stoimen.com/blog/2012/04/24/computer-algorithms-how-to-determine-the-day-of-the-week/' rel='bookmark' title='Computer Algorithms: How to Determine the Day of the Week'>Computer Algorithms: How to Determine the Day of the Week</a></li>
<li><a href='http://www.stoimen.com/blog/2012/05/15/computer-algorithms-karatsuba-fast-multiplication/' rel='bookmark' title='Computer Algorithms: Karatsuba Fast Multiplication'>Computer Algorithms: Karatsuba Fast Multiplication</a></li>
<li><a href='http://www.stoimen.com/blog/2012/04/02/computer-algorithms-rabin-karp-string-searching/' rel='bookmark' title='Computer Algorithms: Rabin-Karp String Searching'>Computer Algorithms: Rabin-Karp String Searching</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.stoimen.com/blog/2012/05/08/computer-algorithms-determine-if-a-number-is-prime/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Computer Algorithms: Lossy Image Compression with Run-Length Encoding</title>
		<link>http://www.stoimen.com/blog/2012/05/03/computer-algorithms-lossy-image-compression-with-run-length-encoding/</link>
		<comments>http://www.stoimen.com/blog/2012/05/03/computer-algorithms-lossy-image-compression-with-run-length-encoding/#comments</comments>
		<pubDate>Thu, 03 May 2012 20:27:06 +0000</pubDate>
		<dc:creator>Stoimen</dc:creator>
				<category><![CDATA[algorithms]]></category>
		<category><![CDATA[Coding theory]]></category>
		<category><![CDATA[Data compression]]></category>
		<category><![CDATA[data compression algorithm]]></category>
		<category><![CDATA[Graphics file formats]]></category>
		<category><![CDATA[image compression]]></category>
		<category><![CDATA[Image processing]]></category>
		<category><![CDATA[Information theory]]></category>
		<category><![CDATA[jpeg]]></category>
		<category><![CDATA[Lossless data compression]]></category>
		<category><![CDATA[lossy algorithm]]></category>
		<category><![CDATA[Lossy compression]]></category>
		<category><![CDATA[PCX]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Pixel]]></category>
		<category><![CDATA[Run-length encoding]]></category>
		<category><![CDATA[suitable algorithm]]></category>
		<category><![CDATA[Technology/Internet]]></category>

		<guid isPermaLink="false">http://www.stoimen.com/blog/?p=3078</guid>
		<description><![CDATA[Introduction Run-length encoding is a data compression algorithm that helps us encode large runs of repeating items by only sending one item from the run and a counter showing how many times this item is repeated. Unfortunately this technique is &#8230; <a href="http://www.stoimen.com/blog/2012/05/03/computer-algorithms-lossy-image-compression-with-run-length-encoding/">Continue reading <span class="meta-nav">&#8594;</span></a>
Related posts:<ol>
<li><a href='http://www.stoimen.com/blog/2012/01/09/computer-algorithms-data-compression-with-run-length-encoding/' rel='bookmark' title='Computer Algorithms: Data Compression with Run-length Encoding'>Computer Algorithms: Data Compression with Run-length Encoding</a></li>
<li><a href='http://www.stoimen.com/blog/2012/01/16/computer-algorithms-data-compression-with-bitmaps/' rel='bookmark' title='Computer Algorithms: Data Compression with Bitmaps'>Computer Algorithms: Data Compression with Bitmaps</a></li>
<li><a href='http://www.stoimen.com/blog/2012/01/23/computer-algorithms-data-compression-with-diagram-encoding-and-pattern-substitution/' rel='bookmark' title='Computer Algorithms: Data Compression with Diagram Encoding and Pattern Substitution'>Computer Algorithms: Data Compression with Diagram Encoding and Pattern Substitution</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<h2>Introduction</h2>
<p><a href="http://www.stoimen.com/blog/2012/01/09/computer-algorithms-data-compression-with-run-length-encoding/" title="Computer Algorithms: Data Compression with Run-length Encoding">Run-length encoding</a> is a data compression algorithm that helps us encode large runs of repeating items by only sending one item from the run and a counter showing how many times this item is repeated. Unfortunately this technique is useless when trying to compress natural language texts, because they don’t have long runs of repeating elements. In the other hand RLE is useful when it comes to image compression, because images happen to have long runs pixels with identical color. </p>
<p>As you can see on the following picture we can compress consecutive pixels by only replacing each run with one pixel from it and a counter showing how many items it contains.</p>
<div id="attachment_3101" class="wp-caption alignnone" style="width: 629px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/05/1.LosslessRLEforImages.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/05/1.LosslessRLEforImages.png" alt="Lossless RLE for Images" title="Lossless RLE for Images" width="619" height="216" class="size-full wp-image-3101" /></a><p class="wp-caption-text">Although lossless RLE can be quite effective for image compression, it is still not the best approach!</p></div>
<p>In this case we can save only counters for pixels that are repeated more than once. Such the input stream “aaaabbaba” will be compressed as “[4]a[2]baba”. </p>
<p>Actually there are several ways run-length encoding can be used for image compression. A possible way of compressing a picture can be either row by row or column by column, as it is shown on the picture below.</p>
<p><div id="attachment_3102" class="wp-caption alignnone" style="width: 631px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/05/2.RowbyRowandColbyCol.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/05/2.RowbyRowandColbyCol.png" alt="Row by row or column by column compression" title="Row by row or column by column compression" width="621" height="300" class="size-full wp-image-3102" /></a><p class="wp-caption-text">Row by row or column by column compression.</p></div><span id="more-3078"></span></p>
<p>The problem in practice is that sometimes compressing row by row may be effective, while in other cases the same approach is very ineffective. This is illustrated by the image below.</p>
<div id="attachment_3103" class="wp-caption alignnone" style="width: 631px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/05/3.EffectiveandIneffectiveCompression.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/05/3.EffectiveandIneffectiveCompression.png" alt="Effective and Ineffective Compression" title="Effective and Ineffective Compression" width="621" height="328" class="size-full wp-image-3103" /></a><p class="wp-caption-text">Sometimes image compression may be done only after some preprocessing that can help us understand the best compression approach!</p></div>
<p>Obviously run-length encoding is a very good approach when compressing images, however when we talk about big images with millions of pixels it’s somehow natural to come with some lossy compression.</p>
<h2>Overview</h2>
<p>Lossy RLE is a very suitable algorithm when it comes to images, because in most of the cases large images do appear to have big spaces of identical pixel colors, i.e. when the half of the picture is the blue sky. By using lossy compression we can skip very short runs.</p>
<p>First we’ve to say how long will be the shortest run that we will keep in the compression. For instance if 3 is the shortest run, then runs of 2 consecutive elements will be skipped.</p>
<div id="attachment_3104" class="wp-caption alignnone" style="width: 631px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/05/4.LossessImageRow.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/05/4.LossessImageRow.png" alt="Lossless Pixel Row" title="Lossless Pixel Row" width="621" height="173" class="size-full wp-image-3104" /></a><p class="wp-caption-text">Lossless compression of a pixel row in some cases can be very inefective!</p></div>
<p>Of course if we set the shortest run to be only one element long, this will make our compression completely lossless, which isn’t very effective. However when we talk about millions of pixels even runs of three or more elements are very short, so it’s up to the developer to decide how long will be the shortest run.</p>
<h3>Some Examples</h3>
<p>Let&#8217;s first define the shortest run that we will keep untouched to be at least three element long.</p>
<div id="attachment_3105" class="wp-caption alignnone" style="width: 630px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/05/5.LossyImageRow.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/05/5.LossyImageRow.png" alt="Lossy Pixel Row" title="Lossy Pixel Row" width="620" height="273" class="size-full wp-image-3105" /></a><p class="wp-caption-text">We can lose some information that is invisbile to the eye.</p></div>
<p>The above image is compressed more effectively than the lossless pixel row from the previous picture.</p>
<p>The thing is how to merge short runs. For instance the following three runs have to be blended into one color run.</p>
<div id="attachment_3106" class="wp-caption alignnone" style="width: 630px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/05/6.BlendingShortRuns.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/05/6.BlendingShortRuns.png" alt="Blending Short Runs" title="Blending Short Runs" width="620" height="199" class="size-full wp-image-3106" /></a><p class="wp-caption-text">We must chose how to blend short runs!</p></div>
<p>We can choose the middle color (option #1) or not, but this will always depend on the picture and it will be effective in some cases and ineffective in other.</p>
<h2>Implementation</h2>
<p>Implementing run-length encoding is easy in general. Here’s a simple <a href="http://www.stoimen.com/blog/category/php/" title="PHP on stoimen.com">PHP</a> code that shows a lossy RLE.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #009933; font-style: italic;">/**
 * Compresses an input list of objects by losing some data using
 * run-length encoding
 * 
 * @param mixed $objectList
 * @param int $minLength
 */</span>
<span style="color: #000000; font-weight: bold;">function</span> lossyRLE<span style="color: #009900;">&#40;</span><span style="color: #000088;">$objectList</span><span style="color: #339933;">,</span> <span style="color: #000088;">$minLength</span><span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span>
	<span style="color: #000088;">$len</span> 		<span style="color: #339933;">=</span> <span style="color: #990000;">is_string</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$objectList</span><span style="color: #009900;">&#41;</span> 
			? <span style="color: #990000;">strlen</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$objectList</span><span style="color: #009900;">&#41;</span>		<span style="color: #666666; font-style: italic;">// string as an input stream</span>
			<span style="color: #339933;">:</span> <span style="color: #990000;">count</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$objectList</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>		<span style="color: #666666; font-style: italic;">// array as an input stream</span>
	<span style="color: #000088;">$j</span> 		<span style="color: #339933;">=</span> <span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span>
	<span style="color: #000088;">$compressed</span> <span style="color: #339933;">=</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>				<span style="color: #666666; font-style: italic;">// compressed output</span>
&nbsp;
	<span style="color: #b1b100;">for</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span> <span style="color: #000088;">$i</span> <span style="color: #339933;">&lt;</span> <span style="color: #000088;">$len</span><span style="color: #339933;">;</span> <span style="color: #000088;">$i</span><span style="color: #339933;">++</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #990000;">isset</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$objectList</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span><span style="color: #339933;">+</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">&amp;&amp;</span> <span style="color: #000088;">$objectList</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">==</span> <span style="color: #000088;">$objectList</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span><span style="color: #339933;">+</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
			<span style="color: #000088;">$j</span><span style="color: #339933;">++;</span>
		<span style="color: #009900;">&#125;</span> <span style="color: #b1b100;">else</span> <span style="color: #009900;">&#123;</span>
			<span style="color: #000088;">$l</span> <span style="color: #339933;">=</span> <span style="color: #990000;">count</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$compressed</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
			<span style="color: #666666; font-style: italic;">// This is where RLE is converted to a lossy algorithm!</span>
			<span style="color: #666666; font-style: italic;">// In case the run is shorter than a predefined length the</span>
			<span style="color: #666666; font-style: italic;">// algorithm will skip these elements and will stretch the last</span>
			<span style="color: #666666; font-style: italic;">// saved run.</span>
			<span style="color: #666666; font-style: italic;">// NOTE: this logic can be changed in order to take other</span>
			<span style="color: #666666; font-style: italic;">// decisions depending on the goals.</span>
			<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$j</span> <span style="color: #339933;">&lt;</span> <span style="color: #000088;">$minLength</span> <span style="color: #339933;">&amp;&amp;</span> <span style="color: #000088;">$j</span> <span style="color: #339933;">&lt;</span> <span style="color: #000088;">$l</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
				<span style="color: #000088;">$compressed</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$l</span><span style="color: #339933;">-</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#91;</span><span style="color: #0000ff;">'count'</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">+=</span> <span style="color: #000088;">$j</span><span style="color: #339933;">;</span>
			<span style="color: #009900;">&#125;</span> <span style="color: #b1b100;">else</span> <span style="color: #009900;">&#123;</span>
				<span style="color: #000088;">$compressed</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'count'</span> <span style="color: #339933;">=&gt;</span> <span style="color: #000088;">$j</span><span style="color: #339933;">,</span> <span style="color: #000088;">$objectList</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
			<span style="color: #009900;">&#125;</span>
			<span style="color: #000088;">$j</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span>
		<span style="color: #009900;">&#125;</span>
	<span style="color: #009900;">&#125;</span>
&nbsp;
	<span style="color: #b1b100;">return</span> <span style="color: #000088;">$compressed</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #000088;">$input</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">'aaaabbaabbbbba'</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #666666; font-style: italic;">// aaaaaaaabbbbbb</span>
lossyRLE<span style="color: #009900;">&#40;</span><span style="color: #000088;">$input</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">3</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>The code above can be modified in order to work with more complex data. Let’s say we have a “pixel” abstraction as on the example above.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #009933; font-style: italic;">/**
 * Pixel abstraction
 */</span>
<span style="color: #000000; font-weight: bold;">class</span> Pixel
<span style="color: #009900;">&#123;</span>
	<span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000088;">$_color</span> <span style="color: #339933;">=</span> <span style="color: #009900; font-weight: bold;">null</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">function</span> __construct<span style="color: #009900;">&#40;</span><span style="color: #000088;">$color</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">''</span><span style="color: #009900;">&#41;</span>
	<span style="color: #009900;">&#123;</span>
		<span style="color: #000088;">$this</span><span style="color: #339933;">-&gt;</span>_color <span style="color: #339933;">=</span> <span style="color: #000088;">$color</span><span style="color: #339933;">;</span>
	<span style="color: #009900;">&#125;</span>
&nbsp;
	<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">function</span> getColor<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> 
	<span style="color: #009900;">&#123;</span>
		<span style="color: #b1b100;">return</span> <span style="color: #000088;">$this</span><span style="color: #339933;">-&gt;</span>_color<span style="color: #339933;">;</span>
	<span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #009933; font-style: italic;">/**
 * Inits the pixels array
 * 
 * @param array $pixels
 */</span>
<span style="color: #000000; font-weight: bold;">function</span> init<span style="color: #009900;">&#40;</span><span style="color: #990000;">array</span> <span style="color: #339933;">&amp;</span><span style="color: #000088;">$pixels</span> <span style="color: #339933;">=</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span>
	<span style="color: #000088;">$colors</span> <span style="color: #339933;">=</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'red'</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">'green'</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">'blue'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #b1b100;">for</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span> <span style="color: #000088;">$i</span> <span style="color: #339933;">&lt;</span> <span style="color: #cc66cc;">100</span><span style="color: #339933;">;</span> <span style="color: #000088;">$i</span><span style="color: #339933;">++</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #000088;">$pixels</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> Pixel<span style="color: #009900;">&#40;</span><span style="color: #000088;">$colors</span><span style="color: #009900;">&#91;</span><span style="color: #990000;">mt_rand</span><span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">0</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">2</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	<span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #009933; font-style: italic;">/**
 * Compresses an input list of objects by losing some data using
 * run-length encoding
 * 
 * @param mixed $objectList
 * @param int $minLength
 */</span>
<span style="color: #000000; font-weight: bold;">function</span> lossyRLE<span style="color: #009900;">&#40;</span><span style="color: #000088;">$objectList</span><span style="color: #339933;">,</span> <span style="color: #000088;">$minLength</span><span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span>
	<span style="color: #000088;">$len</span> 		<span style="color: #339933;">=</span> <span style="color: #990000;">is_string</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$objectList</span><span style="color: #009900;">&#41;</span> 
			? <span style="color: #990000;">strlen</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$objectList</span><span style="color: #009900;">&#41;</span>		<span style="color: #666666; font-style: italic;">// string as an input stream</span>
			<span style="color: #339933;">:</span> <span style="color: #990000;">count</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$objectList</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>		<span style="color: #666666; font-style: italic;">// array as an input stream</span>
	<span style="color: #000088;">$j</span> 		<span style="color: #339933;">=</span> <span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span>
	<span style="color: #000088;">$compressed</span> <span style="color: #339933;">=</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>				<span style="color: #666666; font-style: italic;">// compressed output</span>
&nbsp;
	<span style="color: #b1b100;">for</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span> <span style="color: #000088;">$i</span> <span style="color: #339933;">&lt;</span> <span style="color: #000088;">$len</span><span style="color: #339933;">;</span> <span style="color: #000088;">$i</span><span style="color: #339933;">++</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #990000;">isset</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$objectList</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span><span style="color: #339933;">+</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">&amp;&amp;</span> <span style="color: #000088;">$objectList</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">==</span> <span style="color: #000088;">$objectList</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span><span style="color: #339933;">+</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
			<span style="color: #000088;">$j</span><span style="color: #339933;">++;</span>
		<span style="color: #009900;">&#125;</span> <span style="color: #b1b100;">else</span> <span style="color: #009900;">&#123;</span>
			<span style="color: #000088;">$l</span> <span style="color: #339933;">=</span> <span style="color: #990000;">count</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$compressed</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
			<span style="color: #666666; font-style: italic;">// This is where RLE is converted to a lossy algorithm!</span>
			<span style="color: #666666; font-style: italic;">// In case the run is shorter than a predefined length the</span>
			<span style="color: #666666; font-style: italic;">// algorithm will skip these elements and will stretch the last</span>
			<span style="color: #666666; font-style: italic;">// saved run.</span>
			<span style="color: #666666; font-style: italic;">// NOTE: this logic can be changed in order to take other</span>
			<span style="color: #666666; font-style: italic;">// decisions depending on the goals.</span>
			<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$j</span> <span style="color: #339933;">&lt;</span> <span style="color: #000088;">$minLength</span> <span style="color: #339933;">&amp;&amp;</span> <span style="color: #000088;">$l</span> <span style="color: #339933;">&gt;</span> <span style="color: #000088;">$j</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
				<span style="color: #000088;">$compressed</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$l</span><span style="color: #339933;">-</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#91;</span><span style="color: #0000ff;">'count'</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">+=</span> <span style="color: #000088;">$j</span><span style="color: #339933;">;</span>
			<span style="color: #009900;">&#125;</span> <span style="color: #b1b100;">else</span> <span style="color: #009900;">&#123;</span>
				<span style="color: #000088;">$compressed</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'count'</span> <span style="color: #339933;">=&gt;</span> <span style="color: #000088;">$j</span><span style="color: #339933;">,</span> <span style="color: #000088;">$objectList</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
			<span style="color: #009900;">&#125;</span>
			<span style="color: #000088;">$j</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span>
		<span style="color: #009900;">&#125;</span>
	<span style="color: #009900;">&#125;</span>
&nbsp;
	<span style="color: #b1b100;">return</span> <span style="color: #000088;">$compressed</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #000088;">$pixels</span> <span style="color: #339933;">=</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #666666; font-style: italic;">// initializes the pixels array</span>
init<span style="color: #009900;">&#40;</span><span style="color: #000088;">$pixels</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #000088;">$compressed</span> <span style="color: #339933;">=</span> lossyRLE<span style="color: #009900;">&#40;</span><span style="color: #000088;">$pixels</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">3</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #990000;">print_r</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$compressed</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<h2>Complexity</h2>
<p>In general lossless RLE compelxity is linear &#8211; O(n) where n is the number of items from the input stream. Even with the small modification above the complexity remains linear. However we can modify the compression in a slightly different manner (in order to get the middle value from consecutive short runs). This will somehow affect the complexity of the algorithm, of course.</p>
<h2>Application</h2>
<p>Run-length encoding isn’t a very effective option when compressing texts, but for images where long runs of the identical pixels happen to occur it is quite useful. </p>
<div id="attachment_3107" class="wp-caption alignnone" style="width: 632px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/05/7.LossyRLE.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/05/7.LossyRLE.png" alt="Lossy RLE in Practice" title="Lossy RLE in Practice" width="622" height="483" class="size-full wp-image-3107" /></a><p class="wp-caption-text"> </p></div>
<p>Nevertheless RLE is easy to convert into a lossy algorithm, that makes it very suitable for image compression.</p>
<p>Related posts:<ol>
<li><a href='http://www.stoimen.com/blog/2012/01/09/computer-algorithms-data-compression-with-run-length-encoding/' rel='bookmark' title='Computer Algorithms: Data Compression with Run-length Encoding'>Computer Algorithms: Data Compression with Run-length Encoding</a></li>
<li><a href='http://www.stoimen.com/blog/2012/01/16/computer-algorithms-data-compression-with-bitmaps/' rel='bookmark' title='Computer Algorithms: Data Compression with Bitmaps'>Computer Algorithms: Data Compression with Bitmaps</a></li>
<li><a href='http://www.stoimen.com/blog/2012/01/23/computer-algorithms-data-compression-with-diagram-encoding-and-pattern-substitution/' rel='bookmark' title='Computer Algorithms: Data Compression with Diagram Encoding and Pattern Substitution'>Computer Algorithms: Data Compression with Diagram Encoding and Pattern Substitution</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.stoimen.com/blog/2012/05/03/computer-algorithms-lossy-image-compression-with-run-length-encoding/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>PHP Strings Don&#8217;t Need Quotes</title>
		<link>http://www.stoimen.com/blog/2012/04/26/php-strings-dont-need-quotes/</link>
		<comments>http://www.stoimen.com/blog/2012/04/26/php-strings-dont-need-quotes/#comments</comments>
		<pubDate>Thu, 26 Apr 2012 13:52:53 +0000</pubDate>
		<dc:creator>Stoimen</dc:creator>
				<category><![CDATA[PHP]]></category>
		<category><![CDATA[C]]></category>
		<category><![CDATA[Computer programming]]></category>
		<category><![CDATA[Computing]]></category>
		<category><![CDATA[Cross-platform software]]></category>
		<category><![CDATA[Curly bracket programming languages]]></category>
		<category><![CDATA[PHP interpreter]]></category>
		<category><![CDATA[PHP programming language]]></category>
		<category><![CDATA[Procedural programming languages]]></category>
		<category><![CDATA[Programming language]]></category>
		<category><![CDATA[Scripting languages]]></category>
		<category><![CDATA[Social Issues]]></category>
		<category><![CDATA[Software engineering]]></category>
		<category><![CDATA[String]]></category>

		<guid isPermaLink="false">http://www.stoimen.com/blog/?p=3017</guid>
		<description><![CDATA[I bet you didn&#8217;t know that PHP strings don&#8217;t need quotes! Indeed PHP developers work with strings with either single or double quotes, but actually in some cases you don&#8217;t need them. PHP by Book Here&#8217;s how PHP developer declare &#8230; <a href="http://www.stoimen.com/blog/2012/04/26/php-strings-dont-need-quotes/">Continue reading <span class="meta-nav">&#8594;</span></a>
Related posts:<ol>
<li><a href='http://www.stoimen.com/blog/2011/07/12/a-javascript-trick-you-should-know/' rel='bookmark' title='A JavaScript Trick You Should Know'>A JavaScript Trick You Should Know</a></li>
<li><a href='http://www.stoimen.com/blog/2010/03/10/php-if-else-endif-statements/' rel='bookmark' title='PHP if-else-endif Statements'>PHP if-else-endif Statements</a></li>
<li><a href='http://www.stoimen.com/blog/2011/08/18/powerful-php-less-known-string-manipulation/' rel='bookmark' title='Powerful PHP: Less Known String Manipulation'>Powerful PHP: Less Known String Manipulation</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>I bet you didn&#8217;t know that PHP strings don&#8217;t need quotes! Indeed PHP developers work with strings with either single or double quotes, but actually in some cases you don&#8217;t need them.</p>
<h2>PHP by Book</h2>
<p>Here&#8217;s how PHP developer declare a string, which is something very common in any programming language.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #000088;">$my_var</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">'hello world'</span><span style="color: #339933;">;</span>
<span style="color: #666666; font-style: italic;">// or</span>
<span style="color: #000088;">$my_var</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">&quot;hello world&quot;</span><span style="color: #339933;">;</span></pre></div></div>

<h2>PHP Tricks</h2>
<p>What if you do the following:</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #b1b100;">echo</span> hello<span style="color: #339933;">;</span></pre></div></div>

<p>That appears to be correct &#8230; Well, it&#8217;s not absolutely correct. You&#8217;ll be &#8220;noticed&#8221;.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #666666; font-style: italic;">// Notice: Use of undefined constant hello</span>
<span style="color: #b1b100;">echo</span> hello<span style="color: #339933;">;</span></pre></div></div>

<p>However if you disable error reporting, the code will be completely fine.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #990000;">error_reporting</span><span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">0</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #666666; font-style: italic;">// no problem now</span>
<span style="color: #b1b100;">echo</span> hello<span style="color: #339933;">;</span></pre></div></div>

<h2>Variations</h2>
<p>What follows from the thing above is that you can use strings without quotes:</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #666666; font-style: italic;">// hello</span>
<span style="color: #b1b100;">echo</span> hello<span style="color: #339933;">;</span>
&nbsp;
<span style="color: #666666; font-style: italic;">// hello world (concatenated)</span>
<span style="color: #b1b100;">echo</span> hello <span style="color: #339933;">.</span> <span style="color: #0000ff;">' world'</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #666666; font-style: italic;">// helloworld</span>
<span style="color: #b1b100;">echo</span> hello <span style="color: #339933;">.</span> world<span style="color: #339933;">;</span></pre></div></div>

<p>However you can&#8217;t have spaces and most of the &#8220;special&#8221; symbols.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #666666; font-style: italic;">// syntax error</span>
<span style="color: #b1b100;">echo</span> hello world<span style="color: #339933;">;</span>
&nbsp;
<span style="color: #666666; font-style: italic;">// syntax error</span>
<span style="color: #b1b100;">echo</span> hello<span style="color: #339933;">!;</span></pre></div></div>

<h2>Final Words</h2>
<p>Although you can do this in PHP, that is completely wrong. The code becomes more difficult to read and understand. In the second place you can miss a $ sign in front of a variable declaration and thus the PHP interpreter will assume this is a string. So disable error reporting isn&#8217;t so great sometimes.</p>
<p>Related posts:<ol>
<li><a href='http://www.stoimen.com/blog/2011/07/12/a-javascript-trick-you-should-know/' rel='bookmark' title='A JavaScript Trick You Should Know'>A JavaScript Trick You Should Know</a></li>
<li><a href='http://www.stoimen.com/blog/2010/03/10/php-if-else-endif-statements/' rel='bookmark' title='PHP if-else-endif Statements'>PHP if-else-endif Statements</a></li>
<li><a href='http://www.stoimen.com/blog/2011/08/18/powerful-php-less-known-string-manipulation/' rel='bookmark' title='Powerful PHP: Less Known String Manipulation'>Powerful PHP: Less Known String Manipulation</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.stoimen.com/blog/2012/04/26/php-strings-dont-need-quotes/feed/</wfw:commentRss>
		<slash:comments>11</slash:comments>
		</item>
		<item>
		<title>Computer Algorithms: How to Determine the Day of the Week</title>
		<link>http://www.stoimen.com/blog/2012/04/24/computer-algorithms-how-to-determine-the-day-of-the-week/</link>
		<comments>http://www.stoimen.com/blog/2012/04/24/computer-algorithms-how-to-determine-the-day-of-the-week/#comments</comments>
		<pubDate>Tue, 24 Apr 2012 19:31:03 +0000</pubDate>
		<dc:creator>Stoimen</dc:creator>
				<category><![CDATA[algorithms]]></category>
		<category><![CDATA[Algorithm]]></category>
		<category><![CDATA[Calculating the day of the week]]></category>
		<category><![CDATA[Calendars]]></category>
		<category><![CDATA[Chronology]]></category>
		<category><![CDATA[computer]]></category>
		<category><![CDATA[Computer science]]></category>
		<category><![CDATA[Doomsday rule]]></category>
		<category><![CDATA[February]]></category>
		<category><![CDATA[Gregorian calendar]]></category>
		<category><![CDATA[informatics]]></category>
		<category><![CDATA[Julian calendar]]></category>
		<category><![CDATA[Leap year]]></category>
		<category><![CDATA[month]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Units of time]]></category>
		<category><![CDATA[USD]]></category>
		<category><![CDATA[Year zero]]></category>

		<guid isPermaLink="false">http://www.stoimen.com/blog/?p=3058</guid>
		<description><![CDATA[Introduction Do you know what day of the week was the day you were born? Monday or maybe Saturday? Well, perhaps you know that. Everybody know the day he’s born on, but do you know what day was the 31st &#8230; <a href="http://www.stoimen.com/blog/2012/04/24/computer-algorithms-how-to-determine-the-day-of-the-week/">Continue reading <span class="meta-nav">&#8594;</span></a>
Related posts:<ol>
<li><a href='http://www.stoimen.com/blog/2012/05/08/computer-algorithms-determine-if-a-number-is-prime/' rel='bookmark' title='Computer Algorithms: Determine if a Number is Prime'>Computer Algorithms: Determine if a Number is Prime</a></li>
<li><a href='http://www.stoimen.com/blog/2012/01/02/computer-algorithms-interpolation-search/' rel='bookmark' title='Computer Algorithms: Interpolation Search'>Computer Algorithms: Interpolation Search</a></li>
<li><a href='http://www.stoimen.com/blog/2012/02/06/computer-algorithms-data-compression-with-prefix-encoding/' rel='bookmark' title='Computer Algorithms: Data Compression with Prefix Encoding'>Computer Algorithms: Data Compression with Prefix Encoding</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<h2>Introduction</h2>
<p>Do you know what day of the week was the day you were born? Monday or maybe Saturday? Well, perhaps you know that. Everybody know the day he’s born on, but do you know what day was the 31st January 1883? No? Well, there must be some method to determine any day in any century.</p>
<p>We know that 2012 started at Sunday. After we know that it’s easy to determine what day is the 2nd of January. It should be Monday. But things get a little more complex if we try to guess some date distant from January the 1st. Indeed 1st of Jan was on Sunday, but what day is 9th of May the same year. This is far more difficult to say. Of course we can go with a brute force approach and count from 1/Jan till 9/May, but that is quite slow and error prone.</p>
<div id="attachment_3079" class="wp-caption alignnone" style="width: 629px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/04/FollowingDays.png"><img class="size-full wp-image-3079" title="Following Days" src="http://www.stoimen.com/blog/wp-content/uploads/2012/04/FollowingDays.png" alt="Following Days" width="619" height="315" /></a><p class="wp-caption-text">If 1st of January is Sunday the most logical thing to happen is 2nd of January to be Monday</p></div>
<p>So what we’ll do if we have to code a program that answers this question. The most easier way is to use a library. Almost every major library has built-in functions that can answer what day is on a given date. Such are date() in PHP or getDate() in JavaScript. But the question remains. How these library functions know the answer and how can we code such library function if our library doesn’t support such functionality?</p>
<p>There must be some algorithm to help us.<span id="more-3058"></span></p>
<h2>Overview</h2>
<p>Because months has different number of days, and most of them aren’t divisible by 7 without a remainder, months begin on different days. Thus if January begins on Sunday, the month of February the same year will begin on Wednesday. Of course in common years February has 28 days, which fortunately is divisible by 7 and thus February and March both begin on the same day, which is great, but isn’t true for leap years.</p>
<h3>What Do We Know About the Calendar</h3>
<p>First thing to know is that each week has exactly 7 days. We know also that a common year has 365 days, while a leap year has one day more &#8211; 366. Most of the months has 30 or 31 days, but February has only 28 days in common years and 29 in leap years.</p>
<p>Because 365 mod 7 = 1 in a common year each year begins exactly on the next day of the preceding year. Thus if 2011 started on Saturday, 2012 starts on Sunday. And yet again that is because 2011 is not a leap year.</p>
<div id="attachment_3081" class="wp-caption alignnone" style="width: 630px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/04/SomeStatistics.png"><img class="size-full wp-image-3081" title="Some Statistics" src="http://www.stoimen.com/blog/wp-content/uploads/2012/04/SomeStatistics.png" alt="Some Statistics" width="620" height="254" /></a><p class="wp-caption-text">A week always has 7 days, while a year has different number of days depending on the fact whether it&#39;s a leap or not!</p></div>
<p>What else do we know? Because a week has exactly seven days only February (with its 28 days in a common year) is divisible by 7 (28 mod 7 = 0) and has exactly four weeks in it. Thus in a common year February and March start on a same day. Unfortunately that is not true about the other months.</p>
<p>All these things we know about the calendar are great, so we can make some conclusions. Although eleven of the months have either 30 or 31 days they don’t start on a same day, but some of the months do appear to start on a same day just because the number of days between them is divisible by 7 without a remainder.</p>
<p>Let’s take a look on some examples. For instance September has 30 days, as November, while October, which is in between them has 31 days. Thus 30+30+31 makes 91. Fortunately 91 mod 7 = 0. So for each year September and December start on the same day (as they are after February they don’t depend on leap years). The same thing occurs to April and July and the good news is that in leap years even January starts on the same day as April and July.</p>
<div id="attachment_3082" class="wp-caption alignnone" style="width: 633px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/04/PeriodsofDaysDivisibleby7.png"><img class="size-full wp-image-3082" title="Periods of Days Divisible by 7" src="http://www.stoimen.com/blog/wp-content/uploads/2012/04/PeriodsofDaysDivisibleby7.png" alt="Periods of Days Divisible by 7" width="623" height="508" /></a><p class="wp-caption-text">Not only the number of days in February is divisible by 7. The sum of days of April, May and June is also divisible by 7!</p></div>
<p>Now we know that there are some relations between months. Thus if we know somehow that 13th of April is Monday, we’ll be sure that 13th of July is also Monday. Let’s see now a summary of these observations.</p>
<div id="attachment_3083" class="wp-caption alignnone" style="width: 631px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/04/CorrespondingMonthsinaCommonYear.png"><img class="size-full wp-image-3083" title="Corresponding Months in a Common Year" src="http://www.stoimen.com/blog/wp-content/uploads/2012/04/CorrespondingMonthsinaCommonYear.png" alt="Corresponding Months in a Common Year" width="621" height="516" /></a><p class="wp-caption-text">In a common year some months correspond!</p></div>
<p>We can also refer the following diagram.</p>
<div id="attachment_3086" class="wp-caption alignnone" style="width: 630px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/04/TableofCorrespondingMonthsinaCommonYear.png"><img class="size-full wp-image-3086" title="Table of Corresponding Months in a Common Year" src="http://www.stoimen.com/blog/wp-content/uploads/2012/04/TableofCorrespondingMonthsinaCommonYear.png" alt="Table of Corresponding Months in a Common Year" width="620" height="326" /></a><p class="wp-caption-text">It&#39;s clearer to see the corresponding months in a table view!</p></div>
<p>For leap years there are other corresponding months. Let’s take a look at the following image.</p>
<div id="attachment_3087" class="wp-caption alignnone" style="width: 631px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/04/CorrespondingMonthsinaLeapYear.png"><img class="size-full wp-image-3087" title="Corresponding Months in a Leap Year" src="http://www.stoimen.com/blog/wp-content/uploads/2012/04/CorrespondingMonthsinaLeapYear.png" alt="Corresponding Months in a Leap Year" width="621" height="516" /></a><p class="wp-caption-text">Corresponding months in a leap year differs from corresponding months in a common year!</p></div>
<p>Another way to get the same information is the following table.</p>
<div id="attachment_3088" class="wp-caption alignnone" style="width: 630px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/04/TableofCorrespondingMonthsinaLeapYear.png"><img class="size-full wp-image-3088" title="Table of Corresponding Months in a Leap Year" src="http://www.stoimen.com/blog/wp-content/uploads/2012/04/TableofCorrespondingMonthsinaLeapYear.png" alt="Table of Corresponding Months in a Leap Year" width="620" height="326" /></a><p class="wp-caption-text">Table view is easier to remember!</p></div>
<p>We know also that leap years happen to occur once per four years. However if there is a common year like the year 2001, which will be the next year that is common and starts and corresponds exactly on 2001? Because of leap years we can have a year starting on one of the seven days of the week and to be either leap or common. This means just 14 combinations.</p>
<p>Following these observations we can refer the following table.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #cc66cc;">1700</span>–<span style="color: #cc66cc;">1799</span>     <span style="color: #cc66cc;">4</span>
<span style="color: #cc66cc;">1800</span>–<span style="color: #cc66cc;">1899</span>     <span style="color: #cc66cc;">2</span>
<span style="color: #cc66cc;">1900</span>–<span style="color: #cc66cc;">1999</span>     <span style="color: #cc66cc;">0</span>
<span style="color: #cc66cc;">2000</span>–<span style="color: #cc66cc;">2099</span>     <span style="color: #cc66cc;">6</span>
<span style="color: #cc66cc;">2100</span>–<span style="color: #cc66cc;">2199</span>     <span style="color: #cc66cc;">4</span>
<span style="color: #cc66cc;">2200</span>–<span style="color: #cc66cc;">2299</span>     <span style="color: #cc66cc;">2</span>
<span style="color: #cc66cc;">2300</span>–<span style="color: #cc66cc;">2399</span>     <span style="color: #cc66cc;">0</span>
<span style="color: #cc66cc;">2400</span>–<span style="color: #cc66cc;">2499</span>     <span style="color: #cc66cc;">6</span>
<span style="color: #cc66cc;">2500</span>–<span style="color: #cc66cc;">2599</span>     <span style="color: #cc66cc;">4</span>
<span style="color: #cc66cc;">2600</span>–<span style="color: #cc66cc;">2699</span>     <span style="color: #cc66cc;">2</span></pre></div></div>

<p>You can clearly see the pattern “6 4 2 0”</p>
<p>Here’s the month table.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;">Month		Common  	Leap
January 	<span style="color: #cc66cc;">0</span>  		<span style="color: #cc66cc;">6</span>
February	<span style="color: #cc66cc;">3</span> 		<span style="color: #cc66cc;">2</span>
March		<span style="color: #cc66cc;">3</span>		<span style="color: #cc66cc;">3</span>
April		<span style="color: #cc66cc;">6</span>		<span style="color: #cc66cc;">6</span>
May		<span style="color: #cc66cc;">1</span>		<span style="color: #cc66cc;">1</span>
June		<span style="color: #cc66cc;">4</span>		<span style="color: #cc66cc;">4</span>
July		<span style="color: #cc66cc;">6</span>		<span style="color: #cc66cc;">6</span>
August  	<span style="color: #cc66cc;">2</span>		<span style="color: #cc66cc;">2</span>
September	<span style="color: #cc66cc;">5</span>		<span style="color: #cc66cc;">5</span>
October 	<span style="color: #cc66cc;">0</span>		<span style="color: #cc66cc;">0</span>
November	<span style="color: #cc66cc;">3</span>		<span style="color: #cc66cc;">3</span>
December	<span style="color: #cc66cc;">5</span>		<span style="color: #cc66cc;">5</span></pre></div></div>

<p>Columns 2 and 3 differs only for January and February.</p>
<p>Clearly the day table is as follows.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;">Sunday  	<span style="color: #cc66cc;">0</span>
Monday  	<span style="color: #cc66cc;">1</span>
Tuesday 	<span style="color: #cc66cc;">2</span>
Wednesday	<span style="color: #cc66cc;">3</span>
Thursday	<span style="color: #cc66cc;">4</span>
Friday  	<span style="color: #cc66cc;">5</span>
Saturday	<span style="color: #cc66cc;">6</span></pre></div></div>

<p>Now let’s go back to the algorithm.</p>
<p>Using these tables and applying a simple formula we can calculate what day was on some given date. Here are the steps of this algorithm.</p>
<ol>
<li>Get the number for the corresponding century from the centuries table;</li>
<li>Get the last two digits from the year;</li>
<li>Divide the number from step 2 by 4 and get it without the remainder;</li>
<li>Get the month number from the month table;</li>
<li>Sum the numbers from steps 1 to 4;</li>
<li>Divide it by 7 and take the remainder;</li>
<li>Find the result of step 6 in the days table;</li>
</ol>
<h2>Implementation</h2>
<p>First let&#8217;s take a look on a simple practical example of the example above and then the code. Let’s answer the question from the first paragraph of this post.</p>
<p>What day was on January 31st, 1883?</p>
<ol>
<li>Take a look at the centuries table: for 1800 &#8211; 1899 this is 2.</li>
<li>Get the last two digits from the year: 83.</li>
<li>Divide 83 by 4 without a remainder: 83/4 = 20</li>
<li>Get the month number from the month table: Jan = 0.</li>
<li>Sum the numbers from steps 1 to 4: 2 + 83 + 20 + 0 = 105.</li>
<li>Divide it by 7 and take the remainder: 105 mod 7 = 0</li>
<li>Find the result of step 6 in the days table: Sunday = 0.</li>
</ol>
<p>The following code in PHP do implements the algorithm above.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">function</span> get_century_code<span style="color: #009900;">&#40;</span><span style="color: #000088;">$century</span><span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span>
	<span style="color: #666666; font-style: italic;">// XVIII</span>
	<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">1700</span> <span style="color: #339933;">&lt;=</span> <span style="color: #000088;">$century</span> <span style="color: #339933;">&amp;&amp;</span> <span style="color: #000088;">$century</span> <span style="color: #339933;">&lt;=</span> <span style="color: #cc66cc;">1799</span><span style="color: #009900;">&#41;</span>
		<span style="color: #b1b100;">return</span> <span style="color: #cc66cc;">4</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #666666; font-style: italic;">// XIX</span>
	<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">1800</span> <span style="color: #339933;">&lt;=</span> <span style="color: #000088;">$century</span> <span style="color: #339933;">&amp;&amp;</span> <span style="color: #000088;">$century</span> <span style="color: #339933;">&lt;=</span> <span style="color: #cc66cc;">1899</span><span style="color: #009900;">&#41;</span>
		<span style="color: #b1b100;">return</span> <span style="color: #cc66cc;">2</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #666666; font-style: italic;">// XX</span>
	<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">1900</span> <span style="color: #339933;">&lt;=</span> <span style="color: #000088;">$century</span> <span style="color: #339933;">&amp;&amp;</span> <span style="color: #000088;">$century</span> <span style="color: #339933;">&lt;=</span> <span style="color: #cc66cc;">1999</span><span style="color: #009900;">&#41;</span>
		<span style="color: #b1b100;">return</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #666666; font-style: italic;">// XXI</span>
	<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">2000</span> <span style="color: #339933;">&lt;=</span> <span style="color: #000088;">$century</span> <span style="color: #339933;">&amp;&amp;</span> <span style="color: #000088;">$century</span> <span style="color: #339933;">&lt;=</span> <span style="color: #cc66cc;">2099</span><span style="color: #009900;">&#41;</span>
		<span style="color: #b1b100;">return</span> <span style="color: #cc66cc;">6</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #666666; font-style: italic;">// XXII</span>
	<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">2100</span> <span style="color: #339933;">&lt;=</span> <span style="color: #000088;">$century</span> <span style="color: #339933;">&amp;&amp;</span> <span style="color: #000088;">$century</span> <span style="color: #339933;">&lt;=</span> <span style="color: #cc66cc;">2199</span><span style="color: #009900;">&#41;</span>
		<span style="color: #b1b100;">return</span> <span style="color: #cc66cc;">4</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #666666; font-style: italic;">// XXIII</span>
	<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">2200</span> <span style="color: #339933;">&lt;=</span> <span style="color: #000088;">$century</span> <span style="color: #339933;">&amp;&amp;</span> <span style="color: #000088;">$century</span> <span style="color: #339933;">&lt;=</span> <span style="color: #cc66cc;">2299</span><span style="color: #009900;">&#41;</span>
		<span style="color: #b1b100;">return</span> <span style="color: #cc66cc;">2</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #666666; font-style: italic;">// XXIV</span>
	<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">2300</span> <span style="color: #339933;">&lt;=</span> <span style="color: #000088;">$century</span> <span style="color: #339933;">&amp;&amp;</span> <span style="color: #000088;">$century</span> <span style="color: #339933;">&lt;=</span> <span style="color: #cc66cc;">2399</span><span style="color: #009900;">&#41;</span>
		<span style="color: #b1b100;">return</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #666666; font-style: italic;">// XXV</span>
	<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">2400</span> <span style="color: #339933;">&lt;=</span> <span style="color: #000088;">$century</span> <span style="color: #339933;">&amp;&amp;</span> <span style="color: #000088;">$century</span> <span style="color: #339933;">&lt;=</span> <span style="color: #cc66cc;">2499</span><span style="color: #009900;">&#41;</span>
		<span style="color: #b1b100;">return</span> <span style="color: #cc66cc;">6</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #666666; font-style: italic;">// XXVI</span>
	<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">2500</span> <span style="color: #339933;">&lt;=</span> <span style="color: #000088;">$century</span> <span style="color: #339933;">&amp;&amp;</span> <span style="color: #000088;">$century</span> <span style="color: #339933;">&lt;=</span> <span style="color: #cc66cc;">2599</span><span style="color: #009900;">&#41;</span>
		<span style="color: #b1b100;">return</span> <span style="color: #cc66cc;">4</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #666666; font-style: italic;">// XXVII</span>
	<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">2600</span> <span style="color: #339933;">&lt;=</span> <span style="color: #000088;">$century</span> <span style="color: #339933;">&amp;&amp;</span> <span style="color: #000088;">$century</span> <span style="color: #339933;">&lt;=</span> <span style="color: #cc66cc;">2699</span><span style="color: #009900;">&#41;</span>
		<span style="color: #b1b100;">return</span> <span style="color: #cc66cc;">2</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #009933; font-style: italic;">/**
 * Get the day of a given date
 * 
 * @param $date
 */</span>
<span style="color: #000000; font-weight: bold;">function</span> get_day_from_date<span style="color: #009900;">&#40;</span><span style="color: #000088;">$date</span><span style="color: #009900;">&#41;</span> 
<span style="color: #009900;">&#123;</span>
	<span style="color: #000088;">$months</span> <span style="color: #339933;">=</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span>
		<span style="color: #cc66cc;">1</span> <span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">,</span>		<span style="color: #666666; font-style: italic;">// January</span>
		<span style="color: #cc66cc;">2</span> <span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">3</span><span style="color: #339933;">,</span>		<span style="color: #666666; font-style: italic;">// February</span>
		<span style="color: #cc66cc;">3</span> <span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">3</span><span style="color: #339933;">,</span>		<span style="color: #666666; font-style: italic;">// March</span>
		<span style="color: #cc66cc;">4</span> <span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">6</span><span style="color: #339933;">,</span>		<span style="color: #666666; font-style: italic;">// April</span>
		<span style="color: #cc66cc;">5</span> <span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">1</span><span style="color: #339933;">,</span>		<span style="color: #666666; font-style: italic;">// May</span>
		<span style="color: #cc66cc;">6</span> <span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">4</span><span style="color: #339933;">,</span>		<span style="color: #666666; font-style: italic;">// June</span>
		<span style="color: #cc66cc;">7</span> <span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">6</span><span style="color: #339933;">,</span>		<span style="color: #666666; font-style: italic;">// July</span>
		<span style="color: #cc66cc;">8</span> <span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">2</span><span style="color: #339933;">,</span>		<span style="color: #666666; font-style: italic;">// August</span>
		<span style="color: #cc66cc;">9</span> <span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">5</span><span style="color: #339933;">,</span>		<span style="color: #666666; font-style: italic;">// September</span>
		<span style="color: #cc66cc;">10</span> <span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">,</span>	<span style="color: #666666; font-style: italic;">// October</span>
		<span style="color: #cc66cc;">11</span> <span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">3</span><span style="color: #339933;">,</span>	<span style="color: #666666; font-style: italic;">// November</span>
		<span style="color: #cc66cc;">12</span> <span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">5</span><span style="color: #339933;">,</span>	<span style="color: #666666; font-style: italic;">// December</span>
	<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #000088;">$days</span> <span style="color: #339933;">=</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span>
		<span style="color: #cc66cc;">0</span> <span style="color: #339933;">=&gt;</span> <span style="color: #0000ff;">'Sunday'</span><span style="color: #339933;">,</span>
		<span style="color: #cc66cc;">1</span> <span style="color: #339933;">=&gt;</span> <span style="color: #0000ff;">'Monday'</span><span style="color: #339933;">,</span>
		<span style="color: #cc66cc;">2</span> <span style="color: #339933;">=&gt;</span> <span style="color: #0000ff;">'Tuesday'</span><span style="color: #339933;">,</span>
		<span style="color: #cc66cc;">3</span> <span style="color: #339933;">=&gt;</span> <span style="color: #0000ff;">'Wednesday'</span><span style="color: #339933;">,</span>
		<span style="color: #cc66cc;">4</span> <span style="color: #339933;">=&gt;</span> <span style="color: #0000ff;">'Thursday'</span><span style="color: #339933;">,</span>
		<span style="color: #cc66cc;">5</span> <span style="color: #339933;">=&gt;</span> <span style="color: #0000ff;">'Friday'</span><span style="color: #339933;">,</span>
		<span style="color: #cc66cc;">6</span> <span style="color: #339933;">=&gt;</span> <span style="color: #0000ff;">'Saturday'</span><span style="color: #339933;">,</span>
	<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #666666; font-style: italic;">// calculate the date</span>
	<span style="color: #000088;">$dateParts</span> <span style="color: #339933;">=</span> <span style="color: #990000;">explode</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'-'</span><span style="color: #339933;">,</span> <span style="color: #000088;">$date</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	<span style="color: #000088;">$century</span> <span style="color: #339933;">=</span> <span style="color: #990000;">substr</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$dateParts</span><span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">2</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	<span style="color: #000088;">$year</span> <span style="color: #339933;">=</span> <span style="color: #990000;">substr</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$dateParts</span><span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">2</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #666666; font-style: italic;">// 1. Get the number for the corresponding century from the centuries table</span>
	<span style="color: #000088;">$a</span> <span style="color: #339933;">=</span> get_century_code<span style="color: #009900;">&#40;</span><span style="color: #000088;">$dateParts</span><span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #666666; font-style: italic;">// 2. Get the last two digits from the year</span>
	<span style="color: #000088;">$b</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$year</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #666666; font-style: italic;">// 3. Divide the number from step 2 by 4 and get it without the remainder</span>
	<span style="color: #000088;">$c</span> <span style="color: #339933;">=</span> <span style="color: #990000;">floor</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$year</span> <span style="color: #339933;">/</span> <span style="color: #cc66cc;">4</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #666666; font-style: italic;">// 4. Get the month number from the month table</span>
	<span style="color: #000088;">$d</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$months</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$dateParts</span><span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #666666; font-style: italic;">// 5. Sum the numbers from steps 1 to 4</span>
	<span style="color: #000088;">$e</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$a</span> <span style="color: #339933;">+</span> <span style="color: #000088;">$b</span> <span style="color: #339933;">+</span> <span style="color: #000088;">$c</span> <span style="color: #339933;">+</span> <span style="color: #000088;">$d</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #666666; font-style: italic;">// 6. Divide it by 7 and take the remainder</span>
	<span style="color: #000088;">$f</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$e</span> <span style="color: #339933;">%</span> <span style="color: #cc66cc;">7</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #666666; font-style: italic;">// 7. Find the result of step 6 in the days table</span>
	<span style="color: #b1b100;">return</span> <span style="color: #000088;">$days</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$f</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #666666; font-style: italic;">// Sunday</span>
<span style="color: #b1b100;">echo</span> get_day_from_date<span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'31-1-1883'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<h2>Application</h2>
<p>This algorithm can be applied in many different cases although most of the libraries has built-in functions that can do that. The only problem besides that is that there are much more efficient algorithms that don&#8217;t need additional space (tables) of data. However this algorithm isn&#8217;t difficult to implement and it gives a good outlook of some facts in the calendar.</p>
<p>Related posts:<ol>
<li><a href='http://www.stoimen.com/blog/2012/05/08/computer-algorithms-determine-if-a-number-is-prime/' rel='bookmark' title='Computer Algorithms: Determine if a Number is Prime'>Computer Algorithms: Determine if a Number is Prime</a></li>
<li><a href='http://www.stoimen.com/blog/2012/01/02/computer-algorithms-interpolation-search/' rel='bookmark' title='Computer Algorithms: Interpolation Search'>Computer Algorithms: Interpolation Search</a></li>
<li><a href='http://www.stoimen.com/blog/2012/02/06/computer-algorithms-data-compression-with-prefix-encoding/' rel='bookmark' title='Computer Algorithms: Data Compression with Prefix Encoding'>Computer Algorithms: Data Compression with Prefix Encoding</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.stoimen.com/blog/2012/04/24/computer-algorithms-how-to-determine-the-day-of-the-week/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Computer Algorithms: Boyer-Moore String Searching</title>
		<link>http://www.stoimen.com/blog/2012/04/17/computer-algorithms-boyer-moore-string-searching/</link>
		<comments>http://www.stoimen.com/blog/2012/04/17/computer-algorithms-boyer-moore-string-searching/#comments</comments>
		<pubDate>Tue, 17 Apr 2012 08:24:46 +0000</pubDate>
		<dc:creator>Stoimen</dc:creator>
				<category><![CDATA[algorithms]]></category>
		<category><![CDATA[Boyer–Moore string search algorithm]]></category>
		<category><![CDATA[Boyer–Moore–Horspool algorithm]]></category>
		<category><![CDATA[Computer programming]]></category>
		<category><![CDATA[Computer science]]></category>
		<category><![CDATA[Computing]]></category>
		<category><![CDATA[J Strother Moore]]></category>
		<category><![CDATA[Morris-Pratt algorithm]]></category>
		<category><![CDATA[natural language search]]></category>
		<category><![CDATA[pattern forward]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Rabin-Karp algorithm]]></category>
		<category><![CDATA[Rabin-Karp string search algorithm]]></category>
		<category><![CDATA[Robert S. Boyer]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[search algorithms]]></category>
		<category><![CDATA[String searching algorithm]]></category>
		<category><![CDATA[string searching algorithms]]></category>
		<category><![CDATA[Strlen]]></category>
		<category><![CDATA[Substring]]></category>

		<guid isPermaLink="false">http://www.stoimen.com/blog/?p=3049</guid>
		<description><![CDATA[Introduction Have you ever asked yourself which is the algorithm used to find a word after clicking Ctrl+F and typing something? Well I guess you know the answer from the title, but in this article you’ll find out how exactly &#8230; <a href="http://www.stoimen.com/blog/2012/04/17/computer-algorithms-boyer-moore-string-searching/">Continue reading <span class="meta-nav">&#8594;</span></a>
Related posts:<ol>
<li><a href='http://www.stoimen.com/blog/2012/04/09/computer-algorithms-morris-pratt-string-searching/' rel='bookmark' title='Computer Algorithms: Morris-Pratt String Searching'>Computer Algorithms: Morris-Pratt String Searching</a></li>
<li><a href='http://www.stoimen.com/blog/2012/04/02/computer-algorithms-rabin-karp-string-searching/' rel='bookmark' title='Computer Algorithms: Rabin-Karp String Searching'>Computer Algorithms: Rabin-Karp String Searching</a></li>
<li><a href='http://www.stoimen.com/blog/2012/03/27/computer-algorithms-brute-force-string-matching/' rel='bookmark' title='Computer Algorithms: Brute Force String Matching'>Computer Algorithms: Brute Force String Matching</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<h2>Introduction</h2>
<p>Have you ever asked yourself which is the algorithm used to find a word after clicking Ctrl+F and typing something? Well I guess you know the answer from the title, but in this article you’ll find out how exactly this is done.</p>
<p>As we saw from the <a href="http://www.stoimen.com/blog/2012/04/09/computer-algorithms-morris-pratt-string-searching/" title="Computer Algorithms: Morris-Pratt String Searching">Morris-Pratt string searching</a> we don’t need to compare the text and the pattern character by character. Some comparisons can be skipped in order to improve the performance of the string searching. Indeed the <a href="http://www.stoimen.com/blog/2012/03/27/computer-algorithms-brute-force-string-matching/" title="Computer Algorithms: Brute Force String Matching">brute force string searching</a> and the <a href="http://www.stoimen.com/blog/2012/04/02/computer-algorithms-rabin-karp-string-searching/" title="Computer Algorithms: Rabin-Karp String Searching">Rabin-Karp algorithm</a> are quite slow only because they compare the pattern and the text character by character.</p>
<p>In the other hand the Morris-Pratt algorithm is a very good improvement of the brute force string searching, but the question remains. Is there any algorithm that is faster than Morris-Pratt &#8211; is there any way to skip more comparisons and to move the pattern faster.</p>
<p>It’s clear that if we have to find whether a single character is contained into a text we need at least &#8220;n&#8221; steps, where n is the length of the text. Once we have to find whether a pattern with the length of &#8220;m&#8221; is contained into a text with length of &#8220;n&#8221; the case is getting a little more complex.</p>
<p>However the answer is that there is such algorithm that is faster and more suitable than Morris-Pratt. This is the Boyer-Moore string searching.</p>
<h2>Overview</h2>
<p>Boyer-Moore is an algorithm that improves the performance of pattern searching into a text by considering some observations. It is defined in 1977 by <a href="http://en.wikipedia.org/wiki/Robert_S._Boyer" title="Robert S. Boyer" target="_blank">Robert S. Boyer</a> and <a href="http://en.wikipedia.org/wiki/J_Strother_Moore" title="J Strother Moore" target="_blank">J Strother Moore</a> and it consist of some specific features. </p>
<p>First of all this algorithm starts comparing the pattern from the leftmost part of text and moves it to the right, as on the picture below.</p>
<p><div id="attachment_3059" class="wp-caption alignnone" style="width: 628px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/04/Boyer-MooreShiftingDirection.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/04/Boyer-MooreShiftingDirection.png" alt="Boyer-Moore Shifting Direction" title="Boyer-Moore Shifting Direction" width="618" height="153" class="size-full wp-image-3059" /></a><p class="wp-caption-text">In Boyer-Moore the pattern is shifted from left to right!</p></div><span id="more-3049"></span></p>
<p>Unlike other string searching algorithms though, Boyer-Moore compares the pattern against a possible match from right to left as shown below.</p>
<div id="attachment_3066" class="wp-caption alignnone" style="width: 632px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/04/Boyer-MooreComparisonModel.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/04/Boyer-MooreComparisonModel.png" alt="Boyer-Moore Comparison Model" title="Boyer-Moore Comparison Model" width="622" height="169" class="size-full wp-image-3066" /></a><p class="wp-caption-text">Unlike other algorithms the letters of the pattern are compared from right to left!</p></div>
<p>The main idea of Boyer-Moore in order to improve the performance are some observations of the pattern. In the terminology of this algorithm they are called good-suffix and bad-character shifts. Let’s see by the following examples what they are standing for.</p>
<h3>Good-suffix Shifts</h3>
<p>Just like the Morris-Pratt algorithm we start to compare the pattern against some portion of the text where a possible match will occur. In Boyer-Moore as I said this is done from the rightmost letter of the pattern. After some characters have matched we find a mismatch.</p>
<div id="attachment_3065" class="wp-caption alignnone" style="width: 632px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/04/Boyer-MooreAMismatch.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/04/Boyer-MooreAMismatch.png" alt="Boyer-Moore a Mismatch" title="Boyer-Moore a Mismatch" width="622" height="214" class="size-full wp-image-3065" /></a><p class="wp-caption-text"> </p></div>
<p>So how can we move the pattern to the right in order to skip unusual comparisons. To answer this question we need to explore the pattern. Let’s say there is a portion of the pattern that is repeated inside the pattern itself, like it is shown on the picture below.</p>
<div id="attachment_3062" class="wp-caption alignnone" style="width: 632px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/04/Boyer-MooreGood-suffixShift1.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/04/Boyer-MooreGood-suffixShift1.png" alt="Boyer-Moore Good-suffix Shift 1" title="Boyer-Moore Good-suffix Shift 1" width="622" height="195" class="size-full wp-image-3062" /></a><p class="wp-caption-text">The pattern may consist of repeating portions of characters!</p></div>
<p>In this case we must move the pattern thus the repeated portion must now align with its first occurrence in the pattern.</p>
<p>A variation of this case is when the portion from the pattern A overlaps with another portion that consists of the same characters.</p>
<div id="attachment_3061" class="wp-caption alignnone" style="width: 632px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/04/Boyer-MooreGoodSuffixShift2.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/04/Boyer-MooreGoodSuffixShift2.png" alt="Boyer-Moore Good Suffix Shift 2" title="Boyer-Moore Good Suffix Shift 2" width="622" height="195" class="size-full wp-image-3061" /></a><p class="wp-caption-text">Sometimes these portions may overlap!</p></div>
<p>Yet again the shift must align the second portion with its first occurrence. </p>
<p>Finally only a portion of A, let’s say &#8220;B&#8221;, can happen to occur in the very beginning of the pattern, as on the diagram below.</p>
<div id="attachment_3060" class="wp-caption alignnone" style="width: 632px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/04/Boyer-MooreGoodSuffixShift3.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/04/Boyer-MooreGoodSuffixShift3.png" alt="Boyer-Moore Good Suffix Shift 3" title="Boyer-Moore Good Suffix Shift 3" width="622" height="195" class="size-full wp-image-3060" /></a><p class="wp-caption-text">Only a sub-string of the pattern may re-occur at its front!</p></div>
<p>Now we must align the left end of the pattern with the rightmost occurrence of &#8220;B&#8221;.</p>
<h3>Bad Character Shifts</h3>
<p>Beside the good-suffix shifts the Boyer-Moore algorithm make use of the so called bad-character shifts. In case of a mismatch we can skip comparisons in case the character in the text doesn’t happen to appear in the pattern. To become clearer let’s see the following examples.</p>
<div id="attachment_3064" class="wp-caption alignnone" style="width: 633px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/04/Boyer-MooreBadCharacter1.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/04/Boyer-MooreBadCharacter1.png" alt="Boyer-Moore Bad Character 1" title="Boyer-Moore Bad Character 1" width="623" height="237" class="size-full wp-image-3064" /></a><p class="wp-caption-text">If the mismatched letter of the text appears in the pattern only in its front we can align it easily!</p></div>
<p>In the picture above we see that the mismatched character &#8220;B&#8221; from the text appears only in the beginning of the pattern. Thus we can simply shift the pattern to the right and align both characters B, skipping comparisons. An even better case is described by the following diagram where the mismatched letter isn’t contained into the pattern at all. Then we can shift forward the whole pattern.</p>
<div id="attachment_3063" class="wp-caption alignnone" style="width: 633px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/04/Boyer-MooreBadCharacter2.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/04/Boyer-MooreBadCharacter2.png" alt="Boyer-Moore Bad Character 2" title="Boyer-Moore Bad Character 2" width="623" height="237" class="size-full wp-image-3063" /></a><p class="wp-caption-text">In case the mismatched letter isn&#039;t contained into the pattern we move forward the pattern!</p></div>
<h3>Maximum of Good-suffix and Bad-Character shifts</h3>
<p>Boyer-Moore needs both good-suffix and bad-character shifts in order to speed up searching performance. After a mismatch the maximum of both is considered in order to move the pattern to the right.</p>
<h2>Complexity</h2>
<p>It&#8217;s clear that Boyer-Moore is faster than Morris-Pratt, but actually its worst-case complexity is O(n+m). The thing is that in natural language search Boyer-Moore does pretty well.</p>
<div id="attachment_3073" class="wp-caption alignnone" style="width: 610px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/04/Boyer-Moore-Complexity.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/04/Boyer-Moore-Complexity.png" alt="Boyer-Moore Complexity" title="Boyer-Moore Complexity" width="600" height="371" class="size-full wp-image-3073" /></a><p class="wp-caption-text">Worst-case scenario of Boyer-Moore - O(m+n)</p></div>
<h2>Implementation</h2>
<p>Finally let’s see the implementation in <a href="http://www.stoimen.com/blog/category/php/" title="PHP on stoimen.com">PHP</a>, which can be easily &#8220;transcribed&#8221; into any other programming language. The only thing we need is the structures for bad-character shifts and good-suffixes shifts.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">&lt;?php</span>
&nbsp;
<span style="color: #009933; font-style: italic;">/**
 * Pattern we're searching for
 *
 * @var string
 */</span>
<span style="color: #000088;">$pattern</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">'gloria'</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #009933; font-style: italic;">/**
 * The text we're searching in
 *
 * @var string
 */</span>
<span style="color: #000088;">$text</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">'Sic transit gloria mundi, non transit gloria Gundi!'</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #009933; font-style: italic;">/**
 * Calculates the suffixes for a given pattern
 *
 * @param string $pattern
 * @param array  $suffixes
 */</span>
<span style="color: #000000; font-weight: bold;">function</span> suffixes<span style="color: #009900;">&#40;</span><span style="color: #000088;">$pattern</span><span style="color: #339933;">,</span> <span style="color: #339933;">&amp;</span><span style="color: #000088;">$suffixes</span><span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span>
   <span style="color: #000088;">$m</span> <span style="color: #339933;">=</span> <span style="color: #990000;">strlen</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$pattern</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
   <span style="color: #000088;">$suffixes</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$m</span> <span style="color: #339933;">-</span> <span style="color: #cc66cc;">1</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$m</span><span style="color: #339933;">;</span>
   <span style="color: #000088;">$g</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$m</span> <span style="color: #339933;">-</span> <span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span>
&nbsp;
   <span style="color: #b1b100;">for</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$m</span> <span style="color: #339933;">-</span> <span style="color: #cc66cc;">2</span><span style="color: #339933;">;</span> <span style="color: #000088;">$i</span> <span style="color: #339933;">&gt;=</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span> <span style="color: #339933;">--</span><span style="color: #000088;">$i</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
      <span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">&gt;</span> <span style="color: #000088;">$g</span> <span style="color: #339933;">&amp;&amp;</span> <span style="color: #000088;">$suffixes</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">+</span> <span style="color: #000088;">$m</span> <span style="color: #339933;">-</span> <span style="color: #cc66cc;">1</span> <span style="color: #339933;">-</span> <span style="color: #000088;">$f</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">&lt;</span> <span style="color: #000088;">$i</span> <span style="color: #339933;">-</span> <span style="color: #000088;">$g</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
         <span style="color: #000088;">$suffixes</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$suffixes</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">+</span> <span style="color: #000088;">$m</span> <span style="color: #339933;">-</span> <span style="color: #cc66cc;">1</span> <span style="color: #339933;">-</span> <span style="color: #000088;">$f</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
      <span style="color: #009900;">&#125;</span> <span style="color: #b1b100;">else</span> <span style="color: #009900;">&#123;</span>
         <span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">&lt;</span> <span style="color: #000088;">$g</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
            <span style="color: #000088;">$g</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$i</span><span style="color: #339933;">;</span>
         <span style="color: #009900;">&#125;</span>
         <span style="color: #000088;">$f</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$i</span><span style="color: #339933;">;</span>
&nbsp;
         <span style="color: #b1b100;">while</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$g</span> <span style="color: #339933;">&gt;=</span> <span style="color: #cc66cc;">0</span> <span style="color: #339933;">&amp;&amp;</span> <span style="color: #000088;">$pattern</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$g</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">==</span> <span style="color: #000088;">$pattern</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$g</span> <span style="color: #339933;">+</span> <span style="color: #000088;">$m</span> <span style="color: #339933;">-</span> <span style="color: #cc66cc;">1</span> <span style="color: #339933;">-</span> <span style="color: #000088;">$f</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
            <span style="color: #000088;">$g</span><span style="color: #339933;">--;</span>
         <span style="color: #009900;">&#125;</span>
         <span style="color: #000088;">$suffixes</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$f</span> <span style="color: #339933;">-</span> <span style="color: #000088;">$g</span><span style="color: #339933;">;</span>
      <span style="color: #009900;">&#125;</span>
   <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #009933; font-style: italic;">/**
 * Fills in the array of bad characters.
 *
 * @param string $pattern
 * @param array  $badChars
 */</span>
<span style="color: #000000; font-weight: bold;">function</span> badCharacters<span style="color: #009900;">&#40;</span><span style="color: #000088;">$pattern</span><span style="color: #339933;">,</span> <span style="color: #339933;">&amp;</span><span style="color: #000088;">$badChars</span><span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span>
   <span style="color: #000088;">$m</span> <span style="color: #339933;">=</span> <span style="color: #990000;">strlen</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$pattern</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
   <span style="color: #b1b100;">for</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span> <span style="color: #000088;">$i</span> <span style="color: #339933;">&lt;</span> <span style="color: #000088;">$m</span> <span style="color: #339933;">-</span> <span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span> <span style="color: #339933;">++</span><span style="color: #000088;">$i</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
      <span style="color: #000088;">$badChars</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$pattern</span><span style="color: #009900;">&#123;</span><span style="color: #000088;">$i</span><span style="color: #009900;">&#125;</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$m</span> <span style="color: #339933;">-</span> <span style="color: #000088;">$i</span> <span style="color: #339933;">-</span> <span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span>
   <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #009933; font-style: italic;">/**
 * Fills in the array of good suffixes
 *
 * @param string $pattern
 * @param array  $goodSuffixes
 */</span>
<span style="color: #000000; font-weight: bold;">function</span> goodSuffixes<span style="color: #009900;">&#40;</span><span style="color: #000088;">$pattern</span><span style="color: #339933;">,</span> <span style="color: #339933;">&amp;</span><span style="color: #000088;">$goodSuffixes</span><span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span>
   <span style="color: #000088;">$m</span> 		<span style="color: #339933;">=</span> <span style="color: #990000;">strlen</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$pattern</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
   <span style="color: #000088;">$suff</span> 	<span style="color: #339933;">=</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
   suffixes<span style="color: #009900;">&#40;</span><span style="color: #000088;">$pattern</span><span style="color: #339933;">,</span> <span style="color: #000088;">$suff</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
   <span style="color: #b1b100;">for</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span> <span style="color: #000088;">$i</span> <span style="color: #339933;">&lt;</span> <span style="color: #000088;">$m</span><span style="color: #339933;">;</span> <span style="color: #000088;">$i</span><span style="color: #339933;">++</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
      <span style="color: #000088;">$goodSuffixes</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$m</span><span style="color: #339933;">;</span>
   <span style="color: #009900;">&#125;</span>
&nbsp;
   <span style="color: #b1b100;">for</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$m</span> <span style="color: #339933;">-</span> <span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span> <span style="color: #000088;">$i</span> <span style="color: #339933;">&gt;=</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span> <span style="color: #000088;">$i</span><span style="color: #339933;">--</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
      <span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$suff</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">==</span> <span style="color: #000088;">$i</span> <span style="color: #339933;">+</span> <span style="color: #cc66cc;">1</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
         <span style="color: #b1b100;">for</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$j</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span> <span style="color: #000088;">$j</span> <span style="color: #339933;">&lt;</span> <span style="color: #000088;">$m</span> <span style="color: #339933;">-</span> <span style="color: #000088;">$i</span> <span style="color: #339933;">-</span> <span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span> <span style="color: #000088;">$j</span><span style="color: #339933;">++</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
            <span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$goodSuffixes</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$j</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">==</span> <span style="color: #000088;">$m</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
               <span style="color: #000088;">$goodSuffixes</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$j</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$m</span> <span style="color: #339933;">-</span> <span style="color: #000088;">$i</span> <span style="color: #339933;">-</span> <span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span>
            <span style="color: #009900;">&#125;</span>
         <span style="color: #009900;">&#125;</span>
      <span style="color: #009900;">&#125;</span>
   <span style="color: #009900;">&#125;</span>
&nbsp;
   <span style="color: #b1b100;">for</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span> <span style="color: #000088;">$i</span> <span style="color: #339933;">&lt;</span> <span style="color: #000088;">$m</span> <span style="color: #339933;">-</span> <span style="color: #cc66cc;">2</span><span style="color: #339933;">;</span> <span style="color: #000088;">$i</span><span style="color: #339933;">++</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
      <span style="color: #000088;">$goodSuffixes</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$m</span> <span style="color: #339933;">-</span> <span style="color: #cc66cc;">1</span> <span style="color: #339933;">-</span> <span style="color: #000088;">$suff</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$m</span> <span style="color: #339933;">-</span> <span style="color: #000088;">$i</span> <span style="color: #339933;">-</span> <span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span>
   <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #009933; font-style: italic;">/**
 * Performs a search of the pattern into a given text
 *
 * @param string $pattern
 * @param string $text
 */</span>
<span style="color: #000000; font-weight: bold;">function</span> boyer_moore<span style="color: #009900;">&#40;</span><span style="color: #000088;">$pattern</span><span style="color: #339933;">,</span> <span style="color: #000088;">$text</span><span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span>
   <span style="color: #000088;">$n</span> <span style="color: #339933;">=</span> <span style="color: #990000;">strlen</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$text</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
   <span style="color: #000088;">$m</span> <span style="color: #339933;">=</span> <span style="color: #990000;">strlen</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$pattern</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
   <span style="color: #000088;">$goodSuffixes</span> 	<span style="color: #339933;">=</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
   <span style="color: #000088;">$badCharacters</span> 	<span style="color: #339933;">=</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
   goodSuffixes<span style="color: #009900;">&#40;</span><span style="color: #000088;">$pattern</span><span style="color: #339933;">,</span> <span style="color: #339933;">&amp;</span><span style="color: #000088;">$goodSuffixes</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
   badCharacters<span style="color: #009900;">&#40;</span><span style="color: #000088;">$pattern</span><span style="color: #339933;">,</span> <span style="color: #339933;">&amp;</span><span style="color: #000088;">$badCharacters</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
   <span style="color: #000088;">$j</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span>
   <span style="color: #b1b100;">while</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$j</span> <span style="color: #339933;">&lt;</span> <span style="color: #000088;">$n</span> <span style="color: #339933;">-</span> <span style="color: #000088;">$m</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
      <span style="color: #b1b100;">for</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$m</span> <span style="color: #339933;">-</span> <span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span> <span style="color: #000088;">$i</span> <span style="color: #339933;">&gt;=</span> <span style="color: #cc66cc;">0</span> <span style="color: #339933;">&amp;&amp;</span> <span style="color: #000088;">$pattern</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">==</span> <span style="color: #000088;">$text</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">+</span> <span style="color: #000088;">$j</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span> <span style="color: #000088;">$i</span><span style="color: #339933;">--</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
      <span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">&lt;</span> <span style="color: #cc66cc;">0</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
         <span style="color: #666666; font-style: italic;">// note that if the substring occurs more</span>
         <span style="color: #666666; font-style: italic;">// than once into the text, the algorithm will</span>
         <span style="color: #666666; font-style: italic;">// print out each position of the substring</span>
         <span style="color: #b1b100;">echo</span> <span style="color: #000088;">$j</span><span style="color: #339933;">;</span>
         <span style="color: #000088;">$j</span> <span style="color: #339933;">+=</span> <span style="color: #000088;">$goodSuffixes</span><span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">0</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
      <span style="color: #009900;">&#125;</span> <span style="color: #b1b100;">else</span> <span style="color: #009900;">&#123;</span>
         <span style="color: #000088;">$j</span> <span style="color: #339933;">+=</span> <span style="color: #990000;">max</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$goodSuffixes</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">,</span> <span style="color: #000088;">$badCharacters</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$text</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">+</span> <span style="color: #000088;">$j</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">-</span> <span style="color: #000088;">$m</span> <span style="color: #339933;">+</span> <span style="color: #000088;">$i</span> <span style="color: #339933;">+</span> <span style="color: #cc66cc;">1</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
      <span style="color: #009900;">&#125;</span>
   <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #666666; font-style: italic;">// search using Boyer-Moore</span>
<span style="color: #666666; font-style: italic;">// will return 12 and 38</span>
boyer_moore<span style="color: #009900;">&#40;</span><span style="color: #000088;">$pattern</span><span style="color: #339933;">,</span> <span style="color: #000088;">$text</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<h2>Application</h2>
<p>Boyer-Moore is one of the most used string searching algorithm in practice. It is intuitively clear where it can be useful, but yet again I’ll say only that this algorithm is considered as the mostly used in practice for search and replace operations in text editors.</p>
<div id="attachment_3068" class="wp-caption alignnone" style="width: 630px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/04/Boyer-MooreApplication.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/04/Boyer-MooreApplication.png" alt="Boyer-Moore Application" title="Boyer-Moore Application" width="620" height="399" class="size-full wp-image-3068" /></a><p class="wp-caption-text"> </p></div>
<p>Related posts:<ol>
<li><a href='http://www.stoimen.com/blog/2012/04/09/computer-algorithms-morris-pratt-string-searching/' rel='bookmark' title='Computer Algorithms: Morris-Pratt String Searching'>Computer Algorithms: Morris-Pratt String Searching</a></li>
<li><a href='http://www.stoimen.com/blog/2012/04/02/computer-algorithms-rabin-karp-string-searching/' rel='bookmark' title='Computer Algorithms: Rabin-Karp String Searching'>Computer Algorithms: Rabin-Karp String Searching</a></li>
<li><a href='http://www.stoimen.com/blog/2012/03/27/computer-algorithms-brute-force-string-matching/' rel='bookmark' title='Computer Algorithms: Brute Force String Matching'>Computer Algorithms: Brute Force String Matching</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.stoimen.com/blog/2012/04/17/computer-algorithms-boyer-moore-string-searching/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>You think you know &#8230; algorithms?</title>
		<link>http://www.stoimen.com/blog/2012/04/11/you-think-you-know-algorithms/</link>
		<comments>http://www.stoimen.com/blog/2012/04/11/you-think-you-know-algorithms/#comments</comments>
		<pubDate>Wed, 11 Apr 2012 08:04:34 +0000</pubDate>
		<dc:creator>Stoimen</dc:creator>
				<category><![CDATA[quiz]]></category>
		<category><![CDATA[algorithms]]></category>
		<category><![CDATA[Education]]></category>
		<category><![CDATA[Super Quiz]]></category>
		<category><![CDATA[weekly quiz]]></category>

		<guid isPermaLink="false">http://www.stoimen.com/blog/?p=3021</guid>
		<description><![CDATA[Note!– By reading this blog you can get the right answers! Loading&#8230; The answers will be published a week after the quiz has started! Related posts: You think you know &#8230; javascript? You think you know &#8230; PHP? You think &#8230; <a href="http://www.stoimen.com/blog/2012/04/11/you-think-you-know-algorithms/">Continue reading <span class="meta-nav">&#8594;</span></a>
Related posts:<ol>
<li><a href='http://www.stoimen.com/blog/2012/02/29/you-think-you-know-javascript/' rel='bookmark' title='You think you know &#8230; javascript?'>You think you know &#8230; javascript?</a></li>
<li><a href='http://www.stoimen.com/blog/2012/03/09/you-think-you-know-php/' rel='bookmark' title='You think you know &#8230; PHP?'>You think you know &#8230; PHP?</a></li>
<li><a href='http://www.stoimen.com/blog/2012/03/07/you-think-you-know-javascript-quiz-results/' rel='bookmark' title='You think you know javascript. Quiz results!'>You think you know javascript. Quiz results!</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>Note!– By reading this blog you can get the right answers!</p>
<p><iframe src="https://docs.google.com/spreadsheet/embeddedform?formkey=dHh3NkJiOTJYYkVNem9RVTNVWVh0V0E6MQ" width="640" height="1000" frameborder="0" marginheight="0" marginwidth="0">Loading&#8230;</iframe></p>
<p>The answers will be published a week after the quiz has started!</p>
<p>Related posts:<ol>
<li><a href='http://www.stoimen.com/blog/2012/02/29/you-think-you-know-javascript/' rel='bookmark' title='You think you know &#8230; javascript?'>You think you know &#8230; javascript?</a></li>
<li><a href='http://www.stoimen.com/blog/2012/03/09/you-think-you-know-php/' rel='bookmark' title='You think you know &#8230; PHP?'>You think you know &#8230; PHP?</a></li>
<li><a href='http://www.stoimen.com/blog/2012/03/07/you-think-you-know-javascript-quiz-results/' rel='bookmark' title='You think you know javascript. Quiz results!'>You think you know javascript. Quiz results!</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.stoimen.com/blog/2012/04/11/you-think-you-know-algorithms/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Computer Algorithms: Morris-Pratt String Searching</title>
		<link>http://www.stoimen.com/blog/2012/04/09/computer-algorithms-morris-pratt-string-searching/</link>
		<comments>http://www.stoimen.com/blog/2012/04/09/computer-algorithms-morris-pratt-string-searching/#comments</comments>
		<pubDate>Mon, 09 Apr 2012 19:41:05 +0000</pubDate>
		<dc:creator>Stoimen</dc:creator>
				<category><![CDATA[algorithms]]></category>
		<category><![CDATA[Applied mathematics]]></category>
		<category><![CDATA[Boyer–Moore string search algorithm]]></category>
		<category><![CDATA[Brute-force search]]></category>
		<category><![CDATA[Complexity This algorithm]]></category>
		<category><![CDATA[Computer programming]]></category>
		<category><![CDATA[Computer science]]></category>
		<category><![CDATA[faster string searching algorithms]]></category>
		<category><![CDATA[James H. Morris]]></category>
		<category><![CDATA[Knuth–Morris–Pratt algorithm]]></category>
		<category><![CDATA[Lorem ipsum]]></category>
		<category><![CDATA[Morris-Pratt algorithm]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Rabin]]></category>
		<category><![CDATA[Rabin-Karp algorithm]]></category>
		<category><![CDATA[Rabin-Karp string search algorithm]]></category>
		<category><![CDATA[search algorithms]]></category>
		<category><![CDATA[String searching algorithm]]></category>
		<category><![CDATA[Vaughan Pratt]]></category>

		<guid isPermaLink="false">http://www.stoimen.com/blog/?p=3019</guid>
		<description><![CDATA[Introduction We saw that neither brute force string searching nor Rabin-Karp string searching are effective. However in order to improve some algorithm, first we need to understand its principles in detail. We know already that brute force string matching is &#8230; <a href="http://www.stoimen.com/blog/2012/04/09/computer-algorithms-morris-pratt-string-searching/">Continue reading <span class="meta-nav">&#8594;</span></a>
Related posts:<ol>
<li><a href='http://www.stoimen.com/blog/2012/04/17/computer-algorithms-boyer-moore-string-searching/' rel='bookmark' title='Computer Algorithms: Boyer-Moore String Searching'>Computer Algorithms: Boyer-Moore String Searching</a></li>
<li><a href='http://www.stoimen.com/blog/2012/04/02/computer-algorithms-rabin-karp-string-searching/' rel='bookmark' title='Computer Algorithms: Rabin-Karp String Searching'>Computer Algorithms: Rabin-Karp String Searching</a></li>
<li><a href='http://www.stoimen.com/blog/2012/03/27/computer-algorithms-brute-force-string-matching/' rel='bookmark' title='Computer Algorithms: Brute Force String Matching'>Computer Algorithms: Brute Force String Matching</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<h2>Introduction</h2>
<p>We saw that neither <a href="http://www.stoimen.com/blog/2012/03/27/computer-algorithms-brute-force-string-matching/" title="Computer Algorithms: Brute Force String Matching">brute force string searching</a> nor <a href="http://www.stoimen.com/blog/2012/04/02/computer-algorithms-rabin-karp-string-searching/" title="Computer Algorithms: Rabin-Karp String Searching">Rabin-Karp string searching</a> are effective. However in order to improve some algorithm, first we need to understand its principles in detail. We know already that brute force string matching is slow and we tried to improve it somehow by using a hash function in the Rabin-Karp algorithm. The problem is that Rabin-Karp has the same complexity as brute force string matching, which is O(mn).</p>
<p>Obviously we need a different approach, but to come with a different approach let’s see what’s wrong with brute force string searching. Indeed by taking a closer look at its principles we can answer the question. </p>
<p>In brute force matching we checked each character of the text with the first character of the pattern. In case of a match we shifted the comparison between the second character of the pattern and the next character of the text. The problem is that in case of a mismatch we must go several positions back in the text. Well in fact this technique can’t be optimized. </p>
<p><div id="attachment_3025" class="wp-caption alignnone" style="width: 630px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/04/Morris-Pratt-brute-force-string-matching.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/04/Morris-Pratt-brute-force-string-matching.png" alt="Morris-Pratt brute force string matching" title="Morris-Pratt brute force string matching" width="620" height="360" class="size-full wp-image-3025" /></a><p class="wp-caption-text">In brute force string matching in case of a mismatch we go back and we compare characters that has been compared already!</p></div><span id="more-3019"></span></p>
<p>As you can see on the picture above the problem is that once there is a mismatch we must rollback and start comparing from a position in the text that has been explored already. In our case we have checked the first, second, third and fourth letters, where there is a mismatch between the pattern and the text and then &#8230; we go back and start comparing from the second letter of the text.</p>
<p>This is completely useless, because we already know that the pattern begins with the letter “a” and no such letter happens to be between positions 1 and 3. So how can we improve this redundancy?</p>
<h2>Overview</h2>
<p>The answer of the question came to <a href="http://en.wikipedia.org/wiki/James_H._Morris" title="James H. Morris" target="_blank">James H. Morris</a> and <a href="http://en.wikipedia.org/wiki/Vaughan_Pratt" title="Vaughan Pratt" target="_blank">Vaughan Pratt</a> in 1977 when they described their algorithm, which by skipping lots of useless comparisons is more effective than brute force string matching. Let’s see it in detail. The only thing is to use the information gathered during the comparisons of the pattern and a possible match, as on the picture below.</p>
<div id="attachment_3029" class="wp-caption alignnone" style="width: 630px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/04/Morris-Pratt-basic-principles.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/04/Morris-Pratt-basic-principles.png" alt="Morris-Pratt basic principles" title="Morris-Pratt basic principles" width="620" height="483" class="size-full wp-image-3029" /></a><p class="wp-caption-text">Morris-Pratt skips some comparisons by moving ahead to the next possible position of a match!</p></div>
<p>To do that first we have to preprocess the pattern in order to get possible positions for next matches. Thus after we start to find a possible match in case of a mismatch we’ll know exactly where we should jump in order to skip unusual comparisons.</p>
<h3>Generating the Table of Next Positions</h3>
<p>This is the tricky part in Morris-Pratt and that is how this algorithm overcomes the disadvantages of brute force string searching. Let&#8217;s see some pictures.</p>
<div id="attachment_3044" class="wp-caption alignnone" style="width: 633px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/04/Morris-Pratt-with-no-repeating-letters-in-the-pattern.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/04/Morris-Pratt-with-no-repeating-letters-in-the-pattern.png" alt="Morris-Pratt with no repeating letters in the pattern" title="Morris-Pratt with no repeating letters in the pattern" width="623" height="361" class="size-full wp-image-3044" /></a><p class="wp-caption-text">It is clear that if the pattern consists only of different letters in case of a mismatch we should start comparing the next character of the text with the first character of the pattern!</p></div>
<p>However in case of repeating character in the pattern if we have a mismatch after that character a possible match must begin from this repeating character, as on the picture bellow.</p>
<div id="attachment_3045" class="wp-caption alignnone" style="width: 632px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/04/Morris-Pratt-with-one-repeating-letter-in-the-pattern.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/04/Morris-Pratt-with-one-repeating-letter-in-the-pattern.png" alt="Morris-Pratt with one repeating letter in the pattern" title="Morris-Pratt with one repeating letter in the pattern" width="622" height="404" class="size-full wp-image-3045" /></a><p class="wp-caption-text">The next table is slightly different if the pattern has repeating character! </p></div>
<p>Finally if there are more than one repeating character in the text the &#8220;next&#8221; table will consist show their position.</p>
<div id="attachment_3046" class="wp-caption alignnone" style="width: 632px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/04/Morris-Pratt-more-than-one-repeating-letter-in-the-pattern.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/04/Morris-Pratt-more-than-one-repeating-letter-in-the-pattern.png" alt="Morris-Pratt more than one repeating letter in the pattern" title="Morris-Pratt more than one repeating letter in the pattern" width="622" height="461" class="size-full wp-image-3046" /></a><p class="wp-caption-text">The next table contains the positions of repeating letters!</p></div>
<p>After we have this table of possible “next” positions we can start exploring the text for our pattern.</p>
<div id="attachment_3027" class="wp-caption alignnone" style="width: 632px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/04/Morris-Pratt.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/04/Morris-Pratt.png" alt="Morris-Pratt" title="Morris-Pratt" width="622" height="550" class="size-full wp-image-3027" /></a><p class="wp-caption-text"> </p></div>
<h2>Implementation</h2>
<p>Implementing Morris-Pratt isn’t difficult. First we have to preprocess the pattern and then perform the search. The following <a href="http://www.stoimen.com/blog/category/php/" title="PHP on stoimen.com">PHP</a> code shows you how to do that.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #009933; font-style: italic;">/**
 * Pattern
 * 
 * @var string
 */</span>
<span style="color: #000088;">$pattern</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">'mollis'</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #009933; font-style: italic;">/**
 * Text to search
 * 
 * @var string
 */</span>
<span style="color: #000088;">$text</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Quisque eleifend nisi viverra ipsum elementum porttitor quis at justo. Aliquam ligula felis, dignissim sit amet lobortis eget, lacinia ac augue. Quisque nec est elit, nec ultricies magna. Ut mi libero, dictum sit amet mollis non, aliquam et augue'</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #009933; font-style: italic;">/**
 * Preprocess the pattern and return the &quot;next&quot; table
 * 
 * @param string $pattern
 */</span>
<span style="color: #000000; font-weight: bold;">function</span> preprocessMorrisPratt<span style="color: #009900;">&#40;</span><span style="color: #000088;">$pattern</span><span style="color: #339933;">,</span> <span style="color: #339933;">&amp;</span><span style="color: #000088;">$nextTable</span><span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span>
	<span style="color: #000088;">$i</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span>
	<span style="color: #000088;">$j</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$nextTable</span><span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">0</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> <span style="color: #339933;">-</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span>
	<span style="color: #000088;">$len</span> <span style="color: #339933;">=</span> <span style="color: #990000;">strlen</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$pattern</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #b1b100;">while</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">&lt;</span> <span style="color: #000088;">$len</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #b1b100;">while</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$j</span> <span style="color: #339933;">&gt;</span> <span style="color: #339933;">-</span><span style="color: #cc66cc;">1</span> <span style="color: #339933;">&amp;&amp;</span> <span style="color: #000088;">$pattern</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">!=</span> <span style="color: #000088;">$pattern</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$j</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
			<span style="color: #000088;">$j</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$nextTable</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$j</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
		<span style="color: #009900;">&#125;</span>
&nbsp;
		<span style="color: #000088;">$nextTable</span><span style="color: #009900;">&#91;</span><span style="color: #339933;">++</span><span style="color: #000088;">$i</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> <span style="color: #339933;">++</span><span style="color: #000088;">$j</span><span style="color: #339933;">;</span>
	<span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #009933; font-style: italic;">/**
 * Performs a string search with the Morris-Pratt algorithm
 * 
 * @param string $text
 * @param string $pattern
 */</span>
<span style="color: #000000; font-weight: bold;">function</span> MorrisPratt<span style="color: #009900;">&#40;</span><span style="color: #000088;">$text</span><span style="color: #339933;">,</span> <span style="color: #000088;">$pattern</span><span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span>
	<span style="color: #666666; font-style: italic;">// get the text and pattern lengths</span>
	<span style="color: #000088;">$n</span> <span style="color: #339933;">=</span> <span style="color: #990000;">strlen</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$text</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	<span style="color: #000088;">$m</span> <span style="color: #339933;">=</span> <span style="color: #990000;">strlen</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$pattern</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	<span style="color: #000088;">$nextTable</span> <span style="color: #339933;">=</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #666666; font-style: italic;">// calculate the next table</span>
	preprocessMorrisPratt<span style="color: #009900;">&#40;</span><span style="color: #000088;">$pattern</span><span style="color: #339933;">,</span> <span style="color: #000088;">$nextTable</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #000088;">$i</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$j</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span>
	<span style="color: #b1b100;">while</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$j</span> <span style="color: #339933;">&lt;</span> <span style="color: #000088;">$n</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #b1b100;">while</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">&gt;</span> <span style="color: #339933;">-</span><span style="color: #cc66cc;">1</span> <span style="color: #339933;">&amp;&amp;</span> <span style="color: #000088;">$pattern</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">!=</span> <span style="color: #000088;">$text</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$j</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
			<span style="color: #000088;">$i</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$nextTable</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
		<span style="color: #009900;">&#125;</span>
		<span style="color: #000088;">$i</span><span style="color: #339933;">++;</span>
		<span style="color: #000088;">$j</span><span style="color: #339933;">++;</span>
		<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">&gt;=</span> <span style="color: #000088;">$m</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
			<span style="color: #b1b100;">return</span> <span style="color: #000088;">$j</span> <span style="color: #339933;">-</span> <span style="color: #000088;">$i</span><span style="color: #339933;">;</span>
		<span style="color: #009900;">&#125;</span>
	<span style="color: #009900;">&#125;</span>
	<span style="color: #b1b100;">return</span> <span style="color: #339933;">-</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #666666; font-style: italic;">// 275</span>
<span style="color: #b1b100;">echo</span> MorrisPratt<span style="color: #009900;">&#40;</span><span style="color: #000088;">$text</span><span style="color: #339933;">,</span> <span style="color: #000088;">$pattern</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<h2>Complexity</h2>
<p>This algorithm needs some time and space for preprocessing. Thus the preprocess of the pattern can be done in O(m), where m is the length of the pattern, while the search itself needs O(m+n). The good news is that you can do the preprocess only once and then perform the search as many times as you wish!</p>
<p>The following chart shows the complexity O(n+m) compared with O(nm) for 5 letter patterns.</p>
<div id="attachment_3026" class="wp-caption alignnone" style="width: 610px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/04/Morris-Pratt-complexity.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/04/Morris-Pratt-complexity.png" alt="Morris-Pratt complexity" title="Morris-Pratt complexity" width="600" height="371" class="size-full wp-image-3026" /></a><p class="wp-caption-text">After pre-processing with O(m) the complexity of searching is O(n+m). You can see on the chart how effective is Morris-Pratt string searching compared to brute force string searching!</p></div>
<h2>Application</h2>
<h3>Why it&#8217;s cool</h3>
<ol>
<li>Its searching complexity is O(m+n) which is faster than brute force and Rabin-Karp</li>
<li>It’s fairly easy to implement</li>
</ol>
<h3>Why it isn’t cool</h3>
<ol>
<li>It needs additional space and time &#8211; O(m) for pre-processing</li>
<li>It can be optimized a bit (Knuth-Morris-Pratt)</li>
</ol>
<h2>Final Words</h2>
<p>Obviously this algorithm is quite useful because it improves in some very elegant manner the brute force matching. In the other hand you must know that there are faster string searching algorithms like the Boyer-Moore algorithm. However the Morris-Pratt algorithm can be quite useful in many cases, so understanding its principles can be very handy.</p>
<p>Related posts:<ol>
<li><a href='http://www.stoimen.com/blog/2012/04/17/computer-algorithms-boyer-moore-string-searching/' rel='bookmark' title='Computer Algorithms: Boyer-Moore String Searching'>Computer Algorithms: Boyer-Moore String Searching</a></li>
<li><a href='http://www.stoimen.com/blog/2012/04/02/computer-algorithms-rabin-karp-string-searching/' rel='bookmark' title='Computer Algorithms: Rabin-Karp String Searching'>Computer Algorithms: Rabin-Karp String Searching</a></li>
<li><a href='http://www.stoimen.com/blog/2012/03/27/computer-algorithms-brute-force-string-matching/' rel='bookmark' title='Computer Algorithms: Brute Force String Matching'>Computer Algorithms: Brute Force String Matching</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.stoimen.com/blog/2012/04/09/computer-algorithms-morris-pratt-string-searching/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Computer Algorithms: Rabin-Karp String Searching</title>
		<link>http://www.stoimen.com/blog/2012/04/02/computer-algorithms-rabin-karp-string-searching/</link>
		<comments>http://www.stoimen.com/blog/2012/04/02/computer-algorithms-rabin-karp-string-searching/#comments</comments>
		<pubDate>Mon, 02 Apr 2012 19:48:15 +0000</pubDate>
		<dc:creator>Stoimen</dc:creator>
				<category><![CDATA[algorithms]]></category>
		<category><![CDATA[ASCII]]></category>
		<category><![CDATA[basic sub-string matching algorithm]]></category>
		<category><![CDATA[Boyer–Moore string search algorithm]]></category>
		<category><![CDATA[Complexity The Rabin-Karp algorithm]]></category>
		<category><![CDATA[Cryptographic hash function]]></category>
		<category><![CDATA[Cryptography]]></category>
		<category><![CDATA[Hash function]]></category>
		<category><![CDATA[Hash table]]></category>
		<category><![CDATA[Hashing]]></category>
		<category><![CDATA[Michael O. Rabin]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Rabin-Karp algorithm]]></category>
		<category><![CDATA[Rabin-Karp string search algorithm]]></category>
		<category><![CDATA[Richard M. Karp]]></category>
		<category><![CDATA[Rolling hash]]></category>
		<category><![CDATA[search algorithms]]></category>
		<category><![CDATA[string matching algorithms]]></category>
		<category><![CDATA[String searching algorithm]]></category>
		<category><![CDATA[string searching algorithms]]></category>
		<category><![CDATA[sub-string matching algorithms]]></category>
		<category><![CDATA[This algorithm]]></category>

		<guid isPermaLink="false">http://www.stoimen.com/blog/?p=2991</guid>
		<description><![CDATA[Introduction Brute force string matching is the a very basic sub-string matching algorithm, but it’s good for some reasons. For example it doesn’t require preprocessing of the text or the pattern. The problem is that it’s very slow. That is &#8230; <a href="http://www.stoimen.com/blog/2012/04/02/computer-algorithms-rabin-karp-string-searching/">Continue reading <span class="meta-nav">&#8594;</span></a>
Related posts:<ol>
<li><a href='http://www.stoimen.com/blog/2012/04/09/computer-algorithms-morris-pratt-string-searching/' rel='bookmark' title='Computer Algorithms: Morris-Pratt String Searching'>Computer Algorithms: Morris-Pratt String Searching</a></li>
<li><a href='http://www.stoimen.com/blog/2012/04/17/computer-algorithms-boyer-moore-string-searching/' rel='bookmark' title='Computer Algorithms: Boyer-Moore String Searching'>Computer Algorithms: Boyer-Moore String Searching</a></li>
<li><a href='http://www.stoimen.com/blog/2012/03/27/computer-algorithms-brute-force-string-matching/' rel='bookmark' title='Computer Algorithms: Brute Force String Matching'>Computer Algorithms: Brute Force String Matching</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<h2>Introduction</h2>
<p><a href="http://www.stoimen.com/blog/2012/03/27/computer-algorithms-brute-force-string-matching/" title="Computer Algorithms: Brute Force String Searching">Brute force string matching</a> is the a very basic sub-string matching algorithm, but it’s good for some reasons. For example it doesn’t require preprocessing of the text or the pattern. The problem is that it’s very slow. That is why in many cases brute force matching can’t be very useful. For pattern matching we need something faster, but to understand other sub-string matching algorithms let’s take a look once again on brute force matching. </p>
<p>In brute force sub-string matching we checked every single character from the text with the first character of the pattern. Once we have a match between them we shift the comparison between the second character of the pattern with the next character of the text, as shown on the picture below.</p>
<p><div id="attachment_3002" class="wp-caption alignnone" style="width: 628px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/04/Rabin-Karp-Brute-Froce-Principles.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/04/Rabin-Karp-Brute-Froce-Principles.png" alt="Brute Froce Principles" title="Brute Froce Principles" width="618" height="242" class="size-full wp-image-3002" /></a><p class="wp-caption-text">Brute force string matching is slow because it compares every single character from the pattern and the text!</p></div><span id="more-2991"></span></p>
<p>This algorithm is slow for mainly two reasons. First we have to check every single character from the text. On the other hand even if we find a match between a text character and the first character of the pattern we continue to check step by step (character by character) every single symbol of the pattern in order to find whether it is in the text. So is there any other approach to find whether the text contains the pattern?</p>
<p>In fact there is a “faster” approach. In this case in order to avoid the comparison between the pattern and the text character by character, we’ll try to compare them at once, so we need a good hash function. With its help we can hash the pattern and check against hashed sub-strings of the text. We must be sure that the hash function is returning “small” hash codes for larger sub-strings. Another problem is that for larger patterns we can’t expect to have short hashes. But besides this the approach should be quite effective compared to the brute force string matching. </p>
<p>That approach is known as Rabin-Karp algorithm.</p>
<h2>Overview</h2>
<p><a href="http://en.wikipedia.org/wiki/Michael_O._Rabin" title="Michael O. Rabin" target="_blank">Michael O. Rabin</a> and <a href="http://en.wikipedia.org/wiki/Richard_M._Karp" title="Richard M. Karp" target="_blank">Richard M. Karp</a> came up with the idea of hashing the pattern and to check it against a hashed sub-string from the text in 1987. In general the idea seems quite simple, the only thing is that we need a hash function that gives different hashes for different sub-strings. Such hash function, for instance, may use the ASCII codes for every character, but we must be careful for multi-lingual support.</p>
<div id="attachment_3003" class="wp-caption alignnone" style="width: 631px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/04/Rabin-Karp-Basic-Principles.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/04/Rabin-Karp-Basic-Principles.png" alt="Rabin-Karp Basic Principles" title="Rabin-Karp Basic Principles" width="621" height="299" class="size-full wp-image-3003" /></a><p class="wp-caption-text">Rabin-Karp hashes the pattern and the sub-string in order to compare them quickly!</p></div>
<p>The hash function may vary depending on many things, so it may consist of ASCII char to number converting, but it can be also anything else. The only thing we need is to convert a string (pattern) into some hash that is faster to compare. Let’s say we have the string “hello world”, and let’s assume that its hash is hash(‘hello world’) = 12345. So if hash(‘he’) = 1 we can say that the pattern “he” is contained in the text “hello world”. Thus on every step we take from the text a sub-string with the length of m, where m is the pattern length. Thus we hash this sub-string and we can directly compare it to the hashed pattern, as on the picture above.</p>
<h2>Implementation</h2>
<p>So far we saw some diagrams explaining the Rabin-Karp algorithm, but let’s take a look on its implementation. Here in this very basic example where a simple hash table is used in order to convert the characters into integers. The code is PHP and it&#8217;s used only to illustrate the principles of this algorithm.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">function</span> hash_string<span style="color: #009900;">&#40;</span><span style="color: #000088;">$str</span><span style="color: #339933;">,</span> <span style="color: #000088;">$len</span><span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span>
	<span style="color: #000088;">$hash</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">''</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #000088;">$hash_table</span> <span style="color: #339933;">=</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span>
		<span style="color: #0000ff;">'h'</span> <span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">1</span><span style="color: #339933;">,</span>
		<span style="color: #0000ff;">'e'</span> <span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">2</span><span style="color: #339933;">,</span>
		<span style="color: #0000ff;">'l'</span> <span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">3</span><span style="color: #339933;">,</span>
		<span style="color: #0000ff;">'o'</span> <span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">4</span><span style="color: #339933;">,</span>
		<span style="color: #0000ff;">'w'</span> <span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">5</span><span style="color: #339933;">,</span>
		<span style="color: #0000ff;">'r'</span> <span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">6</span><span style="color: #339933;">,</span>
		<span style="color: #0000ff;">'d'</span> <span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">7</span><span style="color: #339933;">,</span>
	<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #b1b100;">for</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span> <span style="color: #000088;">$i</span> <span style="color: #339933;">&lt;</span> <span style="color: #000088;">$len</span><span style="color: #339933;">;</span> <span style="color: #000088;">$i</span><span style="color: #339933;">++</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #000088;">$hash</span> <span style="color: #339933;">.=</span> <span style="color: #000088;">$hash_table</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$str</span><span style="color: #009900;">&#123;</span><span style="color: #000088;">$i</span><span style="color: #009900;">&#125;</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
	<span style="color: #009900;">&#125;</span>
&nbsp;
	<span style="color: #b1b100;">return</span> <span style="color: #009900;">&#40;</span>int<span style="color: #009900;">&#41;</span><span style="color: #000088;">$hash</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #000000; font-weight: bold;">function</span> rabin_karp<span style="color: #009900;">&#40;</span><span style="color: #000088;">$text</span><span style="color: #339933;">,</span> <span style="color: #000088;">$pattern</span><span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span>
	<span style="color: #000088;">$n</span> <span style="color: #339933;">=</span> <span style="color: #990000;">strlen</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$text</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	<span style="color: #000088;">$m</span> <span style="color: #339933;">=</span> <span style="color: #990000;">strlen</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$pattern</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #000088;">$text_hash</span> <span style="color: #339933;">=</span> hash_string<span style="color: #009900;">&#40;</span><span style="color: #990000;">substr</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$text</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">,</span> <span style="color: #000088;">$m</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span> <span style="color: #000088;">$m</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	<span style="color: #000088;">$pattern_hash</span> <span style="color: #339933;">=</span> hash_string<span style="color: #009900;">&#40;</span><span style="color: #000088;">$pattern</span><span style="color: #339933;">,</span> <span style="color: #000088;">$m</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #b1b100;">for</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span> <span style="color: #000088;">$i</span> <span style="color: #339933;">&lt;</span> <span style="color: #000088;">$n</span><span style="color: #339933;">-</span><span style="color: #000088;">$m</span><span style="color: #339933;">+</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span> <span style="color: #000088;">$i</span><span style="color: #339933;">++</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$text_hash</span> <span style="color: #339933;">==</span> <span style="color: #000088;">$pattern_hash</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
			<span style="color: #b1b100;">return</span> <span style="color: #000088;">$i</span><span style="color: #339933;">;</span>
		<span style="color: #009900;">&#125;</span>
&nbsp;
		<span style="color: #000088;">$text_hash</span> <span style="color: #339933;">=</span> hash_string<span style="color: #009900;">&#40;</span><span style="color: #990000;">substr</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$text</span><span style="color: #339933;">,</span> <span style="color: #000088;">$i</span><span style="color: #339933;">,</span> <span style="color: #000088;">$m</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span> <span style="color: #000088;">$m</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	<span style="color: #009900;">&#125;</span>
&nbsp;
	<span style="color: #b1b100;">return</span> <span style="color: #339933;">-</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #666666; font-style: italic;">// 2</span>
<span style="color: #b1b100;">echo</span> rabin_karp<span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'hello world'</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">'ello'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<h3>Multiple Pattern Match</h3>
<p>It’s great to say that the Rabin-Karp algorithm is great for multiple pattern match. Indeed its nature is supposed to support such functionality, which is its advantage in compare to other string searching algorithms.</p>
<h2>Complexity</h2>
<p>The Rabin-Karp algorithm has the complexity of O(nm) where <strong>n</strong>, of course, is the length of the text, while <strong>m</strong> is the length of the pattern. So where it is compared to brute-force matching? Well, brute force matching complexity is O(nm), so as it seems there’s no much gain in performance. However it’s considered that Rabin-Karp’s complexity is O(n+m) in practice, and that makes it a bit faster, as shown on the chart below.</p>
<div id="attachment_3001" class="wp-caption alignnone" style="width: 610px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/04/Rabin-Karp-Complexity.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/04/Rabin-Karp-Complexity.png" alt="Rabin-Karp Complexity" title="Rabin-Karp Complexity" width="600" height="371" class="size-full wp-image-3001" /></a><p class="wp-caption-text">Rabin-Karp&#039;s complexity is O(nm), but in practice it&#039;s O(n+m)!</p></div>
<p>Note that the Rabin-Karp algorithm also needs O(m) preprocessing time.</p>
<h2>Application</h2>
<p>As we saw Rabin-Karp is not so faster than brute force matching. So where we should use it?</p>
<h3>3 Reasons Why Rabin-Karp is Cool</h3>
<p>1. Good for plagiarism, because it can deal with multiple pattern matching!<br />
<div id="attachment_3000" class="wp-caption alignnone" style="width: 630px"><a href="http://www.stoimen.com/blog/wp-content/uploads/2012/04/Application-of-Rabin-Karp.png"><img src="http://www.stoimen.com/blog/wp-content/uploads/2012/04/Application-of-Rabin-Karp.png" alt="Application of Rabin-Karp" title="Application of Rabin-Karp" width="620" height="399" class="size-full wp-image-3000" /></a><p class="wp-caption-text">Rabin-Karp can detect plagiarism efficiently!</p></div></p>
<p>2. Not faster than brute force matching in theory, but in practice its complexity is O(n+m)!<br />
3. With a good hashing function it can be quite effective and it&#8217;s easy to implement!</p>
<h3>2 Reasons Why Rabin-Karp is Not Cool</h3>
<p>1. There are lots of string matching algorithms that are faster than O(n+m)<br />
2. It’s practically as slow as brute force matching and it requires additional space</p>
<h2>Final Words</h2>
<p>Rabin-Karp is a great algorithm for one simple reason &#8211; it can be used to match against multiple pattern. This makes it perfect to detect plagiarism even for larger phrases. </p>
<p>Related posts:<ol>
<li><a href='http://www.stoimen.com/blog/2012/04/09/computer-algorithms-morris-pratt-string-searching/' rel='bookmark' title='Computer Algorithms: Morris-Pratt String Searching'>Computer Algorithms: Morris-Pratt String Searching</a></li>
<li><a href='http://www.stoimen.com/blog/2012/04/17/computer-algorithms-boyer-moore-string-searching/' rel='bookmark' title='Computer Algorithms: Boyer-Moore String Searching'>Computer Algorithms: Boyer-Moore String Searching</a></li>
<li><a href='http://www.stoimen.com/blog/2012/03/27/computer-algorithms-brute-force-string-matching/' rel='bookmark' title='Computer Algorithms: Brute Force String Matching'>Computer Algorithms: Brute Force String Matching</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.stoimen.com/blog/2012/04/02/computer-algorithms-rabin-karp-string-searching/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
	</channel>
</rss>

