Developers

A Voyage to the Kernel, Day 7

January 1, 2009

5348

We are about to enter the core part of this segment—algorithms. An algorithm could be termed a sequence of computational steps that can transform an input into the output. Here, it should be emphasised that any such sequence cannot be called an algorithm since a wrong methodology will give an incorrect output.

Almost all the code used in this segment will be in a pseudo-code format that is akin to C code. Sometimes the algorithms are given in C itself. We have to stick to this format, as our primary intention is to meddle with the kernel.
Those who wish to have an overview of the importance of bringing in innovative algorithms and building simulations (based on them) can look up the GNU Hurd project.

As I promised earlier, this voyage will not neglect the novices. So, let’s start with a few simple things. Let’s suppose we have a set of values ranging from x1, x2 x3…………..xn. If you are asked to arrange them in ascending order, you need to write an algorithm so as to get the output as xa, xb xc…….., which satisfies the condition xa < xb < xc…….

This is a simple case of sorting, which is a common algorithm that we employ in our programs.

Some fields that rely on algorithms
Writing algorithms is not just the headache of programmers alone. There are various other fields in which we need to rely on the algorithmic approach: The Human Genome Project has the goal of identifying all the 100,000 genes in the human DNA. It has to determine about three billion sequences of the chemical base pairs! This is virtually impossible unless we employ effective algorithms for pattern recognition and identification. Today we depend a lot on electronic commerce for the purchase of any commodity. A good system should have the ability to keep information (such as credit card numbers, passwords and bank statements) secure and encrypted. Algorithms are used for these cryptographic processes. Other branches like physical science (see the box on ‘Simulation building in physical science) require the use of algorithms for solving complex problems. Even a manufacturing industry (or any other commercial setting) needs algorithms for allocating scarce resources.

Some fields that rely on algorithms

Writing algorithms is not just the headache of programmers alone. There are various other fields in which we need to rely on the algorithmic approach:

The Human Genome Project has the goal of identifying all the 100,000 genes in the human DNA. It has to determine about three billion sequences of the chemical base pairs! This is virtually impossible unless we employ effective algorithms for pattern recognition and identification.
Today we depend a lot on electronic commerce for the purchase of any commodity. A good system should have the ability to keep information (such as credit card numbers, passwords and bank statements) secure and encrypted. Algorithms are used for these cryptographic processes.
Other branches like physical science (see the box on ‘Simulation building in physical science) require the use of algorithms for solving complex problems.
Even a manufacturing industry (or any other commercial setting) needs algorithms for allocating scarce resources.

We just looked at an instance of the sorting problem. This may become more complex depending on the number of items to be sorted, the extent to which the items are to be sorted, the current state (sorting) of the elements, possible restrictions on the items, and even on the kind of storage device used.

Hence, while dealing with algorithms, we need to consider the data structures employed with which we can manipulate the way to store and organise data in order to facilitate access and modifications as per our requirement.

Figure 1: Five cards of clubs arranged in an order

Finding the shortest route is a kind of sorting algorithm. Consider a trucking company with a central warehouse. Each day, it loads up a truck at the central warehouse and then sends it around to several locations to deliver the products. At the end of each day, the truck should return to the central warehouse so that it is ready to be loaded for the next day. To find out the lowest operating cost, the company needs an algorithm that will indicate a specific order of delivery stops such that the truck travels the lowest overall distance.

If you have enough data in your hands, you can write down an algorithm. Now, let’s look at how to write such an algorithm. For the sake of simplicity, let’s replace our problem with a simple one.

Consider five cards of clubs—2, 4, 5, 7 and 10—which are placed randomly (just like we had random stops). If you were to play the game, you would have arranged the cards as shown in Figure 1.

But what if a computer had to play your role? How would it arrange the cards? The answer is quite simple: by employing the sorting algorithm. The following code elucidates the algorithm:

INSERTION-SORT(X)
for j ← 2 to length[X]
do key-select ← X[j]
Insert X[j] into  series X[1,.. [j-1]]

i ← j - 1
while i > 0 and X[i] > key-select
do X[i + 1] ← X[i]
i ← i - 1

X[i + 1] ← key-select

The pseudo code carries elements that are quite akin to those in C, and the alphabets used for each iteration are conventional ones. The character ‘j’ indicates the ‘current card’ that is picked up and the ‘▹’ sign symbolises that the remainder of the line is a comment. The numbers that we wish to sort are represented as the key-select. By looking at the algorithm, you can see that the parameters are passed to a procedure by value.

Figure 2: A set of cards from different families

As the algorithm is quite simple, it is self-explanatory. Now try to expand the same algorithm to another problem shown in Figure 2. If you have understood the first one, this will be quite simple, except that you need to bring in some additional rules as there are cards from different families and two cards here have the same value (of hearts and diamonds).

Analysing an algorithm has come to mean predicting the resources that the algorithm requires so as to get the desired output. It considers aspects like memory allocation, communication bandwidth, computer hardware, etc. We also need to take into account parameters like computational time when it comes to the practical side. These parameters may, in turn, depend on input size and the running time.

Methodology: the divide-and-conquer approach

This is an effective methodology that we can adopt when we design algorithms. It involves:

Divide: Divide the given sequence (with n elements) into two sub-sequences (of n/2 elements each)
Conquer: Sort the two new sub-sequences recursively using the merge sort algorithm.
Combine: Merge the two sorted sub-sequences to produce the desired result.

The merge mode is illustrated below:

MERGE(A, p, q, r)

MERGE-SORT ( A, p, r)
if p < r                              To check for base case
then q < ( p + r)/2                For dividing
MERGE-SORT ( A, p, q)      Conquering
MERGE-SORT ( A, q + 1, r)  Conquering
MERGE ( A, p, q, r)          Combining

The above pseudo code may not enlighten novice programmers, who can look at the expanded code given below and then use the above example to assimilate the core idea:

MERGE ( A, p, q, r)
s1 ← q − p + 1
s2 ← r − q
make arrays L[1 .... s1 + 1] and R[1 ..... s2 + 1]
for i ← 1 to s1
do L[i] ← A[ p + i − 1]
for j ← 1 to s2
do R[ j ] ← A[q + j ]
L[s1 + 1] ← ∞
R[s2 + 1] ← ∞
i ←1
j ←1
for k ← p to r
do if L[i] ≤ R[ j ]
then A[k] ← L[i]
i ←i +1
else A[k] ← R[ j ]
j ← j +1

Let’s look at an example to comprehend the idea better. I found the following problem in a book that deals with problem solving. A part of the problem involves arranging an array of values [5, 2, 4, 7, 1, 3, 2, 6]. This will be our initial (given) array. Now, we can use our methodology to solve the problem. Figure 3 shows the consecutive division and merging processes.

Figure 3: Sorting and arranging an array of values in numbers

We can imagine the process in many different ways. If the set is random, then we could have another array with the same elements in a different order. The solution is shown in Figures 4, 5 and 6. Here, Figures 4 and 5 represent the initial steps and Figure 6 represents the final output. It should be noted that the intermediate stages are not shown here, as they are similar steps that can be envisaged in one’s mind. The step-by-step procedure will also throw ample light on the subs L and R that are created.

Figure 4: Initial steps of sorting the array

Figure 5: Initial steps of sorting the array

Once you are done with simple algorithms, you can apply the same ‘black box representation’ idea (that you employ while writing programs) and use the simple algorithms as sub-routine calls in your main algorithm (depending on the model you opt for).

Now we can look at another simple algorithm that can handle errors. The following code shows the multiplication of matrices:

MATRIX-MULTIPLICATION(A, B)
if columns[A] ≠ rows[B]
then error “This operation is not allowed”
else for i ← 1 to rows[A]
do for j ← 1 to columns[B]
do C[i, j] ← 0
for p ← 1 to columns[A]
do C[i, j] ← C[i, j] + A[i, k] · B[k, j]
return C

Before we move on to the complex stuff, you can test yourself by considering the problem in the following table. You are required to find out the maximum value of n that corresponds to each of the stipulated times. The data provided along with it is that you have f(n) in milliseconds. Try solving it!

	1 Sec	1 Min	1 Hr	1 Day	1 Month	1 Year
n!
n2
ln (n)
log (n)
n x ln (n)
2n

We can see that the style remains the same when we try to write algorithms in C. If we need to compare functions for a pointer to an integer, we can have the following:

#include “compare.h”
int int_is_equal(void *vplace1, void *vplace2)
{
     int *place1;
     int *place2;

     place1 = (int *) vplace1;
     place2 = (int *) vplace2;

     return *place1 == *place2;
}

int int_is_compare(void *vplace1, void *vplace2)
{
     int *place1;
     int *place2;

     place1 = (int *) vplace1;
     place2 = (int *) vplace2;

     if (*place1 < *place2) {
          return -1;
     } else if (*place1 > *place2) {
          return 1;
     } else {
          return 0;
     }
}

You may find a reference to a header file in the program. So you need compare.h along with it to get the desired result:

#ifndef ALGORITHM_COMPARE_INT_H
#define ALGORITHM_COMPARE_INT_H

#ifdef __cplusplus
extern “C” {
#endif

int int_is_equal(void *place1, void *place2);

int int_is_compare(void *place1, void *place2);

#ifdef __cplusplus
}
#endif

#endif

There could be problems that belong to the NP-complete category. These problems may not have an exact solution. We will be discussing these aspects once we are done with topics like asymptotic notations and complex algorithms.

Simulation building in physical science

Solid-state physics largely employs simulation techniques for modelling. By taking data from experiments, crystal lattice structures can be constructed easily.

Figure 7 shows one such three-dimensional array of lattice points. The properties of the crystal structure can be inferred by building them. Simulations will give us a clear picture of the crystal properties by considering its primitives.

Figure 8: Jupiter is accompanied by two groups of asteroids that precede and follow it at an angular distance of π/3

So is the case with complex bodies. It is found that Jupiter is accompanied, in its orbit, by two groups of asteroids that precede it and follow it at an angular distance of π/3 (see Figure 8). By building simulations we can show that these are positions of stable equilibrium. A Runge-Kutta procedure with automatic step control can be used for analysing the data from Table 2.

Data for analysing a Runge-Kutta procedure with automatic step control
	Sun	Jupiter	Trojan 1	Trojan 2
Mass	1	0.001	0	0
x	0	0	-4.50333	4.50333
y	0	5.2	2.6	2.6
z	0	0	0	0
Vx	0	-2.75674	-1.37837	-1.37837
Vy	0	0	-2.38741	2.38741
Vz	0	0	0	0

Figure 9: Thus prediction can be made using computational procedures

Using the computational procedure we can get a prediction as shown in Figure 9.

You can also try solving simple problems of the following form:

The Runge-Kutta procedure is quite sufficient to handle these types of problems, provided you have enough data.

7 COMMENTS

flide November 30, 2009 At 9:22 PM

hey, how can I get hold of all the ” A Voyage to the Kernel” articles, I am not able to find the articles before day 7 and after day 11. Please help

Vikas March 15, 2010 At 10:17 PM

Hi,

Awesome Tutorial… but how can i get all the articles from day 1… unable to find here.

Tej Vohra March 20, 2010 At 4:44 PM

I have a similar comment. I would like to get hold of all kernel programming articles. From day 1. Could you please help?
Thanking you in anticipation

Pronita May 21, 2010 At 3:10 PM

Please can you let me know how to get the whole articles starting from 1 till end? Its not searchable or seen on the site.
If you can point to the date when it started and continued, will also do. I really want to read the whole article.
Please hellp…

Atul June 28, 2010 At 4:12 PM

Hi,
I have the similar comments. I would like read complete series.
Please let us know the links for those.

cslim July 15, 2010 At 3:23 PM

I wanted to get the whole Votage to the Kernel series.

SS September 6, 2013 At 8:35 PM

How exactly is this voyage to the kernel?

Methodology: the divide-and-conquer approach

Simulation building in physical science

7 COMMENTS

LEAVE A REPLY Cancel reply

Thought Leaders

HOW TOs

MOST POPULAR

Open Journey

EDITOR PICKS

POPULAR POSTS

POPULAR CATEGORY