The first programming language that you learn leaves an indelible mark on you. If you are used to coding in a particular language and then decide to take up a new one, the latter may throw up a few surprises. Some of these can be fun but they could also be frustrating.
I learned C more than 20 years ago and as a C programmer at heart, I found it difficult to adjust to other languages. Nevertheless, I had to learn or use many others like C++, Java, x86 assembly language, Python, etc, over the years. I am no expert in all these programming languages. If I can order food, ask for water and call a taxi in some (natural) language, I assume proficiency in that language. Often, the same standards apply while claiming proficiency in a programming language. Recently, while learning Haskell, a functional programming language, I came across a feature called lazy evaluation of expressions (to be discussed later in this article), and I was literally surprised. This is something that has happened to me a lot while learning new programming languages. While learning Python, it was a surprise to learn that you don’t need to explicitly declare the type of a variable before storing data into it.
This article discusses some of the unique features of different programming languages that might surprise programmers well versed in some other programming language. Now let us also try to find out why this occurs. Major surprises tend to occur when a programmer learns a language that has a different paradigm. For example, a Java (an object-oriented programming language) programmer learning Haskell (a functional programming language) might come across a large number of features that are surprising and a few which might even look counter intuitive. Similarly, compiled and interpreted languages often differ a lot in their features. An example for this is a C++ (a compiled language) programmer learning Python (an interpreted language).
Similar surprises might await a person switching from a general-purpose programming language to a domain-specific one — for example, a C (a general-purpose programming language) programmer learning JavaScript (a domain-specific programming language). The programming experience might also depend on the underlying architecture and operating system. An x86 assembly language programmer (from Intel) learning MIPS assembly language (from MIPS Technologies) or a PowerShell (from Microsoft) user learning Linux shell scripting are bound to encounter a few surprises. Of course, there could be many other reasons for this, but the above mentioned ones seem to be the most obvious.
For those learning their first programming language, it’s most probable that none of the features are surprising. Now let us look at a few features of programming languages that might surprise someone who learns these as their second programming language.
Complicated pointers in C/C++
Most undergraduate programmes in computer science do have a course on C programming and, to many, the most difficult section is the one on pointers. To add to the misery, you can declare pointers of arbitrary complexity. As an example, consider the C program named pointer.c given below. This and all the other programs discussed in this article can be downloaded from opensourceforu.com/article_source_code/June20surprisinglanguage.zip.
#include<Stdio.h> int main() { int *(*(* ptr)(int *))[2]; int *******ptr1; int **************************************************ptr3; return 0; }
The above C program compiles without any errors. The pointer variable ptr1 is a pointer to a function that accepts a pointer to an integer as an argument and returns a pointer to an array of pointers to integers. The pointer variable ptr2 is a pointer to a pointer to a pointer to a pointer to a pointer to a pointer to a pointer to an integer (I hope I have counted correctly!). There are 50 asterisk symbols before the declaration of the pointer variable ptr3. But is there a limit to this? I tried up to 1000 asterisk symbols and it was still compiling fine. I believe there are no upper limits set by the C standard. There may be a limit at which the C compiler might fail to handle this, but I don’t know where that limit is. No programmer will ever use these sorts of pointers in a real program. Nevertheless, these features are available for us to use. A similar C++ program will also compile without any errors. Now back to our business. Imagine the horror of a Java, Python or Haskell (all of which are programming languages that do not use pointers) programmer who comes across such monstrosities.
Confusing Octal numbers in C/C++/Java
This may not be as big a surprise as the previous one. But once, a long time back, this feature of C/C++/Java gave me quite a headache and I believe we should discuss it. Consider the C++ program octal.cc given below. Similar C and Java programs (octal.c and Octal.java) are also available for download. What is the output of the program octal.cc?
#include<iostream> using namespace std; int main() { int num[ ]={001,005,025,125,625}; for(int i=0;i<5;i++) { cout<<num[i]<<endl; } return 0; }
The array num contains the first five numbers in the geometric sequence starting at 1 with common ratio 5. Figure 1 shows the output of the program octal.cc. But why is the number 21 printed instead of 25? Well, in C, C++ and Java, a number like 0123 is treated as an octal number and ‘0x123’ is treated as a hexadecimal number. In the program octal.cc the numbers 001, 005 and 025 are treated as octal numbers due to this reason. But the numbers 001 and 005 are the same in decimal and octal number systems, whereas the number 025 in octal is 21 in decimal. Thus, the sequence printed is 1, 5, 21, 125, 625. This feature is convenient on many occasions, but could lead to potential bugs if one is not careful. For example, a Python programmer who is familiar with the notation ‘0o25’ might consider ‘025’ as the number 25 with a leading zero.
The for loop with an else in Python
As mentioned earlier, a C/C++/Java programmer newly learning Python will be surprised that explicit type declaration is not required in Python. Since this dynamic typing feature of Python is well known to many programmers, I will discuss a simple yet surprising feature of Python; for loop with an else part. Consider the Python script loop.py shown below.
for i in range(5): print(i) else: print(“Noraml Exit from for loop”) for i in range(10): print(i) if i == 4: break else: print(“Break from for loop”)
Figure 2 shows the output of the Python script loop.py. Notice that the else part of the for loop gets executed only when exited normally from the loop and not when exited through a break statement. Though not essential, for-else is a convenient feature that is absent in most of the programming languages.
Fork bombing with a Linux shell script
What is the smallest program (in terms of source code size) in any language that comes to your mind, which when executed will crash your system? I am sure the winner will be the following shell script.
x( ) { x | x & }; x
It defines a function called x, which calls the function x itself recursively and then pipes this result to another recursive call to x in the background. The fourth x in the script is a call to the function x to begin the bombing. Soon your CPU will be allocating all its time to process just these calls to the function x. Just 11 non-white space characters in the script and your system is down, so imagine the surprise of those programmers whose favourite programming language takes hundreds of characters just to print the message ‘Hello World’ on the screen.
Warning! If you execute this code on a Linux terminal and do not close the terminal immediately, your system will hang. In such a situation, you will have to force restart your system.
Case-insensitive function names in PHP
Imagine copying the directories named ‘SONGS’, ‘Songs’ and ‘songs’ from your Linux machine to your friend’s Windows machine. The Windows file system in general is not case-sensitive and you will be forced to rename 2 of these directories before copying them, because all the three directory names (SONGS, Songs and songs) are the same if treated in a case-insensitive manner. Some programming languages behave like this. A classic example is PHP. Consider the PHP script named case.php given below:
<!DOCTYPE html> <html> <body> <?php function PRINTMSG( ) { echo “I AM PHP <br>”; } printmsg( ); PrintMsg( ); PRINTMSG( ); pRiNtMsG( ); PrInTmSg( ); ?> </body> </html>
Function names in PHP are case-insensitive and the lines of code printmsg(), PrintMsg(), PRINTMSG(), pRiNtMsG(), and PrInTmSg() are all calling the function PRINTMSG(). Figure 3 shows the output of the PHP script case.php. Do notice that PHP variable names are case-sensitive like most other programming languages.
Implicit type conversion in JavaScript
Imagine the case of a Java programmer learning JavaScript. Due to the similarity in names we might think that this will be an easy task. Even though the syntax is somewhat similar, the transition from Java to JavaScript is not that easy. Java is a strongly typed language where type checking is very rigorous, while on the other hand, JavaScript is a weakly typed language with extensive implicit type conversion. This is one feature of JavaScript that should be avoided if possible. To understand the pitfalls in the implicit type conversion of JavaScript, let us go through the JavaScript script named type.js given below. What is the output of the script type.js?
<!DOCTYPE html> <html> <body> <script> a = ‘2’ + 1 b = ‘2’ - 1 document.write(a + “<br>”); document.write(b + “<br>”); c = ‘1’ + 2 + 3 d = 1 + 2 + ‘3’ document.write(c + “<br>”); document.write(d); </script> </body> </html>
A seasoned Java programmer will expect a number of errors in the above code. But everything is fine with JavaScript. Figure 4 shows the output of the script type.js. Why does the variable a have the value 21 and variable b have the value 1? This is due to the implicit type conversion in JavaScript. The operator ‘-’ performs just one function, a mathematical subtraction. When a string and a number are the operands of the ‘-’ operator, JavaScript converts the string to a number. So in the line of code, b = ‘2’ – 1, the string ‘2’ is converted to the number 2, and the number 1 is subtracted from it to obtain 1. Notice that, here, the variable b contains a number, 1. Now let us look at what happens with the operator ‘+’.
The operator ‘+’ performs two functions, mathematical addition and string concatenation. When a string and a number are the operands of the ‘+’ operator, instead of converting the string to a number, JavaScript converts the number to a string. So in the line of code a = ‘2’ + 1, the number 1 is converted to the string ‘1’, and the strings ‘2’ and ‘1’ are concatenated to give ‘21’. Notice that here, the variable ‘a’ contains a string, ‘21’. With that knowledge, can you explain why the variable ‘c’ contains the string ‘123’ and the variable ‘d’ contains the string ‘33’? I will give you a hint — the associativity of the operator ‘+’ is what matters. The operator ‘+’ has left-associativity. Hence in the line of code c = ‘1’ + 2 + 3, the operation ‘1’ + 2 is carried out first, resulting in the string ‘12’ because ‘1’ is a string. Then the expression becomes ‘12’ + 3, resulting in the final string ‘123’ stored in the variable c. But in the line of code d = 1 + 2 + ‘3’, due to left-associativity of the operator ‘+’, the first operation performed is 1+2, resulting in the number 3, because both 1 and 2 are numbers. Then the expression becomes 3 + ‘3’, resulting in the final string ‘33’ stored in the variable d. Imagine the horror of a Java programmer going through these results in his favourite programming language!
Lazy evaluation in Haskell
Haskell is a general-purpose purely functional programming language, which has a feature called lazy evaluation! Because of lazy evaluation, expressions are not evaluated immediately if they are just bound to variables. Instead, the evaluation of an expression is carried out only when its results are needed by other operations. For this reason, lazy evaluation is often called ‘call-by-need’. Lazy evaluation enables infinite functions to be stored in Haskell lists. Figure 5 shows the processing of one such infinite list in the GHCi interactive environment. The list ‘x’ represents the infinite list of natural numbers by the line of code let x = [1..]. Notice that ‘x’ is not evaluated at this point. The line of code take 10 x gives the list [1,2,3,4,5,6,7,8,9,10] as output (the first ten elements of the list x). But the third line of code print x will lead to the complete evaluation of the list ‘x’ resulting in printing the natural numbers (from which you have to forcefully exit). Notice that the lazy evaluation feature of Haskell is very powerful but at the same time has received some criticism from experts.
Multiple main methods in Java
When I first learned Java programming, I did a very poor job because I tried to learn some advanced features (Swing and AWT) without mastering the basics. Till recently, I was under the false impression that a Java program can contain only one main method. But then I got a rude shock — it is possible to have more than one main method in a Java program. For example, consider the Java program AAA.java shown below with two different classes AAA and BBB, each having one main method.
class AAA { void disp() { System.out.println(“Hello From AAA...”); } public static void main(String[ ] args) { AAA a = new AAA(); a.disp( ); } } class BBB { public static void disp() { System.out.println(“Hello from BBB...”); } public static void main(String[ ] args) { BBB b = new BBB(); b.disp( ); } }
Figure 6 shows the output of the program AAA.java. From the figure, it is clear that the name of the class you use to invoke the Java Virtual Machine (JVM) decides the main method to be called. You can see that the output is different when you execute the commands java AAA (the main method of AAA is called) and java BBB (the main method of BBB is called). Though surprising, it reminds us that the main method is no way special.
Matrix indexing in MATLAB/Scilab
Imagine students learning their first programming language, be it Python, Java, C, or C++. The first time they come across arrays, I am sure they will have some difficulty adjusting with the indexing starting at 0 (the famous computer scientist Edsger W. Dijkstra has convincingly argued that this should be the case and it would be nice if you could go through his paper). But after a while, the students master the language and become comfortable with array indices starting at 0. Years later, array indices starting at 0 become second nature to them. And then one day they have to learn and use MATLAB/Scilab for some mathematical computations.
All of a sudden, they realise that indexing starts at 1 for MATLAB/Scilab. You can imagine the number of errors they are going to make within a short span of time. At least, that is what happened to me and it was very frustrating. Figure 7 shows an example for array indexing in the Scilab console. I think the figure clearly illustrates the matter. In addition to MATLAB and Scilab, there are other programming languages like Fortran, Julia, Mathematica, etc, in which arrays start at index 1.
A lot more could have been added to this discussion, but since I don’t know whether or not these so called surprises are indeed surprising to others, let us stop for the time being. The selection of surprising features is based on my personal experiences and preferences, so it might be quite possible that the very universally surprising feature Y of programming language X might be missing on this list. But the most important takeaway from this discussion is that while learning a new programming language, if you are really surprised by a particular feature, then it might be the reason why you are learning that language. All the features present in your old programming language will make the learning of the new one easy, but the surprising new features will test you and might even give you a career boost. So, the next time you come across a surprising feature in a programming language, be ready; mastering it might be the break you need.