*In this months column, we discuss the topic of automatic relation extraction from text using natural language processing techniques.*

In last months column, I had described certain techniques for information extraction from unstructured texts using natural language processing. In this months column, we take a break from this and discuss a few interview questions in general computer science and data science.

1. Given an arbitrary integer N, how many unique binary trees can be created containing N nodes? Can you come up with an algorithm which can programmatically list all possible binary trees consisting of N nodes?

2. You are given two unsorted lists of integers. You are asked to create a sorted list consisting of all the elements in the two lists. Can you write an algorithm for this? What is the running time of your code, in the worst case?

3. You are given a list of N nodes in a tree, with each node represented by an integer. For each node, you are also given the details of its parent node. You are not given the edges of the tree explicitly, but you are told that it is a binary tree. You are asked to write code to perform an in-order traversal of the tree using this information. How would you solve this problem: (a) If you are told that you can perform pre-processing on the information and can use any amount of additional storage; (b) You are told that you can use only a constant amount of additional storage.

4. You are given an arbitrary tree T (not necessarily binary), with each tree node containing a pair of integers (X and Y). Each tree node can have an arbitrary number of children nodes, at each level of the tree. You are given two numbers, A and B, and are told to find out if the nodes N1 and N2 containing the integers A and B are located at the same level of the tree. Note that N1 and N2 can be two separate nodes or be the same node. It is only required that N1 and N2 should be located at the same level of the tree, and they need not be siblings. Write an algorithm to determine whether the tree T contains A and B at the same level. What is the execution time complexity of your code?

5. You are given an array of size N. Each element of the array can take only one of the three values (0, 1 or -1). You are told to find the longest contiguous sequence of 1s in the array. Write a C function to find the start and end of the contiguous sequence of 1s in the array. Now you are told that you can flip K contiguous sequence of (-1) to 1 where K is a dynamically provided input to the function. In that case, how would you determine the longest contiguous sequences of 1s in the array after flipping K elements which were -1 to 1? What is the time and space complexity of your function?

6. Given an array of N integers, you are asked to find out whether the given array is an arithmetic series (a sequence of numbers where there is a constant difference between any two consecutive numbers of the series). What is the complexity of your function? Now, you are given a modified problem where your task is to convert a given array into an arithmetic sequence by dropping off those array elements which prevent it from being an arithmetic sequence. How would you solve this problem? What is the time complexity of your function?

7. Given an arbitrarily long character string, you are asked to find out the most frequently occurring character as well as the least frequently occurring character. Write a C function to solve this problem, given a character string of size N. Now you are told that instead of being a fixed size character string, you are given a stream of characters which come dynamically as input. At any point in time, you should be able to print the most frequent and least frequent characters in the stream (note that if two or more characters have the same highest frequency, you can choose to print any of them as the most frequent characters). What is the time and space complexity of your algorithm?

8. You are given an array of N integers and told that the array is mostly sorted, with only k elements not being in their correct sorted position in the array, and with k being very small compared to N. What is the sorting algorithm you would use to completely sort the array?

9. You are given an array of N integers. You are asked to find the maximum and minimum elements in each of the sub-sequences of length K of the array, where K is an arbitrary integer. (a) Given an array of N integers and a value K, how many sub-sequences of length K are possible? (b) What is the complexity of computing the maximum and minimum for each of the sub-sequences of length K? How would you solve this problem in polynomial time, for any arbitrary value of K?

10. You are given a list of N integers, where the numbers can only be positive. You are now being asked to write code to create two lists of size N/2 such that the sum of the elements of the two sub-lists are equal, if at all it is possible to create such sub-lists. Your function should return the sum of the sub-list elements if it is possible to split the original list into two such sub-lists, or return 0 if that is not possible. What is the time complexity of your function? (Note that you can move the elements of the list.) You are now told that the original list can either contain positive or negative integers, and are told to solve the above problem. What are the changes you would make to your earlier solution?

11. Consider the problem given in (10). Now, instead of splitting the list into two sub-lists such that the elements of both are equal, you are asked to split the list at any arbitrary index such that the difference of the sum of the two sub-lists is the minimum among all possible splits of the original list. Unlike problem (10), you cannot move around the elements of the list. What is the time and space complexity of your function?

12. You are given two sentences S1 and S2, each containing N and M words, respectively. You are now asked to convert S1 into S2 by either deleting a word or adding a new word. Modifying word1 to word2 can be considered as word deletion, followed by word addition. Each insert operation has a cost of +2 and each delete operation has a cost of +1. Given two arbitrary sentences S1 and S2, the edit distance of (S1, S2) is the minimum total cost of operations needed to convert S1 into S2. You are asked to write a function to compute the edit distance. What is the time and space complexity of your function?

13. You are given a long piece of text T and three sub-strings S1, S2 and S3. You are asked to find the shortest sub-string X of T such that X contains S1, S2 and S3 as sub-strings. What is the time and space complexity of your solution?

14. You are given an array of N integers (where N can be of the order of millions) which has been sorted. Given an arbitrary value K, you are asked to find out how many times it repeats in the array? Can you write a C function to do this in O (log N) time complexity?

15. Consider the following code snippet:

int main(int argc, char *argv[]) { void *my_mem = 0; int alloc_blocks = 0; while(1) { my_mem = (void *) malloc(1024 * 1024); if (!my_mem) break; memset(my_mem,1, (1024*1024)); printf(allocated memory blocks is %d MB\n,++alloc_blocks); } exit(0); }

Can you explain what would happen when you compile and run the program?

If you have any favourite programming questions/software topics that you would like to discuss on this forum, please send them to me, along with your solutions and feedback, at sandyasm_AT_yahoo_DOT_com. Till we meet again next month, happy programming!