In this months column, we discuss a medley of topics including solving cognitive intelligence puzzles, and how Python is getting used widely in AI and Natural Language Processing.
For the past few months, we have been discussing information retrieval and Natural Language Processing (NLP), along with the algorithms associated with them. In this months column, we take a break from that discussion to look at some of the questions I have been getting from our readers recently. Given that we generally discuss coding problems, some readers have asked me whether I can discuss some puzzles. So instead of discussing math puzzles, I thought it may be interesting to try out something that tests our cognitive intelligence capabilities. Here is the first puzzle, which initially appears quite simple, but many people, including college graduates, seem to get it wrong on their first try.
There are three people Andrew, Anne, and Bobby. Andrew is looking at Anne. Anne is looking at Bobby. Andrew is married but Bobby is not married. Now the question to you is: Is a married person looking at an unmarried person? Choose the correct answer from the following options:
(A) Yes (B) No (C) Cannot be determined
Most people tend to choose the answer (C). Do you think the answer is correct? If not, why not? First take a couple of minutes to think through your reply. Well, the correct answer is (A). The key point to note is that each person can be in either a married or unmarried state. We know that Andrew is married and that Bobby is not. The reason most folks choose (C) is because there is no information regarding the marital state of Anne. But remember that the question is not whether Anne is married or not. The actual question is whether a married person is looking at an unmarried person. Anne can either be married or unmarried. If she is married, then answer (A) is indeed a married person looking at an unmarried one since Bobby is unmarried. If Anne is unmarried, then too, (A) emerges as the correct answer since Andrew who is married, is looking at Anne who is unmarried. Hence the answer is (A) and not (C). The reason most people choose (C) is because they are poor in disjunctive reasoning, which requires all the different possibilities or outcomes associated with a given situation to be taken into account before arriving at a conclusion.
Here is another puzzle to exercise your brain. This being a simple puzzle, I would like you to answer this question very quickly, in your mind and without using pen and paper.
A bat and a ball cost $1.10. The bat costs one dollar more than the ball. How much does the ball cost?
The immediate answer that most people give is 10 cents. However, if you work it out with a bit more careful thought, if the ball costs 10 cents, this makes the cost of the bat as one dollar and 10 cents since the bat costs one dollar more than the ball. The total cost of both bat and ball would then be $1.20, which is incorrect since the total given in the puzzle is $1.10. With a little bit of careful thought, you will find that the cost of the ball is five cents and that of the bat is one dollar five cents, which makes the total $1.10. This problem is very simple as far as mathematics is concerned. But when asked to name the answer immediately, somehow most folks blurt out the answer as ten cents instead of as five cents. The reason for this cognitive dysfunction is because our quick thinking brain is deceived into coming up with an obvious answer. However, a little more careful thinking prevents this error of reasoning.
Here’s the third puzzle.
You are given four cards, with each card having a digit on one side and an alphabet on the other. The hypothesis you need to validate is that if a card has a vowel on one side, then it has an even number on the other side. The four cards are laid such that Card 1 shows A, Card 2 shows D, Card 3 shows 4 and Card 4 shows 7. You need to specify which of these four cards are needed to check whether this rule is true or not.
Most people give the answer as the cards containing A and 4. But this is incorrect. Can you figure out why? The correct answer is the cards containing A and 7. It is obvious why the A card needs to be checked. If Card A has an odd number on the other side, obviously the rule is incorrect. Why is it that Card 4 need not be checked? The reason is that if Card 4 contains a consonant on the other side, it still does not violate the rule. (If you disagree, think carefully about the rule that we are trying to verify here.) On the other hand, if Card 7 contains a vowel on its other side, then the rule is disproved. This is another example of how the short-cut heuristics employed by our brain to quickly jump to a solution can lead us to the wrong answer.
You may have been wondering why we are discussing cognitive intelligence puzzles in a column devoted for computer science. Computer science programming projects are littered with evidence where a poor choice or decision in the planning, requirements, implementation and testing phases has led to a costly debacle. Hence, it is all the more important for computer scientists to be able to reason effectively and be aware of the short-cut heuristics and illusions the human brain can come up with. I would like to suggest that our readers look up the book Thinking Fast and Slow by Daniel Kahneman, a Nobel Prize laureate. The book is an effective guide to the various heuristics and biases the human mind is prone to, and on how to become an effective decision maker.
Vivekanandan, one of our student readers, had asked me about libraries for AI and NLP algorithms. Given that we also had a few other readers asking about how to ramp up on NLTK (Natural Language Toolkit), I felt it would be good to have a short discussion on the Python libraries available for AI and NLP and how one can quickly ramp up on the same. This months article is being co-authored by one of our student readers, Ankith Subramanya from The International School, Bengaluru, as he shares his experience of ramping up on Python libraries for AI and NLP.
As computer science develops, it is exploring problems that arise in the real world. Artificial Intelligence (AI) and Natural Language Processing (NLP) are two fields that have great practical applications. AI is the science of making our machines intelligent, so that they can be utilised for a number of practical problems. AI has vast potential and can have a great impact on every aspect of our lives, from the household (robotic vacuum cleaners) to national defence (missiles). As AI is a vast topic, research in it involves several centralised fields such as reasoning, knowledge, planning and learning. One of these several fields is NLP, which is probably the most human-centric field in AI, in the sense that it involves the interaction between natural human languages and the computer. This can be broadly understood as human-computer interactions. The goal of research in this field is to make communication between humans and computers as natural and effortless as possible, as if a computer were a person.
Python is regarded by many as a simple yet powerful language. The reason for its simplicity and quickness can be attributed to features such as built-in high level data types and dynamic typing. Python is increasingly preferred for AI, NLP and data mining. Apart from the characteristic benefits that it offers to developers, the main reason for Pythons increasing preference is the vast number of tools and libraries that it offers. These can be broadly classified into general AI, machine learning, natural language and text processing, and neural networks. AMAI (Annals of Mathematics and Artificial Intelligence) is a library that includes a number of Python implementations of the algorithms from Russell and Norvigs Artificial Intelligence: A Modern Approach.
One fun library that you can look at is easyAI, which is a simple Python engine for two-player games involving AI. With this framework, you can create and play various games such as Tic-Tac-Toe and Connect4. The steps for setting up and installing easyAI along with a user manual with examples are provided at zulko.github.io/easyAI/. To create a two-player game with easyAI, you need to simply create a sub-class of the TwoPlayersGame class (from the easyAI class). Then, you must define a set of methods that specify the nature of your game. To start the game, create an object of your game and then call the play method.
One very effective Natural Language and Text Processing library that Python offers is the NLTK, which is a platform for building Python programs that process data from human language. It basically provides the tools and libraries required to work with NLP. To make use of NLTK, you must first install it and then install the NLTK library. Optional installs such as NumPy are recommended, but not necessary. The steps and links for installation can be found at http://www.nltk.org/install.html. With the NLTK libraries, you can do many delightful things to a piece of text such as analyse the sentiments (positive or negative), tokenise (split the text into parts such as paragraphs, sentences or words), stem (remove and replace word suffixes to arrive at the common root form of the word), tag and chunk (recognise different words as nouns, verbs, etc), and a lot more.
You can make your own simple sentiment analysis software. The following trivial code example will give readers a brief idea of how to make it work.
import sys sentence = sys.stdin. read () # to get input from stdin tokens = nltk_work.tokenize(sentence) scorer = 0 #keep a score of how positive or negative the statement is for word in tokens: /* in the following if statements, we will be giving our own condition to assign a score depending on how negative or positive a word is, from a selected set of words. You can add more words or choose a different words. */ if word==good: scorer = scorer + 1 if word ==great: scorer = scorer + 2 if word ==bad: scorer = scorer 1 if word ==terrible: scorer = scorer -2 if scorer == 0: print neutral if scorer > 0 print positive if scorer < 0 print negative |
The above example is very basic but to actually make software that can process the complexities of various real world texts, more rigorous algorithms and larger sets of data are used.
If you have any favourite programming questions/software topics that you would like to discuss on this forum, please send them to me, along with your solutions and feedback, at sandyasm_AT_yahoo_DOT_com. Till we meet again next month, happy programming.