Tuesday, September 26, 2006

More on the Turing Test: Loebner Prize, TTS, Hindi and other languages

Is conversational behavior shown by a naïve human being a good example of intelligent behavior? If a machine does as well in conversation as a human being to make you think that it is human, then many people will agree it is intelligent.

There is a lot of lore about the Turing Test, named after Alan Turing who defined it. The website http://www.cs.vu.nl/~jdruiter/c/index.html gives a lot of information on this.

The Loebner Prize

You may also wish to read http://www.loebner.net/Prizef/loebner-prize.html describing the $100,000 Loebner Prize for the first person to win the Turing Test by creating a suitable computer program. There is also an annual prize to be won by the best entry that year. The website mentioned above gives considerable information on the annual prizes awarded so far. Current technology has not yet demonstrated successful artificial intelligence at the level of winning the $100,000 Prize.

There is a very interesting discussion between Shieber [Lessons from a Restricted Turing Test] and
Loebner [ http://www.loebner.net/Prizef/In-response.html ]. Shieber has argued that if there had been a prize for a flying machine before the relevant science had been understood, people would have spent time trying to use springs to make flying machines. Loebner answers this by referring to Mozart’s backside!
To quote Loebner
“When Mozart rode to Vienna in 1781 he wrote complaining of the pain the mail coach inflicted on his backside. (Mozart in Vienna, V.Braunbehrens, trans T. Bell Grove Weidenfield, NY 1990, p 17). This was a result, I must suppose, in part from poor suspension of the coach. The study of elasticity, stress, and strain did not result in a swift and straight arrival at understanding. Suppose a concerted effort had been made, early on, to fly using springs. Perhaps the concepts of stress and strain would have been invented sooner, along with advances in spring technology that would have been a boon to humanity, and Mozart’s buttocks”.
Three cheers to Loebner! My project suggestion in this posting deals with an idea related in spirit. Many practical and valuable things can be achieved using the experience of working with programs designed to take the Turing Test. To continue with the analogy, instead of making a flying machine, one of these efforts could lead to an understanding of how to make properly sprung mail coaches.


Chatterbots

Before I start off with my proposal, let me mention Chatterbots. The Internet era has created “bots” which are programs which behave like robots; including the Chatterbots” which carry out conversations with humans. The website http://www.a-i.com/ gives you access to “Alan”, an interesting chatterbot.

Good, now we are ready to discuss my project suggestion. The proposal below involves extending the test idea to a spoken language context, not to win the Loebner prize, but to create a system, which could serve a variety of people in a limited way. I will argue that there are many attractive reasons to use a speech interface in the context of a Turing Machine. I do not believe that this is a magical solution to the challenge of making a machine show intelligent behavior, but I do believe that many students of artificial intelligence (AI) and computer science would consider this a promising way to advance the tools available to those who work on AI. The idea of using a speech interface is not new. You can find examples among winning programs from the annual Loebner prize contests that offer you a text-to-speech interface.

As I have mentioned earlier, there have been developments in search over the last ten years. Search engines dig out relevant information from millions of pages of text. I am going to deviate from the definition of the Turing Test, and will introduce, this time, the Ramani Test :=) This test will require a machine to respond to keyboarded questions in interesting and appropriate ways with spoken responses to impress listeners. It is deliberately a broad definition. Here is a relatively detailed proposal:


Marrying TTS capabilities in Hindi with the Turing Machine

The idea is to have humans communicate to a machine through a keyboard in one direction and through a test-to-speech system in the other direction. The focus would be on the machine being queried in one language, say in English, and responding in another language, say Hindi. The computer could do one of two following things, and the human participant would not be told which mode the computer uses at a given time.

1. The computer could simulate the conversational behavior of a person without significant school education – let us think of him/her as an adult coming to an adult literacy class in a village. The responses come through the TTS in the Hindi language (as an example). Questions can be asked in English and the simulated “person” who has only a very limited understanding of English would speak back in Hindi. An English version of what the simulated “he” says would be displayed on the screen so that the English-speaking human participant can understand what the answer is. The Hindi answer would be communicated through a speaker or a pair of earphones, to impress and amuse those who understand Hindi.
2. In the second mode of operation, the computer would display the question to a team of three, working somewhere out of sight of the person asking questions: a student (A) who reads the question and translates it an equivalent question in spoken Hindi. Another member of the team would be an actual member of an adult literacy class who answers these questions in Hindi. But the delegate would not hear the spoken answers directly. The student A would type the answer in Hindi using a suitable keyboard or a transliteration scheme, and the answer would go through the TTS to the delegate. (The simple transliteration scheme named ITRANS enables the use of a common Roman Script keyboard to input Hindi text). Simultaneously, another student, B, would convert the spoken Hindi answer to a typed-in English answer to be displayed to the original questioner.

You may ask why we use this elaborate method of going from the Hindi answer spoken by a human to its TTS “equivalent”. The idea is not to give away the nature of the “person” answering the question through the quality of the voice. This model ensures that the answers in both cases come only through the TTS.

3. Some delegates would be watching the three-person team at one end while others watch the delegate asking questions at another location. This is to demonstrate that we are not cheating!

4. The game is for the visitor to guess whether the answers came from the machine as in model (1) or from the real human being as in model (2). The system would randomly choose for each discussion session whether the respondent would be a real person, or the simulated person.

The big question is if the human participant would be able to distinguish the “simulated person” from the real one.

A socially valuable outcome of this work could very well be the wider understanding of two capabilities of uneducated adults: knowledge and communication ability.


Srinivasan Ramani Sept 26, 06


------------------