Sunday, January 21, 2007

Semantics, AI and Dyslexia

There has been a lot of theorizing about how semantic information is used to recognize the structure and meaning of sentences. I got a very interesting view of an approach to this problem by a scholar who had started out as a medical doctor, Parag C. Prasad.
He did a Ph D at the School of Biosciences and Bioengineering, Indian Institute of Technology, Bombay, completing it last year (2006). Professor Arunkumar was the thesis advisor.

The thesis offers an interesting hypothesis as to how the human mind uses a sequence of words read (call this the immediate context) to predict the word it expects to read next. Prasad has developed a simulation model in the form of a neural net and has carried out experiments on it. The attraction of the hypothesis is its simplicity. It ignores the structure of the “immediate context”, and uses the set of words in it to represent relevant information. In other words, the hypothesis is that a small bag of words you have read helps you predict the next word you expect to encounter. The bag is the Short Term Memory (STM). The hypothesis is that the Long Term Memory (LTM) associates different bags with the individual words that they predict. For example, the bag (I, like, eat) could be associated with “mango” in your LTM.

Prasad discusses results from the medical field that show the extent of degradation of reading ability arising from disease conditions such as lesions, and shows how his neural net model successfully shows similar behavior.

Undoubtedly, there are possibilities of extending the hypothesis to account for a variety of phenomena. For instance, if you have read that John likes to eat mango, you might find it easy to recognize “banana” if it appears with the same immediate context as mango. How does this happen?

I can also think of a number of research projects that depend on the hypothesis reported to improve machine behavior. Can you think of one? Do you wish to list it here? Or do you wish to carry out such a project? Post a comment here.


No comments: