Key Highlights:
- The predictive language models were designed to perform tasks and functions of predicting the next word in search engines and texting apps
- These prediction models formed connections through deep neural networks that resemble the functioning of the language-processing center in the human brain
- The models had activity patterns similar to the human brain enabling to best predict human behavior responses
Forward One-way Predictive Transformer
MIT neuroscientists are currently researching AI-powered predictive language models that help search engines and texting apps in predicting the next word in a string of text. The latest research by Joshua Tenenbaum (Professor of computational cognitive science at MIT and a Member of CBMM and MIT’s Artificial Intelligence Laboratory) and Evelina Fedorenko (Associate Professor of Neuroscience and a Member of the McGovern Institute)suggests that the underlying function of these models resembles the function of language-processing centers in the human brain. These predictive language models predict words while performing tasks such as answering questions, summarizing documents, and completing stories.
Neural Language Processing Model
As per neuroscientists, these high-performing next-word predictive language models belong to a class of models called deep neural networks. These networks include computation nodes that form connections of varying strength and layers that pass information to each other. For years, scientists have been using deep neural networks to create models of vision for recognizing objects as well as the primate brain does. The neuroscientists compared the language-processing centers in the human brain with language-processing models. The research confirmed that the GPT-3 (Generative Pre-trained Transformer 3) model can generate text similar to what the human brain would produce.
GPT-3 Predictive Model
While conducting further research, each model was presented with a string of words that helped them measure the activity of the nodes forming a network. These network patterns were then used to compare with the activity in the human brain. The comparative results were measured in subjects performing three language tasks–listening to stories, reading sentences one at a time, and reading sentences in which one word is revealed at a time. These human datasets included functional magnetic resonance (fMRI) data and intracranial electrocorticographic measurements. The researchers identified that the best-performing next-word predictive language models had activity patterns resembling to the human brain. One of the key computational features of the GPT-3 model, is an element called ‘forward one-way predictive transformer’.
Also Read: Researchers advance toward a Next-generation Brain-computer Interface System