Rewind: Chatting with Artificial Intelligence
Microsoft-backed ChatGPT, Google’s Bard and Baidu’s Ernie Bot take AI search chatbots to a whole new level
Date of publication – 12:45, Sun – 12 February 23
Illustrator: GURU G.
By Allamraju Aparajitha
Hyderabad: The release of ChatGPT has led to a sharp rise in public interest in artificial intelligence (AI) and related fields. Although research in the field of Natural Language Processing (NLP) has been going on for several decades with some major breakthroughs in the last decade, it was ChatGPT that stole the limelight. Whether it is due to the fact that it is able to answer questions like a human or was able to pose based on a command or maybe just its simple user-friendly interface, ChatGPT gets attention like no other .
The field is advancing at such a pace that before we can understand one model, there is already another ready to come out. Earlier this week, Google announced Bard, a conversational service powered by LaMDA (Language Model for Dialogue Applications). Both ChatGPT and LaMDA are language models whose job is to predict the next word. Before we look at what’s behind the scenes and the future from here, let’s look at how we got here, while also understanding the foundation for it all.
Intelligence and Language
Cambridge dictionary defines intelligence as the ability to learn, understand and make judgments or opinions based on reason. As a rudimentary extension, we can call AI a machine’s ability to learn, understand and make judgments or have opinions.
Language is typically defined as a communication system consisting of sounds, words and grammar. The ability of humans to communicate abstract thoughts is what makes human language different from animal ‘language’. Language is not necessarily a measure of intelligence.
The earliest work known in the field of linguistics can be attributed to Pāṇini, who in his work, Aṣṭādhyāyī, sets down the rules of Sanskrit grammar. His treatise explains the syntactic and semantic structures in Sanskrit. Understanding the structure of language is one of two main approaches deployed in the late 1950s and 1960s to make machines understand natural language.
Noam Chomsky and many other linguists led the work in this formal language theory. The other paradigm that saw a rise around the same time was that of AI. Researchers such as John McCarthy, Marvin Minsky and Claude Shannon were focused on this paradigm. Some others on statistical algorithms and some on reasoning and logic.
History of AI
Early work on AI began after World War II and the name AI itself was coined in 1956. AI can be broadly categorized into four classes – Think human, think rational, act human and act rational. Turing test proposed by Alan Turing in 1950 set the initial direction for research. A machine that has the ability to communicate, store information, use that information and adapt it to other situations can be said to pass the Turing test. A machine that can also perceive objects and manipulate objects or move around can be said to pass the total Turing test.
History of Chatbots
One of the earliest question answering systems was Baseball, which was created in 1961. Baseball was a program that could answer questions in plain English. It was considered the first goal towards computers that answer questions directly from stored data. As the name suggests, the program was able to answer questions such as “Which teams won 10 games in July?” in the domain of the sport – baseball. The answers were not grammatical sentences. They were either a list or yes/no type of answers.
Weizenbaum’s ELIZA in 1966 was an early chatbot system capable of limited dialogue and mimicking the responses of a Rogerian psychotherapist. It was a very simple program that used pattern matching to process the input and generate a suitable output. ELIZA is believed to have passed the Turing test, as many users of the system believed that they were indeed interacting with a human.
ALICE (Artificial Linguistic Internet Computer Entity), created by Richard Wallace in the 1990s is a chatbot written in AIML (Artificial Intelligence Markup Language). This can be used to create bot personalities that have a minimalistic response. The following years saw several changes in paradigms with the rise of statistical methods, machine learning and the re-emergence of neural networks, all of which led NLP to what it is today.
Breakthroughs in the last decade
Some major advances in the past decade have brought us to where we are today. First, the computing power that can be packed into smaller, cheaper chips. There has been a tremendous reduction in the size and cost of chips along with their ability to do faster calculations.
Although the basis of neural networks has existed for several decades, the access to GPUs (graphics processing units), which enable faster computers, has brought it back to the fore. Deep learning was initially used to solve problems involving images. In the language world, the introduction of transformer architecture in 2017 by Google, based on the attention mechanism introduced in 2014, can be said to be a turning point.
ChatGPT refers to itself as a Language Model and can also explain what a language model is and the transformer architecture. Language modeling is the NLP task of predicting the next word given the previous words. This can be thought of as a ‘fill in the blanks’ exercise where all the blanks are always at the end of the given words. Language modeling, like other NLP tasks, has shifted to using deep neural networks from statistical methods. The availability of big data and computation has enabled companies to create what are known as large language models or pre-trained language models.
Founded in 2015, OpenAI introduced GPT (Generative Pre-Training) in 2018, which is also one such model. Its successor GPT-3, launched in 2020, forms the basis for GPT3.5 which is the basis for ChatGPT. That’s why ChatGPT calls itself a language model.
Although ChatGPT, as seen by the many examples online, has the ability to answer complex questions, it fails in common sense reasoning. This is because ChatGPT doesn’t really understand the language. All it can do is generate the next word, thus making sentences and paragraphs. Looking back to AI, ChatGPT is successfully able to act like a human with its responses, but it doesn’t think like a human. When a user prompts ChatGPT and asks it to pose as a screenwriter and write a script, ChatGPT generates because it has seen enough scripts in its training data to know what a script looks like and not because it suddenly became a screenwriter.
To put it in simple words, ChatGPT, although the exact training data is not released, was trained on 500 billion tokens (can be considered words for simplicity). For better perspective, imagine knowing every piece of text published on the Internet for a period of three years and using it to generate scripts, papers, code and everything else imaginable without departing from the original text to copy.
How else does ChatGPT learn?
Reinforcement learning with human feedback
Reinforcement learning (RL), unlike supervised and unsupervised learning, has a system of learning through rewards. The goal of RL is to maximize the reward. Usually the whole process is automated and no human involvement is there in the whole process. In RLHF, one provides feedback on the generated responses, which is then used to calculate the reward.
Let us imagine how a child learns a language. Suppose the child says “I want to eat an apple”. The parent can correct the child by saying “I want to eat an apple”. This kind of exchange can happen repeatedly and the child’s language comes closer to the natural way. At different points in the exchange, the child can also be rewarded. If the child gets the sentence just after the first correction, the whole apple can be given. One or more slices of the apple can be given at different stages of the learning process. Here the apple is considered the reward. In the case of ChatGPT, there is no child, but it is the model that learns while providing an annotation that can be used to calculate the reward as a score.
Bluff with confidence
Besides the already mentioned inability to reason, ChatGPT has a few more limitations. The biggest problem is that of factual accuracy. Since the model does not really understand the meaning, it also lacks the ability to verify facts. Although the model may respond with an ‘I don’t know’ for a particular message, it may force a response to be generated to slightly adjust the message. In other words, it has the ability to bluff with confidence. This is technically known as hallucination.
ChatGPT itself has answered this in several questions – that it is not connected to the Internet. Although the model looks like a replacement for a search engine, it is far from one. Besides not being connected to the internet, the model cannot currently provide a source for the answer. Unlike ChatGPT, Bard may be able to circumvent the problem of factuality, as LaMDA is introduced with the mechanism to be able to connect with verified external knowledge source.
As seen in GPT-3 and other models, the models reflect the training data. Inadvertently, the training data reflects the bias in society. Although there are some checks in place for ChatGPT, people have found ways to break them.
Moreover, cost is not the only concern in training and hosting these models, but the environmental impact of these models must also be seen. The energy used to train large models is quite high. Making model training sustainable is research in itself. Finally, looking at the copyright lawsuits against the image generation models, we cannot yet be certain about how copyright will work in language models.
Making us unemployed?
A growing concern was also about AI taking away jobs. Although it may not happen immediately, the possibility cannot be ruled out. If we look at our own history, the Industrial Revolution led to a reduction in jobs in some sectors, but led to an increase in jobs in some other sectors. The only difference between then and now would be that this may be the first time in history that we see white collar jobs being taken away. Jobs like programmers, copyright holders, and graphic designers can all be replaced with people who are good at writing incentives for the generation models.
Currently, OpenAI allows the model to be used for free. With the integration of Bing search engine with ChatGPT, this may not be the case for long. Let’s enjoy our conversations with ChatGPT as we sign up for the new Bing and await the announcement of GPT4 and the public release of Bard.
(The author is pursuing PhD in NLP and is a former Senior Software Engineer at Apple, USA. Opinions are personal)