A New Artificial Intelligence AI Approach Called PromptPG Learns to Select in-Context Examples From A Small Amount of Training Data via Policy Gradient When Interacting With GPT-3 API

A New Artificial Intelligence AI Approach Called PromptPG Learns to Select in-Context Examples From A Small Amount of Training Data via Policy Gradient When Interacting With GPT-3 API

Source: https://arxiv.org/pdf/2209.14610.pdf

The latest modernizations in the field of Natural Language Processing have enabled us to define intelligent systems with a better and more articulate understanding of language than ever before. Major language models such as ChatGPT, PaLM and DALL-E are constantly improvising and showing exponential growth in their performance. These models mimic humans and help perform tasks such as generating the content, summarizing long paragraphs of textual data, answering questions, completing codes, and so on. LLMs are trained on large amounts of data and have shown excellent results in almost every domain, including Mathematics. LLMs have advanced in providing a solution to Mathematical problems such as Mathematical Reasoning and Mathematical Word Problems (MWP).

Although currently functioning LLMs can provide mathematical solutions to textual problems; they are still lacking in handling tabular mathematical data consisting of many reasonings and heterogeneous details. Researchers from the University of California, Los Angeles, Georgia Institute of Technology and Allen Institute for AI introduce a new approach called PromptPG that can easily handle tabular and textual data consisting of grade-level mathematical reasoning problems. This method is based on Policy Gradient, an approach to solving Reinforcement Learning problems.

Policy gradient mainly involves three steps – sampling the actions, observing rewards and adjusting the policy. PromptPG uses the concept of policy gradient in a way that the in-context examples are selected from the training data, followed by the development of prompts for the test data. It does this while dealing with the GPT-3 interface. For training the model, the researchers behind PromptPG developed a new Tabular Math Word Problems (TabMWP) dataset consisting of 38431 open-domain textual and table-type mathematical reasoning problems. Out of the total data in the dataset, questions are 28876 in number, answers are 6153 in number, and there are 35442 different solutions. The questions comprising the dataset have a tabular structure presented as an image, semi-structured text and a structured table. The dataset consists of a variety of questions ranging from free text questions to multiple choice questions.

🎟 Join our 13k+ ML Subreddit Community

The researchers showed that when using PromptPG on the TabMWP dataset, an average accuracy of 68.23% (State of the art) was achieved with a 5.31% gain over random selection. Several pre-trained models were evaluated on the TABMWP dataset, such as the GPT-3 model, which previously performed poorly due to its reliance on the in-context sample selection. PromptPG, while selecting the in-context examples, reduces the variance, followed by a growth in the efficiency and performance of the model without any heuristics.

The PromptPG interface is very user-friendly and easy to use. It has simple filters to choose from. The user can choose between the type of question for which he wants to find a solution, whether it is free text or a multiple choice type. After that, an answer can be selected from the many options of integer, boolean, decimal, etc. The user can also specify the degree, the number of rows and columns, and the table title.

PromptPG is a major advance considering the current LLMs’ limitations in solving complex mathematical problems that require reasoning. This approach can boost the performance of the GPT model and is undoubtedly a cutting-edge solution.

Check out the Paper, Github and Project Page. All credit for this research goes to the researchers on this project. Also, don’t forget to join our 14k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

Tanya Malhotra is a final year undergraduate from the University of Petroleum and Energy Studies, Dehradun, pursuing BTech in Computer Science Engineering with a specialization in Artificial Intelligence and Machine Learning.
She is a Data Science enthusiast with good analytical and critical thinking, along with a keen interest in learning new skills, leading groups and managing work in an organized manner.

Leave a Reply

Your email address will not be published. Required fields are marked *