Artificial intelligence (AI) is advancing, especially in the form of so-called large language models (LLMs). These are digital intelligences that seem very good at making sense of and generating human-sounding text. How good are they as reasoners? An extraordinary collaboration between Amazon and UCLA sheds light on the thinking strengths and weaknesses of such LLMs. It provides a fascinating, if mixed, portrait of how AI reasoners are performing.
Underpinning this work is a new typology for understanding the reasoning powers of LLMs, carved out via a set of rigorous experiments that Amazon and UCLA hoped would shed light on how LLMs approach inductive and deductive reasoning. The verdict? LLMs are indeed good at inductive reasoning, but deducing conclusions from logical premises seems to be their weakness.
The ability to engage in inductive reasoning – drawing general conclusions from a particular set of observations – is where LLMs excel. It’s like joining up the dots and discerning a larger picture. But then, as with the Morse code cipher: in the reverse appropriation of deductive reasoning – drawing a precise logical conclusion from a general premise – LLMs fail.
Another problem in getting the most out of these LLMs is that they are sensitive to the precise details of prompt engineering. Feed a prompt with some random or biased information, and it might well regurgitate those errors back through the LLM, favouring confirmation over correction. All of this brings into sharp focus just how much control the training mechanisms have, especially the attention mechanisms that are built into transformers, the building blocks of LLMs.
The core of the problem, according to Amazon researchers, is the attention mechanisms in transformers that excel at predicting the next token given the context immediately before it. This strategy works well in many cases, but can cause an LLM to go off the rails by boosting the salience of recurring tokens and skew its outputs towards inaccuracy. The tendencies of transformers to amplify the salience of repeated entities can suffuse the realm of logical reasoning with ‘noise’.
To address those challenges, the team of researchers at Amazon and UCLA developed a novel version of the zero-shot prompt technique, which recognises that LLMs can easily lose focus on the task at hand by repurposing the original prompt. If applied correctly, this technique of forcing a narrower perspective might provide one route in overcoming another of the deductive limitations of LLMs.
These findings have significant implications for the future of AI. By increasing our understanding of how LLMs reason, this research provides a pathway for improving their deductive capabilities, while also paving the way for AI systems that can excel at both forms of reasoning. These results highlight the wide range of skills that are still required to program these entities, and illustrate the benefits of tailored training and quick engineering. It’s going to take a nuanced approach to developing the cognitive faculties of these digital beings. And with AI spearheading an ever-expanding quest to integrate itself ever more into human life, that refinement is going to become a progressively important question to answer.
Amazon’s first AI lab throws a spotlight on the company’s drive for technological innovation.It is competing with Google and others to recruit academics and stay at the cutting edge of AI.For the many thousands of jobs it has created, and the colossal resources at work, Amazon is helping to shape the future of all sorts of commerce.Meanwhile it is also making a huge contribution to understanding how artificial intelligence will work. Beyond testing the limitations of current AI, work such as this path-breaking study of LLM reasoning is also pointing towards new generations of more nuanced, more reliable, more capacious AI. This vision of future human-technology hybrids expresses itself across the unbounded imagination of an AI people. At its most idealistic, AI-enabled technological augmentation of human interaction is supposed to be invisible, second nature, and – most crucially – sensitively responsive to the nuances of human thought.
Overall, the first instance of this kind of research, done by Amazon and UCLA, reveals the cognitive strengths and weaknesses of large language models in a rich and nuanced way. As we enter an exciting era of AI, the insights that it reveals are imperative for building the future where AI thinks, reasons and comprehends as intelligently as a human being does.
© 2024 UC Technology Inc . All Rights Reserved.