large language models tagged posts

As LLMs Grow Bigger, they’re more likely to give Wrong Answers than Admit Ignorance

As LLMs grow bigger, they're more likely to give wrong answers than admit ignorance
Performance of a selection of GPT and LLaMA models with increasing difficulty. Credit: Nature (2024). DOI: 10.1038/s41586-024-07930-y

A team of AI researchers at Universitat Politècnica de València, in Spain, has found that as popular LLMs (Large Language Models) grow larger and more sophisticated, they become less likely to admit to a user that they do not know an answer.

In their study published in the journal Nature, the group tested the latest version of three of the most popular AI chatbots regarding their responses, accuracy, and how good users are at spotting wrong answers.

As LLMs have become mainstream, users have become accustomed to using them for writing papers, poems or songs and solving math problems and other tasks, and the issue of accuracy has become a bigger...

Read More

Language Agents Help Large Language Models ‘Think’ Better and Cheaper

Language agents help large language models 'think' better and cheaper
An example of the agent producing task-specific instructions (highlighted) for a classification dataset IMDB. The agent only runs once to produce the instructions. Then, the instructions are used for all our models during reasoning. Credit: arXiv (2023). DOI: 10.48550/arxiv.2310.03710

The LLMs that have increasingly taken over the tech world are not “cheap” in many ways. The most prominent LLMs, such as GPT-4, took some $100 million to build in the form of legal costs of accessing training data, computational power costs for what could be billions or trillions of parameters, the energy and water needed to fuel computation, and the many coders developing the training algorithms that must run cycle after cycle so the machine will “learn.”

But, if a researcher needs to do a specializ...

Read More

Researchers to present New Tool for Enhancing AI Transparency and Accuracy at conference

SMU researchers to present new tool for enhancing AI transparency and accuracy at IEEE Conference
Clark and Buongiorno’s research explores GAME-KG’s potential across two demonstrations. The first uses the video game Dark Shadows. Credit: SMU

While large language models (LLMs) have demonstrated remarkable capabilities in extracting data and generating connected responses, there are real questions about how these artificial intelligence (AI) models reach their answers. At stake are the potential for unwanted bias or the generation of nonsensical or inaccurate “hallucinations,” both of which can lead to false data.

That’s why SMU researchers Corey Clark and Steph Buongiorno are presenting a paper at the upcoming IEEE Conference on Games, scheduled for August 5-8 in Milan, Italy...

Read More

AI Study reveals Dramatic Reasoning Breakdown in Large Language Models

AI study reveals dramatic reasoning breakdown in LLMs
Strong fluctuations across AIW problem variations. Also for higher performers, eg GPT-4o, GPT-4 and Claude Opus 3, correct response rates vary strongly from close to 1 to close to 0, despite only slight changes introduced in AIW variations (a color per each variation 1–4). This clearly shows lack of model robustness, hinting basic reasoning deficits. Credit: arXiv (2024). DOI: 10.48550/arxiv.2406.02061

Even the best AI large language models (LLMs) fail dramatically when it comes to simple logical questions. This is the conclusion of researchers from the Jülich Supercomputing Center (JSC), the School of Electrical and Electronic Engineering at the University of Bristol and the LAION AI laboratory.

In their paper posted to the arXiv preprint server, titled “Alice in Wonderland: S...

Read More