Site icon Indian Flash

Man vs. Machine: New Study Reveals AI Surpasses Average Human Creativity

A groundbreaking study by Université de Montréal reveals that GPT-4 and other AI models now surpass average human creativity. However, the world’s most creative individuals still outperform the best AI.

Are generative artificial intelligence (AI) systems truly creative? This question has sparked intense debate since the rise of tools like ChatGPT. To find a definitive answer, a research team led by Professor Karim Jerbi from the Department of Psychology at the Université de Montréal has published the largest comparative study ever conducted on the creativity of large language models (LLMs) versus humans.

The study, which included AI pioneer Yoshua Bengio, marks a historic turning point in our understanding of machine intelligence. Published in Scientific Reports (Nature Portfolio), the findings show that while AI has reached a major milestone, the “creative crown” still rests on human heads—for now.

AI Reaches a New Milestone: Surpassing the “Average”

The research team compared the creative output of several high-profile models, including GPT-4, Claude, and Gemini, against a massive database of 100,000 human participants. The results were both surprising and, for some, unsettling.

For the first time, researchers confirmed that some AI models now exceed the average creative performance of humans in tasks involving divergent linguistic creativity. This means that in standard brainstorming or word-association tasks, a model like GPT-4 is likely to provide more creative responses than the average person on the street.

“Our study shows that some AI systems… can now outperform average human creativity on well-defined tasks,” Professor Jerbi explained. However, he was quick to point out that this does not mean the machines have “won” the creative race entirely.

The Human Advantage: Why the Best Still Lead

While the “average” human may find themselves outmatched by an LLM, the most creative individuals still hold a significant lead. The study’s co-first authors, Antoine Bellemare-Pépin and François Lespinasse, revealed that the highest levels of creativity remain a distinctly human trait.

The data highlights a clear hierarchy:

AI vs. Average Humans: Some AI models (specifically GPT-4) outperform the average human participant.

The Creative Half: The average performance of the most creative 50% of human participants still exceeds every AI model tested.

The Elite Creators: The top 10% of the most creative humans maintain a wide and distinct gap between themselves and the best AI systems.

Rigorous Testing: The Framework

To ensure the comparison was fair, the team collaborated with Jay Olson from the University of Toronto to develop a rigorous framework. This allowed the researchers to evaluate both humans and AI using the exact same linguistic tools and metrics.

By testing such a vast number of participants, the study provides a robust benchmark that moves beyond anecdotal evidence. It proves that while AI can replicate and even enhance the “typical” creative process, it has yet to replicate the spark of genius found in the world’s most innovative minds.

How Do We Measure Creativity? The Science Behind Benchmarking Humans vs. AI

As artificial intelligence systems like ChatGPT and Claude become more integrated into our daily lives, scientists are increasingly asking: are these machines truly creative?

By using a combination of psychological tests and complex writing tasks, researchers can now quantify how AI compares to a database of over 100,000 human participants.

The Divergent Association Task (DAT): A Window into the Creative Mind

The primary tool used by researchers to benchmark creativity is the Divergent Association Task (DAT). Developed by study co-author Jay Olson, this tool is specifically designed to measure divergent creativity—the psychological ability to generate a wide variety of original ideas from a single starting point.

How the 10-Word Test Works

The DAT is a deceptively simple test. Participants (whether human or AI) are asked to produce ten words that are as semantically different from one another as possible.

• A Low-Creativity Example: Choosing words that are closely related, such as “cat, dog, bird, fish, hamster.”

• A High-Creativity Example: Selecting words from entirely different categories, such as “galaxy, fork, freedom, algae, harmonica, quantum, nostalgia, velvet, hurricane, photosynthesis.”

Furthermore, the DAT is highly efficient, usually taking only two to four minutes to complete. Because it is accessible online, it has allowed researchers to gather a massive human dataset for comparison.

Why Words Reflect Complex Thinking

While the DAT is language-based, it is not merely a test of vocabulary. Psychologists have found that a participant’s performance on this 10-word task correlates strongly with their performance on other established creativity tests used in Creative problem solving, Professional writing and Idea generation.

Consequently, the test serves as a proxy for the general cognitive mechanisms of creative thinking. It measures the brain’s ability to “jump” between distant concepts—a hallmark of human innovation.

Testing Real-World Creativity: From Haikus to Movie Plots

To see if high scores on the DAT translated into real-world skill, researchers pushed the AI models into more complex, “natural” creative activities. They directly compared humans and AI across three specific formats:

1. Haiku Composition: Testing the ability to work within strict poetic structures.

2. Movie Plot Summaries: Measuring narrative originality and structural thinking.

3. Short Stories: Assessing the ability to build complex worlds and characters.

Interestingly, while some AI models (like GPT-4) can now outperform the average human on these tasks, the data shows that the most skilled human creators still retain a clear advantage. Machine intelligence can mimic the “average” creative output, but it struggles to match the outlier levels of originality found in top-tier human writers.

Q&A: Deep Dive into the Study

How was “creativity” measured in this study? The researchers focused on divergent linguistic creativity. This involves the ability to generate diverse and original ideas or solutions using language. By using the same tasks for both 100,000 humans and several AI models, they established a direct performance comparison.

Which AI models were the most creative? The study specifically mentioned models such as GPT-4, Claude, and Gemini. Among these, GPT-4 was highlighted as one of the systems capable of exceeding average human performance.

Does this mean AI is “thinking” like a human? Not necessarily. While the output is more creative than that of an average human, the study describes AI as being based on well-defined tasks. It excels at linguistic patterns, but the “highest levels” of creativity—likely involving deeper intuition or complex synthesis—remain uniquely human.

FAQ: Frequently Asked Questions

Can AI replace creative professionals? According to the study, AI still falls short of the most creative humans. While it can assist with “average” creative tasks, high-level innovation and the top 10% of creative work still require human input.

Who led this research? The study was led by Professor Karim Jerbi of the Université de Montréal and included contributions from AI pioneer Yoshua Bengio, as well as researchers from Concordia University and the University of Toronto.

What is the “Divergent Creativity” threshold? This is the point where an AI can generate a higher volume of original linguistic associations than the average human. The study confirms that AI has now crossed this threshold.

Why is this study considered a “turning point”? It is the largest study of its kind, using data from 100,000 participants. It provides scientific proof that AI is no longer just a tool but a competitor to average human creative performance.

Exit mobile version