As AI systems began acing traditional tests, researchers realized those benchmarks were no longer tough enough. In response, nearly 1,000 experts created Humanity’s Last Exam, a massive 2,500-question challenge covering highly specialized topics across many fields. The exam was engineered so that any question solvable by current AI models was removed. Early results show even the most advanced systems still struggle — revealing a surprisingly large gap between AI performance and true expert-level knowledge.
A massive new study comparing more than 100,000 people with today’s most advanced AI systems delivers a surprising result: generative AI can now beat the average human on certain creativity tests. Models like GPT-4 showed strong performance on tasks designed to measure original thinking and idea generation, sometimes outperforming typical human responses. But there’s a clear ceiling. The most creative humans — especially the top 10% — still leave AI well behind, particularly on richer creative work like poetry and storytelling.
New findings challenge the widespread belief that AI is an environmental villain. By analyzing U.S. economic data and AI usage across industries, researchers discovered that AI’s energy consumption—while significant locally—barely registers at national or global scales. Even more surprising, AI could help accelerate green technologies rather than hinder them.
Chimps may revise their beliefs in surprisingly human-like ways. Experiments showed they switched choices when presented with stronger clues, demonstrating flexible reasoning. Computational modeling confirmed these decisions weren’t just instinct. The findings could influence how we think about learning in both children and AI.