AbsenceBench: Language Models Can’t Tell What’s Missing
https://simonwillison.net/2025/Jun/20/absencebench/#atom-everything
Long context models have been getting increasingly good at passing “Needle in a Haystack” tests recently, but what about a problem in the opposite direction?