doi: 10.5281/zenodo.20528142
A blind evaluation of LLM behaviour against CGT predictions; reported as a negative result (honest negative).