Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Naturally, their "confidence" is represented as activations in layers close to output, so they might be able to use it. Research ([0], [1], [2], [3]) shows that results of prompting LLMs to express their confidence correlate with their accuracy. The models tend to be overconfident, but in my anecdotal experience the latest models are passably good at judging their own confidence.

[0] https://ieeexplore.ieee.org/abstract/document/10832237

[1] https://arxiv.org/abs/2412.14737

[2] https://arxiv.org/abs/2509.25532

[3] https://arxiv.org/abs/2510.10913



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: