Interestingly how we are, humans, criticies ourselves during our Chains of Thoughts. When I’m writing some words every time I’m thinking about is that right word and run kinda self-check. Sometimes self-checks consuming much more effort, cpu-time, or brain-time, just to conclude that statement is true and can be self-proved.

For current LLMs that’s not quite true. Ofc, now you can see real-time refreshing thoughts during chatbot answer, which looks like “thinking”, but they are not. They are not criticise themselves, unles you point that they are wrong. But how to achive self-critique?

Basically it is the same question is: how to achieve good and reliable result from chatbots. The short answer - “no how”. The complex answer, I believe, lays down between effecient vs full qualified validation of results.