a foundational approach to artificial hallucinations in reinforcement learning

José Manuel Rodríguez Caballero josephcmac at gmail.com
Tue Mar 7 06:05:41 EST 2023


Dennis wrote (regarding the Turing test of ChatGPT, in which there were
absurd references to the literature):

> I suggest that the Turing Test has not been passed, unless it is about
> being
> credible but not defensible, unless the impersonation is of a fraud.


In the realm of mathematics, the phenomenon of artificial hallucinations
has been observed and documented [1, 2, 3], yet the cause remains an open
problem. My hypothesis suggests that such a phenomenon may arise as a
result of self-organized criticality in complex systems. The presence of
self-organized criticality in neural networks has been investigated in the
literature [7, 8, 9] but I am not sure to which extent it has been linked
to artificial hallucinations.

To investigate this, we may look at the analogous Gutenberg–Richter law
(originated in seismology), which states that the logarithm of the number
of hallucinations of magnitude greater than or equal to M should be
approximately equal to b - a*M, where a and b are positive real numbers. It
has been proposed that similar phenomena occur in the human brain [4].
While it is true that there are differences between earthquakes and
artificial hallucinations, these differences are not necessarily
significant enough to invalidate the analogy. In fact, one of the strengths
of complex systems theory is its ability to identify similarities and
patterns across seemingly disparate phenomena.

In the foundations of mathematics, self-organized criticality can be
understood as a mathematician attempting to develop a formal system to
encompass all the mathematics, as Gottlob Frege once attempted. The
mathematician then discovers an inconsistency, such as the paradox provided
by Bertrand Russell. Unlike Frege, this hypothetical mathematician may be
paraconsistent and would not cease their pursuit due to such a
contradiction. Rather, he would continue to develop the formal system while
striving to avoid the contradiction, as is done in naive set theory. The
mathematician's positive and negative feedback would be informed by an
engineer friend who would provide feedback on whether the mathematics being
developed are practical and functional, as in the method of reinforcement
learning [5] used in machine learning.

However, should the engineer no longer be present to provide feedback, the
mathematician may continue to develop the mathematics without external
guidance. Initially, the mathematician may continue to make progress, but
eventually, he may encounter contradictions. To resolve these
contradictions, the mathematician may arbitrarily select one side of the
contradiction as true and the other as false and continue to develop the
mathematics. Over time, the structure of the formal system becomes
increasingly fragile, with contradictions occurring more frequently. At
this point, the system is said to have spontaneously organized itself into
a critical state. Finally, once criticality is reached, the mathematics
generated may appear to be the result of a random number generator. Using
the language of statistical physics, this system is produced by
reinforcement learning and will, if left alone, end up in a state of
maximum entropy, or what S. Wolfram refers to as computational
irreducibility. Recently, Wolfram has linked this concept with the second
law of thermodynamics [6].

With regard to the notion of a formal system spontaneously organizing
itself into a critical state, I agree that this is a somewhat controversial
idea. However, it is worth noting that there are many examples of
self-organized criticality in natural systems, such as forest fires and
financial crashes. The idea that formal systems used in foundations of
mathematics may exhibit similar behavior is not necessarily far-fetched,
and further research in this area could be fruitful.

Thus, the questions we must ask ourselves are: how can the foundations of
mathematics prevent the degeneration of the structure of pure mathematics
when there is no external feedback? And, can the methods from the
foundations of mathematics be applied to the field of reinforcement
learning to avoid artificial hallucinations?

Finally, I would like to address what Patrick wrote:

> The question of proof is indeed far from trivial and mathematics will
> never allow languages like Isabelle/HOL to be like a supersupervisor or
> "oberschwester" for best practice in mathematical proof mechanization.


Perhaps using Isabelle/HOL as a supervisor or "oberschwester" of a
reinforcement learning trained system could prevent some real disasters
from happening, for example, if the artificial intelligence controls a
nuclear power plant and suffers from an artificial hallucination.

Kind regards,
Jose M.

References
[1] Ji, Ziwei; Lee, Nayeon; Frieske, Rita; Yu, Tiezheng; Su, Dan; Xu, Yan;
Ishii, Etsuko; Bang, Yejin; Dai, Wenliang; Madotto, Andrea; Fung, Pascale
(November 2022). "Survey of Hallucination in Natural Language Generation"
(pdf). ACM Computing Surveys. Association for Computing Machinery.
doi:10.1145/3571730. S2CID 246652372. Retrieved 15 January 2023.

[2] Nie, Feng; Yao, Jin-Ge; Wang, Jinpeng; Pan, Rong; Lin, Chin-Yew (July
2019). "A Simple Recipe towards Reducing Hallucination in Neural Surface
Realisation" (PDF). Proceedings of the 57th Annual Meeting of the
Association for Computational Linguistics. Association for Computational
Linguistics. doi:10.18653/v1/P19-1256. Retrieved 15 January 2023.

[3] Dziri, Nouha; Milton, Sivan; Yu, Mo; Zaiane, Osmar; Reddy, Siva (July
2022). "On the Origin of Hallucinations in Conversational Models: Is it the
Datasets or the Models?" (PDF). Proceedings of the 2022 Conference of the
North American Chapter of the Association for Computational Linguistics:
Human Language Technologies. Association for Computational Linguistics.
doi:10.18653/v1/2022.naacl-main.38. Retrieved 15 January 2023.

[4] Osorio, I., Frei, M. G., Sornette, D., Milton, J., & Lai, Y. C. (2010).
Epileptic seizures: quakes of the brain?. *Physical Review E*, *82*(2),
021919.

[5] Sutton, Richard S., and Andrew G. Barto. Reinforcement learning: An
introduction. MIT press, 2018.

[6] Wolfram, Stephen, "Computational Foundations for the Second Law of
Thermodynamics", Stephen Wolfram Writings. URL:
https://writings.stephenwolfram.com/2023/02/computational-foundations-for-the-second-law-of-thermodynamics/

[7] Katsnelson, Mikhail I., Vitaly Vanchurin, and Tom Westerhout.
"Self-organized criticality in neural networks." *arXiv preprint
arXiv:2107.03402* (2021).

[8] Droste, Felix, Anne-Ly Do, and Thilo Gross. "Analytical investigation
of self-organized criticality in neural networks." Journal of The Royal
Society Interface 10.78 (2013): 20120558.

[9] Levina, Anna. A mathematical approach to self-organized criticality in
neural networks. Diss. Göttingen, Univ., Diss., 2008, 2008.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: </pipermail/fom/attachments/20230307/495cc220/attachment-0001.html>


More information about the FOM mailing list