Log-normal distribution of number of theorems in the chapters of an ideal textbook

Sat Apr 9 07:58:12 EDT 2022

Dear FOM members,

Despite our illusion of free will when doing mathematics, some of the
statistical properties of big data from formal systems seem to be quite
regular. e.g., figures 2 and 3 (about dependency inside formal theories)
from the article

Alama, Jesse, Lionel Mamane, and Josef Urban. "Dependencies in formal
mathematics: Applications and extraction for Coq and Mizar." *International
Conference on Intelligent Computer Mathematics*. Springer, Berlin,
Heidelberg, 2012. https://arxiv.org/pdf/1109.3687v1.pdf

look like the approximations of smooth curves. These regularities may be
related to the fact that humans tend to optimize the performance of
reasoning to spend less computational effort. If this is the case, keeping
the useful lemmas at hand seems to be the right heuristic. In this article,
there are several criteria concerning the notion of usefulness in lemmas:

Kaliszyk, Cezary, and Josef Urban. "Learning-assisted theorem proving with
millions of lemmas." *Journal of symbolic computation* 69 (2015): 109-128.
https://arxiv.org/pdf/1402.3578.pdf

Let's consider a textbook (set) containing precisely n true mathematical
statements with their proofs (for simplicity in the presentation, I will
include lemmas and corollaries as theorems). We partition this book into
chapters (disjoint nonempty subsets). The number of such partitions is the
n-th Bell number

https://www.wikiwand.com/en/Bell_number

and if we ignore the distinction among the elements, we get the n-th value
of the integer partition function

https://www.wikiwand.com/en/Partition_(number_theory)

In this framework, it is natural to ask which Young diagram would be
optimal from a complexity-theoretical point of view? More precisely, which
distribution of sizes of the chapters in the textbook would allow a
mathematician, or an automatic theorem prover software, to prove new
theorems in the most efficient way by decreasing the computational
complexity of the search for the most useful theorems that may be relevant.

Conjecture: The optimal Young diagram, viewed as a histogram, converges to
a Log-normal distribution as the number of theorems goes to infinity.

I am well aware that this conjecture needs to be formalized because its
current stage may be ambiguous, e.g., it is not clear to me what would be
the best notion of convergence to use.

My deduction of this conjecture is rather empirical and heuristic.
According to the paper (I disagree with the hypothesis that this is a
human-specific phenomenon, I think it is a complexity-theoretical
phenomenon):

Gros, Claudius, Gregor Kaczor, and D. Marković. "Neuropsychological
constraints to human data production on a global scale." *The European
Physical Journal B* 85.1 (2012): 1-5.
https://arxiv.org/pdf/1111.6849.pdf

 the files size distributions follow a power law for data without a time
> component, like images, and a log-normal distribution for multimedia files,
> for which time is a defining qualia.

Mathematics is more like multimedia than an image since the process of
proving a theorem introduces a notion of a sequence of events (steps in the
proofs), which is a discrete version of time. Furthermore, applying
the Curry-Howard correspondence, we should expect similar statistical
properties in mathematics proofs and in computer programs. I checked the
distribution of the sizes of files (extensions .c and .h) of the kernel of
the operative system Linux (a huge code)

https://github.com/torvalds/linux

and I found a log-normal distribution as expected:

https://github.com/Caballero-Software-Inc/data-about-linux-kernel/blob/main/SizeOfFilesStatistics/SizeOfFilesStatistics.ipynb

Considering that the programmers of Linux are experts, the way they
organize their computer library should be statistically similar to the
optimal way a mathematical textbook should be organized in chapters. Of
course, in practice, mathematical textbooks as short in comparison with
this computer library, but in theory, when the number of theorems goes to
infinity, both statistical behaviors should be rather similar.

Kind regards,
Jose M.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: </pipermail/fom/attachments/20220409/7e03219d/attachment-0001.html>