# Shannon's information theory and foundations of mathematics

Fri Jun 24 23:47:32 EDT 2022

```According to Shannon, at each moment of time, we are uncertain about a state of the system, and this uncertainty can be described by the probabilities p_i of different states. The corresponding uncertainty can be described by a single number – the average number of binary questions needed to determine the actual state. This average number is what Shannon called entropy S = - p1*log_2(p1)=p2*log_2(p2) -…

If we gain additional knowledge about the system, our probabilities change and, as a result, the entropy changes. Usually, the entropy decreases – and it continues decreasing until we reach the situation of full knowledge at which one state has probability 1 and all other \s have probability 0, so that the entropy is 0.

Shannon defined the amount of information received by a person (e.g., in a message) as the difference between the old and the new values of the entropy.

Disinformation has the opposite effect: it increases entropy. So, by Shannon definition, disinformation corresponds to negative amount of information.

For example, suppose that we have two hypotheses, we are almost convinced that the first hypothesis is true, so its probability is 0.9 and the probability of the second hypothesis is 0.1. In this case, Shannon’s entropy is much smaller than 1 bit. Then someone provides you with a disinformation that supposedly provides string argument in favor of the second hypothesis. Now, your probabilities change, e.g., to p1 = p2 = 0.5, in which case the new entropy is exactly 1 bit. Your entropy increased – so the difference between the old and the new values of entropy if negative. In other words, disinformation brought you negative number of bits.

From: FOM [mailto:fom-bounces at cs.nyu.edu] On Behalf Of Vaughan Pratt
Sent: Thursday, June 23, 2022 11:30 PM
To: fom at cs.nyu.edu
Subject: Re: Shannon's information theory and foundations of mathematics

Recently Rohit Parikh suggested to me that disinformation was not information.  As I've always considered disinformation about any given proposition to be less likely that the conventional wisdom about it, it seemed to me that with Shannon's information theory, a less likely message contains more information than a more likely one.  Hence in particular disinformation should convey more information than the conventional wisdom.

Is there a foundational way of approaching these seemingly conflicting notions of information that isn't too wildly ad hoc?

Vaughan Pratt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: </pipermail/fom/attachments/20220625/2fcf5d26/attachment-0001.html>
```