[FOM] Short or very short Gôdel codes, anyone?

Paul Tarau paul.tarau at gmail.com
Thu Jul 12 15:10:35 EDT 2012

Apparently the general solution to this problem is finding
a bijective Goedel numbering scheme for term algebras (that can also
fit, with minor adaptations, languages like predicate or lambda calculus).

I have just finished a Scala package that does it, see:


The algorithms, using a generalized Cantor bijection between N and N^k 
(known to be polynomial in size of the representations) ensure that:

- only syntactically valid terms are encoded / decoded

- a unique syntactically valid term is associated to each natural number

- a unique natural number is associated to each syntactically valid term

- either way, the bitsize of the representation of the output is proportional 
  (up to a small constant) to the bitsize of the representation of the output

The trickiest steps for getting everything run in low polynomial time is
finding a fast inverse to the generalized Cantor bijection and using an
efficient ranking / unranking function for balanced parentheses languages.

Here is an outline of the algorithm:

First, it extracts a "Catalan Skeleton",
basically the sequence of parentheses
one can see in the string representation if
a term.

Next, it extracts symbol and variable labels
from a term.

Then it encodes the Catalan Skeleton as a natural number
and it encodes the symbol and variable labels
using the generalized Cantor bijection.

Finally it pairs together the two encodings
using the Cantor pairing function into
a natural number uniquely associated to the term.

The decoding from a natural number to a term
proceeds by inverting each of these steps.

For example, terms like:


are uniquely associated to Goedel numbers like

1166589096937670191 and

of comparable representation size.

The usage of the package is explained at:


Paul Tarau

On Jul 12, 2012, at 6:42 AM, Hendrik Boom wrote:

> On Wed, Jul 11, 2012 at 11:14:30PM +0200, Frode Bjørdal wrote:
>> 2012/7/9 <joeshipman at aol.com>
>>> It's easy to have short Godel codings if your arithmetic has
>>> exponentiation as well as addition and multiplication,
>> What is the role of exponentiation?
> It's traditionally used to make pairs in building godel numbers for 
> formulae, the same way cons is used for building s-expressions in Lisp.   
> A pair (a, b) can be encoded like 3^a * 5^b.
> But Cantor pairing could be used instead, which would use multiplication 
> instead.
>>> but your inequality doesn't work because m^n is the number of strings with
>>> exactly n symbols instead of <=n symbols, so some distinct strings would
>>> have to have the same Godel code number.
>> I do not quite understand. Why cannot m^n, n>0, be the number of strings
>> with n or less than n symbols?
> m^n isn't defined as "the number of strings ...".  
> m^n is defined as exponentiation, and as it happens it *is* thhe number 
> of strings with n symbols.
> The number of strings with n or fewer symbols is bounded by (m+1)^n, 
> though.
> -- hendrik
> _______________________________________________
> FOM mailing list
> FOM at cs.nyu.edu
> http://www.cs.nyu.edu/mailman/listinfo/fom

More information about the FOM mailing list