[FOM] Simple Turing machines, Universality, Encodings, etc.

Wed Oct 31 09:49:44 EDT 2007

(I'm replying to a proposed counterexample to a definition of universality I use in my proof; a specific explanation as to how the counterexample is based on inconsistent assumptions is given as part of this message, but it's mostly general discussion in an attempt to pin down how the problem arose in the first place and how the definition of 'universality' (and 'Turing machine', for that matter) isn't very clear at all.)

The real problem here seems to be that the definition of universality has somehow got confused with the definition of things like 'Turing machine' and 'linear bounded automaton' to some extent.

For example, consider the following definition of a system:

"The system has a tape with elements each of which can contain a finite number of possible 'colours'; one of those elements is 'active' at any given moment, and the system can be in one of a finite number of states. There is a fixed rule that is a function that maps the present situation of the system (the colour of each tape element, which tape element is active, and what state the system is in) to the 'next' situation of the system, and the system continues applying this rule forever, iterating constantly from one situation to the next. The mapping function is limited in that it never changes the active tape element by more than one element along the tape, it doesn't change any tape elements but the previously active element, and non-previously-active elements don't affect the output of the function with respect to the colour other elements, which element is active, and which state the system is in."

Is there at least one mapping function that makes this system universal?

One possible answer to this is that there is no way that the system can halt according to the definition I've given, so it cannot be universal. For me, this is an unsatisfactory definition of 'universal', because it excludes a large number of systems that seem as they ought to be universal, including some declared as universal in the past (the rule 110 cellular automaton, for instance); and it's trivially possible to come up with a 'halt-equivalent' situation (a tight infinite loop, for instance).

One possible answer to this is that the system itself hasn't been fully specified, so there is insufficient information to answer the question. After all, it hasn't been stated whether the tape is finite or infinite, what sort of information or pattern could originally be on the tape, or how input, if any, is provided to the system. It is this sort of answer that leads to the current results of "Turing machines can be universal, linear bounded automata can't be"; the answer considers the rules about what can be on the tape to be part of the system in the first place. So a Turing machine has an infinite tape, and a linear bounded automaton has a tape with a length proportional to that of the input. On the other hand, there is still a lot of underspecification here; what sort of pattern is allowed on the tape if the tape is infinite, for instance. Must it be blank (i.e. solid-colour)? This seems to preclude any input being given. It seems intuitive to me (but I think that this point may be disputed) that a system can be universal even if it accepts no input; it could, for instance, emulate all possible cyclic tag systems in parallel, so that given enough time it will have computed the evolution of any cyclic tag system to any desired number of steps, which to me really seems as though it should be considered 'universal'. Should the input be encoded into a finite amount of space and then surrounded by solid colour? Can the input be repeated endlessly? What about if the input is preprocessed by some non-universal system? The problem here is trying to find the boundary between which of these options are part of the definition of the system, and which are part of the definition of universality, and this is directly relevant to whether my proof is proving universality of the 2,3 Turing machine or whether it is proving something else. I deliberately designed the definition of the system so that it included both Turing machines and linear bounded automata; these are considered different classes of systems (I think most people here, including me, agree that Turing machines can be universal but linear bounded automata can't be), so this demonstrates that the finiteness or infiniteness of the tape must be part of the definition of the system with current thinking, rather than the definition of universality.

So this brings me on to a direct reply to Vaughan Pratt, and in particular, to this:

> In the case of an LBA and a one-counter machine, essentially your setup,
> I claim that when the one-counter machine does nothing but keep
> incrementing its counter and at the n-th step put a marker on the tape n
> places from the end of the input to delimit the end of the tape, this
> combination is already sufficient for universality."

This setup has the counter continuously 'feeding tape' to the linear bounded automaton, so that the amount of tape that the linear bounded automaton has is essentially infinite. In other words, this is a demonstration that a linear bounded automaton can act in a universal manner if given an unlimited amount of tape, and I don't dispute this fact. On the other hand, I've established that with the current definitions, as I understand them to be, a linear bounded automaton cannot be given an infinite amount of input; the input is on the tape to start with, and the tape length is proportional to the input and finite. (The tape cannot be infinite, because it's already established that the only difference between a linear bounded automaton and a Turing machine is the finiteness of the tape; any proof that linear bounded automata cannot be universal must assume that the input, and therefore the tape, is finite, because otherwise counterexamples would be trivial to find, and this assumption must come from somewhere; stated as a limitation on the proof, as part of the definition of a linear bounded automaton, or as part of the definition of universality used. I'm not sure where to find a copy of the original proof that linear bounded automata cannot be universal; presumably someone on this list will know, and as part of this discussion it will definitely be worth checking whether the initial proof mentioned this limitation on finiteness in the definition of universality, in the definition of a linear bounded automaton, separately, or is fallacious because it contains it as an unwarranted hidden assumption (although I doubt the last of these possibilities is true).

This discussion shows that when writing a universality proof, it's important to separate the definition of universality from the definition of the system. In the case of the 2,3 Turing machine prize, the problem stated the rules of the Turing machine and not much else as far as the system itself was concerned, and (I think deliberately) left the definition of universality somewhat vague. I used a definition of universality that allowed any input that could be produced in a non-universal manner to count as legitimate; the rules of the prize stated that whether a definition was accepted was at the discretion of the judges, and presumably they accepted my definition in this case because I was announced as having won the prize.

Whether this definition is reasonable or too general is the point in question; the counterexamples suggested in the message I originally replied to would suggest not, but they are actually demonstrating the problems caused by blurring of the borders between the definition of universality and the system. The definition of 'linear bounded automaton' that is presumably being used is that of a system that accepts only a finite amount of input; the definition of 'universal' that's being shown to conflict with this is one that allows an infinite amount of input to be given to a system, and the counterexample uses the situation in which an infinite amount of input is actually being used. Two statements are being shown to conflict here ('a linear bounded automaton cannot be universal' and 'a non-universal system when given a non-universal preprocessor is non-universal'); however, it's not surprising that they conflict, as they are based on contradictory assumptions. (The first statement is only true if linear bounded automata, or its definition of universal, are restricted to finite input; the counterexample passes an infinite amount of data from the preprocessor to the main system, and therefore assumes that an infinite amount of input is allowed.) I think that the counterexample is therefore invalid because it contains contradictory assumptions.

Back to my original question: Is there at least one mapping function that makes the system I defined near the start of this message universal? I think that there is a third possible answer to this question that hasn't been discussed yet, which is "yes with some definitions of 'universal' and no with others". In this situation, the system is well-defined, and what sort of tape length and input can be used with the system is instead part of the definition of universality; how to deal with the lack of a halt state is another question that should be addressed here. In the below, a 'halting process' is a process that maps input to output and can be proven in advance to always generate a finite amount of output, followed by some signal that it's 'finished' generating output, in a finite amount of time. With this view (which seems conceptually neater and clearer to me, even though it's likely to be controversial), the question can be answered:
- no, if the system needs to be able to actually stop running, rather than just do something equivalent, as a necessary condition to be considered universal (the system can't halt!)
- no, with a finite-length solid-colour tape and the definition of what is considered universal behaviour is the same as in the first 'yes' point (this is pretty intuitive, as a universal system has to be able to support an infinite amount of data)
- no, if the tape consists of nothing but a finite-length encoding of the input produced by a halting process which has a length proportional to that of the input (i.e. 'linear bounded automata are non-universal'; this needs defining slightly more precisely because 'length of the input' is vague, but it could be made rigourous by specifying bits or other similar units of data; I'm being silent on the precise definition of universal behaviour needed for this point to be correct because I don't know how general it can be and still leave the answer as 'no')
- yes, if the tape is an infinite solid-colour tape, and the system is considered universal if the evolution of any Turing machine to any given number of steps can be deduced from the sequence of situations of the system by a process that can be proven in advance to halt (this shows that a system can be universal without input; I haven't proven this yet, but based on my programming experience it seems obvious to me that this is true; is it obvious to others? Is my intuition on this wrong?)
- yes, if the tape consists of a finite-length encoding of the input produced by a halting process which has a length proportional to that of the input surrounded by an infinite number of tape elements that are all the same colour, and universal behaviour is considered to be that for which the system is considered to 'halt' when it enters a situation that's the same as a situation it's previously been in and the system 'halts' for those inputs that represent with some suitable encoding tag systems which eventually halt, and doesn't for inputs that represent tag systems that don't (the original universal Turing machine is an example for this definition of 'universal'; again, due to imperfect knowledge, I haven't defined 'suitable encoding')
- yes, if the tape consists of an infinite number of repetitions of a sequence of elements A, followed by a sequence of elements B, followed by an infinite number of repetitions of a sequence of elements C, each of the sequences A, B, and C is deduced by a halting process, and universal behaviour is considered to be that for which the system is considered to 'halt' when it enters a situation that's the same as a situation it's previously been in, except for a possible translation of all elements and the active element, and the system 'halts' for those inputs that represent with some suitable encoding cyclic tag systems which eventually halt, and doesn't for inputs that represent cyclic tag systems that don't (yes because this is a generalisation of the previous definition because tag systems can be converted to cyclic tag systems by a halting process; I mention this definition because when applied to cellular automata rather than Turing-machine-like-systems, it's the definition that allows rule 110 to be universal whereas I suspect none of the previous definitions do)
- yes, if the tape consists of an infinite sequence of elements all of which are generated from a finite amount of input by a non-universal rule, and universal behaviour is considered to be that where for some definition of 'halting' that can be calculated by a non-universal process from the sequence of situtations the system goes through, the system 'halts' for those inputs that represent with some suitable encoding cyclic tag systems which eventually halt, and doesn't for inputs that represent cyclic tag systems that don't (this definition allows the 2,3 Turing machine in question; one potential problem with it is recursive and shows no obvious way to bottom out, but luckily non-universality of a system with respect to any reasonable definition can often be proved pretty conclusively, like in the proof I gave for the 2,3 Turing machine)
- yes, if the tape consists of an infinite sequence of elements all of which are generated from a finite amount of input that represents with some suitable encoding a cyclic tag system by a non-universal rule, and system is considered universal if the evolution of that cyclic tag system to any given number of steps can be deduced from the sequence of situations of the system by a process that can be proven in advance to halt (this definition also allows the 2,3 Turing machine in question)

It's interesting to see how some other systems fare under similar definitions to these; tag systems and cyclic tag systems, for instance, are capable of 'generating' more tape as they run, unlike the Turing-machine-like systems I've been discussing for most of this, so they are universal under all definitions except possibly the first one (which is only partially defined). Does this mean that they're 'more universal' than Turing-machine-like systems? Again, that's a matter of definition and a matter for debate. If they aren't considered more universal, then either the 2,3 Turing machine isn't any 'less universal' than a cyclic tag system, or there's a cutoff point which has to be taken somewhere in the list of definitions above, and the selection of such a cutoff point seems hard to me to do in a non-arbitrary manner. (There are many possible definitions available 'between' in inclusiveness the ones I've given here, and I suspect that inclusiveness in universality definitions is weakly rather than stricly ordered, both of which make pinning down a cutoff point harder and looking like a less sensible thing to do.) The rule 110 cellular automaton has not yet been proven to meet any definitions of universality before the sixth one I've given here; it may or may not be universal with the fifth and isn't with the fourth. S/K combinator calculus doesn't take input at all, so is universal with the second and fourth definitions, maybe with the third, but not with the fifth, sixth, seventh, or eighth definitions here. In other words, none of the definitions here admit both the rule 110 cellular automaton or S/K combinator calculus as being universal - the flaw shown here is that some of the definitions assume that giving input is needed, and others disallow it. Here's a ninth definition that I believe allows both (and also the 2,3 Turing machine, the original universal Turing machine, etc.):

- yes, if the tape consists of a (fixed and non input-dependent) infinite sequence of elements calculated by a non-universal process, and the system is considered universal if the evolution of any Turing machine to any given number of steps can be deduced from the sequence of situations of the system by a process that can be proven in advance to halt

The question here is, is this definition too general? I find it hard to think of a more general definition of universality that doesn't end up admitting pretty much every system, even the 'obviously non-universal' systems such as finite state automata (which can only move one way along the tape), as universal; on the other hand, 'calculated by a non-universal process' will get more restrictive as the definition of universality is opened up. (There is also a problem here with how the recursive definition gets started in the first place, which may reduce the rigour of the definition.) Also, this approach arguably has taken too much information out of the definition of the system; where is that boundary drawn? This is really the fundamental problem that I think is worth looking at with respect to the definition of universality. Here are some pairs of systems:

Turing machines with repetitive input tapes and Turing machines with non-universally-constructed input tapes
Turing machines with blank input tapes and Turing machines with repetitive input tapes
Turing machines with a tape that's all colour 0 and Turing machines with a tape that's all colour 1
Turing machines where input is originally placed on the tape and Turing machines where no input is given
Turing machines that accept an infinite amount of input and linear bounded automata that accept an infinite amount of input
Linear bounded automata where input is originally placed on the tape and linear bounded automata where no input is given
Turing machines and linear bounded automata
Turing machines and push-down automata
Turing machines and finite-state machines

The question is, for which of these pairs is it the same system operating under different assumptions about what counts as legitimate input for the system for a universality proof, for which pairs is it two different systems, and which of these pairs contain contradictory assumptions that make them meaningless? I've deliberately put some examples in above to try to blur the boundary, and I'd be interested where various people draw it.

The overall point I'm making is that universality is a hard concept to pin down, and I suspect that there may never be a definition that fits everyone's intuition as to whether a system should be considered universal or not. Likewise with defining what information should be included as part of a system, or group of systems, and what is part of its initial condition.

--
Alex Smith

  _____

From: fom-bounces at cs.nyu.edu on behalf of Vaughan Pratt
Sent: Tue 30/10/2007 22:56
To: Foundations of Mathematics
Subject: Re: [FOM] Simple Turing machines, Universality, Encodings, etc.

The following was originally in private reply to a message Alex Smith
sent me cc'ing Stephen Wolfram and Todd Rowland, whom I cc'd likewise in
my reply.  Now that Alex has in the meantime subscribed to FOM and
posted his message there, I'll follow suit (i.e. both Alex and I have
sent our messages twice, once to each other and once to FOM).  --Vaughan

=========

To: Alexander Smith <AIS523 at bham.ac.uk>
CC: s.wolfram at wolfram.com,  rowland at wolfram.com

Dear Alex,

Pleased to make your acquaintance, if only electronically.  I appreciate
the opportunity to go over your proof with you directly rather than the
more complicated approach of going through intermediaries.

If your argument is correct then it is a very impressive piece of work
and you deserve both the prize and the fame that goes with it.  In
either case it is clear that you enjoy the challenge of difficult
problems and I hope that whatever the outcome you will be encouraged to
attack other problems with equal enthusiasm.  Many who've been rebuffed
by one problem have bounced back to successfully solve others and make
their reputation in due course, the occasional failure notwithstanding.
(And some have perfect track records perhaps, but those are very rare
indeed---certainly not me!)

I should also add, please don't take my question about how your proof
"got through the filter" as directed at you.  It is the judges'
responsibility to make the determination of correctness.  Ideally your
proof will turn out to be correct, but if as I argue below it is not
then you will have made an error but this often happens.  This
particular error while elementary is sufficiently subtle as to often
trip people up the first time they encounter it.  It is a much bigger
fault when the judges of such a competition fail to catch this sort of
well-known fallacy, and this was the basis for my opening question in my
post.

In the case of two PDAs, which was the first thought I had when I saw
your argument, I haven't considered whether the requirement that the
communication between the two machines be one-way and off-line (i.e.
performed at the start) is sufficient to inhibit universality, though it
seems plausible to me that this is the case (but then it would need to
be proved).

> For instance, a universal machine can search for counterexamples to
> the Riemann hypothesis; the linear-bounded-automata+preprocessor
> system cannot,

Oh but it can.  You underestimate the capabilities of a linear bounded
automaton.

In the case of an LBA and a one-counter machine, essentially your setup,
I claim that when the one-counter machine does nothing but keep
incrementing its counter and at the n-th step put a marker on the tape n
places from the end of the input to delimit the end of the tape, this
combination is already sufficient for universality.

Let S be the common tape alphabet of all the following machines, and let
d be a symbol in S.  Let T be any Turing machine.  Construct T' which
simulates T but which diverges (permanently enters a non-halting state)
the first time the head arrives at a cell containing d.  Construct T''
from T' by prefixing the program of T' with a short routine that first
goes to the end of the tape, writes d one cell beyond the end to serve
as a delimiter, returns to the beginning of the tape, and proceeds to
simulate T'.

Now T' accepts the same set of words on S\{d} (S less symbol d) as T,
whence if T never writes d, then if T is universal on S\{d} so is T'.
T'' on the other hand uses no more tape than the length of the input,
and is therefore a linear bounded automaton.

The system you run for finitely many steps corresponds to T' in this
argument---you have a machine that an encoder can communicate an initial
condition to and it then runs for n steps.  The team formed by T' and
the encoder at each stage then amounts to T'' each time except that at
each stage T'' is given a different input, one cell longer than before
(in my abstraction of your setup where the encoder keeps moving the end
out one more cell at a time).  Although T'' itself is only an LBA,
repeatedly running it on these increasingly longer inputs allows it to
simulate any computation of T' (and hence T) that does not need more
than that many tape cells.  If T eventually halts, so will T'' on a
sufficiently long input.

Now this all seems to support your argument, since in order to create
T'' having the property that it is universal we had to start with the
assumption that T and hence T' is universal.

However we only started with a universal T in order to prove that there
could exist a T'' with the property that for each input w accepted by T
(and hence by T') there exists a sufficiently large initial condition
allowing T'' to accept w, and conversely that if T (and hence T') does
not accept w then no initial condition suffices.

The problem in your proof is your assumption that if a machine T'' has
universal behaviour in this sense it must have arisen originally as a
universal T.

But what if the 2,3 machine we're trying to prove universal implements
not a universal T but actually T'' itself?   Your setup allows T'' to
behave universally as we saw above, even though T'' is not universal,
because you keep changing the input for T'' until it is long enough to
allow it to perform a universal computation performed by T (or T').

If the 2,3 machine is universal in your sense because its behaviour is
that of T'', then it is an LBA.  I claim that your proof does not rule
out this possibility.

One might argue that the construction of T'' is artificial whereas this
2,3 machine arose "in nature" and nature would not dream of limiting its
universal machines in such an obviously counterproductive way, or that
nature would be very unlikely to do so, or that the manner of selection
of this machine out of three million others must have eliminated that
possibility.  But such arguments are wishful thinking because this 2,3
machine is a Turing machine of unknown capacity and we can exhibit
Turing machines with the behaviour of T'' having limited capacity.  You
have to show that this 2,3 machine is not such a T'' machine.

> There seems to be a lemma in question, which is that if a sequence of
>  systems, each of which reads input and produces output, are set up
> so that each one reads the output of the previous one, and none of
> them are universal, then the resulting system cannot be universal. In
> the case of the prize problem, this lemma was basically incorporated
> into the definition of universality.

This is not how I read the technical details of the problem, which seem
quite unambiguous: "For the purposes of this prize, the treatment of
universality in any particular submission must be considered acceptable
by the Prize Committee."  In other words you're entitled to any
definition of universality that satisfies you as long as it also
satisfies the committee.

I would regard any definition of universality that can classify an LBA
as universal as being too generous.  However this is not for me to
determine but the committee.

I would however expect that if this 2,3 machine really is universal then
it should be possible to exhibit that universality without having to run
it on an increasing set of initial conditions, which without some sort
of further modification can as I show above turn a non-universal machine
into a universal one.

Best regards,
Vaughan Pratt
_______________________________________________
FOM mailing list
FOM at cs.nyu.edu
http://www.cs.nyu.edu/mailman/listinfo/fom