Re: Predictions & duplications

From: Juergen Schmidhuber <>
Date: Mon, 29 Oct 2001 13:09:32 +0100

> > > From: Russell Standish <>
> > > The only reason for not accepting the simplest thing is if it can be
> > > shown to be logically inconsistent. This far, you have shown no such
> > > thing, but rather demonstrated an enormous confusion between measure
> > > and probability distribution.
> >
> > Russell, this is really too much; please read any intro to measure
> > theory,
> > e.g., the one by Halmos. Measures in general are _not_ probability
> > distributions. They are more than that; you can view them as _sets_ of
> > probability distributions on sets that are related to each other in a
> > specific way. Probability density functions define measures and
> > therefore
> > in general aren't probability distributions either. The uniform PDF on
> > [0..1] yields the perfect trivial example; that's exactly why I used it
> > before. In the computability context, compare LI/Vitanyi 1997, chapters
> > 4.3 (discrete sample space with (semi)distribution m(x)) vs chapter 4.5
> > (continuous sample space with (semi)measure M(x)).
> This is an excellent reference, thank you. Example 4.2.1 gives the
> construction of the uniform measure over the set of strings, precisely
> what you claim didn't exist when we started this discussion.

You just want to annoy me, don't you? For the last time: There is a
uniform *measure* on the strings, but no uniform probability

> > > 1) Halting theorem implies the set of halting programs is of measure
> > > zero within the set of all programs
> >
> > You mean, given a uniform measure? But why should you need the halting
> > theorem for this trivial statement? In fact, what exactly do you mean
> > by
> > halting theorem? (Schoenfield's book provides a good intro to
> > mathematical
> > logic and computability theory.)
> >
> The halting theorem is that there is no algorithm for predicting
> whether any program will halt. Of course, the result I was referring
> to is that the set of compressible strings is of measure zero. The set
> of halting programs actually has finite measure, since \Omega >
> 0. However, there is still finite probability of an input string
> being nonhalting, and indeed the probability seems to be greater than
> 0.7 (ie \Omega <0.217643 - see pg 218 Li and Vitanyi), so this still
> presents a problem for computationalism.

This is mixing up everything, in particular, different measures.
1) Infinite strings with finite program have *nonzero* measure, given
the *universal* measure (Omega is a number derived from the universal
measure). 2) They have measure zero, given the *uniform* measure (Omega
has nothing to do with it). 3) To see 2) we do not need the halting
theorem - we just compare the finite programs (countably many) to all
strings (uncountably many) and we are done. 4) The halting theorem is
no problem whatsoever when it comes to computable universes. Why care
whether some computable universe stops or not? If it does, it does,
otherwise it does not. In both cases we can compute it in the limit.

> > > The GP program (aka Universal Dovetailer) is not the simplest
> > > thing. The simplest describable thing is the set of all descriptions,
> > > that simply exist without being generated. That has precisely zero
> > > information or complexity, whereas the UD has some complexity of the
> > > order of a few 100 bits. The simplest prior is the uniform one, ie
> > > there is no bias whatsoever in the selection.
> >
> > Exist without being generated? Precisely zero information or
> > complexity? But with respect to what?
> With respect to the observer. With the respect to anything in
> fact. The set of all possible descriptions has precisely zero
> information, by definition, regardless of what observer or context you
> employ.

This is a vague informal statement, not a definition. Which are the
possible descriptions? What's the formal difference between a
and a non-description? What is your "anything" if there is no device
for describing it formally? If "anything" does not convey any bit of
info then what exactly is it that conveys one bit? Or two?

> Any traditional definition of
> > information/simplicity requires the specification of a formal
> > description
> > method, such as a Turing machine. You don't need one? Then how do you
> > define information or complexity? How exactly do you define "simple" ?
> Actually, all it needs is an equivalencing procedure. See my "On
> Complexity and Emergence" paper. A Turing machine will do the job for
> you, but so will a neural network or an "expert system" following
> heuristic rules.

So you have to write down formally this equivalencing procedure, right?
Why should this be different in principle from writing down the formal
rules for a Turing machine? Why is a simplicity measure that depends on
your equivalencing procedure superior to a simplicity measure depending
on a Turing machine? (The TM procedure includes yours - because it
includes all procedures - but not the other way round.)
Received on Mon Oct 29 2001 - 04:10:27 PST

This archive was generated by hypermail 2.3.0 : Fri Feb 16 2018 - 13:20:07 PST