- Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]

From: Russell Standish <R.Standish.domain.name.hidden>

Date: Mon, 29 Oct 2001 10:12:25 +1100 (EST)

Juergen Schmidhuber wrote:

*> > From: Russell Standish <R.Standish.domain.name.hidden>
*

*> > The only reason for not accepting the simplest thing is if it can be
*

*> > shown to be logically inconsistent. This far, you have shown no such
*

*> > thing, but rather demonstrated an enormous confusion between measure
*

*> > and probability distribution.
*

*>
*

*> Russell, this is really too much; please read any intro to measure
*

*> theory,
*

*> e.g., the one by Halmos. Measures in general are _not_ probability
*

*> distributions. They are more than that; you can view them as _sets_ of
*

*> probability distributions on sets that are related to each other in a
*

*> specific way. Probability density functions define measures and
*

*> therefore
*

*> in general aren't probability distributions either. The uniform PDF on
*

*> [0..1] yields the perfect trivial example; that's exactly why I used it
*

*> before. In the computability context, compare LI/Vitanyi 1997, chapters
*

*> 4.3 (discrete sample space with (semi)distribution m(x)) vs chapter 4.5
*

*> (continuous sample space with (semi)measure M(x)).
*

This is an excellent reference, thank you. Example 4.2.1 gives the

construction of the uniform measure over the set of strings, precisely

what you claim didn't exist when we started this discussion.

*>
*

*> > 1) Halting theorem implies the set of halting programs is of measure
*

*> > zero within the set of all programs
*

*>
*

*> You mean, given a uniform measure? But why should you need the halting
*

*> theorem for this trivial statement? In fact, what exactly do you mean
*

*> by
*

*> halting theorem? (Schoenfield's book provides a good intro to
*

*> mathematical
*

*> logic and computability theory.)
*

*>
*

The halting theorem is that there is no algorithm for predicting

whether any program will halt. Of course, the result I was referring

to is that the set of compressible strings is of measure zero. The set

of halting programs actually has finite measure, since \Omega >

0. However, there is still finite probability of an input string

being nonhalting, and indeed the probability seems to be greater than

0.7 (ie \Omega <0.217643 - see pg 218 Li and Vitanyi), so this still

presents a problem for computationalism.

*> > The GP program (aka Universal Dovetailer) is not the simplest
*

*> > thing. The simplest describable thing is the set of all descriptions,
*

*> > that simply exist without being generated. That has precisely zero
*

*> > information or complexity, whereas the UD has some complexity of the
*

*> > order of a few 100 bits. The simplest prior is the uniform one, ie
*

*> > there is no bias whatsoever in the selection.
*

*>
*

*> Exist without being generated? Precisely zero information or
*

*> complexity? But with respect to what?
*

With respect to the observer. With the respect to anything in

fact. The set of all possible descriptions has precisely zero

information, by definition, regardless of what observer or context you

employ.

Any traditional definition of

*> information/simplicity requires the specification of a formal
*

*> description
*

*> method, such as a Turing machine. You don't need one? Then how do you
*

*> define information or complexity? How exactly do you define "simple" ?
*

*>
*

Actually, all it needs is an equivalencing procedure. See my "On

Complexity and Emergence" paper. A Turing machine will do the job for

you, but so will a neural network or an "expert system" following

heuristic rules.

Perhaps the simplest formal approach is say that the equivalencing procedure

is a function from the set of descriptions onto the integers

(labelling the equivalence classes). If that function is partial

recursive, then the equivalencing mechanism can be replaced by the

appropriate Turing machine. Computationalism is the creed that this

function must be partial recursive, but it is not necessary for the

definition of information or complexity.

I can see some subtle problems occurring if this equivalencing

procedure is stochastic in nature, however by summing over the

appropriate distributions, one should still end up with meaningful

results provided the variance is not too large.

Cheers

----------------------------------------------------------------------------

Dr. Russell Standish Director

High Performance Computing Support Unit, Phone 9385 6967, 8308 3119 (mobile)

UNSW SYDNEY 2052 Fax 9385 6965, 0425 253119 (")

Australia R.Standish.domain.name.hidden

Room 2075, Red Centre http://parallel.hpc.unsw.edu.au/rks

International prefix +612, Interstate prefix 02

----------------------------------------------------------------------------

Received on Sun Oct 28 2001 - 15:31:33 PST

Date: Mon, 29 Oct 2001 10:12:25 +1100 (EST)

Juergen Schmidhuber wrote:

This is an excellent reference, thank you. Example 4.2.1 gives the

construction of the uniform measure over the set of strings, precisely

what you claim didn't exist when we started this discussion.

The halting theorem is that there is no algorithm for predicting

whether any program will halt. Of course, the result I was referring

to is that the set of compressible strings is of measure zero. The set

of halting programs actually has finite measure, since \Omega >

0. However, there is still finite probability of an input string

being nonhalting, and indeed the probability seems to be greater than

0.7 (ie \Omega <0.217643 - see pg 218 Li and Vitanyi), so this still

presents a problem for computationalism.

With respect to the observer. With the respect to anything in

fact. The set of all possible descriptions has precisely zero

information, by definition, regardless of what observer or context you

employ.

Any traditional definition of

Actually, all it needs is an equivalencing procedure. See my "On

Complexity and Emergence" paper. A Turing machine will do the job for

you, but so will a neural network or an "expert system" following

heuristic rules.

Perhaps the simplest formal approach is say that the equivalencing procedure

is a function from the set of descriptions onto the integers

(labelling the equivalence classes). If that function is partial

recursive, then the equivalencing mechanism can be replaced by the

appropriate Turing machine. Computationalism is the creed that this

function must be partial recursive, but it is not necessary for the

definition of information or complexity.

I can see some subtle problems occurring if this equivalencing

procedure is stochastic in nature, however by summing over the

appropriate distributions, one should still end up with meaningful

results provided the variance is not too large.

Cheers

----------------------------------------------------------------------------

Dr. Russell Standish Director

High Performance Computing Support Unit, Phone 9385 6967, 8308 3119 (mobile)

UNSW SYDNEY 2052 Fax 9385 6965, 0425 253119 (")

Australia R.Standish.domain.name.hidden

Room 2075, Red Centre http://parallel.hpc.unsw.edu.au/rks

International prefix +612, Interstate prefix 02

----------------------------------------------------------------------------

Received on Sun Oct 28 2001 - 15:31:33 PST

*
This archive was generated by hypermail 2.3.0
: Fri Feb 16 2018 - 13:20:07 PST
*