- Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]

From: Russell Standish <R.Standish.domain.name.hidden>

Date: Fri, 12 Oct 2001 11:43:22 +1000 (EST)

juergen.domain.name.hidden wrote:

*>
*

*>
*

*>
*

*> Huh? A PDF? You mean a probability density function? On a continuous set?
*

Probability Distribution Function. And PDF's are defined on all

measurable sets, not just continuous ones.

*> No! I am talking about probability distributions on describable objects.
*

*> On things you can program.
*

Sure...

*>
*

*> Anyway, you write "...observer will expect to see regularities even with
*

*> the uniform prior" but that clearly cannot be true.
*

*>
*

This is where I beg to differ.

*> >
*

*> > Since you've obviously barked up the wrong tree here, it's a little
*

*> > hard to know where to start. Once you understand that each observer
*

*> > must equivalence an infinite number of descriptions due to the
*

*> > boundedness of its resources, it becomes fairly obvious that the
*

*> > smaller, simpler descriptions correspond to larger equivalence classes
*

*> > (hence higher probability).
*

*>
*

*> Maybe you should write down formally what you mean? Which resource bounds?
*

*> On which machine? What exactly do you mean by "simple"? Are you just
*

*> referring to the traditional Solomonoff-Levin measure and the associated
*

*> old Occam's razor theorems, or do you mean something else?
*

It is formally written in "Why Occams Razor", and the extension to

non-computational observers is a fairly obvious corollory of the

contents of "On Complexity and Emergence".

To summarise the results for this list, the basic idea is that a

observer acts as a filter, or category machine. On being fed a

description, the description places it into a category, or equivalence

class of descriptions. What we can say is that real observers will

have a finite number of such equivalence classes it can categorise

descriptions into (be they basins of attraction in a neural network,

or output states in the form of books, etc). This is important,

because if we consider the case of UTMs being the classification

device, and outputs from halting programs as being the categories,

then we end up with the S-L distribution over the set of halting

programs. Unfortunately, the set of halting programs has measure zero

within the set of all programs (descriptions), so we end up with the

White Rabbit paradox, namely that purely random descriptions have

probability one.

One solution is to ban the "Wabbit" universes (which you do by means

of the speed prior), and recover the nicely behaved S-L measure,

which, essentially, is your solution (ie S-L convolved with the Speed

Prior).

The other way is to realise that the only example of conscious

observer we are aware of is actually quite resilient to noise - only a

small number of bits in a random string will be considered

significant. We are really very good at detecting patterns in all

sorts of random noise. Hence every random string will be equivalenced

to some regular description. Then one ends up with an _observer

dependent_ S-L-like probability distribution over the set of

categories made by that observer. The corresponding Occam's Razor

theorem follows in the usual way.

I hope this is clear enough for you to follow...

*>
*

*> You are talking falsifiability. I am talking verifiability. Sure, you
*

*> cannot prove randomness. But that's not the point of any inductive
*

*> science. The point is to find regularities if there are any. Occam's
*

*> razor encourages us to search for regularity, even when we do not know
*

*> whether there is any. Maybe some PhD student tomorrow will discover a
*

*> simple PRG of the kind I mentioned, and get famous.
*

*>
*

*> It is important to see that Popper's popular and frequently cited and
*

*> overrated concept of falsifiability does not really help much to explain
*

*> what inductive science such as physics is all about. E.g., physicists
*

*> accept Everett's ideas although most of his postulated parallel universes
*

*> will remain inaccessible forever, and therefore are _not_ falsifiable.
*

*> Clearly, what's convincing about the multiverse theory is its simplicity,
*

*> not its falsifiability, in line with Solomonoff's theory of inductive
*

*> inference and Occam's razor, which is not just a wishy-washy philosophical
*

*> framework like Popper's.
*

*>
*

*> Similarly, today's string physicists accept theories for their simplicity,
*

*> not their falsifiability. Just like nobody is able to test whether
*

*> gravity is the same on Sirius, but believing it makes things simpler.
*

*>
*

*> Again: the essential question is: which prior is plausible? Which
*

*> represents the correct notion of simplicity? Solomonoff's traditional
*

*> prior, which does not care for temporal complexity at all? Even more
*

*> general priors computable in the limit, such as those discussed in
*

*> the algorithmic TOE paper? Or the Speed Prior, which embodies a more
*

*> restricted concept of simplicity that differs from Kolmogorov complexity
*

*> because it takes runtime into account, in an optimal fashion?
*

*>
*

*> Juergen Schmidhuber
*

*>
*

*> http://www.idsia.ch/~juergen/
*

*> http://www.idsia.ch/~juergen/everything/html.html
*

*> http://www.idsia.ch/~juergen/toesv2/
*

*>
*

*>
*

The problem is that the uniform prior is simpler than the speed

prior. It is very much the null hypothesis, against which we should

test the speed prior hypothesis. That is why falsifiability, and more

importantly verifyability is important. It is entirely possible that

we exist within some kind a stupendous, but nevertheless resource

limited virtual reality. However, is it not too much to ask for

possible evidence for this?

My main point is that your objection to the uniform prior (essentially

the so called "White Rabbit" problem) is a non-problem.

Cheers

----------------------------------------------------------------------------

Dr. Russell Standish Director

High Performance Computing Support Unit, Phone 9385 6967, 8308 3119 (mobile)

UNSW SYDNEY 2052 Fax 9385 6965, 0425 253119 (")

Australia R.Standish.domain.name.hidden

Room 2075, Red Centre http://parallel.hpc.unsw.edu.au/rks

International prefix +612, Interstate prefix 02

----------------------------------------------------------------------------

Received on Thu Oct 11 2001 - 18:59:06 PDT

Date: Fri, 12 Oct 2001 11:43:22 +1000 (EST)

juergen.domain.name.hidden wrote:

Probability Distribution Function. And PDF's are defined on all

measurable sets, not just continuous ones.

Sure...

This is where I beg to differ.

It is formally written in "Why Occams Razor", and the extension to

non-computational observers is a fairly obvious corollory of the

contents of "On Complexity and Emergence".

To summarise the results for this list, the basic idea is that a

observer acts as a filter, or category machine. On being fed a

description, the description places it into a category, or equivalence

class of descriptions. What we can say is that real observers will

have a finite number of such equivalence classes it can categorise

descriptions into (be they basins of attraction in a neural network,

or output states in the form of books, etc). This is important,

because if we consider the case of UTMs being the classification

device, and outputs from halting programs as being the categories,

then we end up with the S-L distribution over the set of halting

programs. Unfortunately, the set of halting programs has measure zero

within the set of all programs (descriptions), so we end up with the

White Rabbit paradox, namely that purely random descriptions have

probability one.

One solution is to ban the "Wabbit" universes (which you do by means

of the speed prior), and recover the nicely behaved S-L measure,

which, essentially, is your solution (ie S-L convolved with the Speed

Prior).

The other way is to realise that the only example of conscious

observer we are aware of is actually quite resilient to noise - only a

small number of bits in a random string will be considered

significant. We are really very good at detecting patterns in all

sorts of random noise. Hence every random string will be equivalenced

to some regular description. Then one ends up with an _observer

dependent_ S-L-like probability distribution over the set of

categories made by that observer. The corresponding Occam's Razor

theorem follows in the usual way.

I hope this is clear enough for you to follow...

The problem is that the uniform prior is simpler than the speed

prior. It is very much the null hypothesis, against which we should

test the speed prior hypothesis. That is why falsifiability, and more

importantly verifyability is important. It is entirely possible that

we exist within some kind a stupendous, but nevertheless resource

limited virtual reality. However, is it not too much to ask for

possible evidence for this?

My main point is that your objection to the uniform prior (essentially

the so called "White Rabbit" problem) is a non-problem.

Cheers

----------------------------------------------------------------------------

Dr. Russell Standish Director

High Performance Computing Support Unit, Phone 9385 6967, 8308 3119 (mobile)

UNSW SYDNEY 2052 Fax 9385 6965, 0425 253119 (")

Australia R.Standish.domain.name.hidden

Room 2075, Red Centre http://parallel.hpc.unsw.edu.au/rks

International prefix +612, Interstate prefix 02

----------------------------------------------------------------------------

Received on Thu Oct 11 2001 - 18:59:06 PDT

*
This archive was generated by hypermail 2.3.0
: Fri Feb 16 2018 - 13:20:07 PST
*