- Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]

From: Juergen Schmidhuber <juergen.domain.name.hidden>

Date: Wed, 03 Apr 2002 10:59:27 +0200

The theory of inductive inference is Bayesian, of course.

But Bayes' rule by itself does not yield Occam's razor.

Suppose x represents the history of our universe up until now.

What is its most likely continuation y? Let us write xy for

the entire history - the concatenation of x and y. Bayes just

says: P(xy | x) = P(x | xy) P(xy) / N(x), where N(x) is

a normalizing constant. So our conditional probability

is proportional to the prior probability P(xy).

Hence, according to Bayes, what you put in is what you get

out. If your prior P(z) were high for simple z then you'd

get Occam's razor: simple explanations preferred.

But why should P favor simple z? Where does Occam's razor

really come from? The essential work on this subject has

been done in statistical learning theory, not in physics.

Some have restricted P by making convenient Gaussian

assumptions. Such restrictions yield specific variants

of Occam's razor.

But the most compelling approach is much broader than that.

It just assumes that P is computable. That you can formally

write it down. That there is a program that takes as input

past observations and possible future observations,

and computes conditional probabilities of the latter

(Gaussian assumptions are a very special case thereof.)

The computability assumption seems weak but is strong enough

to yield a very general form of Occam's razor. It naturally

leads to what is known as the universal prior, which dominates

Gaussian and other computable priors. And Hutter's

recent loss bounds show that it does not hurt much to predict

according to the universal prior instead of the true but

unknown distribution, as long as the latter is computable.

I believe physicists and other inductive scientists really

should become aware of this. It is essential to what they are

doing. And much more formal and concrete than Popper's

frequently cited but non-quantitative ideas on falsifiability.

Juergen Schmidhuber http://www.idsia.ch/~juergen/

Received on Wed Apr 03 2002 - 01:03:01 PST

Date: Wed, 03 Apr 2002 10:59:27 +0200

The theory of inductive inference is Bayesian, of course.

But Bayes' rule by itself does not yield Occam's razor.

Suppose x represents the history of our universe up until now.

What is its most likely continuation y? Let us write xy for

the entire history - the concatenation of x and y. Bayes just

says: P(xy | x) = P(x | xy) P(xy) / N(x), where N(x) is

a normalizing constant. So our conditional probability

is proportional to the prior probability P(xy).

Hence, according to Bayes, what you put in is what you get

out. If your prior P(z) were high for simple z then you'd

get Occam's razor: simple explanations preferred.

But why should P favor simple z? Where does Occam's razor

really come from? The essential work on this subject has

been done in statistical learning theory, not in physics.

Some have restricted P by making convenient Gaussian

assumptions. Such restrictions yield specific variants

of Occam's razor.

But the most compelling approach is much broader than that.

It just assumes that P is computable. That you can formally

write it down. That there is a program that takes as input

past observations and possible future observations,

and computes conditional probabilities of the latter

(Gaussian assumptions are a very special case thereof.)

The computability assumption seems weak but is strong enough

to yield a very general form of Occam's razor. It naturally

leads to what is known as the universal prior, which dominates

Gaussian and other computable priors. And Hutter's

recent loss bounds show that it does not hurt much to predict

according to the universal prior instead of the true but

unknown distribution, as long as the latter is computable.

I believe physicists and other inductive scientists really

should become aware of this. It is essential to what they are

doing. And much more formal and concrete than Popper's

frequently cited but non-quantitative ideas on falsifiability.

Juergen Schmidhuber http://www.idsia.ch/~juergen/

Received on Wed Apr 03 2002 - 01:03:01 PST

*
This archive was generated by hypermail 2.3.0
: Fri Feb 16 2018 - 13:20:07 PST
*