- Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]

From: Osher Doctorow <osher.domain.name.hidden>

Date: Fri, 6 Sep 2002 12:20:28 -0700

From: Osher Doctorow osher.domain.name.hidden, Fri. Sept. 6, 2002 11:45AM

I have read about half of J. Schmidthuber's *A computer scientist's view of

life, the universe, and everything,* (1997), and he has interesting ideas

and clarity of presentation, but I have to disagree with him on a number of

places where he uses conditional probability including his section

Generalization and Learning. I hasten to add that I do not view

alternative theories as *wrong* but as competing and that they should almost

all survive for competition, motivation, and also because many of them turn

out to have useful contributions long after they have been regarded as

*discredited*.

Schmidthuber (S for short) concludes that generalization is impossible in

general by using a proof based on conditional probability, and similarly he

concludes that the learner's life in general is limited by also a

conditional probability proof. Most readers will undoubtedly stare at this

statement in bewilderment, since as far as they know nothing is wrong with

conditional probability.

They are partly correct and partly wrong. Nothing is wrong with

conditional probability, which is the main tool of the Bayesian school (or

as I abbreviate it, the BCP or Bayesian Conditional Probability-Statistics

school), for Fairly Frequent Events. For Rare Events, something very

strange happens. This was how my wife Marleen and I began our exploration

of Rare Events in 1980. Conditional probability divides two probabilities

and regards that as an indication of the probability of one event *given*

another event, where *given* is used in the sense of *freezing the other

event in place*. Some real analysis experts will argue that this is all

justified by the Radon Derivative of the Lebesgue-Radon-Nikodym theorem(s),

not quite realizing that the proof of those theorems only hold up to

equivalence classes outside sets of measure ZERO. But events of

probability zero are the Rarest Events. Moreover, division of

probabilities blows up even in small (one-sided) neighborhoods of

probability 0 since division by 0 is impossible. Thus, not only can

conditional probability not model events of probability 0, but it cannot

even model events of probability close to 0 (Rare Events).

Is there a simple solution? Yes! Product/Goguen fuzzy multivalued

logical implication x-->y is defined as y/x for x not 0. So it corresponds

to conditional probability where x and y are carefully chosen probabilities

in the probability-statistics analog. Lukaciewicz and Rational Pavelka

fuzzy multivalued logical implications (Rational Pavelka is the predicate

logic generalization of Lukaciewicz propositional logic) are x-->y = 1 + y -

x for y < = x for the non-trivial case. The latter does not involve

division by 0 and does not blow up in any (one-sided) neighborhood of zero.

Logic-Based Probability (LBP) uses precisely the same definition of 1 + y -

x in place of y/x for exactly the same probabilities x, y which BCP uses.

My wife and I introduced LBP in 1980. It may be remarked here the Godel

fuzzy multivalued logic, which we showed applies to Very Frequent (Very

Common) Events, uses x-->y = y and refers in the probability-statistics

analog to INDEPENDENT events, and since in general events are not

independent unless that can be established in special cases, LBP is the

correct result to use.

So when S claims that generalization is impossible in general and that the

learner's life is limited in general, he has to be referring to Fairly

Frequent Events, not Rare Events or even Very Frequent Events (which use the

Godel analog).

But surely that leaves much room for S to maneuver in? In a way, yes, and

in a way, no. S is very interested in the Great Programmer or even a

decreasing sequence of Great Programmers each delegating authority to the

other in different universes and so on. The Great Programmer thinks on the

level of the Universe or All Universes or the particular Universe in the

sequence. So we have to ask: which type of fuzzy multivalued logic or its

probability-statistics analog (or proximity function - geometry - topology

analog, which we developed as exact analogs of the above) most influences

the Universe(s)?

The answer turns out to be very simple, namely Lukaciewicz/Rational Pavelka

(Rare Event) or its probability-statistics analog LBP. This is because in

our universe it is generally agreed that a Rare Event called a Big Bang

occurred (I have proven that even if it did not, as in Steinhardt-Turok and

Gott-Li cyclic or backward time loop cosmological theories, LBP is the key

influence probability), and that very rare events such as inflation and the

transition from radiation-dominated to matter-dominated eras and transition

from non-accelerating to accelerating universe which fairly recently

occurred - that all of these Rare Events played critical roles in the

development of the Universe.

I should also mention that Shannon Information-Entropy and its Kolmogorov

generalizations blow up near zero because the logarithm does, and that the

only *influence* type of Shannon Information-Entropy is based on conditional

probability, which of course also blows up at zero. Rare Event

Information-Entropy does not use logarithms but (positive or negative)

exponentials, and of course does not divide probabilities so it does not

blow up at or near zero denominator.

Quantum-field-theory-oriented physicists may be slightly disturbed at this

point, since QFT totally eliminates probabilities except in the *formal*

location of Schrodinger's equation which is regarded as a *deterministic*

equation (another anomaly that I will be glad to argue about at another time

or place). Happily or unhappily, they have no choice in the matter of the

above results, since they hold across about 10 different branches of

mathematics and almost an equal number of branches of physics. Curiously

enough, Quantum Mechanics theorists manage to get probability back into the

picture, including their much-used CONDITIONAL probability, while

simultaneously disavowing the stochastic (probability) school and claiming

allegiance to the Statistics School (apparently unaware that there is no

statistics without probability) which plays an only *formal* role in

supporting the *deterministic Schrodinger and Heisenberg* equations.

Osher Doctorow

Received on Fri Sep 06 2002 - 12:33:31 PDT

Date: Fri, 6 Sep 2002 12:20:28 -0700

From: Osher Doctorow osher.domain.name.hidden, Fri. Sept. 6, 2002 11:45AM

I have read about half of J. Schmidthuber's *A computer scientist's view of

life, the universe, and everything,* (1997), and he has interesting ideas

and clarity of presentation, but I have to disagree with him on a number of

places where he uses conditional probability including his section

Generalization and Learning. I hasten to add that I do not view

alternative theories as *wrong* but as competing and that they should almost

all survive for competition, motivation, and also because many of them turn

out to have useful contributions long after they have been regarded as

*discredited*.

Schmidthuber (S for short) concludes that generalization is impossible in

general by using a proof based on conditional probability, and similarly he

concludes that the learner's life in general is limited by also a

conditional probability proof. Most readers will undoubtedly stare at this

statement in bewilderment, since as far as they know nothing is wrong with

conditional probability.

They are partly correct and partly wrong. Nothing is wrong with

conditional probability, which is the main tool of the Bayesian school (or

as I abbreviate it, the BCP or Bayesian Conditional Probability-Statistics

school), for Fairly Frequent Events. For Rare Events, something very

strange happens. This was how my wife Marleen and I began our exploration

of Rare Events in 1980. Conditional probability divides two probabilities

and regards that as an indication of the probability of one event *given*

another event, where *given* is used in the sense of *freezing the other

event in place*. Some real analysis experts will argue that this is all

justified by the Radon Derivative of the Lebesgue-Radon-Nikodym theorem(s),

not quite realizing that the proof of those theorems only hold up to

equivalence classes outside sets of measure ZERO. But events of

probability zero are the Rarest Events. Moreover, division of

probabilities blows up even in small (one-sided) neighborhoods of

probability 0 since division by 0 is impossible. Thus, not only can

conditional probability not model events of probability 0, but it cannot

even model events of probability close to 0 (Rare Events).

Is there a simple solution? Yes! Product/Goguen fuzzy multivalued

logical implication x-->y is defined as y/x for x not 0. So it corresponds

to conditional probability where x and y are carefully chosen probabilities

in the probability-statistics analog. Lukaciewicz and Rational Pavelka

fuzzy multivalued logical implications (Rational Pavelka is the predicate

logic generalization of Lukaciewicz propositional logic) are x-->y = 1 + y -

x for y < = x for the non-trivial case. The latter does not involve

division by 0 and does not blow up in any (one-sided) neighborhood of zero.

Logic-Based Probability (LBP) uses precisely the same definition of 1 + y -

x in place of y/x for exactly the same probabilities x, y which BCP uses.

My wife and I introduced LBP in 1980. It may be remarked here the Godel

fuzzy multivalued logic, which we showed applies to Very Frequent (Very

Common) Events, uses x-->y = y and refers in the probability-statistics

analog to INDEPENDENT events, and since in general events are not

independent unless that can be established in special cases, LBP is the

correct result to use.

So when S claims that generalization is impossible in general and that the

learner's life is limited in general, he has to be referring to Fairly

Frequent Events, not Rare Events or even Very Frequent Events (which use the

Godel analog).

But surely that leaves much room for S to maneuver in? In a way, yes, and

in a way, no. S is very interested in the Great Programmer or even a

decreasing sequence of Great Programmers each delegating authority to the

other in different universes and so on. The Great Programmer thinks on the

level of the Universe or All Universes or the particular Universe in the

sequence. So we have to ask: which type of fuzzy multivalued logic or its

probability-statistics analog (or proximity function - geometry - topology

analog, which we developed as exact analogs of the above) most influences

the Universe(s)?

The answer turns out to be very simple, namely Lukaciewicz/Rational Pavelka

(Rare Event) or its probability-statistics analog LBP. This is because in

our universe it is generally agreed that a Rare Event called a Big Bang

occurred (I have proven that even if it did not, as in Steinhardt-Turok and

Gott-Li cyclic or backward time loop cosmological theories, LBP is the key

influence probability), and that very rare events such as inflation and the

transition from radiation-dominated to matter-dominated eras and transition

from non-accelerating to accelerating universe which fairly recently

occurred - that all of these Rare Events played critical roles in the

development of the Universe.

I should also mention that Shannon Information-Entropy and its Kolmogorov

generalizations blow up near zero because the logarithm does, and that the

only *influence* type of Shannon Information-Entropy is based on conditional

probability, which of course also blows up at zero. Rare Event

Information-Entropy does not use logarithms but (positive or negative)

exponentials, and of course does not divide probabilities so it does not

blow up at or near zero denominator.

Quantum-field-theory-oriented physicists may be slightly disturbed at this

point, since QFT totally eliminates probabilities except in the *formal*

location of Schrodinger's equation which is regarded as a *deterministic*

equation (another anomaly that I will be glad to argue about at another time

or place). Happily or unhappily, they have no choice in the matter of the

above results, since they hold across about 10 different branches of

mathematics and almost an equal number of branches of physics. Curiously

enough, Quantum Mechanics theorists manage to get probability back into the

picture, including their much-used CONDITIONAL probability, while

simultaneously disavowing the stochastic (probability) school and claiming

allegiance to the Statistics School (apparently unaware that there is no

statistics without probability) which plays an only *formal* role in

supporting the *deterministic Schrodinger and Heisenberg* equations.

Osher Doctorow

Received on Fri Sep 06 2002 - 12:33:31 PDT

*
This archive was generated by hypermail 2.3.0
: Fri Feb 16 2018 - 13:20:07 PST
*