information content of measures (fwd) from Russell Standish on 1999-01-10 (everything)

From: Russell Standish <R.Standish.domain.name.hidden>
Date: Mon, 11 Jan 1999 09:53:44 +1100 (EST)

Forwarded message:
> From everything-list-request.domain.name.hidden Sat Jan 9 21:11 EST 1999
> Resent-Date: Sat, 9 Jan 1999 01:59:34 -0800
> Message-ID: <19990109015930.A20634.domain.name.hidden>
> Date: Sat, 9 Jan 1999 01:59:30 -0800
> From: Wei Dai <weidai.domain.name.hidden>
> To: everything-list.domain.name.hidden
> Subject: information content of measures
> Mime-Version: 1.0
> X-Mailer: Mutt 0.91.1i
> Resent-Message-ID: <"UnIqu1.0.wM2.6Yobs".domain.name.hidden>
> Resent-From: everything-list.domain.name.hidden
> X-Mailing-List: <everything-list.domain.name.hidden> archive/latest/227
> X-Loop: everything-list.domain.name.hidden
> Precedence: list
> Resent-Sender: everything-list-request.domain.name.hidden
> Content-Type: text/plain; charset=us-ascii
> Content-Length: 1633
>
> Is there an established definition of the information content of a measure
> on finite binary strings (as opposed to the information content of a
> finite binary string)? I searched through Li and Vitanyi's _An
> Introduction to Kolmogorov Complexity and Its Applications_ but wasn't
> able to find it.
>
> If there isn't an established definition, let me propose the following as
> a natural one:
>
> Let B = {0,1} and U : B* -> N be a universal prefix machine. Define U' :
> (B*->R) -> (N->R) as:
Wai,
It would certainly help to define some terms - what do you
mean by B* (powerset of B? set of all binary strings?) What exactly is
a universal prefix machine? Can it be any map from B* to N?

>
> U'(u)(x) = \sum_{U(p)=x}u(p)

This definition doesn't even parse. U' is defined over the set of
universal prefix machines, not (something) x N. What is u? And what is
the index that the sum is performed over - {p:U(p)=x}, or is it
{U:U(p)=x}? In any case the independent variable ought to appear on
the RHS.

I think I'd better bail at this point, pending clarification.

Cheers

>
> U' is interpreted as a function from measures on infinite binary strings
> (see 4.2 of Li and Vitanyi for the notation) to measures on finite binary
> strings. If U'(u)=v, then v is the distribution on the output of U when
> the input of U has distribution u. Note that U' is many-to-one and onto.
> Now define the algorithmic information content of a measure on finite
> binary strings v : N -> R as:
>
> C_U(v)=min {c:U'(u)=v and c=\sum u(p)(log u(p)+l(p))}
>
> which can be thought of as the negative entropy of the maximum entropy
> distribution on inputs that produce v as the distribution on outputs. (BTW
> I originally had c=limit as n goes to infinity of
> n+\sum_{l(p)<=n}u(p)log(u(p)/2^(n-l(p))), which simplies to the above.
>
> With this definition, the only measure with zero algorithmic information
> content is the universal a priori distribution (4.3.2). Also, for any
> measure that assigns 1 to some particular string x, the algorithmic
> information content of that measure agrees with the algorithmic
> information content of the string x, as defined by -log Q_U(x), where Q_U
> is the universal a priori distribution.
>
> Any comments?
>
>

----------------------------------------------------------------------------
Dr. Russell Standish Director
High Performance Computing Support Unit,
University of NSW Phone 9385 6967
Sydney 2052 Fax 9385 7123
Australia R.Standish.domain.name.hidden
Room 2075, Red Centre http://parallel.hpc.unsw.edu.au/rks
----------------------------------------------------------------------------
Received on Sun Jan 10 1999 - 14:51:00 PST

This archive was generated by hypermail 2.3.0 : Fri Feb 16 2018 - 13:20:06 PST