# information content of measures

From: Wei Dai <weidai.domain.name.hidden>
Date: Sat, 9 Jan 1999 01:59:30 -0800

Is there an established definition of the information content of a measure
on finite binary strings (as opposed to the information content of a
finite binary string)? I searched through Li and Vitanyi's _An
Introduction to Kolmogorov Complexity and Its Applications_ but wasn't
able to find it.

If there isn't an established definition, let me propose the following as
a natural one:

Let B = {0,1} and U : B* -> N be a universal prefix machine. Define U' :
(B*->R) -> (N->R) as:

U'(u)(x) = \sum_{U(p)=x}u(p)

U' is interpreted as a function from measures on infinite binary strings
(see 4.2 of Li and Vitanyi for the notation) to measures on finite binary
strings. If U'(u)=v, then v is the distribution on the output of U when
the input of U has distribution u. Note that U' is many-to-one and onto.
Now define the algorithmic information content of a measure on finite
binary strings v : N -> R as:

C_U(v)=min {c:U'(u)=v and c=\sum u(p)(log u(p)+l(p))}

which can be thought of as the negative entropy of the maximum entropy
distribution on inputs that produce v as the distribution on outputs. (BTW
I originally had c=limit as n goes to infinity of
n+\sum_{l(p)<=n}u(p)log(u(p)/2^(n-l(p))), which simplies to the above.

With this definition, the only measure with zero algorithmic information
content is the universal a priori distribution (4.3.2). Also, for any
measure that assigns 1 to some particular string x, the algorithmic
information content of that measure agrees with the algorithmic
information content of the string x, as defined by -log Q_U(x), where Q_U
is the universal a priori distribution.