Optimal measures and Markov transition kernels

Article, Preprint English OPEN
Belavkin, Roman V. (2012)
  • Publisher: Springer
  • Related identifiers: doi: 10.1007/s10898-012-9851-1
  • Subject: Mathematics - Optimization and Control | Mathematical Physics | Statistics - Machine Learning | Computer Science - Computational Complexity | Computer Science - Information Theory | Mathematics - Functional Analysis

We study optimal solutions to an abstract optimization problem for measures, which is a generalization of classical variational problems in information theory and statistical physics. In the classical problems, information and relative entropy are defined using the Kullback-Leibler divergence, and for this reason optimal measures belong to a one-parameter exponential family. Measures within such a family have the property of mutual absolute continuity. Here we show that this property characterizes other families of optimal positive measures if a functional representing information has a strictly convex dual. Mutual absolute continuity of optimal probability measures allows us to strictly separate deterministic and non-deterministic Markov transition kernels, which play an important role in theories of decisions, estimation, control, communication and computation. We show that deterministic transitions are strictly sub-optimal, unless information resource with a strictly convex dual is unconstrained. For illustration, we construct an example where, unlike non-deterministic, any deterministic kernel either has negatively infinite expected utility (unbounded expected error) or communicates infinite information.
  • References (29)
    29 references, page 1 of 3

    [1] Accardi, L., Cecchini, C.: Conditional expectations in von Neumann algebras and a theorem of Takesaki. Journal of Functional Analysis 45(2), 245-273 (1982)

    [2] Alesker, S.: Integrals of smooth and analytic functions over Minkowski's sums of convex sets. In: K.M. Ball, V. Milman (eds.) Convex Geometric Analysis, vol. 34, pp. 1-15. MSRI Publications (1998)

    [3] Amari, S.I.: Differential-Geometrical Methods of Statistics, Lecture Notes in Statistics, vol. 25. Springer, Berlin, Germany (1985)

    [4] Amari, S.I., Ohara, A.: Geometry of q-exponential family of probability distributions. Entropy 13, 1170-1185 (2011)

    [5] Asplund, E., Rockafellar, R.T.: Gradients of convex functions. Transactions of the American Mathematical Society 139, 443-467 (1969)

    [6] Banerjee, A., Merugu, S., Dhillon, I.S., Ghosh, J.: Clustering with Bregman divergences. Journal of Machine Learning Research 6, 1705-1749 (2005)

    [8] Belavkin, R.V.: On evolution of an information dynamic system and its generating operator. Optimization Letters pp. 1-14 (2011). 10.1007/s11590-011-0325-z

    [9] Belavkin, V.P.: New types of quantum entropies and additive information capacities. In: L. Accardi, W. Freudenberg, M. Ohya (eds.) Quantum Bio-Informatics IV, QP-PQ: Quantum Probability and White Noise Analysis, pp. 61-89. World Scientific (2011)

    [10] Bobkov, S.G., Zegarlinski, B.: Entropy bounds and isoperimetry. Memoirs of the American Mathematical Society 176(829) (2005)

    [11] Bourbaki, N.: Ele´ments de mathe´matiques. Inte´gration. Hermann (1963)

  • Metrics
    No metrics available
Share - Bookmark