Some Order Preserving Inequalities for Cross Entropy and Kullback–Leibler Divergence

Cross entropy and Kullback–Leibler (K-L) divergence are fundamental quantities of information theory, and they are widely used in many fields. Since cross entropy is the negated logarithm of likelihood, minimizing cross entropy is equivalent to maximizing likelihood, and thus, cross entropy is applied for optimization in machine learning. K-L divergence also stands independently as a commonly used metric for measuring the difference between two distributions. In this paper, we introduce new inequalities regarding cross entropy and K-L divergence by using the fact that cross entropy is the negated logarithm of the weighted geometric mean. We first apply the well-known rearrangement inequality, followed by a recent theorem on weighted Kolmogorov means, and, finally, we introduce a new theorem that directly applies to inequalities between K-L divergences. To illustrate our results, we show numerical examples of distributions ​
This document is licensed under a Creative Commons:Attribution (by) Creative Commons by4.0