PRINCIPLES OF UNCERTAINTY IN SYSTEMS SCIENCE
George J. Klir
There are three inseparable principles: Principle of minimum uncertainty: ���.. Principle of maximum uncertainty: ���. Principal of uncertainty invariance: �
These principles may also be viewed as principles of uncertainty-based information. The common thrust of them is that they are sound information safeguards in dealing with systems problems. They guarantee that when we deal with any systems problem, we use all information available, we do not unwittingly use information that is not available, and we do not lose more information than inevitable. The three principles apply to nondeterministic systems, in which relevant uncertainty (predictive, retrodictive, prescriptive, diagnostic, etc.) is formalized with a mathematical theory suitable for each application (probability theory, possibility theory, evidence theory, etc.). The principles can be made operational only if a well-justifies measure of uncertainty in the theory employed is available. Since types and measures of uncertainty substantially differ in different uncertainty theories, the principles result in considerable different mathematical problems when we move from one theory to another. When uncertainty is reduced by taking an action (performing a relevant experiment and observing the experimental outcome, searching through an archive and finding a relevant document, etc.), the amount of information obtained by the action can be measured by the amount of uncertainty reduced � the difference between the a priori uncertainty and a posteriori uncertainty. Due to this connection between uncertainty and information, the three principles of uncertainty may also be viewed as principles of information. Information of this kind is usually called uncertainty-based information [Klir & Wierman, 1999].
Principle of Minimum Uncertainty The principle of minimum uncertainty is an arbitration principle. It facilitates the selection of meaningful solutions from a solution set obtained by solving any problem in which some initial information is inevitably lost. By this principle, we should accept only such solutions for which the amount of lost information is minimal. This is equivalent to accepting solutions with the minimum relevant uncertainty (predictive, prescriptive, etc.). A major class of problems for which the principle of minimum uncertainty is applicable are simplification problems. When a system is simplified, it is usually unavoidable to lose some information contained in the system. The amount of information that is lost in this process results in the increase of an equal amount of relevant uncertainty. Examples of relevant uncertainties are predictive, retrodictive, or prescriptive uncertainty. A sound simplification of a given system should minimize the loss of relevant information (or increase in relevant uncertainty) while achieving the required reduction of complexity. That is, we should accept only such simplifications of a given system at any desirable level of complexity for which the loss of relevant information (or the increase in relevant uncertainty) is minimal. When properly applied, the principle of minimum uncertainty guarantees that no information is wasted in the process of simplifications. Given a system formulated within a particular experimental frame, there are many distinct ways of simplifying it. Three main strategies of simplification can readily be recognized. � simplifications made by eliminating some entities from the system (variables, subsystems, etc.) � simplifications made by aggregating some entities of the system (variables, states, etc.) � simplifications made by breaking overall systems into appropriate subsystems.
Regardless of the strategy employed, the principle of minimum uncertainty is utilized in the same way. It is an arbiter which decides which simplifications to choose at any given level of complexity. Another application of the principle of minimum uncertainty is the area of conflict-resolution problems. For example, when we integrate several overlapping partial models into one larger model, the models may be locally inconsistent. It is reasonable then to require that each of the models be appropriately adjusted in such a way that the overall model become consistent. To guarantee that no fictitious (biasing) information be introduced, the adjustments must not decrease the uncertainty of any of the partial models involved, but may increase it. That is, to achieve local consistency of the overall model, we are likely to loose some information contained in the partial model. This is not desirable. Hence, we should minimize this loss of information. That is, we should accept only those adjustments for which the total loss of information (or total increase of uncertainty) is minimal. The total loss of information may be expressed, for example, by the sum of all individual losses or by a weighted sum, if the partial models are valued differently. 6.2 Principle of Maximum Uncertainty The second principle, the principle of maximum uncertainty, is essential for any problem that involves ampliative reasoning. This is reasoning in which conclusions are not entailed in the given premises. Using common sense, the principle may be expressed by the following requirement: in any ampliative inference, use all information available but make sure that no additional information is unwittingly added. That is, the principle requires that conclusions resulting from any ampliative inference maximize the relevant uncertainty within the constraints representing the premises. This principle guarantees that our ignorance be fully recognized when we try to enlarge our claims beyond the given premises and, at the same time, that all information contained in the premises be fully utilized. In other words, the principle guarantees that our conclusions are maximally noncommittal with regards to information not contained in the premises. Ampliative reasoning is indispensable to science and engineering in a variety of ways. For example, whenever we utilize a scientific model for predictions, we employ ampliative reasoning. Similarly, when we want to estimate microstates form the knowledge of relevant macrostates and partial information regarding the microstates (as in image processing and many other problems), we must resort to ampliative reasoning. The problem of the identification of an overall system from some of its subsystems is another example that involves ampliative reasoning. The principle of maximum uncertainty is well developed and tested within the classical information theory based upon the Shannon entropy, where it is called the maximum entropy principle. This entropy principle was founded, presumably, by Jaynes [1983]. Perhaps the greatest skill in using this principle in a broad spectrum of applications, often in combination with the complementary minimum entropy principle, has been demonstrated by Christensen [1985-1986]. Literature concerned with the principle is extensive. An excellent overview is a book by Kapur [1989], which contains an extensive bibliography. A general formulation of the principle of maximum entropy is: determine a probability distribution that maximizes the Shannon entropy subject to given constraints, which express partial information about the unknown probabilities, as well as general constraints (axioms) of probability theory. The most typical constraints employed in practical applications are the mean (expected) values of random variables under investigation, various marginal probability distributions of an unknown joint distribution, or upper and lower estimates of probabilities. 6.3 Principle of Uncertainty Invariance The third principle, the principle of uncertainty invariance, facilitates connections among representations of uncertainty and information in alternative mathematical theories. The principle requires that the amount of uncertainty (and information) be preserved when a representation of uncertainty in one mathematical theory is transformed into its counterpart in another theory. That is, the principle guarantees that no information is unwittingly added or eliminated solely by changing the mathematical framework by which a particular phenomenon is formalized. As a rule, uncertainty invariant transformations are not unique. To make them unique, appropriate additional requirements must be imposed. In comparison with the principles of minimum and maximum uncertainty, which have been investigated and applied within probability theory for at least 40 years, the principle of uncertainty invariance was introduced only in the early 1990s [Klir, 1990]. It is based upon the following epistemological and methodological position: every real-world decision or problem situation involving uncertainty can be formalized in all the theories of uncertainty. Each formalization is a mathematical model of the situation. When we commit ourselves to a particular mathematical theory, our modeling becomes necessarily limited by the constraints of the theory. For example, probability theory can model decision situations only in terms of conflicting degrees of belief in mutually exclusive alternatives. These degrees are derived in some ways from the evidence on hand. Possibility theory, on the other hand, can model a decision situation only in terms of degrees of belief that are allocated to consonant (nested) subsets of alternatives; these are almost conflict-free, but involve large nonspecificity. Clearly, a more general theory is capable of capturing uncertainties of some decision situations more faithfully than its less general competitors. Nevertheless, every uncertainty theory, even the least general one, is capable of characterizing (or approximating, if you like) the uncertainty of every situation. This characterization may not be, due to constraints of the theory, as natural as its counterparts in other, more adequate theories. However, such a characterization does always exist. If the theory is not capable of capturing some type of uncertainty directly, it may capture it indirectly in some fashion, through whatever other type of uncertainty is available. To transform the representation of a problem-solving situation in one theory, T1, into an equivalent representation in another theory, T2, we should require that: (i) the amount of uncertainty associated with the situation be preserved when we move form T1 into T2; and (ii) the degrees of belief in T1 be converted to their counterparts in T2 by an appropriate scale, at least ordinal. These two requirements express the principle of uncertainty invariance. Requirement (i) guarantees that no uncertainty is unwittingly added or eliminated solely by changing the mathematical theory by which a particular phenomenon is formalized. If the amount of uncertainty were not preserved then either some information not supported by the evidence would unwittingly be added by the transformation (information bias) or some useful information contained in the evidence would unwittingly be eliminated (information waste). In either case, the model obtained by the transformation could hardly be viewed as equivalent to its original. Requirement (ii) guarantees that certain properties, which are considered essential in a given context (such as ordering or proportionality of relevant values), be preserved under the transformation. Transformations under which certain properties of a numerical variable remain invariant are known in the theory of measurement as scales. Due to unique connection between uncertainty and information, the principle of uncertainty can also be conceived as a principle of information invariance or information preservation. Indeed, each model of a problem-solving situation, formalized in some mathematical theory, contains information of some type and some amount. The amount is expressed by the difference between the maximum possible uncertainty associated with the set of alternatives postulated in the situation and the actual uncertainty of the model. When we approximate one model with another one, formalized in terms of a different mathematical theory, this basically means that we want to replace one type of information with an equal amount of information of another type. That is, we want to convert information from one type to another while, at the same time, preserving its amount. This expresses the spirit of the principle of information invariance of preservation: no information should be added or eliminated solely by converting one type of information to another. It seems reasonable to compare this principle, in a metaphoric way, with the principle of energy preservation in physics. Examples of generic applications of the principle include problems that involve transformations from probabilities to possibilities and vice versa, approximations of fuzzy sets by crisp sets (defuzzification), and approximations of bodies of evidence in evidence theory by their probabilistic or possibilistic counterparts.
REFERENCES: Christensen, R. [1985], “Entropy minimax multivariate statistical modeling - I: Theory.” Intern. J. of General Systems, 11(3), pp. 231-277. Christensen, R. [1986], “Entropy minimax multivariate statistical modeling - II: Applications.” Intern. J. of General Systems, 12(3), pp. 227-305. Jaynes, E. T. [1983], Rosenkrantz, R. D., ed., Papers on Probability, Statistics and Statistical Physics. Reidel, Dordrecht. Kapur, J. N. [1989], Maximum Entropy Models in Science and Engineering. John Wiley, New York. Klir, G. J. [1990], “A principle of uncertainty and information invariance.” Intern. J. of General Systems, 17(2-3), pp. 249-275. Klir, G. J. and Wierman M.J. [1999], Uncertainty-Based Information: Elements of Generalized Information Theory. Physica-Verlag/Springer-Verlag, Heidelberg and New York.