Academia.eduAcademia.edu

Predicting the value of an integer-valued random variable

Statistics & Probability Letters, 1993
Daniel Jeske
This Paper
A short summary of this paper
37 Full PDFs related to this paper
Statistics & Probability Letters 16 (1993) 297-300 16 March 1993 North-Holland Predicting the value of an integer-valued random variable Daniel R. Jeske zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA AT&T Bell laboratories, Holmdel, NJ, USA Received September 1991 Revised July 1992 Abstract: Predicting the value of a random variable Y, based on the observed value of another random variable X is a common objective of data analysis. It is well-known that the minimum mean-squared error predictor of Y is the mean of the conditional distribution of Y, given X. In cases where Y is necessarily integer-valued, the conditional mean is not always a feasible value for Y and, therefore, is an unsatisfactory predicted value. In this paper, it is shown how minimum mean-squared error integer-valued predictors can be obtained. Keywords: Mean-squared error, integer-valued distributions, unbiased prediction. 1. Introduction tors with the same minimal value of MSE. Unbi- ased prediction is also considered. In general, an It is well-known that the minimum mean-squared unbiased predictor will have a randomization step error predictor of a random variable Y, based on associated with it. The predictor is defined by the value of another random variable X, is the specifying a probability distribution on the inte- conditional expected value, E(Y 1X) (e.g., Rao, gers and predictions are made by stimulating 1973). In applications where Y is a continuous values from this distribution. random variable, E(Y I Xl is at least a feasible In Section 2, the problem to be considered is solution to the prediction problem and in many illustrated by an example. Relevant notation and cases a satisfactory solution as well. When Y is terminology is also introduced. Results are given integer-valued, however, HY 1Xl will not gener- in Section 3 for the ‘No Data Problem’, or con- ally be a possible value for Y, and consequently, texts for which there is no external information in will not be a reasonable predicted value (e.g., the form of X to base the prediction of Y upon. Kempthorne, 1989). Obtaining predictors for contexts where X is In this paper, the problem of predicting an available (Section 4) is facilitated by having re- integer-valued random variable with an integer- sults for the ‘No Data Problem’ at hand. The valued predictor is considered. The criterion used proofs of Theorems 1 and 2 (Section 3) are given for evaluating alternative predictors is mean- in the Appendix. squared error (MSE) of prediction. It is shown that there is either a unique minimum MSE pre- dictor, or there are an infinite number of predic- 2. Example and notation Correspondence to: Daniel R. Jeske, AT&T Bell Laboratories, Let Y have a probability distribution defined on Holmdel, NJ 07733, USA. the integer values 0, 1 and 2 with weights 0.2, 0.3 0167.7152/93/$06.00 0 1993 - Elsevier Science Publishers B.V. All rights reserved 297 Volume 16. Number 4 STATISTICS&PROBABILITY LETTERS 16 March 1993 and 0.5, respectively. Notationally, this will be MSE predictor(s) for the case where k is not an expressed as integer. y.% O l 2 ( 0.2 0.3 0.5 1 ’ Theorem 1. Let Y be an integer- valued random variable with a non- integer mean, u. Define k to be Suppose it is desired to predict a future value of the largest integer less than u, The unique Amini- Y, and that no external information related to Y mum M SE integer- valued predictor of Y is Y,,, = (i.e., X1 is available. The minimum MSE predic- Round&L) (the integer nearest to u), unless u = k tor of Y is Ymi, = 1.3 (the mean value of Y). + 0.5, in which case all randomized predictors of Clearly, Ymi, is not a feasible value for Y, and the form thus is an unsatisfactory predictor of Y. AnAalter- native predictor is obtained by roundingnYmin to the nearest integer to give the predictor Y,,, = 1. It follows from Theorem 1 in Section 3 that pm,, is the unique minimum MSE integer-valued pre- (for arbitrary 0 =~p < 11 are minimum M SE inte- ger- valued predictors. dictor of Y. In cases where Ymi, is of the form zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJ k + 0.5, for some integer k, a unique minimum MSE integer-valued predictor does not exist. It is Proof. See Appendix. shown that for this case there are an infinite number of integer-valued predictors that all have When p is not an integer, there are no non- the same minimal value of MSE. zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA randomized unbiased predictors of Y. To predict An arbitrary predictor Y is an unbiased pre- Y unbiasedly, Y must be a randomized predictor dictor of Y if E(Y) = E(Y). Typically, unbiased defined by a probability distribution on the inte- integer-valued predictors will require a random- ger values. Theorem 2 shows that there is a ization step. For this example, it follows from unique minimum MSE unbiased integer-valued Theorem 2 in Section 3 that the unique minimum predictor and that is the randomized predictor MSE unbiased integer-valued predictor of Y is that proportionally divides its probability mass on I;m,,” defined according to the following probabil- the integer values that are adjacent to p. ity distribution: Y&,, - O l * Theorem 2. Let Y be an integer- valued random (0 0.7 0.3 1 . variable with a non- integer mean u. Let k be the integer such that k < u < k + 1. The unique mini- Values for I;,,,, are obtained by simulating the mum M SE integer- valued unbiased predictor of Y values 0, 1 and 2 accocding to this probability is the randomized predictor defined by distribution. In general, Y,,,, will be the random- ized predictor whose associated probability distri- k+l bution proportionally divides its mass on the two CI,C” N k ( 1-p+k u- k 1 ’zyxwvutsrqponmlkjihgfedcb integer values that are adjacent to the mean of Y. Proof. See Appendix. 3. No data problem Consider the case where no external information 4. Predictor data available about Y, in the form of an observable random variable X, is available. Suppose the mean value, When an observable random variable, X, is avail- IL, of Y is known. If p is an integer, it is the able and is correlated with Y, we can improve the integer-valued predictor with the smallest MSE. prediction of Y by making use of the conditional The following theorem identifies the minimum distribution of Y, given X. The MSE of an arbi- 298 Volume 16, Number 4 STATISTICS&PROBABILITY LETTERS 16 March 1993 trary predictor, g(X), can be expressed as zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDC with equality if and only if Y= Round(p). It follows that E[Y-g(X)]2=E(E[(Y-R(X))21X]). E(Y- Y)2 >,E(Y- k)2, Finding the integer-valued predictor g(X) with the smallest MSE amounts to finding, for each with equality if and only if 8s Round(p), prov- X, an integer g(X), that minimizes the condi- ing the theorem for Case (1). Case (2): By using (ii) of Lemma 1, the proof tional MSE, zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA E[(Y - g(X)>* I Xl. Each of the con- ditional minimization problems is equivalent to of Case (2) is similar to the proof of Case (1). the ‘No Data Problem’ considered in Section 3 Case (3): For each possible value, y, of Y, with the marginal distribution of Y replaced by (y - k)* > y - k with equality if and only if y = k the conditional distribution of Y, given X. + 1 or y = k. Thus, E(Y- k>* > E(Y- - k), with equality if and only if the probability distribution of Y is concentrated exclusively on k and k + 1. Appendix It follows that E(Y - 8)’ 2 E(Y - k>2, with equality if and only if Y is a randomized predic- Lemma 1. Suppose u is a non- integer real number. tor with a probability distribution exclusively on k (i) If k < p < k + 0.5 for some integer k, then and k + 1. Thus, any Y of the form (j - k)2 2 2(p - k)( j - k) for any integer j, with equality if and only if j = k. (ii) If k + 0.5 < u < k + 1 for some integer k, then [j - (k + 1)12 2 2[u - (k + l)l[ j - (k + 111 (0 <p G 1) will be a minimum MSE integer-val- for any integer j, with equality if and only if j = ued predictor, proving the theorem for Case (3). k+ 1. 0 Proof. The hypothesis in (i) implies 0 < 2(~ - k) Lemma 2. Let Y be an arbitrary integer- valued < 1. If j > k, then j - k > 1 > 2(p - k), implying random variable with mean u, and suppose k < u the result. If j <k, then j-k <O <2(u - k), < k + 1 for some integer k. Then, implying the result. Equality in (i) clearly attains if j = k. The proof of (ii) follows similarly by Var(Y) 2 (I- L- k)[l - (P - k)], noting that - 1 < 2[~ - (k + l)] < 0 for this case. with equality if and only if Proof of Theorem 1. The theorem will be proved k k+l zyxwvutsrqponmlkjihgfedcbaZYXW by considering the three cases: (1) k <p < k + Y- i 1-p+k u- k 1’ 0.5; (2) k + 0.5 <p <k + 1; and (3) Jo = k + 0.5. In what follows, Y will denote an arbitrary inte- Proof. (Y - k)[Y - (k + l)] > 0, with equality if ger-valued predictor of Y. Each Y can be though; and only if Y puts all of its mass on k and k + 1. of as a randomized predictor. In the case where Y It follows that EY’ > ku + (k + l>(p - k), with is a constant, the associated probability distribu- equality if and only if Y puts all of its mass on k tion is degenerate. The fact that Y is statistically and k + 1. Thus, Var(Y) 2 (p - k)[l - (p -k)], independent of Y (a result of the randomization with equality if and only if Y puts all of its mass step) is used throughout the proof. on k and k + 1. Since E(Y) = CL,equality follows Case (1): It follows from (i) of Lemma 1 that, if and only if Y has the distribution given in the for each possible value, y, of Y, statement of the lemma. 0 (y-k)*>-(p-k)(y-k), Proof of Theorem 2. Let Y be an arbitrary inte- with equality if and only if y = k = Round(p). ger-valued unbiased predictor of Y. Then, the Thus, MSE of Y is E(Y- k)2~2(u- k)[E(Y) - k], E(Y-Y)*=Var(Y) +Var(Y). 299 Volume 16, Number 4 STATISTICS&PROBABILITY LETTERS 16 March 1993 Since EC?;) = CL,it follows from Lemma 2 that References Var(P)a(p-k)[l-(p-k)], Kempthorne, O.K. (19891, The fate worse than death and with equality if and only if P= gmseu, given in the other curiosities and stupidities, Amer. Statist. 43, 133-134. Rao, C.R. (19731, Linear Statistical Inference and its Applica- statement of the theorem. Thus, tions (Wiley, New York, 2nd ed.). E(Y-P)‘&Var(Y)+(j_-k)[l-(p-k)], with equality if and only if P= pm,,,, proving the theorem. q 300