Statistics & Probability Letters 16 (1993) 297-300 16 March 1993
North-Holland
Predicting the value of an integer-valued
random variable
Daniel R. Jeske zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
AT&T Bell laboratories, Holmdel, NJ, USA
Received September 1991
Revised July 1992
Abstract: Predicting the value of a random variable Y, based on the observed value of another random variable X is a common
objective of data analysis. It is well-known that the minimum mean-squared error predictor of Y is the mean of the conditional
distribution of Y, given X. In cases where Y is necessarily integer-valued, the conditional mean is not always a feasible value for Y
and, therefore, is an unsatisfactory predicted value. In this paper, it is shown how minimum mean-squared error integer-valued
predictors can be obtained.
Keywords: Mean-squared error, integer-valued distributions, unbiased prediction.
1. Introduction tors with the same minimal value of MSE. Unbi-
ased prediction is also considered. In general, an
It is well-known that the minimum mean-squared unbiased predictor will have a randomization step
error predictor of a random variable Y, based on associated with it. The predictor is defined by
the value of another random variable X, is the specifying a probability distribution on the inte-
conditional expected value, E(Y 1X) (e.g., Rao, gers and predictions are made by stimulating
1973). In applications where Y is a continuous values from this distribution.
random variable, E(Y I Xl is at least a feasible In Section 2, the problem to be considered is
solution to the prediction problem and in many illustrated by an example. Relevant notation and
cases a satisfactory solution as well. When Y is terminology is also introduced. Results are given
integer-valued, however, HY 1Xl will not gener- in Section 3 for the ‘No Data Problem’, or con-
ally be a possible value for Y, and consequently, texts for which there is no external information in
will not be a reasonable predicted value (e.g., the form of X to base the prediction of Y upon.
Kempthorne, 1989). Obtaining predictors for contexts where X is
In this paper, the problem of predicting an available (Section 4) is facilitated by having re-
integer-valued random variable with an integer- sults for the ‘No Data Problem’ at hand. The
valued predictor is considered. The criterion used proofs of Theorems 1 and 2 (Section 3) are given
for evaluating alternative predictors is mean- in the Appendix.
squared error (MSE) of prediction. It is shown
that there is either a unique minimum MSE pre-
dictor, or there are an infinite number of predic-
2. Example and notation
Correspondence to: Daniel R. Jeske, AT&T Bell Laboratories, Let Y have a probability distribution defined on
Holmdel, NJ 07733, USA. the integer values 0, 1 and 2 with weights 0.2, 0.3
0167.7152/93/$06.00 0 1993 - Elsevier Science Publishers B.V. All rights reserved 297
Volume 16. Number 4 STATISTICS&PROBABILITY LETTERS 16 March 1993
and 0.5, respectively. Notationally, this will be MSE predictor(s) for the case where k is not an
expressed as integer.
y.% O l 2
( 0.2 0.3 0.5 1 ’ Theorem 1. Let Y be an integer- valued random
variable with a non- integer mean, u. Define k to be
Suppose it is desired to predict a future value of the largest integer less than u, The unique Amini-
Y, and that no external information related to Y mum M SE integer- valued predictor of Y is Y,,, =
(i.e., X1 is available. The minimum MSE predic- Round&L) (the integer nearest to u), unless u = k
tor of Y is Ymi, = 1.3 (the mean value of Y). + 0.5, in which case all randomized predictors of
Clearly, Ymi, is not a feasible value for Y, and the form
thus is an unsatisfactory predictor of Y. AnAalter-
native predictor is obtained by roundingnYmin to
the nearest integer to give the predictor Y,,, = 1.
It follows from Theorem 1 in Section 3 that pm,,
is the unique minimum MSE integer-valued pre- (for arbitrary 0 =~p < 11 are minimum M SE inte-
ger- valued predictors.
dictor of Y. In cases where Ymi, is of the form zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJ
k + 0.5, for some integer k, a unique minimum
MSE integer-valued predictor does not exist. It is Proof. See Appendix.
shown that for this case there are an infinite
number of integer-valued predictors that all have
When p is not an integer, there are no non-
the same minimal value of MSE. zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
randomized unbiased predictors of Y. To predict
An arbitrary predictor Y is an unbiased pre-
Y unbiasedly, Y must be a randomized predictor
dictor of Y if E(Y) = E(Y). Typically, unbiased
defined by a probability distribution on the inte-
integer-valued predictors will require a random-
ger values. Theorem 2 shows that there is a
ization step. For this example, it follows from
unique minimum MSE unbiased integer-valued
Theorem 2 in Section 3 that the unique minimum
predictor and that is the randomized predictor
MSE unbiased integer-valued predictor of Y is
that proportionally divides its probability mass on
I;m,,” defined according to the following probabil- the integer values that are adjacent to p.
ity distribution:
Y&,, - O l * Theorem 2. Let Y be an integer- valued random
(0 0.7 0.3 1 . variable with a non- integer mean u. Let k be the
integer such that k < u < k + 1. The unique mini-
Values for I;,,,, are obtained by simulating the
mum M SE integer- valued unbiased predictor of Y
values 0, 1 and 2 accocding to this probability
is the randomized predictor defined by
distribution. In general, Y,,,, will be the random-
ized predictor whose associated probability distri-
k+l
bution proportionally divides its mass on the two CI,C” N k
( 1-p+k u- k 1 ’zyxwvutsrqponmlkjihgfedcb
integer values that are adjacent to the mean of Y.
Proof. See Appendix.
3. No data problem
Consider the case where no external information 4. Predictor data available
about Y, in the form of an observable random
variable X, is available. Suppose the mean value, When an observable random variable, X, is avail-
IL, of Y is known. If p is an integer, it is the able and is correlated with Y, we can improve the
integer-valued predictor with the smallest MSE. prediction of Y by making use of the conditional
The following theorem identifies the minimum distribution of Y, given X. The MSE of an arbi-
298
Volume 16, Number 4 STATISTICS&PROBABILITY LETTERS 16 March 1993
trary predictor, g(X), can be expressed as zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDC
with equality if and only if Y= Round(p). It
follows that
E[Y-g(X)]2=E(E[(Y-R(X))21X]).
E(Y- Y)2 >,E(Y- k)2,
Finding the integer-valued predictor g(X) with
the smallest MSE amounts to finding, for each with equality if and only if 8s Round(p), prov-
X, an integer g(X), that minimizes the condi- ing the theorem for Case (1).
Case (2): By using (ii) of Lemma 1, the proof
tional MSE, zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
E[(Y - g(X)>* I Xl. Each of the con-
ditional minimization problems is equivalent to of Case (2) is similar to the proof of Case (1).
the ‘No Data Problem’ considered in Section 3 Case (3): For each possible value, y, of Y,
with the marginal distribution of Y replaced by (y - k)* > y - k with equality if and only if y = k
the conditional distribution of Y, given X. + 1 or y = k. Thus, E(Y- k>* > E(Y- - k), with
equality if and only if the probability distribution
of Y is concentrated exclusively on k and k + 1.
Appendix It follows that E(Y - 8)’ 2 E(Y - k>2, with
equality if and only if Y is a randomized predic-
Lemma 1. Suppose u is a non- integer real number. tor with a probability distribution exclusively on k
(i) If k < p < k + 0.5 for some integer k, then and k + 1. Thus, any Y of the form
(j - k)2 2 2(p - k)( j - k) for any integer j, with
equality if and only if j = k.
(ii) If k + 0.5 < u < k + 1 for some integer k,
then [j - (k + 1)12 2 2[u - (k + l)l[ j - (k + 111 (0 <p G 1) will be a minimum MSE integer-val-
for any integer j, with equality if and only if j =
ued predictor, proving the theorem for Case (3).
k+ 1. 0
Proof. The hypothesis in (i) implies 0 < 2(~ - k)
Lemma 2. Let Y be an arbitrary integer- valued
< 1. If j > k, then j - k > 1 > 2(p - k), implying
random variable with mean u, and suppose k < u
the result. If j <k, then j-k <O <2(u - k),
< k + 1 for some integer k. Then,
implying the result. Equality in (i) clearly attains
if j = k. The proof of (ii) follows similarly by Var(Y) 2 (I- L- k)[l - (P - k)],
noting that - 1 < 2[~ - (k + l)] < 0 for this case.
with equality if and only if
Proof of Theorem 1. The theorem will be proved k k+l zyxwvutsrqponmlkjihgfedcbaZYXW
by considering the three cases: (1) k <p < k + Y-
i 1-p+k u- k 1’
0.5; (2) k + 0.5 <p <k + 1; and (3) Jo = k + 0.5.
In what follows, Y will denote an arbitrary inte- Proof. (Y - k)[Y - (k + l)] > 0, with equality if
ger-valued predictor of Y. Each Y can be though; and only if Y puts all of its mass on k and k + 1.
of as a randomized predictor. In the case where Y It follows that EY’ > ku + (k + l>(p - k), with
is a constant, the associated probability distribu- equality if and only if Y puts all of its mass on k
tion is degenerate. The fact that Y is statistically and k + 1. Thus, Var(Y) 2 (p - k)[l - (p -k)],
independent of Y (a result of the randomization with equality if and only if Y puts all of its mass
step) is used throughout the proof. on k and k + 1. Since E(Y) = CL,equality follows
Case (1): It follows from (i) of Lemma 1 that, if and only if Y has the distribution given in the
for each possible value, y, of Y, statement of the lemma. 0
(y-k)*>-(p-k)(y-k),
Proof of Theorem 2. Let Y be an arbitrary inte-
with equality if and only if y = k = Round(p). ger-valued unbiased predictor of Y. Then, the
Thus, MSE of Y is
E(Y- k)2~2(u- k)[E(Y) - k], E(Y-Y)*=Var(Y) +Var(Y).
299
Volume 16, Number 4 STATISTICS&PROBABILITY LETTERS 16 March 1993
Since EC?;) = CL,it follows from Lemma 2 that References
Var(P)a(p-k)[l-(p-k)],
Kempthorne, O.K. (19891, The fate worse than death and
with equality if and only if P= gmseu, given in the other curiosities and stupidities, Amer. Statist. 43, 133-134.
Rao, C.R. (19731, Linear Statistical Inference and its Applica-
statement of the theorem. Thus,
tions (Wiley, New York, 2nd ed.).
E(Y-P)‘&Var(Y)+(j_-k)[l-(p-k)],
with equality if and only if P= pm,,,, proving the
theorem. q
300