Home

Computing Poisson probabilities

1. Inputs Poisson parameter error tolerance e underflow threshold T overflow threshold Q Outputs left truncation point L right truncation point R weights w L w R total weight W flag F Subroutine required FINDER A e 7 Q L R w m F see section 3 Comments w i p i W where p i Poisson probability For A 400 mass left of L 2 mass right of R lt 2 For 0 lt lt 400 mass left of L s e 2 note L 0 for lt 25 mass right of R lt 2 6 X 10 r Q see section 3 Section 3 explains where the 400 comes from F true if no underflow can occur while comput ing the weights F false indicates that weights are not computed due to potential underflow w i W may underflow even when F true Initialize Set m LAJ mode Get L R w m and F from FINDER if F false exit Down Set j m While j gt L execute w j 1 j Nw joj 1 Up If lt 400 go to Special Else set j m While j lt R execute w j 1 A G Dwl fjcjti Compute W Comment We want to compute W lt w L w R Comment To attenuate roundoff we add small terms first W0 s lt L te R While s lt t execute If w s lt w t then W lt W v s se st l else 442 Communications of the ACM W W wit t lt t 1 W lt W wfs Exit Special If R gt 600 set F false and then exit Comment Underflow is possible but n
2. 25 we set L 0 and R Rago The latter is justified since the mass in the right tail decreases with A Weigh ter checks that Raoo 600 for e corresponding to k 7 Raoo 600 It then assures that properly scaled proba bilities do not underflow resetting R if necessary The error bound is then e 2 10 Raoo R t Q S 2 6 x 10 77 Q The second term is negligible when e gt 10727 9 which holds for e 107 and the computers considered in section 3 Let Cm 1 V2am exp m 1 12m According to Feller 3 page 54 the following bound supplements Stirling s formula n lt V2rn n e e1 12 It readily follows that pm Cm Section 6 proves PROPOSITION 5 For i gt 0 p m i py m exp i i 1 2A cmexp i 1 2d Communications of the ACM PROPOSITION 6 For 0 lt i lt 4 2 O pim pme E ongin e i ie gt a es ei Cm Xp Dr 5 ii ForO lt ism pai i en COROLLARY 3 Let k k V2 3 2V Then for k gt 0 pr lm k V2A 3 21 p Lm VXI gt cmexp k 1 2 COROLLARY 4 Let k 3 2 i Foro lt lt VX 2 prim kVA 3 21 palm EVAN gt Cmexp k 2 R J3 VN ii For k lt vm 1 m prim kVA 3 21 p m Km 11 k y m 1 Cm 1 vm 1 iii For k lt vm 1 m pim kVA 3 21 p 0 e gt We suggest using i when applicable the bound is then at least c e
3. 440 RESEARCH CONTRIBUTIONS Management Science and Operations Research Harvey J Greenberg Editor Computing Poisson Probabilities BENNETT L FOX and PETER W GLYNN ABSTRACT We propose an algorithm to compute the set of individual nonnegligible Poisson probabilities rigorously bound truncation error and guarantee no overflow or underflow Work and space requirements are modest both proportional to the square root of the Poisson parameter Our algorithm appears numerically stable We know no other algorithm with all these good features Our algorithm speeds generation of truncated Poisson variates and the computation of expected terminal reward in continuous time uniformizable Markov chains More generally our algorithm can be used to evaluate formulas involving Poisson probabilities 1 INTRODUCTION We give an algorithm to compute the set of individual nonnegligible Poisson probabilities needed for exam ple in the applications pointed out in section 1 1 To get finite termination we clearly have to truncate the Poisson distribution Unlike previous contributions we bound the mass in the truncated tails from above and the rernaining individual probabilities from below Given any reasonable e see section 3 we find a left truncation point L and a right truncation point R such Bennett L Fox s research was supported by a grant from the Natural Sciences and Engineering Research Council of Canada Pet
4. T m i lt b 8 3 2 VN We reparameterize these bounds with the substitu tions i 3 2 V2 kVA and i 3 2 kVX respec Communications of the ACM 443 Research Contributions 444 tively This gives Qm kV2X 3 21 lt a d k NB K Tr lm kVA 3 21 lt b B K where d k X 1 1 exp 2 9 k V2X 3 2 and T j 0 for j lt 0 For A 25 and k 3 we get d k A 1 007 From Abramowitz and Stegun 1 page 932 we get PROPOSITION 4 If x gt 0 then x x x with error less than x x Apply proposition 4 to Glynn s reparameterized bounds to get COROLLARY 1 If 2 and 1 2V2 lt k lt VA 2V2 then Q lm kV2X 3 21 lt and k Ne 2 k V2 COROLLARY 2 If 2 and k 1 V2 then Ty Lm kVA 3 23 bye k 2x Corollary 1 does not contradict the fact that for large enough truncation points the mass in the right Poisson tail is an order of magnitude greater than the mass in the corresponding normal tail In corollary 1 the trun cation point is at most m A 2 3 21 5 BOUNDING POISSON PROBABILITIES We bound the Poisson probabilities p i from below to guarantee that properly scaled they do not underflow for L i R By the monotonicity of p i to the left and to the right of m LAJ it suffices to check only pa L and p R The programs Finder and Weighter use corollaries 3 and 4 below only for 25 For0 lt A lt
5. assures that W lt 0 107 The factor 107 typically prevents overflow when the weights are sub sequently used as in proposition 1 for example To check for underflow we scale the lower bounds in cor ollaries 3 and 4 by 2 10 R L before comparing them to 7 Based on the discussion below writing a subroutine FINDER A e 7 Q L R w m F as required by WEIGHTER is straightforward 1 If 0 then set L R 0 and F false 2 If0 lt lt 25 then L 0 If e lt 7 set F false and exit If 0 lt A lt 400 find R using corollary 1 of section 4 with A 400 Increase k through the posi tive integers greater than 3 until the upper bound is less than 2 Set F true 3 If A 400 use corollary 1 with the actual A and proceed as above to find R Evaluate the lower bound in corollary 3 of section 5 multiplied by 0 10 R L at the k corresponding to R If the result is less than 7 set F false and exit If 25 then find L using corollary 2 of section 4 with the actual A Evaluate the lower bound in cor ollary 4 of section 5 multiplied by Q 10 R L at the k corresponding to L If the result is greater than 7 set F true else set F false From inspection of corollaries 1 and 2 we see that the k s found above are o log 1 e growing very slowly as e decreases In corollaries 1 and 2 the factor exp k 2 dominates when 25 say At k 7 thi
6. e the set of probabil ities p L p R He also gives a numerically stable method to estimate the masses in each tail with pre scribed relative error These bounds can be transformed into corresponding bounds on absolute error Whether such transformed bounds are looser or tighter than ours is an open question Our bounds take O 1 time to com pute independently of all other parameters The time and space complexities to compute Kntisel s bounds are unspecified and unclear Our bounds on tail masses as estimates of them probably would have high relative error In view of proposition 1 this seems irrelevant to the applications cited above in other applications how ever small relative error may be important As a check for programming errors our w m W should be approxi mately Kniisel s estimate of p m 1 3 Overview In section 2 we give an algorithm to find the weights which requires a subroutine to find L R and w m and to check that underflow will not occur We sketch that subroutine in section 3 based on bounds in sections 4 and 5 Section 6 justifies the bounds in section 5 In section 7 we outline generalizations to other discrete distributions 2 COMPUTING WEIGHTS In this section we present an easily programmed algo rithm to compute the weights followed by an analysis of roundoff error Algorithm WEIGHTER A e 7 Q L R w L w R W F Communications of the ACM 441 Research Contributions
7. er W Glynn s research was supported by the National Science Foundation under Grant ECS 8404809 and the U S Army under Contract Number DAAG29 80 C 0041 1988 ACM 0001 0782 88 0400 0440 1 50 Communications of the ACM that for a Poisson random variable N with parameter A we have P L lt N lt R 1 e and R L O v Starting from a mode m LAJ we recursively compute weights w m 1 w m 2 w L and w m 1 w m 2 w R such that for some constant a we have w i ap i for L lt i R where p i P N i a W and W w L w R Our new lower bounds let us scale the weights via the choice of w m to guarantee without repeated checking that no under flows or overflows occur Our new upper bounds let us avoid estimating tail masses by summing estimated probabilities in the complementary region a computa tion likely to be significantly contaminated by roundoff error As far as we know no other published algorithms for computing Poisson probabilities use rigorous practi cal bounds such as ours and so these algorithms are unsatisfactory Such bounds do not seem readily acces sible elsewhere The overall work and space complexities are both O VA which lets us handle all practical s Our algo rithm does not restrict A Today s desktop computers can easily handle A in the range 0 to 10 say We give a crude analysis of roundoff error that suggests that our method is numerica
8. ft trun cation it would take O A time For the terminal reward problem considered by Gross and Miller 9 the required f j s can be computed in O V time using successive matrix squaring up to L as Fox 4 details On the other hand consider using the weights to com pute Erlang s loss formula e g see Gross and Harris with c servers as w c w L w c for L c R If L 0 the error is zero ignoring roundoff and assum ing no underflow occurs but computing the weights then takes O A time For L gt 0 we do not attempt a formal error analysis here which would require lower as well as upper bounds on the left tail Our bounds may indicate a reasonable choice of R For another example suppose that we want to generate truncated Poisson variates by efficiently implemented inversion or by the alias method Simply scale standard uniform variates as u lt uW and then use the w j s directly Our algorithm reduces the setup time from O A for a naive method with no error bounds to O VA with bounds on truncation error Fox 5 bound the loss in coverage probability for confi dence intervals constructed from truncated variates with any distribution Bratley Fox and Schrage 2 ar gue that truncation avoids statistical anomalies and al lows variate generation by efficiently implemented in version in O 1 marginal time independently of the truncation point Inversion is compatible with common random number
9. lly stable Possibly w i W may underflow for some i but this is ordinarily irrelevant if the computations are properly arranged For example we can compute a weighted sum w L f L w R f R and only then divide the sum by W Like wise the overall error in a computation is ordinarily small even though the individual estimated probabili ties are all a bit off To illustrate we note April 1988 Volume 31 Number 4 PROPOSITION 1 Let f be a real valued function with Kft sup f j j 0 1 2 pandP ILSNsR 1 e In exact arithmetic R ry 1 W wf E PFO ze IfI I j Proof It suffices to show that j2o 4 j p j 2e where q t w i W p i 8 with 8 p L p R Now R 2 l4 pi S 2 4li pli e j 1 gt ptifs a es2e o i L 8 For large A our L is A O VA and our R is OVN This verifies that our overall work and space complexities are both O VA In contrast the 1982 IMSL routine asks the user to specify a parameter k not necessarily related to A and outputs estimates of p 0 p p k This amounts to setting L 0 The k specified by any reasonable user will be O A Thus in practice the IMSL routine requires O A work 1 1 Motivation Gross and Miller 9 and Fox 4 consider problems of this form in which large values of typically occur Given the f j s the first summation in proposition 1 can be computed in O Vd time without the le
10. ode or the mean 2 find appropriate truncation points L and R from up In response to membership requests CURRICULA RECOMMENDATIONS FOR COMPUTING Volume I Volume II Volume IIL Curricula Recommendations for Computer Science Curricula Recommendations for Information Systems Curricula Recommendations for Related Computer Science Programs in Vocational Technical Schools Community and Junior Colleges and Health Computing Information available from the ACM Order Dept 1 800 342 6626 in Maryland Alaska or Canada call 301 528 4261 April 1988 Volume 31 Number 4 Communications of the ACM 445
11. ories and Subject Descriptors F 2 1 Numerical Algorithms and Problems G 3 Probability and Statistics General Terms Algorithms Bounds Additional Key Words and Phrases Overflow Poisson probabilities truncation underflow variate generation Received 5 86 revised 1 87 accepted 4 87 Authors Present Address Bennett L Fox Department of Computer Sci ence University of Montreal C P 6128 Station A Montreal Quebec H3C 3J7 Canada Peter W Glynn Department of Operations Research tions U S Dept of Commerce National Bureau of Standards Appl D 5 ed Springer Verlag New York 1987 3 Fox B L Numerical methods for transient Markov chains Technical cal report Cornell University 1988 Pe pelea i 1 6 P 7 Glynn P W Upper bounds on Poisson tail probabilities Operations New York 1985 Operations Research 32 2 March April 1984 343 361 7 CONCLUDING REMARKS practical purposes There are similar tradeoffs between metric distributions Finding good tradeoffs for such individual nonnegligible probabilities from these distri Stanford University Stanford CA 94305 4022 ee 1 Abramowitz M and Stegun I E Handbook of Mathematical Func Math Series 55 1972 2 Bratley P Fox B L and Schrage L E A Guide to Simulation 2nd Feller W An Introduction to Probability Theory and Its Applications 1 Wiley New York 1968 ps 4 ifi 1 _ iG 1 2i 1 report Cornell Univer
12. ot certain section 3 explains where the 600 comes from Else set j m While j lt R execute q4 X j 1 If w j gt 7 4 then set w j 1 lt qw j andj lt j 1 else set Rj go to Compute W Go to Compute W Computing each new weight takes two floating point operations accounting for the 2 in the exponents be low We define the relative roundoff error to be the maximum of the ratio of the computed result to the true result and the reciprocal of that ratio With multi plication and division bounds on relative roundoff er rors multiply Let u be the unit roundoff Here u 2 where b is the number of bits in the mantissas of the computer s floating point numbers The relative round off error accumulated in Down and Up is at worst O 1 uj and O 1 u respectively likely gross overestimates For example 1 1077 1 00010 The implicit proportionality factor associated with estimates of roundoff error depends on whether floating point arithmetic chops or rounds It is less than 1 01 in all cases if we replace u by 10u Since Compute W involves only positive numbers it follows from Gill Murray and Wright 6 pages 11 12 for example that the associated relative roundoff error is at worst O 1 u Consider using double preci sion to attenuate it The actual roundoff error is prob ably much less than the bound because we add small weights first Pushing the principle of adding small term
13. s factor equals 6 2 x 1071 approximately and we get R L lt 20V Suppose that e 10 One routinely checks that with k 7 the bounds are less than 107 for 25 If programming in a language like Fortran where storage for the weights must be assigned in ad vance a generous rule of thumb is to allow max T20V21 600 cells When A 400 then corollary 1 applies for R lt 600 which corresponds to k 7 This explains the 600 above If R is not reset in special then the mass to its right is at most 2 otherwise the mass to the right of R is at most 2 600 x 10 7 Q because of our choice for w m While computing bounds use the upper bounds on a ba and d k A given in section 4 and the lower bound on cm given in section 5 The remaining factor in any particular bound is an exponential say exp 9 k Multiply the bound on a d k A by Or Cm2 10 R L by exp g k Lg k 1 We assume reasonably that there is no underflow up to this point To avoid subse quent underflow multiply the result by e as long as April 1988 Volume 31 Number 4 Research Contributions the current product is greater than ze or until Lg k J multiplications occur whichever happens first For cor ollaries 1 and 2 if underflow would occur with the next hypothetical multiplication then the bound is less than 7 hence acceptable providing e gt 27 as seems reasonable For corollaries 3 and 4 however if under flow
14. s and antithetic variates but rejection methods typically are not April 1988 Volume 31 Number 4 Research Contributions 1 2 Numerical Stability The w j s are slightly inaccurate due to roundoff error They are all positive so no cancellation errors occur when adding them to get W Thus the additional round off error induced by adding the w j s is potentially troublesome only when there are many summands large A If we were simply to add from left to right say then for example w I m R 21 W W in finite precision floating point add arithmetic for large enough Here W w L w f m R 21 1 Thus we will get W W though w I m R 21 w R is not negligible The cure is to add small weights first This widely applicable principle to attenuate round off error is also apparently used in the 1982 IMSL rou tine to compute Poisson probabilities Recursive computation of weights starting from the mode is by itself not a new idea What is new is its coupling with a priori determination of truncation points and a priori underflow checks The obvious choice for w m is one Given the computer s underflow and overflow thresholds we typically choose w m far larger to avoid underflow worries Knisel 10 pages 1028 1029 gives a numerically stable method to compute p i for any fixed i though he gives no corresponding error bounds It would be ineffi cient to use that method to comput
15. s first to the limit we could put all initial sum mands in a heap remove the smallest two insert their sum in the heap and so on until the heap empties In view of Glynn s 7 corollary 1 4 showing roughly that the mass in the tails does not overwhelm the next sum mand such elaborate measures seem unnecessary and the method in WEIGHTER looks good enough Even with that corollary simply terminating when the current weight falls below a heuristic threshold would leave us with no rigorous error bound on the corresponding truncated tail masses The 1982 IMSL routine outputs estimates of p 0 p 1 p k 1 with k specified by the user It does not check whether the user specified k is reasonable for example for large picking k V1 or k is unreasonable If k is at least of order A then the IMSL routine spends most of its time computing estimates of negligible probabilities it may have to rescale often to avoid underflow April 1988 Volume 31 Number 4 The roundoff error associated with L and R does not seem severe but analyzing it would require specifica tion of how the machine computes exponentials As a hedge against roundoff divide the nominal e by 10 say If the nominal e is already small we will see later that this hedge does not have a major impact on L and R 3 FINDING TRUNCATION POINTS We now give a recipe to find L and R given e as required by WEIGHTER The heuristic choice w m 0 107 R L
16. sity 1987 62 5 5 Fox B L and Glynn P W Conditional confidence intervals Techni Gill P E Murray W and Wright M H Practical Optimization Aca demic Press London 1981 Research Letters 6 1 March 1987 9 14 z i 8 Gross D and Harris C M Fundamentals of Queueing Theory Wiley Cm _ m i1 9 Gross D and Miller D R The randomization technique as a model i ing tool and solution procedure for transient Markov processes i ap 5 10 Kniisel L Computation of the chi square and Poisson distribution m i1 SIAM J Sci Stat Comput 7 3 July 1986 1022 1036 Our bounds probably can be tightened by more intri cate analysis As they stand they seem good enough for complexity and tightness of error bounds for other dis crete distributions such as the binomial and hypergeo distributions is a subject for future research Given ap propriate bounds a good strategy to compute the set of butions are similar to our strategy for the Poisson distri bution Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commer cial advantage the ACM copyright notice and the title of the publication and its date appear and notice is given that copying is by permission of the Association for Computing Machinery To copy otherwise or to republish requires a fee and or specific permission 1 choose an appropriate weight for the m
17. would occur then e is too small Again looking at what happens for k 7 in corollary 3 the dominant factor is exp 1 2 2 4 x 10714 for 25 in corollary 4 the dominant factor is exp k2 2 R 3 V 2 1 x 107 for A 196 The discussion following corollary 4 indicates that for k 7 we have to deal with bounds no smaller than 107 for A lt 196 Corresponding tok 7 we get R L s max f20VA1 100 In any case underflow occurs only if a lower bound is less than 10 R L r Q Consider 7 Q for typical computers According to the 1982 VAX 11 Fortran reference manual p 2 7 7 2 107 According to the 1982 CDC Fortran 5 ref erence manual p 1 5 7 2 107 According to the 1981 Itel iAPX 86 88 User s Manual p S 6 for the 8087 Numerical Data Processor 7 Q 10 for single preci sion and 7 Q 107 for double precision Probably our checks for underflow and overflow are superfluous in practice but they are inexpensive to do and guaran tee that when passed no problems can occur 4 BOUNDING POISSON TAILS Let pali e aifil i20 aD ali Tai pal j 0 oe e Van P x i p t dt ay 1 1 A e1 8V2 by 1 1 Aje For 25 we get a S 1 57 and b 1 05 Glynn 7 proves PROPOSITION 2 Suppose A 2 and 2 Si A 3 2 Then Q m i Sa 1 exp 21 9 i 3 2 V2 PROPOSITION 3 Suppose A gt 2 andi 2 Then
18. xp 2k 3 If only ii and iii apply compute both bounds and use the maximum Since for m large _f2 oe k ime vm 1 computing the left side is numerically stable For exam ple with m 63 and k 7 i does not apply and 7 56 m z 2 6X10 see ii e79 5 2X 107 see e 44Xx107 see iii Convergence in is glacial For 25 we get Cm 2 1 5eV2am 0 02935 Vm 6 PROOFS OF PROPOSITIONS FIVE AND SIX We use the following known facts for x 0 for 0 lt x lt 1 2 log x Sx log 1 x x x Right of mode p m i perex 2 log m u n April 1988 Volume 31 Number 4 Research Contributions i per bounds on the tails possibly using a counter pirex gt log 1 Km part to special ie 3 find lower bounds on the individual probabilities i and check for underflow at L and at R p injoxo log 1 k j compute weights recursively outwards to L and to R as compute the total weight by adding smallest terms m pomexo 5 Kr first i e inwards i In this paper we focused on the Poisson distribution because it is most relevant to our interests see 1 1 among the discrete distributions for which the set of nonnegligible individual probabilities is nontrivial to compute p mjexp i i 1 2A Left of mode p m i pemexo log m k A p mexp log 1 Ks p m exp 3 oe Sa IV p m i V CR Categ

Computing Poisson probabilities

Contents

Download Pdf Manuals

Related Search

Related Contents