 SQL Server HYPGEOM.DIST function

HYPGEOM_DIST

Updated: 31 July 2015

Use HYPGEOM_DIST to calculate the probability mass function or the cumulative distributive function of the hypergeometric distribution. The hypergeometric distribution is generally described using colored balls in an urn. Given an urn containing N balls of which K are white (and N-K are black), the hypergeometric distribution calculates the probability of drawing k white balls from a sample of n without replacement (meaning that once a ball is removed from the urn it is not put back).
Syntax
SELECT [wct].[HYPGEOM_DIST](
<@Sample_s, float,>
,<@Number_sample, float,>
,<@Population_s, float,>
,<@Number_population, float,>
,<@Cumulative, bit,>)
Arguments
@Sample_s
the number of successes in the sample. @Sample_s must be of a type float or of type that intrinsically converts to float.
@Number_sample
the size of the sample. @Number_sample must be of a type float or of type that intrinsically converts to float.
@Population_s
the number of successes in the population. @Population_s must be of a type float or of type that intrinsically converts to float.
@Number_population
the size of the population. @Number_population must be of a type float or of a type that intrinsically converts to float.
@Cumulative
a bit value identifying whether the probability mass function ('False') or the cumulative distribution function ('True') is to be returned. @Cumulative is of a type bit or a type that implicitly converts to bit.
Return Type
float
Remarks
·         @Sample_s, @Number_sample, @Population_s, and @Number_population are truncated; only the integer part is used.
·         0 = @Sample_s
·         0 = @Population_s
·         0 < @Number_sample
·         0 < @Number_population
·         @Population_s = @Number_population.
Examples
In an urn with 1000 balls of which 300 are white we want to calculate the probability of having exactly 40 white balls when we randomly draw 150 from the urn.
SELECT
wct.HYPGEOM_DIST(
40    --@Sample_s
,150   --@Number_sample
,300   --@Population_s
,1000 --@Number_population
,'False'
) as pmf

This produces the following result

If we wanted to calculate the probability of having up to 40 balls we would use the following SQL.
SELECT
wct.HYPGEOM_DIST(
40    --@Sample_s
,150   --@Number_sample
,300   --@Population_s
,1000 --@Number_population
,'True'
) as cdf

This produces the following result.

One way to look at the cumulative distribution function is as the sum of probability mass functions, as shown in this example.
SELECT
*
,SUM(hgeom) OVER (ORDER BY sample_s) as cumulative
FROM (
SELECT
seq-1 as sample_s,
wct.HYPGEOM_DIST(SeriesValue,150,300,1000,'False') hgeom
FROM
wct.SeriesInt(0,40,NULL,NULL,NULL)
)n

This produces the following result.

Support  Copyright 2008-2019 Westclintech LLC         Privacy Policy        Terms of Service