Updated: 27 June 2012

*Note: This documentation is for the SQL2008 (and later) version of this XLeratorDB function, it is not compatible with SQL Server 2005.*

Click here for the SQL2005 version of the KSTEST function

Use the aggregate function KSTEST to compare a sample with a reference probability distribution. The Kolmogorov-Smirnov statistic qualifies a distance between the empirical distribution function of the sample and the cumulative distribution function of the reference sample.

the sample values. *@x* is an expression of type **float** or of a type that can be implicitly converted to **float**.

the values of the cumulative distribution function. *@cdf* is an expression of type **float** or of a type that can be implicitly converted to **float**.

identifies the return value as either the Kolmogorov-Smirnov statistic (k) or the p-value (p).

float

· If *@x* is null then *@x* and the corresponding *@cdf* are not included in the aggregate.

· If *@cdf* is null then *@cdf* and the corresponding *@x* are not included in the aggregate.

· *@statistic* must be invariant for the GROUP.

· KSTEST is an aggregate function and follows the same conventions as all other aggregate functions in SQL Server.

· To evaluate two samples, use KSTEST2

In this example, we compare the sample (-2,-1, 0, 1, 2, 3, 4) to the standard normal distribution.

SELECT wct.KSTEST(x, wct.NORMSDIST(x),'k') as k_observed

,wct.KSTEST(x, wct.NORMSDIST(x),'p') as p_value

FROM (VALUES (-2),(-1),(0),(1),(2),(3),(4)) n(x)

This produces the following result.

k_observed p_value

---------------------- ----------------------

0.412773317497114 0.135890127950776

In this example, we compare the sample x, consisting of 1,000 randomly generated numbers in the standard normal distribution, to the standard normal distribution.

SELECT wct.KSTEST(x,wct.NORMSDIST(x),'P') as p_value

,wct.KSTEST(x,wct.NORMSDIST(x),'k') as k_observed

FROM (

SELECT k.seriesvalue as x

FROM wctMath.wct.SeriesFloat(0,1,NULL,1000,'N') k

) n(x)

This produces the following result. Your results will be different.

p_value k_observed

---------------------- ----------------------

0.248195490618504 0.0321196162767327

In this example, we compare the uniformaly random numbers to the gamma distribution with shape 1 and scale 1.

SELECT wct.KSTEST(x,wct.GAMMADIST(x,1,1,'True'),'P') as p_value

,wct.KSTEST(x,wct.GAMMADIST(x,1,1,'True'),'k') as k_observed

FROM (

SELECT k.seriesvalue as x

FROM wctMath.wct.SeriesFloat(0,1,NULL,100,'R') k

) n(x)

This produces the following result. Your results will be different.

p_value k_observed

---------------------- ----------------------

1.09546692948201E-12 0.368492604785191