KSTEST2
Updated: 27 June 2012
Note: This documentation is for the SQL2008 (and later) version of this XLeratorDB function, it is not compatible with SQL Server 2005.
Click here for the SQL2005 version of the KSTEST2 function
Use the aggregate function KSTEST2 to perform the two-sample Kolmogorov-Smirnov test to compare the distributions of the values in two samples. KSTEST2 produces the test statistic (k) and the p-value (p) which can be used to ascertain if the two samples are from the same continuous distribution.
Syntax
Arguments
@sid
the sample identifier. @sid indentifies which of the two samples @ x belongs to.
@x
the sample value. @x is an expression of type float or of a type that can be implicitly converted to float.
@statistic
identifies the return value as either the Kolmogorov-Smirnov statistic (k) or the p-value (p).
Return Types
float
Remarks
· If @x is null then @x is not included in the aggregate.
· There must exactly 2 @sid within a GROUP.
· @statistic must be invariant for the GROUP.
· KSTEST2 is an aggregate function and follows the same conventions as all other aggregate functions in SQL Server.
· To evaluate one sample against a continuous distribution function, use KSTEST
Examples
In this example, we compare a sample of test scores for men and women
SELECT wct.KSTEST2(s,x,'p') as p_value
,wct.KSTEST2(s,x,'k') as k_observed
FROM (
SELECT 'Men',117 UNION ALL
SELECT 'Men',106 UNION ALL
SELECT 'Men',105 UNION ALL
SELECT 'Men',112 UNION ALL
SELECT 'Men',108 UNION ALL
SELECT 'Men',115 UNION ALL
SELECT 'Men',97 UNION ALL
SELECT 'Men',112 UNION ALL
SELECT 'Men',110 UNION ALL
SELECT 'Women',131 UNION ALL
SELECT 'Women',107 UNION ALL
SELECT 'Women',110 UNION ALL
SELECT 'Women',108 UNION ALL
SELECT 'Women',100 UNION ALL
SELECT 'Women',74 UNION ALL
SELECT 'Women',89 UNION ALL
SELECT 'Women',115 UNION ALL
SELECT 'Women',85 UNION ALL
SELECT 'Women',94 UNION ALL
SELECT 'Women',110 UNION ALL
SELECT 'Women',80 UNION ALL
SELECT 'Women',103 UNION ALL
SELECT 'Women',122
) n(s,x)
This produces the following result
p_value k_observed
---------------------- ----------------------
0.297578744560361 0.388888888888889