Login     Register

        Contact Us     Search

XLeratorDB/statistics Documentation

History for KSTEST - 2008 (history as of 8/7/2014 10:34:44 AM)

KSTEST

Updated: 27 June 2012


Use the aggregate function KSTEST to compare a sample with a reference probability distribution. The Kolmogorov-Smirnov statistic qualifies a distance between the empirical distribution function of the sample and the cumulative distribution function of the reference sample.
Syntax
syntax for sql server analytic function KSTEST
Arguments
@x
the sample values. @x is an expression of type float or of a type that can be implicitly converted to float.
@cdf
the values of the cumulative distribution function. @cdf is an expression of type float or of a type that can be implicitly converted to float.
@statistic
identifies the return value as either the Kolmogorov-Smirnov statistic (k) or the p-value (p).
Return Types
float
Remarks
·         If @x is null then @x and the corresponding @cdf are not included in the aggregate.
·         If @cdf is null then @cdf and the corresponding @x are not included in the aggregate.
·         @statistic must be invariant for the GROUP.
·         KSTEST is an aggregate function and follows the same conventions as all other aggregate functions in SQL Server.
·         To evaluate two samples, use KSTEST2
Examples
In this example, we compare the sample (-2,-1, 0, 1, 2, 3, 4) to the standard normal distribution.
SELECT wct.KSTEST(x, wct.NORMSDIST(x),'k') as k_observed
,wct.KSTEST(x, wct.NORMSDIST(x),'p') as p_value
FROM (VALUES (-2),(-1),(0),(1),(2),(3),(4)) n(x)
This produces the following result.
            k_observed                p_value
---------------------- ----------------------
     0.412773317497114      0.135890127950776
 
In this example, we compare the sample x, consisting of 1,000 randomly generated numbers in the standard normal distribution, to the standard normal distribution.
SELECT wct.KSTEST(x,wct.NORMSDIST(x),'P') as p_value
,wct.KSTEST(x,wct.NORMSDIST(x),'k') as k_observed
FROM (
      SELECT k.seriesvalue as x
      FROM wctMath.wct.SeriesFloat(0,1,NULL,1000,'N') k
      ) n(x)
 
This produces the following result. Your results will be different.
               p_value             k_observed
---------------------- ----------------------
     0.248195490618504     0.0321196162767327
 
In this example, we compare the uniformaly random numbers to the gamma distribution with shape 1 and scale 1.
SELECT wct.KSTEST(x,wct.GAMMADIST(x,1,1,'True'),'P') as p_value
,wct.KSTEST(x,wct.GAMMADIST(x,1,1,'True'),'k') as k_observed
FROM (
      SELECT k.seriesvalue as x
      FROM wctMath.wct.SeriesFloat(0,1,NULL,100,'R') k
      ) n(x)
This produces the following result. Your results will be different.
               p_value             k_observed
---------------------- ----------------------
 1.09546692948201E-12      0.368492604785191
 
  

|<< Back |    

Copyright 2008-2024 Westclintech LLC         Privacy Policy        Terms of Service