# SQL Server sample covariance function

COVARIANCE_S

Updated: 24 May 2013

Note: This documentation is for the SQL2008 (and later) version of this XLeratorDB function, it is not compatible with SQL Server 2005.
Click here for the SQL2005 version of the COVARIANCE_S function

Use the aggregate function COVARIANCE_S to calculate the sample covariance for a set of x- and y-values. The formula for the population covariance is

Syntax

@Known_y
the y-values to be used in the calculation. @Known_y is an expression of type float or of a type that can be implicitly converted to float.
@Known_x
the x-values to be used in the calculation. @Known_x is an expression of type float or of a type that can be implicitly converted to float.
Return Types
float
Remarks
·         COVARIANCE_S is an AGGREGATE function and follows the same conventions as all other AGGREGATE functions in SQL Server.
·         Use COVARIANCE_P to calculate the population covariance.
Examples
In this example, we calculate the sample covariance for a single set of x- and y-values
SELECT wct.COVARIANCE_S(y, x) as COVARIANCE_S
FROM (
SELECT 0.75, 1 UNION ALL
SELECT 2.5, 2 UNION ALL
SELECT 6.75, 3 UNION ALL
SELECT 10, 4
) n(x,y)

This produces the following result
COVARIANCE_S
----------------------
5.33333333333333

In this example, we will populate some temporary table with some historical financial information and then calculate the sample covariance. First, create the table and put some data in it:
CREATE TABLE #c(
SYM         NVARCHAR(5),
YE          BIGINT,
REV         FLOAT,
GPROF       FLOAT,
OPINC       FLOAT,
NETINC      FLOAT
)

INSERT INTO #c VALUES('YHOO',2009,6460.32,3588.57,386.69,597.99)
INSERT INTO #c VALUES('YHOO',20,72.5,4185.14,12.96,418.92)
INSERT INTO #c VALUES('YHOO',2007,6969.27,4130.52,695.41,639.16)
INSERT INTO #c VALUES('YHOO',2006,6425.68,3749.96,940.97,751.39)
INSERT INTO #c VALUES('YHOO',2005,5257.67,3161.47,1107.73,1896.23)
INSERT INTO #c VALUES('GOOG',2009,23650.56,14806.45,8312.19,6520.45)
INSERT INTO #c VALUES('GOOG',20,21795.55,13174.04,5537.21,4226.86)
INSERT INTO #c VALUES('GOOG',2007,16593.99,9944.9,54.44,4203.72)
INSERT INTO #c VALUES('GOOG',2006,10604.92,6379.89,3550,3077.45)
INSERT INTO #c VALUES('GOOG',2005,6138.56,3561.47,2017.28,1465.4)
INSERT INTO #c VALUES('MSFT',2010,62484,509,24167,18760)
INSERT INTO #c VALUES('MSFT',2009,58437,46282,21225,14569)
INSERT INTO #c VALUES('MSFT',20,60420,48822,22271,17681)
INSERT INTO #c VALUES('MSFT',2007,51122,40429,18438,14065)
INSERT INTO #c VALUES('MSFT',2006,44282,36632,16064,12599)
INSERT INTO #c VALUES('ORCL',2010,26820,21056,9062,6135)
INSERT INTO #c VALUES('ORCL',2009,23252,18458,8321,5593)
INSERT INTO #c VALUES('ORCL',20,22430,17449,7844,5521)
INSERT INTO #c VALUES('ORCL',2007,17996,13805,5974,4274)
INSERT INTO #c VALUES('ORCL',2006,14380,11145,4736,3381)
INSERT INTO #c VALUES('SAP',2009,10672,6980,2588,1748)
INSERT INTO #c VALUES('SAP',20,11575,7370,2701,1847)
INSERT INTO #c VALUES('SAP',2007,10256,6631,2698,1906)
INSERT INTO #c VALUES('SAP',2006,9393,6064,2578,1871)
INSERT INTO #c VALUES('SAP',2005,8509,5460,2337,1496)

Now, calculate the population covariance of the revenue (REV) againt the year (YE) for each company (SYM)
SELECT #c.SYM
,wct.COVARIANCE_S(REV,YE) as COVARIANCE_S
FROM #c
GROUP BY SYM
This produces the following result.
SYM             COVARIANCE_S
----- ----------------------
GOOG           -2989746.8405
MSFT             -2509357.25
ORCL               -715302.8
SAP                  -740891
YHOO             2466413.266

Let’s say we wanted to perform the same analysis as above, but we only want to return the results where the covariance is positive.
SELECT #c.SYM
,wct.COVARIANCE_S(REV,YE) as COVARIANCE_S
FROM #c
GROUP BY SYM
HAVING wct.COVARIANCE_S(REV,YE) > 0

This produces the following result.
SYM             COVARIANCE_S
----- ----------------------
YHOO             2466413.266

In this example, we will calculate the correlation of the operating income (OPINC) against the revenue (REV)
SELECT #c.SYM
,wct.COVARIANCE_S(OPINC,REV) as COVARIANCE_S
FROM #c
GROUP BY SYM

This produces the following result.
SYM             COVARIANCE_S
----- ----------------------
GOOG         15351903.708095
MSFT                23798218
ORCL               8568824.7
SAP                   147381
YHOO            845158.97983