WestClinTech - SQL Server Functions - Blog - Calculating a Correlation Matrix in SQL Server

Home

XLeratorDB function packages for SQL Server
financial view documentation pricing
statistics view documentation pricing
math view documentation pricing
engineering view documentation pricing
strings view documentation pricing
financial-options view documentation pricing
windowing view documentation pricing

XLeratorDB Compilation packages for SQL Server
Suite incl: financial, statistics, math, engineering & strings pricing
Suite (Developer) requires SQL Server Developer Edition pricing
Suite (Subscription) One-year non-recurring license pricing

SuitePLUS incl: all Suite packages PLUS financial-options pricing
SuitePLUS (Developer) requires SQL Server Developer Edition, also incl: financial-options pricing
SuitePLUS (Subscription) One-year non-recurring license, also incl: financial-options pricing

XLeratorDLL function packages Microsoft .NET API Library
financial (DLL) view documentation pricing SQL Server not required

View All Product Pricing ...

Download Free 15 Day Trial ...

Documentation

Purchase

XLeratorDB function packages for SQL Server (2008 & later)
financial
statistics
math
engineering
strings
financial-options
windowing

XLeratorDB Compilation packages for SQL Server (2008 & later)
Suite
Suite (Developer)
Suite (Subscription)

SuitePLUS
SuitePLUS (Developer)
SuitePLUS (Subscription)

XLeratorDLL function packages Microsoft .NET API Library
financial (DLL)

Legacy XLeratorDB Packages for SQL Server 2005
financial for SQL Server 2005 only
statistics for SQL Server 2005 only
math for SQL Server 2005 only

Suite for SQL Server 2005 only
Suite (Developer) for SQL Server 2005 only
SuitePLUS for SQL Server 2005 only
SuitePLUS (Developer) for SQL Server 2005 only

Download Trial
Case Studies
Blog
Support

Calculating a Correlation Matrix in SQL Server

Mar 25

Written by: Charles Flock
3/25/2015 10:24 AM

Diversification is the basis for any sound investment strategy and the heart of diversification is finding uncorrelated risk in different asset classes. In this article we show you how to do that using the XLeratorDB table-valued function CORRM.

Correlation is measured on a scale from -1, meaning that whatever is being measured is perfectly negatively correlated, and 1, meaning perfectly correlated. What the values between -1 and 1 mean is subject to interpretation depending on the context. However, when constructing a correlation matrix we are more concerned with comparing the correlation coefficients than with drawing any inference about the actual value of the correlation coefficient. It lets us compare the degree of correlation.

We have used the closing stock prices over the last 90 or so business days as input for these examples. We created a table, PRICES, which stores the date and closing price for each stock symbol (ticker).

--Create a table in 3rd Normal form to store price history

CREATE TABLE PRICES(

tdate date,

ticker char(5),

price money,

PRIMARY KEY (tdate, ticker)

)

We will populate this table with data for the symbols AAPL, BIDU, FN, GOOG, LNKD, MSFT, TWTR, and YHOO.

You can get the data here.

Once you have inserted the data into your database, you can select the first 8 rows with the following SQL to make sure it agrees with the data in this example.

--Select the first few rows to make sure the data are correct

SELECT TOP 8

FROM

PRICES

ORDER BY

tdate, tdate

This produces the following result.

We use the table-valued function CORRM to calculate the correlation matrix.

--Calculate the Correlation Matrix using CORRM

SELECT

FROM

wct.CORRM('SELECT * FROM PRICES ORDER BY ticker, tdate','True')

This produces the following result.

RowNum      ColNum              ItemValue
----------- ----------- ----------------------
          0           0                      1
          0           1     -0.704131756634434
          0           2      0.294977980772373
          0           3      0.766488229299065
          0           4      0.885316063394826
          0           5      -0.52542522824834
          0           6      0.906691613918283
          0           7     -0.748829430763266
          1           0     -0.704131756634434
          1           1                      1
          1           2    -0.0202674590499283
          1           3     -0.373439551984946
          1           4     -0.733398764001072
          1           5      0.720432614263467
          1           6     -0.759229099007427
          1           7      0.878317238206277
          2           0      0.294977980772373
          2           1    -0.0202674590499283
          2           2                      1
          2           3      0.426046268062164
          2           4      0.349484626628378
          2           5      0.210783096878673
          2           6      0.191987743502255
          2           7      0.107244050842218
          3           0      0.766488229299065
          3           1     -0.373439551984946
          3           2      0.426046268062164
          3           3                      1
          3           4      0.688666854341631
          3           5     -0.246696052678455
          3           6      0.661326491615496
          3           7     -0.434904481638053
          4           0      0.885316063394826
          4           1     -0.733398764001072
          4           2      0.349484626628378
          4           3      0.688666854341631
          4           4                      1
          4           5     -0.519277595579834
          4           6       0.93434203672097
          4           7     -0.739344490441844
          5           0      -0.52542522824834
          5           1      0.720432614263467
          5           2      0.210783096878673
          5           3     -0.246696052678455
          5           4     -0.519277595579834
          5           5                      1
          5           6     -0.500444165114568
          5           7      0.874811278901919
          6           0      0.906691613918283
          6           1     -0.759229099007427
          6           2      0.191987743502255
          6           3      0.661326491615496
          6           4       0.93434203672097
          6           5     -0.500444165114568
          6           6                      1
          6           7     -0.771346874535524
          7           0     -0.748829430763266
          7           1      0.878317238206277
          7           2      0.107244050842218
          7           3     -0.434904481638053
          7           4     -0.739344490441844
          7           5      0.874811278901919
          7           6     -0.771346874535524
          7           7                      1

As you can see, the results are returned in 3^rd-normal from using row and column indices to identify the correlations. In order to be useful to us we need to turn the row and column indices into stock symbols. Here's one way to do that.

--Put the tickers into a table so that we can label the rows and columns

SELECT

ROW_NUMBER()OVER (ORDER BY ticker) - 1 as rn,ticker

INTO

#vlookup

FROM(SELECT DISTINCT ticker FROM PRICES)n

--Create the output using the tickers rather the RowNum, ColNum

SELECT

y.ticker,x.ticker,k.ItemValue

FROM

wct.CORRM('SELECT * FROM PRICES ORDER BY ticker, tdate','True')k

CROSS APPLY(SELECT TOP(1) ticker FROM #vlookup WHERE colnum <= rn ORDER BY rn)x

CROSS APPLY(SELECT TOP(1) ticker FROM #vlookup WHERE rownum <= rn ORDER BY rn)y

This produces the following result.

ticker ticker              ItemValue
------ ------ ----------------------
AAPL   AAPL                        1
AAPL   BIDU       -0.704131756634434
AAPL   FB          0.294977980772373
AAPL   GOOG        0.766488229299065
AAPL   LNKD        0.885316063394826
AAPL   MSFT        -0.52542522824834
AAPL   TWTR        0.906691613918283
AAPL   YHOO       -0.748829430763266
BIDU   AAPL       -0.704131756634434
BIDU   BIDU                        1
BIDU   FB        -0.0202674590499283
BIDU   GOOG       -0.373439551984946
BIDU   LNKD       -0.733398764001072
BIDU   MSFT        0.720432614263467
BIDU   TWTR       -0.759229099007427
BIDU   YHOO        0.878317238206277
FB     AAPL        0.294977980772373
FB     BIDU      -0.0202674590499283
FB     FB                          1
FB     GOOG        0.426046268062164
FB     LNKD        0.349484626628378
FB     MSFT        0.210783096878673
FB     TWTR        0.191987743502255
FB     YHOO        0.107244050842218
GOOG   AAPL        0.766488229299065
GOOG   BIDU       -0.373439551984946
GOOG   FB          0.426046268062164
GOOG   GOOG                        1
GOOG   LNKD        0.688666854341631
GOOG   MSFT       -0.246696052678455
GOOG   TWTR        0.661326491615496
GOOG   YHOO       -0.434904481638053
LNKD   AAPL        0.885316063394826
LNKD   BIDU       -0.733398764001072
LNKD   FB          0.349484626628378
LNKD   GOOG        0.688666854341631
LNKD   LNKD                        1
LNKD   MSFT       -0.519277595579834
LNKD   TWTR         0.93434203672097
LNKD   YHOO       -0.739344490441844
MSFT   AAPL        -0.52542522824834
MSFT   BIDU        0.720432614263467
MSFT   FB          0.210783096878673
MSFT   GOOG       -0.246696052678455
MSFT   LNKD       -0.519277595579834
MSFT   MSFT                        1
MSFT   TWTR       -0.500444165114568
MSFT   YHOO        0.874811278901919
TWTR   AAPL        0.906691613918283
TWTR   BIDU       -0.759229099007427
TWTR   FB          0.191987743502255
TWTR   GOOG        0.661326491615496
TWTR   LNKD         0.93434203672097
TWTR   MSFT       -0.500444165114568
TWTR   TWTR                        1
TWTR   YHOO       -0.771346874535524
YHOO   AAPL       -0.748829430763266
YHOO   BIDU        0.878317238206277
YHOO   FB          0.107244050842218
YHOO   GOOG       -0.434904481638053
YHOO   LNKD       -0.739344490441844
YHOO   MSFT        0.874811278901919
YHOO   TWTR       -0.771346874535524
YHOO   YHOO                        1

If we wanted to see the results returned in a more traditional 'matrix' format, we simply use the SQL Server PIVOT function.

--Pivot the resultant table using tickers alphabetically

SELECT ticker2,AAPL,BIDU,FB,GOOG,LNKD,MSFT,TWTR,YHOO

FROM (

SELECT

y.ticker2, x.ticker1,k.ItemValue

FROM

wct.CORRM('SELECT * FROM PRICES ORDER BY ticker, tdate','True')k

CROSS APPLY(SELECT TOP(1) ticker FROM #vlookup WHERE colnum <= rn ORDER BY rn)x(ticker1)

CROSS APPLY(SELECT TOP(1) ticker FROM #vlookup WHERE rownum <= rn ORDER BY rn)y(ticker2)

PIVOT(SUM(ItemValue) FOR Ticker1 IN(AAPL,BIDU,FB,GOOG,LNKD,MSFT,TWTR,YHOO))pvt

ORDER BY

ticker2

This produces the following result.

ticker2                   AAPL                   BIDU                     FB                   GOOG                   LNKD                   MSFT                   TWTR                   YHOO
------- ---------------------- ---------------------- ---------------------- ---------------------- ---------------------- ---------------------- ---------------------- ----------------------
AAPL                         1     -0.704131756634434      0.294977980772373      0.766488229299065      0.885316063394826      -0.52542522824834      0.906691613918283     -0.748829430763266
BIDU        -0.704131756634434                      1    -0.0202674590499283     -0.373439551984946     -0.733398764001072      0.720432614263467     -0.759229099007427      0.878317238206277
FB           0.294977980772373    -0.0202674590499283                      1      0.426046268062164      0.349484626628378      0.210783096878673      0.191987743502255      0.107244050842218
GOOG         0.766488229299065     -0.373439551984946      0.426046268062164                      1      0.688666854341631     -0.246696052678455      0.661326491615496     -0.434904481638053
LNKD         0.885316063394826     -0.733398764001072      0.349484626628378      0.688666854341631                      1     -0.519277595579834       0.93434203672097     -0.739344490441844
MSFT         -0.52542522824834      0.720432614263467      0.210783096878673     -0.246696052678455     -0.519277595579834                      1     -0.500444165114568      0.874811278901919
TWTR         0.906691613918283     -0.759229099007427      0.191987743502255      0.661326491615496       0.93434203672097     -0.500444165114568                      1     -0.771346874535524
YHOO        -0.748829430763266      0.878317238206277      0.107244050842218     -0.434904481638053     -0.739344490441844      0.874811278901919     -0.771346874535524                      1

But, we are not really interested in looking at the correlation among 8 companies using the last 90 days worth of data. What could be interesting, as an example, is creating a correlation matrix with every company traded on the NYSE. Let's look at how that might work.

First, you should eliminate the data in the PRICES table.

TRUNCATE TABLE PRICES

There are approximately 2,800 companies that trade on the NYSE. We will randomly create 2,800 stock symbols and create a year's worth of price data, which we will skew ever so slightly to be correlated. We will only create prices for non-weekend days. This should create 2,800 * 262 = 733,600 rows of data.

SELECT

n2.ticker

,n2.mean

,ROUND(mean / wct.RANDBETWEEN(2,10),2) as sigma

INTO

#p1

FROM (

SELECT TOP 2800

ticker,

wct.RANDBETWEEN(50,150) as mean

FROM (

SELECT

wct.RANDBETWEEN(1,17576) as num

,CHAR(k1.SeriesValue) + CHAR(k2.SeriesValue) + CHAR(k3.SeriesValue) as ticker

FROM

wct.SeriesInt(UNICODE('A'), UNICODE('Z'),NULL,NULL,NULL)k1

CROSS APPLY

wct.SeriesInt(UNICODE('A'), UNICODE('Z'),NULL,NULL,NULL)k2

CROSS APPLY

wct.SeriesInt(UNICODE('A'), UNICODE('Z'),NULL,NULL,NULL)k3

)n1

ORDER BY

num ASC

)n2

SELECT

SeriesValue as tdate

,wct.RANDNORM(0,1) as normsinv

INTO

#d1

FROM

wct.SeriesDate(wct.EDATE(cast(GETDATE() as date),-12),cast(GETDATE() as date),1,NULL,NULL)k4

WHERE

DATEPART(dw, k4.SeriesValue) != 1 AND DATEPART(dw, k4.SeriesValue) != 7

INSERT INTO

PRICES

SELECT

p.ticker

,d.tdate

,ROUND(p.mean *(1+wct.RAND()) + p.sigma*d.normsinv,2) as price

FROM

#p1 p, #d1 d

DROP TABLE #p1

DROP TABLE #d1

Then we can run the following query which will create a table (#corr) with all the correlations between all the tickers.

SELECT

v1.ticker as ticker1

,v2.ticker as ticker2

,k.ItemValue as correlation

INTO

#corr

FROM

wct.CORRM('SELECT tdate, ticker, price FROM PRICES ORDER BY ticker, tdate','True')k, #vlookup v1, #vlookup v2

WHERE

k.rownum < k.ColNum

and k.RowNum = v1.rn

and k.ColNum = v2.rn

On my laptop computer running SQL Server 2012 this takes about 90 seconds and produces a table with 3,918,600 rows, which is the combination of ticker pairs.

Now, it's just a matter of selecting what's of interest. For example, we could select the top 10 pairs with the highest correlation.

SELECT TOP 10

FROM

#corr

ORDER BY

correlation DESC

This produces the following result (your results will be different).

Or we could select the 10 pairs with the lowest correlation.

SELECT TOP 10

FROM

#corr

ORDER BY

correlation ASC

This produces the following result (your results will be different).

We think that using the CORRM table-valued function in SQL Server is probably the best way to do analyses of this type. Using CORRM makes looking for correlations in large datasets simple and efficient, allowing you to gain insights into your data that might otherwise remain hidden. Let us know what you think.

If you think that CORRM might be useful, you can get the free 15-day trial by following this link.

If you want to find out more about our math library for SQL Server click here. If there is something you would like to see added to the almost 800 functions already included in XLeratorDB, just send us an e-mail to support@westclintech.com.

Trackback Print

Tags:

Categories:

Location: Blogs Parent Separator

The WestClinTech Blog

Search Blogs

Keywords

Phrase

Blog Archives

Products

Support

Contact Us
FAQ’s
Blog
XLeratorDB Documentation
- Financial
- Financial-Options
- Statistics
- Math
- Engineering
- Strings
- Windowing
XLeratorDLL Documentation
- Financial-DLL
XLeratorDB Installation Guide

XLeratorDB function packages
for SQL Server

XLeratorDB Compilation packages
for SQL Server

XLeratorDLL function packages
Microsoft .NET API Library

XLeratorDB function packages for
SQL Server (2008 & later)

XLeratorDB Compilation packages for
SQL Server (2008 & later)

XLeratorDLL function packages
Microsoft .NET API Library

Legacy XLeratorDB Packages for
SQL Server 2005

Calculating a Correlation Matrix in SQL Server

Search Blogs

Blog Archives

Products

Support

About

Pricing

Contact us

About us

FAQ

XLeratorDB function packagesfor SQL Server

XLeratorDB Compilation packagesfor SQL Server

XLeratorDLL function packagesMicrosoft .NET API Library

XLeratorDB function packages for SQL Server (2008 & later)

XLeratorDB Compilation packages for SQL Server (2008 & later)

XLeratorDLL function packagesMicrosoft .NET API Library

Legacy XLeratorDB Packages for SQL Server 2005

Calculating a Correlation Matrix in SQL Server

Search Blogs

Blog Archives

Products

Support

About

Pricing

Contact us

About us

FAQ

XLeratorDB function packages
for SQL Server

XLeratorDB Compilation packages
for SQL Server

XLeratorDLL function packages
Microsoft .NET API Library

XLeratorDB function packages for
SQL Server (2008 & later)

XLeratorDB Compilation packages for
SQL Server (2008 & later)

XLeratorDLL function packages
Microsoft .NET API Library

Legacy XLeratorDB Packages for
SQL Server 2005