Home

XLeratorDB function packages for SQL Server
financial view documentation pricing
statistics view documentation pricing
math view documentation pricing
engineering view documentation pricing
strings view documentation pricing
financial-options view documentation pricing
windowing view documentation pricing

XLeratorDB Compilation packages for SQL Server
Suite incl: financial, statistics, math, engineering & strings pricing
Suite (Developer) requires SQL Server Developer Edition pricing
Suite (Subscription) One-year non-recurring license pricing

SuitePLUS incl: all Suite packages PLUS financial-options pricing
SuitePLUS (Developer) requires SQL Server Developer Edition, also incl: financial-options pricing
SuitePLUS (Subscription) One-year non-recurring license, also incl: financial-options pricing

XLeratorDLL function packages Microsoft .NET API Library
financial (DLL) view documentation pricing SQL Server not required

View All Product Pricing ...

Download Free 15 Day Trial ...

Documentation

Purchase

XLeratorDB function packages for SQL Server (2008 & later)
financial
statistics
math
engineering
strings
financial-options
windowing

XLeratorDB Compilation packages for SQL Server (2008 & later)
Suite
Suite (Developer)
Suite (Subscription)

SuitePLUS
SuitePLUS (Developer)
SuitePLUS (Subscription)

XLeratorDLL function packages Microsoft .NET API Library
financial (DLL)

Legacy XLeratorDB Packages for SQL Server 2005
financial for SQL Server 2005 only
statistics for SQL Server 2005 only
math for SQL Server 2005 only

Suite for SQL Server 2005 only
Suite (Developer) for SQL Server 2005 only
SuitePLUS for SQL Server 2005 only
SuitePLUS (Developer) for SQL Server 2005 only

Download Trial
Case Studies
Blog
Support

XLeratorDB/statistics Documentation

SQL Server ROC function

ROCTable

Updated: 08 May 2015

Use the table-valued function ROCTable to show the calculation of the area under the ROC curve. The function accepts either raw or grouped data as input.

The function has a single input parameter which is an SQL SELECT statement which, when executed, returns a resultant table where the first column is the predicted probabilities. For the raw case, the SQL will return one additional column, consisting of zeroes and ones indicating the absence (0) or presence (1) of the characteristic of interest. You may also think of these as indicating the failure (0) or success (1) of the observation.

For the grouped case, the input SQL will return two additional columns containing the count of failures and successes for a predicted probability.

The function returns a table (described below) sorted by ascending predicted probability which calculates the True Positive Rate, the False Positive Rate, and the area under the ROC curve (AUROC). This is the same value returned by the LOGIT and LOGITSUM functions.

Syntax

SELECT * FROM [wct].[ROCTable](

<@Matrix_RangeQuery, nvarchar(max),>)

@Matrix_RangeQuery

the SELECT statement, as a string, which, when executed, creates the resultant table of predicted probabilities and Y values.

Return Types

RETURNS TABLE (

[idx] [int] NULL,

[ppred] [float] NULL,

[failure] [int] NULL,

[success] [int] NULL,

[cumfailure] [int] NULL,

[cumsuccess] [int] NULL,

[FalsePositiveRate] [float] NULL,

[TruePositiveRate] [float] NULL,

[AUROC] [float] NULL,

[cumAUROC] [float] NULL

)

Table Description

Column	Description
idx	a unique identifier for the row identifying its positon in the resultant table
ppred	predicted probability
failure	for raw data, the count of the number of rows for the predicted probability having a value of 0. For grouped data, the sum of the second column passed into the function grouped by predicted probability
success	for raw data, the sum of the second column grouped by predicted probability. For grouped data, the sum of the third column passed into the function grouped by the predicted probability
cumfailure	the sum of failure for the current row and all preceding rows
cumsuccess	the sum of success for the current row and all preceding rows
FalsePositiveRate	cumfailure(idx) /cumfailure(idx_max)
TruePositiveRate	cumsuccess(idx) /cumsuccess(idx_max)
AUROC	[FalsePositiveRate(idx+1) – FalsePositiveRate(idx)] * TruePositiveRate(idx)
cumAUROC	the sum of AUROC for the current row and all preceding rows

Remarks

· The first column returned by @Matrix_RangeQuery should contain the predicted probabilities where 0 <= predicted probability <= 1.

· The resultant table returned by @Matrix_RangeQuery should return either 2 columns or 3 columns.

· When the resultant table contains 2 columns the function assumes that the second column contains binary responses consisting of zero (0) or (1); the use of other values will produce unreliable results.

· When the resultant table contains 3 columns the function assumes that the second column contains a count of the failures or absences and the third column contains a count of the successes or presence.

Examples

Example #1

In this example we use the same data as was used in the LOGIT documentation, consisting of the Coronary Heart Disease data from Applied Logistic Regression, Third Edition by David W. Hosmer, Jr., Stanley Lemeshow, and Rodney X. Sturdivant. The data consist of a single independent variable (age) and an outcome (chd) which indicates the absence (0) or presence (1) of coronary heart disease. We will put the data into a temporary table, run the logistic regression, use the coefficients from the logistic to create the predicted probabilities and then produce the ROC table. Note that the AUROC value is actually returned by the LOGIT function; this example simply explains the calculation.

--Put the Hosmer data into the #chd table

SELECT

INTO

#chd

FROM (VALUES

(20,0),(23,0),(24,0),(25,0),(25,1),(26,0),(26,0),(28,0),(28,0),(29,0)

,(30,0),(30,0),(30,0),(30,0),(30,0),(30,1),(32,0),(32,0),(33,0),(33,0)

,(34,0),(34,0),(34,1),(34,0),(34,0),(35,0),(35,0),(36,0),(36,1),(36,0)

,(37,0),(37,1),(37,0),(38,0),(38,0),(39,0),(39,1),(40,0),(40,1),(41,0)

,(41,0),(42,0),(42,0),(42,0),(42,1),(43,0),(43,0),(43,1),(44,0),(44,0)

,(44,1),(44,1),(45,0),(45,1),(46,0),(46,1),(47,0),(47,0),(47,1),(48,0)

,(48,1),(48,1),(49,0),(49,0),(49,1),(50,0),(50,1),(51,0),(52,0),(52,1)

,(53,1),(53,1),(54,1),(55,0),(55,1),(55,1),(56,1),(56,1),(56,1),(57,0)

,(57,0),(57,1),(57,1),(57,1),(57,1),(58,0),(58,1),(58,1),(59,1),(59,1)

,(60,0),(60,1),(61,1),(62,1),(62,1),(63,1),(64,0),(64,1),(65,1),(69,1)

)n(age,chd)

--Run LOGIT and store the results in #mylogit

SELECT

INTO

#mylogit

FROM

wct.LOGIT('SELECT age,chd FROM #chd',2)

--Calculate the predicted probabilities for each row in #chd and store the

--predicted probability and the chd value in #t

SELECT

wct.LOGITPRED('SELECT stat_val FROM #mylogit where stat_name = ''b'' ORDER BY idx',cast(age as varchar(max))) as [p predicted]

,chd as y

INTO

FROM

#chd

--Run the ROCTable function

SELECT

FROM

wct.ROCTable('SELECT * FROM #t')

This produces the following result.

idx                  ppred     failure     success  cumfailure  cumsuccess      FalsePositiveRate       TruePositiveRate                  AUROC               cumAUROC
----------- ---------------------- ----------- ----------- ----------- ----------- ---------------------- ---------------------- ---------------------- ----------------------
          0      0.912464554564153           0           1           0           1                      0     0.0232558139534884                      0                      0
          1      0.869939152344419           0           1           0           2                      0     0.0465116279069767   0.000815993472052223   0.000815993472052223
          2      0.856865930676536           1           1           1           3     0.0175438596491228     0.0697674418604651                      0   0.000815993472052223
          3      0.842716220602683           0           1           1           4     0.0175438596491228     0.0930232558139535                      0   0.000815993472052223
          4      0.827449401763914           0           2           1           6     0.0175438596491228       0.13953488372093                      0   0.000815993472052223
          5      0.811032992880968           0           1           1           7     0.0175438596491228      0.162790697674419    0.00285597715218278    0.00367197062423501
          6      0.793444615655287           1           1           2           8     0.0350877192982456      0.186046511627907                      0    0.00367197062423501
          7      0.774673993551717           0           2           2          10     0.0350877192982456      0.232558139534884    0.00407996736026112    0.00775193798449612
          8      0.754724899971724           1           2           3          12     0.0526315789473684       0.27906976744186    0.00979192166462668     0.0175438596491228
          9      0.733616953220639           2           4           5          16      0.087719298245614      0.372093023255814                      0     0.0175438596491228
         10      0.711387142595015           0           3           5          19      0.087719298245614      0.441860465116279    0.00775193798449612     0.0252957976336189
         11      0.688090963392313           1           2           6          21      0.105263157894737      0.488372093023256                      0     0.0252957976336189
         12      0.663803041111905           0           1           6          22      0.105263157894737      0.511627906976744                      0     0.0252957976336189
         13      0.638617138505235           0           2           6          24      0.105263157894737      0.558139534883721    0.00979192166462668     0.0350877192982456
         14       0.61264546440856           1           1           7          25       0.12280701754386      0.581395348837209     0.0101999184006528     0.0452876376988984
         15      0.586017240033851           1           0           8          25      0.140350877192982      0.581395348837209     0.0101999184006528     0.0554875560995512
         16      0.558876524531328           1           1           9          26      0.157894736842105      0.604651162790698     0.0212158302733578      0.076703386372909
         17      0.531379353436951           2           1          11          27      0.192982456140351      0.627906976744186      0.011015911872705      0.087719298245614
         18      0.503690295993513           1           2          12          29      0.210526315789474      0.674418604651163     0.0236638106895145      0.111383108935129
         19      0.475978584473281           2           1          14          30      0.245614035087719      0.697674418604651     0.0122399020807834      0.123623011015912
         20      0.448414004860464           1           1          15          31      0.263157894736842      0.720930232558139     0.0126478988168095      0.136270909832721
         21      0.421162758975344           1           1          16          32      0.280701754385965      0.744186046511628     0.0261117911056712      0.162382700938392
         22      0.394383510626178           2           2          18          34      0.315789473684211      0.790697674418605     0.0277437780497756      0.190126478988168
         23      0.368223812328276           2           1          20          35      0.350877192982456      0.813953488372093     0.0428396572827417       0.23296613627091
         24      0.342817076642784           3           1          23          36      0.403508771929825      0.837209302325581     0.0293757649938801       0.26234190126479
         25      0.318280211425752           2           0          25          36       0.43859649122807      0.837209302325581       0.01468788249694       0.27702978376173
         26      0.294711986717842           1           1          26          37      0.456140350877193       0.86046511627907     0.0150958792329661      0.292125662994696
         27      0.272192148511754           1           1          27          38      0.473684210526316      0.883720930232558     0.0310077519379845      0.323133414932681
         28      0.250781246560969           2           0          29          38      0.508771929824561      0.883720930232558     0.0310077519379845      0.354141166870665
         29      0.230521103877386           2           1          31          39      0.543859649122807      0.906976744186046     0.0318237454100367      0.385964912280702
         30      0.211435827131904           2           1          33          40      0.578947368421053      0.930232558139535     0.0326397388820889      0.418604651162791
         31      0.193533240663126           2           0          35          40      0.614035087719298      0.930232558139535     0.0652794777641779      0.483884128926969
         32      0.176806621582586           4           1          39          41      0.684210526315789      0.953488372093023     0.0334557323541412       0.51733986128111
         33      0.161236617821071           2           0          41          41      0.719298245614035      0.953488372093023     0.0334557323541412      0.550795593635251
         34      0.146793242543317           2           0          43          41      0.754385964912281      0.953488372093023     0.0836393308853529      0.634434924520604
         35      0.121125053503268           5           1          48          42      0.842105263157895      0.976744186046512     0.0171358629130967      0.651570787433701
         36       0.10980443546362           1           0          49          42      0.859649122807018      0.976744186046512     0.0342717258261934      0.685842513259894
         37     0.0994221764013863           2           0          51          42      0.894736842105263      0.976744186046512     0.0342717258261934      0.720114239086087
         38     0.0812484736598618           2           0          53          42      0.929824561403509      0.976744186046512     0.0171358629130966      0.737250101999184
         39     0.0733437884100028           1           1          54          43      0.947368421052632                      1     0.0175438596491229      0.754793961648307
         40     0.0661527783012159           1           0          55          43      0.964912280701754                      1     0.0175438596491228       0.77233782129743
         41     0.0596214497281155           1           0          56          43      0.982456140350877                      1     0.0175438596491229      0.789881680946553
         42     0.0434787567488236           1           0          57          43                      1                      1                      0      0.789881680946553

You can see from the table that the cumulative AUROC value is 0.789881680946553. This is the same as the value returned by LOGIT.

--Get the AUROC value from #mylogit

SELECT

stat_val

FROM

#mylogit

WHERE

stat_name = 'AUROC'

This produces the following result.

However, ROCTable does return the False Positive Rate and the True Positive Rate, which can be graphed using SSRS, Excel, or any tool that you prefer. In this example, I have simply copied the FalsePositiveRate and TruePositiveRate from ROCTable, pasted them into Excel, and then produced the following graph.

It is worth noting that our input data consisted of 100 rows, yet ROCTable only returned 43 rows of data from the temporary table #t, even though we generated a predicted probability for all 100 rows. This is because there were not 100 unique predicted probabilities. We can get the number of unique predicted probabilities using the following SQL.

SELECT COUNT(DISTINCT [p predicted]) as [COUNT p predicted] FROM #t

This produces the following result, which matches what was returned by ROCTable.

As Hosmer points out in section 5.4.2 "let n₁ denote the number of subjects with y = 1 and n₀ denote the number of subjects with y = 0. We can then create n₁ x n₀ pairs; each subject with y = 1, paired with each subject with y = 0. Of these n₁ x n₀ pairs, we determine the proportion of the pairs where the subject with y = 1 had the higher of the two probabilities. This proportion may be shown to be equal to the area under the ROC Curve."

The technique that he is suggesting lends itself quite well to SQL and we can use it to check the AUROC calculation in both LOGIT and ROCTable. We would not recommend this calculation as a practical matter as it requires a Cartesian product; in this case 57 x 43 combinations.

--Calculate the area under the ROC Curve using a Cartesian product

;with mycte as (

SELECT

n1.y as y1,

n1.[p predicted] as p1,

n0.y as y0,

n0.[p predicted] as p0

FROM

#t n1, #t n0

WHERE

n1.y = 1 AND n0.y = 0

)

SELECT

COUNT(m1.y1)/cast(n.pairs as float) As AUROC

,n.pairs

FROM

(SELECT SUM(y1) FROM mycte)n(pairs)

,mycte m1

WHERE

m1.p1 >= m1.p0

GROUP BY

n.pairs

This produces the following result.

Now, let's look at an example using grouped data. The data consist of 3 independent variables; x1, x2, and x3, and 2 additional columns; the number of successes for that combination of independent variables and the number of observations for that combination of independent variables.

--Put grouped data into a temporary table #x

SELECT

INTO

FROM (VALUES

(100,1,10,28,156)

,(150,1,10,33,144)

,(200,1,10,44,171)

,(250,1,10,56,196)

,(300,1,10,55,158)

,(350,1,10,44,100)

,(400,1,10,57,126)

,(450,1,10,77,166)

,(500,1,10,84,166)

,(100,2,10,23,153)

,(150,2,10,31,165)

,(200,2,10,40,179)

,(250,2,10,42,152)

,(300,2,10,55,181)

,(350,2,10,68,200)

,(400,2,10,59,148)

,(450,2,10,69,156)

,(500,2,10,75,157)

,(100,1,11,19,164)

,(150,1,11,23,147)

,(200,1,11,35,182)

,(250,1,11,46,196)

,(300,1,11,41,143)

,(350,1,11,60,189)

,(400,1,11,59,162)

,(450,1,11,75,187)

,(500,1,11,59,129)

,(100,2,11,9,105)

,(150,2,11,22,179)

,(200,2,11,30,182)

,(250,2,11,32,155)

,(300,2,11,41,164)

,(350,2,11,58,200)

,(400,2,11,60,181)

,(450,2,11,75,199)

,(500,2,11,59,141)

)n(x1,x2,x3,success,N)

--Run LOGIT and store the results in #mylogit

SELECT

INTO

#mylogit

FROM

wct.LOGITSUM('SELECT x1,x2,x3,success,n-success from #x',4,5)

--Calculate the predicted probabilities using LOGITPROB for each row in #x and store the

--predicted probability and the group totals in #t

SELECT

wct.LOGITPROB(n.x, m.stat_val) as [p predicted]

,n-Success as failure

,success as success

INTO

FROM

CROSS APPLY(VALUES (0,1),(1,x1),(2,x2),(3,x3))n(idx,x)

INNER JOIN

#mylogit m

m.idx = n.idx

WHERE

m.stat_name = 'b'

GROUP BY

n-Success, success

--Run ROCTable function

SELECT

FROM

wct.ROCTable('SELECT * FROM #t')

This produces the following result.

idx                  ppred     failure     success  cumfailure  cumsuccess      FalsePositiveRate       TruePositiveRate                  AUROC               cumAUROC
----------- ---------------------- ----------- ----------- ----------- ----------- ---------------------- ---------------------- ---------------------- ----------------------
          0      0.540977960616935          82          84          82          84      0.019825918762089     0.0481927710843374   0.000955465964438023   0.000955465964438023
          1       0.49876262766249          82          75         164         159     0.0396518375241779     0.0912220309810671    0.00196294989296784    0.00291841585740586
          2      0.488565636437005          89          77         253         236     0.0611702127659574      0.135398737808376    0.00229156471145705    0.00520998056886291
          3      0.460157630044192          70          59         323         295     0.0780947775628627       0.16924842226047     0.0035601094624422    0.00877009003130511
          4      0.446462214823615          87          69         410         364     0.0991295938104449      0.208835341365462    0.00348395516301181     0.0122540451943169
          5      0.436403529774778          69          57         479         421      0.115812379110251      0.241537578886976    0.00478870441700485     0.0170427496113218
          6      0.418499071132844          82          59         561         480       0.13563829787234      0.275387263339071    0.00745729533219921      0.024500044943521
          7       0.40860533932183         112          75         673         555      0.162717601547389        0.3184165232358    0.00685180623017075     0.0313518511736917
          8        0.3953206788673          89          59         762         614      0.184235976789168      0.352266207687894    0.00476956180621907     0.0361214129799108
          9      0.385611537599448          56          44         818         658      0.197775628626692      0.377510040160643     0.0113179992698065     0.0474394122497173
         10      0.368428698069553         124          75         942         733      0.227756286266925      0.420539300057372     0.0104728113892431     0.0579122236389604
         11      0.358987917507059         103          59        1045         792      0.252659574468085      0.454388984509466     0.0145017761013659     0.0724139997403263
         12      0.346371616586007         132          68        1177         860      0.284574468085106      0.493402180149168     0.0122873366913357      0.084701336431662
         13      0.337194277421765         103          55        1280         915      0.309477756286267      0.524956970740103     0.0153577837184605      0.100059120150122
         14       0.32104154787904         121          60        1401         975      0.338733075435203      0.559380378657487     0.0174468251563868      0.117505945306509
         15      0.312214767445495         129          60        1530        1035      0.369922630560928      0.593803786574871     0.0180897671925613      0.135595712499071
         16      0.300471741826253         126          55        1656        1090      0.400386847195358      0.625358577165806     0.0211678435210863      0.156763556020157
         17      0.291967312452626         140          56        1796        1146      0.434235976789168      0.657487091222031      0.022573299553561      0.179336855573718
         18      0.277075424928281         142          58        1938        1204      0.468568665377176      0.690763052208835     0.0170352590244926       0.19637211459821
         19      0.268978570102441         102          41        2040        1245      0.493230174081238      0.714285714285714     0.0189969604863222      0.215369075084533
         20      0.258251135511103         110          42        2150        1287      0.519825918762089      0.738382099827883     0.0226727579009045      0.238041832985437
         21      0.250513751624832         127          44        2277        1331      0.550531914893617      0.763625932300631     0.0227093785476252      0.260751211533062
         22      0.237028393827286         123          41        2400        1372       0.58027079303675       0.78714859437751     0.0285474586935751      0.289298670226637
         23         0.229729927067         150          46        2550        1418      0.616537717601547      0.813539873780838     0.0273409193557873      0.316639589582425
         24      0.220096523923864         139          40        2689        1458      0.650145067698259      0.836488812392427     0.0224492887271662      0.339088878309591
         25      0.213173748215327         111          33        2800        1491      0.676982591876209      0.855421686746988     0.0254392813031624      0.364528159612753
         26      0.201158946047794         123          32        2923        1523      0.706721470019342      0.873780837636259     0.0310555568502249      0.395583716462978
         27      0.194683131790574         147          35        3070        1558      0.742263056092843        0.8938611589214     0.0289597183983239      0.424543434861302
         28      0.186164174671769         134          31        3204        1589      0.774661508704062      0.911646586345382     0.0282134340068203      0.452756868868122
         29      0.180062272857074         128          28        3332        1617      0.805609284332689      0.927710843373494     0.0340938220968982      0.486850690965021
         30      0.169511632771358         152          30        3484        1647      0.842359767891683      0.944922547332186     0.0283293993881023      0.515180090353123
         31      0.163845663830216         124          23        3608        1670      0.872340425531915       0.95811818703385     0.0301149333448744      0.545295023697997
         32      0.156414007523259         130          23        3738        1693      0.903771760154739      0.971313826735513     0.0368704716628325       0.58216549536083
         33      0.141958450405941         157          22        3895        1715      0.941731141199226      0.983935742971888     0.0344948459214033      0.616660341282233
         34      0.137061458881927         145          19        4040        1734       0.97678916827853      0.994836488812392     0.0230909823322025      0.639751323614436
         35      0.118246213552031          96           9        4136        1743                      1                      1                      0      0.639751323614436

You can see from the table that the cumulative AUROC value is 0.639751323614436. This is the same as the value returned by LOGITSUM.

--Get the AUROC value from #mylogit

SELECT

stat_val

FROM

#mylogit

WHERE

stat_name = 'AUROC'

This produces the following result.

We can modify our SQL slightly from the previous example in order to verify the calculation of the area under the ROC curve using the Cartesian product.

--Calculate the area under the ROC Curve using a Cartesian product

SELECT

SUM(n.y0*n.y1)/cast(p.pairs as float) as AUROC

,p.pairs

FROM (

SELECT

t1.success as y1,

t1.[p predicted] as p1,

t2.failure as y0,

t2.[p predicted] as p0

FROM

#t t1, #t t2

,(SELECT SUM(success) * SUM(failure) FROM #t)p(pairs)

WHERE

p1 >= p0

GROUP BY

p.pairs

This produces the following result.

See Also

· LINEST - Linear regression

· LOGEST - Logarithmic regression

· LOGIT - Logit regression

· LOGITPRED - Calculate predicted values based on a logit regression

· LOGITPROB - Calculate predicted values based on a logit regression

· LOGITSUM - Logit regression using summary data

· VIF - Variance inflation factors

View Topic History

Products

Support

Contact Us
FAQ’s
Blog
XLeratorDB Documentation
- Financial
- Financial-Options
- Statistics
- Math
- Engineering
- Strings
- Windowing
XLeratorDLL Documentation
- Financial-DLL
XLeratorDB Installation Guide

XLeratorDB function packages
for SQL Server

XLeratorDB Compilation packages
for SQL Server

XLeratorDLL function packages
Microsoft .NET API Library

XLeratorDB function packages for
SQL Server (2008 & later)

XLeratorDB Compilation packages for
SQL Server (2008 & later)

XLeratorDLL function packages
Microsoft .NET API Library

Legacy XLeratorDB Packages for
SQL Server 2005

XLeratorDB/statistics Documentation

SQL Server ROC function

Products

Support

About

Pricing

XLeratorDB function packagesfor SQL Server

XLeratorDB Compilation packagesfor SQL Server

XLeratorDLL function packagesMicrosoft .NET API Library

XLeratorDB function packages for SQL Server (2008 & later)

XLeratorDB Compilation packages for SQL Server (2008 & later)

XLeratorDLL function packagesMicrosoft .NET API Library

Legacy XLeratorDB Packages for SQL Server 2005

XLeratorDB/statistics Documentation

SQL Server ROC function

Products

Support

About

Pricing

XLeratorDB function packages
for SQL Server

XLeratorDB Compilation packages
for SQL Server

XLeratorDLL function packages
Microsoft .NET API Library

XLeratorDB function packages for
SQL Server (2008 & later)

XLeratorDB Compilation packages for
SQL Server (2008 & later)

XLeratorDLL function packages
Microsoft .NET API Library

Legacy XLeratorDB Packages for
SQL Server 2005