Home

XLeratorDB function packages for SQL Server
financial view documentation pricing
statistics view documentation pricing
math view documentation pricing
engineering view documentation pricing
strings view documentation pricing
financial-options view documentation pricing
windowing view documentation pricing

XLeratorDB Compilation packages for SQL Server
Suite incl: financial, statistics, math, engineering & strings pricing
Suite (Developer) requires SQL Server Developer Edition pricing
Suite (Subscription) One-year non-recurring license pricing

SuitePLUS incl: all Suite packages PLUS financial-options pricing
SuitePLUS (Developer) requires SQL Server Developer Edition, also incl: financial-options pricing
SuitePLUS (Subscription) One-year non-recurring license, also incl: financial-options pricing

XLeratorDLL function packages Microsoft .NET API Library
financial (DLL) view documentation pricing SQL Server not required

View All Product Pricing ...

Download Free 15 Day Trial ...

Documentation

Purchase

XLeratorDB function packages for SQL Server (2008 & later)
financial
statistics
math
engineering
strings
financial-options
windowing

XLeratorDB Compilation packages for SQL Server (2008 & later)
Suite
Suite (Developer)
Suite (Subscription)

SuitePLUS
SuitePLUS (Developer)
SuitePLUS (Subscription)

XLeratorDLL function packages Microsoft .NET API Library
financial (DLL)

Legacy XLeratorDB Packages for SQL Server 2005
financial for SQL Server 2005 only
statistics for SQL Server 2005 only
math for SQL Server 2005 only

Suite for SQL Server 2005 only
Suite (Developer) for SQL Server 2005 only
SuitePLUS for SQL Server 2005 only
SuitePLUS (Developer) for SQL Server 2005 only

Download Trial
Case Studies
Blog
Support

XLeratorDB/statistics Documentation

SQL Server Weighted Least Squares function

WLS

Updated: 01 Oct 2017

Use the SQL Server table-valued function WLS to calculate the Ordinary Least Squares (OLS) solution for a series of x- and y-values and an associated column of weights; sometimes referred to as Weighted Least-Squares (WLS).

WLS returns the coefficients of regression, standard errors, Student’s T and associated p-value for each of the independent variables. It also returns summary statistics about the regression including the standard error of y, R², adjusted R², the F-statistic and its p-value, the regression sum of squares, the residual sum of squares and the quartiles of the residuals.

WLS is closely related to LINEST and the regression coefficients and their standard errors, T statistics, and p-values can be calculated in LINEST, though LINEST will not produce the correct summary statistics. See Example 1 to find out more.

Syntax

SELECT * FROM [wct].[WLS] (

<@TableName, nvarchar(max),>

,<@ColumnNames, nvarchar(max),>

,<@GroupedColumnName, nvarchar(max),>

,<@GroupedColumnValue, sql_variant,>

,<@LConst, bit,>

,<@y_Column, nvarchar(4000),>

,<@w_Column, nvarchar(4000),>)

Arguments

Input Name	Definition
@TableName	The name, as text, of the table or view that contains the values used in the calculation.
@ColumnNames	The names, as text, of the columns in @TableName that contain the values used in the calculation. The column names include the independent variables (x0,x1…xn), the dependent variable (y) and the weight (w). Data returned from the @ColumnNames must be of a type float or of a type that intrinsically converts to float.
@GroupedColumnName	The name, as text, of the column in the table or view specified by @TableName which will be used for grouping the results.
@GroupedColumnValue	The column value to do the grouping on.
@LConst	A bit value specifying the calculation of a y-intercept (@LConst =1) or regression through the origin (@LConst = 0).
@y_Column	The column name or column number containing the dependent (y) variable.
@w_Column	The column name or column number containing the weight (w) variable.

Return Type

RETURNS TABLE (

[stat_name] [nvarchar](20) NULL,

[idx] [int] NULL,

[stat_val] [float] NULL,

[col_name] [nvarchar](128) NULL

)

Table Description

Column	Definition
stat_name	Identifies the statistic being returned: m – estimated coefficient se –standard error of the estimated coefficient tstat – Student’s T statistic pval – p-values of the tstat rsq – R² rsqa – adjusted R² rsqm – multiple R² sey – standard error for the y estimate F – F statistic F_pval – p-value of F df – residual degrees of freedom ss_resid – weighted sum of squares mss – modified sum-of-squares w_resid_quart – weighted residual quartile
idx	Uniquely identifies a return value for the stat_names where multiple values are returned: m, se, tstat, pval, and w_resid_quart. For m, se, tstat, and pval, idx identifies that subscript of the estimated coefficient. For example, the stat_name m with an idx of 0, specifies that the stat_val is for m0, or the y-intercept (which is b in y = mx + b). An idx of 1 for the same stat_name identifies m1. For w_resid_quart idx identifies the quartile being returned. For all other stat_names returning a single value, the idx will be NULL.
stat_val	The return value.
col_name	The column name from the resultant table produced by the dynamic SQL for the m, se, tstat, and pval stat_names.

Remarks

If @y_Column is NULL then @y_Column is the left-most column in @ColumnNames.
If @w_Column is NULL the @w_Column is the right-most column in @ColumnNames.
If @y_Column = @w_Column then no rows are returned.
If @y_Column is numeric and less than 1 or greater than the number of columns in @ColumnNames then no rows are returned.
If @w_Column is numeric and less than 1 or greater than the number of columns in @ColumnNames then no rows are returned.
Weight values must be greater than zero.
The number of rows in the regression must be greater than or equal to the number of columns.
Available in XLeratorDB / statistics 2008 only

Examples

Example #1

This example explains the calculation of the regression coefficients and the summary statistics in WLS. Let’s set up an example in SQL Server and take a closer look at those calculation. We will put the WLS results into a temp table, #wls.

SELECT

INTO

FROM (VALUES

(103,126.8,62.3,0.420928305104083)

,(127.2,115.7,98,0.642347072957175)

,(118,103.4,92.2,0.503672280805613)

,(121.8,95.2,74.2,0.349193063289055)

,(106.1,96,78.9,0.321793289794097)

,(124.6,124.7,96.1,1.34249371606786)

,(116.9,122.2,94.1,0.401800920329203)

,(118.6,128.2,79.2,0.67140606947821)

,(125.2,116.9,79.6,0.336969408869812)

,(123.3,112.3,87.8,0.556210387357181)

)n(y,x1,x2,w)

Use the WLS function to calculate the coefficients of regression and the associated statistics.

SELECT

INTO

#wls

FROM

wct.WLS('#t','y,x1,x2,w','',NULL,1,'y','w')

SELECT

FROM

#wls

This produces the following result.

stat_name                    idx               stat_val col_name
-------------------- ----------- ---------------------- ---------
m                              0       76.2158913852846 Intercept
m                              1     0.0222877042102519 x1
m                              2       0.47373007655067 x2
se                             0       24.1127459683416 Intercept
se                             1      0.171731881726625 x1
se                             2      0.173375419548835 x2
tstat                          0       3.16081343391378 Intercept
tstat                          1      0.129781983322882 x1
tstat                          2       2.73239469460799 x2
pval                           0     0.0159102496734965 Intercept
pval                           1      0.900389539811493 x1
pval                           2     0.0292374089699169 x2
rsq                         NULL      0.520577705092377 NULL
sey                         NULL       4.29237110562461 NULL
F                           NULL       3.80045314366198 NULL
F_pval                      NULL     0.0762981122063386 NULL
df                          NULL                      7 NULL
mss                         NULL        140.04251562907 NULL
ss_resid                    NULL       128.971147958807 NULL
rsqm                        NULL      0.721510710310233 NULL
rsqa                        NULL      0.383599906547341 NULL
w_resid_quart                  0      -5.46438974663061 NULL
w_resid_quart                  1      -3.44808553096924 NULL
w_resid_quart                  2      0.839382652539098 NULL
w_resid_quart                  3       2.08237170553033 NULL
w_resid_quart                  4       5.03271582569117 NULL

There is nothing further that needs to be done in terms of getting the regression results. The rest of this example serves as an explanation of how the results are calculated.

To calculate weighted least squares the dependent variable and all the independent variables are multiplied by the square root of the weights. We can achieve that result in LINEST by setting @LConst = 0 and by manually creating the intercept in the result table. The following SQL does that and puts the results into a temp tale #ols.

SELECT

stat_name

,idx-1 as idx

,stat_val

,col_name

INTO

#ols

FROM

wct.LINEST('#t','y*SQRT(w) as y, sqrt(w) as Intercept, x1*SQRT(w) as x1, x2*SQRT(w) as x2','',NULL,1,0)

WHERE

idx > 0 OR idx IS NULL

SELECT

FROM

#ols

This produces the following result.

stat_name          idx               stat_val col_name
---------- ----------- ---------------------- ---------
m                    0       76.2158913852846 Intercept
m                    1     0.0222877042102519 x1
m                    2       0.47373007655067 x2
se                   0       24.1127459683416 Intercept
se                   1      0.171731881726625 x1
se                   2      0.173375419548835 x2
tstat                0       3.16081343391378 Intercept
tstat                1      0.129781983322882 x1
tstat                2       2.73239469460799 x2
pval                 0     0.0159102496734965 Intercept
pval                 1      0.900389539811493 x1
pval                 2     0.0292374089699169 x2
rsq               NULL      0.998391679632623 NULL
sey               NULL       4.29237110562461 NULL
F                 NULL       1448.45556461448 NULL
df                NULL                      7 NULL
ss_reg            NULL       80060.9901152796 NULL
ss_resid          NULL       128.971147958807 NULL
rsqm              NULL      0.999195516219235 NULL
rsqa              NULL      0.997702399475176 NULL

The following SQL produces a side-by-side comparison of the regression results.

SELECT

w.stat_name

,w.idx

,w.stat_val as [wls]

,o.stat_val as [ols]

FROM

#wls w

FULL null">OUTER JOIN

#ols o

(w.stat_name = o.stat_name AND ISNULL(w.idx,0) = ISNULL(o.idx,0))

OR (w.stat_name = 'mss' AND o.stat_name = 'ss_reg')

This produces the following result.

stat_name                    idx                    wls                    ols
-------------------- ----------- ---------------------- ----------------------
m                              0       76.2158913852846       76.2158913852846
m                              1     0.0222877042102519     0.0222877042102519
m                              2       0.47373007655067       0.47373007655067
se                             0       24.1127459683416       24.1127459683416
se                             1      0.171731881726625      0.171731881726625
se                             2      0.173375419548835      0.173375419548835
tstat                          0       3.16081343391378       3.16081343391378
tstat                          1      0.129781983322882      0.129781983322882
tstat                          2       2.73239469460799       2.73239469460799
pval                           0     0.0159102496734965     0.0159102496734965
pval                           1      0.900389539811493      0.900389539811493
pval                           2     0.0292374089699169     0.0292374089699169
rsq                         NULL      0.520577705092377      0.998391679632623
sey                         NULL       4.29237110562461       4.29237110562461
F                           NULL       3.80045314366198       1448.45556461448
df                          NULL                      7                      7
mss                         NULL        140.04251562907       80060.9901152796
ss_resid                    NULL       128.971147958807       128.971147958807
rsqm                        NULL      0.721510710310233      0.999195516219235
rsqa                        NULL      0.383599906547341      0.997702399475176
F_pval                      NULL     0.0762981122063386                   NULL
w_resid_quart                  0      -5.46438974663061                   NULL
w_resid_quart                  1      -3.44808553096924                   NULL
w_resid_quart                  2      0.839382652539098                   NULL
w_resid_quart                  3       2.08237170553033                   NULL
w_resid_quart                  4       5.03271582569117                   NULL

Notice that LINEST does a pretty good job with the m, se, tstat, pval, sey, df, and ss_resid statistics. And LINEST does not calculate F_pval nor does it calculate the quartiles of the residuals. The real difference arises with the calculation of the sum of squares of regression (ss_reg in LINEST; mss in WLS) which is used in the calculation of rsq, F, rsqm and rsqa.

The ss_reg value in LINEST is calculated as sum of yhat (y ̂) squared, where yhat is simply the independent variables multiplied by the coefficients of regression. We can see that calculation in the following SQL.

SELECT SUM(SQUARE([m0]*n.Intercept+[m1]*x1+[m2]*x2)) as ss_reg

FROM (

SELECT

'm' + cast(idx as char(1)) as coef,

stat_val

FROM

#ols

WHERE

stat_name = 'm'

PIVOT (max(stat_val) FOR coef in (m0,m1,m2))pvt

CROSS JOIN

(SELECT sqrt(w) as Intercept, x1*SQRT(w) as x1, x2*SQRT(w) as x2 FROM #t)n

This produces the following result.

For weighted least squares, this calculation needs to be adjusted for the weights. The following SQL shows how to make this adjustment.

DECLARE @wavg_y as float = (SELECT wct.WAVG(w,y) FROM #t)

DECLARE @mss as float = (

SELECT SUM(POWER([m0]*n.Intercept+[m1]*x1+[m2]*x2-@wavg_y,2)*w) as mss

FROM (

SELECT

'm' + cast(idx as char(1)) as coef,

stat_val

FROM

#ols

WHERE

stat_name = 'm'

PIVOT (max(stat_val) FOR coef in (m0,m1,m2))pvt

CROSS JOIN

(SELECT 1 as Intercept, x1, x2, w FROM #t)n

)

SELECT @mss as mss

This produces the following result.

Having gotten the regression modified sum-of-squares value, we can then use the same formulas as in ordinary least squares to calculate the remaining statistics

DECLARE @ssresid as float = (SELECT stat_val from #wls WHERE stat_name = 'ss_resid')

DECLARE @df as float = (SELECT stat_val from #wls WHERE stat_name = 'df')

DECLARE @rsq as float = @mss/(@mss+@ssresid)

DECLARE @rsqm as float = SQRT(@rsq)

DECLARE @p as float = (SELECT COUNT(*)-1 FROM #t)

DECLARE @rsqa as float = 1 - (1 - @rsq) * @p/ @df

DECLARE @Fobs as float = @mss / ((@p - @df) * @ssresid / @df)

DECLARE @Fdist as float = wct.F_DIST_RT(@Fobs, @p - @df, @df)

SELECT

stat_name,

NULL as idx,

stat_val

FROM (VALUES

('rsq',@rsq)

,('mss',@mss)

,('rsqm',@rsqm)

,('rsqa',@rsqa)

,('F',@Fobs)

,('F_pval',@Fdist)

)x(stat_name,stat_val)

This produces the following result.

Finally, the w_resid_quart values are simply the quartiles of the residuals.

SELECT

'w_resid_quart' as stat_name

,x.k as idx

,wct.QUARTILE(y - ([m0]*n.Intercept+[m1]*x1+[m2]*x2),x.k) as stat_val

FROM (

SELECT

'm' + cast(idx as char(1)) as coef,

stat_val

FROM

#ols

WHERE

stat_name = 'm'

PIVOT (max(stat_val) FOR coef in (m0,m1,m2))pvt

CROSS JOIN

(SELECT y*sqrt(w) as y,sqrt(w) as Intercept, x1 * sqrt(w) as x1, x2 * sqrt(w) as x2 FROM #t)n

CROSS APPLY

(VALUES (0),(1),(2),(3),(4))x(k)

GROUP BY

x.k

This produces the following result.

Products

Support

Contact Us
FAQ’s
Blog
XLeratorDB Documentation
- Financial
- Financial-Options
- Statistics
- Math
- Engineering
- Strings
- Windowing
XLeratorDLL Documentation
- Financial-DLL
XLeratorDB Installation Guide

XLeratorDB function packages
for SQL Server

XLeratorDB Compilation packages
for SQL Server

XLeratorDLL function packages
Microsoft .NET API Library

XLeratorDB function packages for
SQL Server (2008 & later)

XLeratorDB Compilation packages for
SQL Server (2008 & later)

XLeratorDLL function packages
Microsoft .NET API Library

Legacy XLeratorDB Packages for
SQL Server 2005

XLeratorDB/statistics Documentation

SQL Server Weighted Least Squares function

Products

Support

About

Pricing

XLeratorDB function packagesfor SQL Server

XLeratorDB Compilation packagesfor SQL Server

XLeratorDLL function packagesMicrosoft .NET API Library

XLeratorDB function packages for SQL Server (2008 & later)

XLeratorDB Compilation packages for SQL Server (2008 & later)

XLeratorDLL function packagesMicrosoft .NET API Library

Legacy XLeratorDB Packages for SQL Server 2005

XLeratorDB/statistics Documentation

SQL Server Weighted Least Squares function

Products

Support

About

Pricing

XLeratorDB function packages
for SQL Server

XLeratorDB Compilation packages
for SQL Server

XLeratorDLL function packages
Microsoft .NET API Library

XLeratorDB function packages for
SQL Server (2008 & later)

XLeratorDB Compilation packages for
SQL Server (2008 & later)

XLeratorDLL function packages
Microsoft .NET API Library

Legacy XLeratorDB Packages for
SQL Server 2005