Login     Register

        Contact Us     Search

XLeratorDB/statistics Documentation

SQL Server predicted values for multivariate linear regression


LMPRED
Updated: 02 February 2017
Use the SQL Server scalar function LMPRED to evaluate a dependent variable from the independent variables using the regression coefficients from the output of the LINEST or LINEST_q function. You can use the following equation to calculate the dependent value, y, from the independent variables, x.
The m values are included in the output from the XLeratorDB linear regression functions LINEST and LINEST_q and have a stat_name of 'm'. The subscripts for m are in the idx column of the regression output.
LMPRED matches the coefficients of regression (m) to the new independent variables in one of two ways. You can specify that they be matched using the names of the coefficients and the column names of the independent variables or by using the idx associated with m and the relative position of the new independent variables.
Syntax
SELECT [wct].[LMPRED](
  <@Coef, nvarchar(max),>
 ,<@New_x, nvarchar(max),>
 ,<@UseNames, bit,>)
Arguments

Input Name
Definition
@Coef
A string which contains a SELECT statement which returns a resultant table consisting of the idx, stat_val and col_name from the LINEST or LINEST_q table-valued functions.
@New_x
A string which contains either a SELECT statement which returns a single row consisting of the independent variables to be evaluated or a comma separated string of new independent variables. When @New_x is not a SELECT statement @UseNames is ignored and coefficients are match to independent variables based on their position in the string. 
@UseNames
A bit value indicating whether or not use the independent variable names to match up the coefficients with the independent variables. When True the coefficients are match to the independent variables using the names. Then False, the coefficients are match to the independent variables using the column position in the resultant table (or string) returned by @New_x.

Return Type
float
Remarks
·         There will be 1 more coefficient than independent variable
·         When @UseNames is 1, LMPRED matches the col_name from the regression to the column names in @New_x
·         LMPRED does not calculate the regression coefficients
·         Available in XLeratorDB / statistics 2008 only
Examples
Example #1
--Put data into a table
SELECT
   IDENTITY(int,1,1) as recno
   ,*
INTO
   #L
FROM (VALUES
    (2310,2,2,20,142000)
   ,(2333,2,2,12,144000)
   ,(2356,3,1.5,33,151000)
   ,(2379,3,2,43,150000)
   ,(2402,2,3,53,139000)
   ,(2425,4,2,23,169000)
   ,(2448,2,1.5,99,126000)
   ,(2471,2,2,34,142900)
   ,(2494,3,3,23,163000)
   ,(2517,4,4,55,169000)
   ,(2540,2,3,22,149000)
   )n([Floor Space],[Offices],[Entrances],[Age],[Assessed Value])
      
      
--Run the linear regression and store the results
SELECT
   *
INTO  
   #lm
FROM
   wct.LINEST('#L','[Floor Space],[Offices],[Entrances],[Age],[Assessed Value]','',NULL,5,'True')
The regression coefficients are stored in the temp table #lm and we can select them using the following SQL.
SELECT
   idx
   ,stat_val
   ,col_name
FROM
   #lm
WHERE
   stat_name = 'm'
This produces the following result.
As you can see from the resultant table, the regression coefficients can be referenced by the idx column or the col_name column. We can use the LMPRED function with the @UseNames = 1 and the input data to calculate a predicted Assessed Value. Note that we have passed all the column name from the input table into the @New_x variable not just the independent variable. By using @UseNames = 1, LMPRED matched the regression coefficient names to the column names returned by the @New_x SQL.
SELECT
    recno
   ,[Floor Space]
   ,[Offices]
   ,[Entrances]
   ,[Age]
   ,[Assessed Value]
   ,wct.LMPRED(
          'SELECT
              idx
              ,stat_val
              ,col_name
          FROM
              #lm
          WHERE
              stat_name = ''m'''         --@Coef
          ,'SELECT
              *
          FROM
              #l
          WHERE recno = ' + cast(recno as varchar(max))      --@New_x
          ,1                             --@UseNames
        ) as yhat
FROM
   #l
This produces the following result.
Example #2
Using the same regression coefficients as in the previous example, we set @UseNames = 0. This means that LMPRED will not use the regression coefficient names to match to the independent variables but will use the idx values instead which refer to the relative column position in the resultant table from the @New_x SQL. In other words, idx = 1 means the first column, idx = 2 means the second column, etc.
SELECT
    recno
   ,wct.LMPRED(
       'SELECT idx,stat_val FROM #lm WHERE stat_name = ''m'''
       ,'SELECT [Floor Space],[Offices],[Entrances],[Age] FROM #l WHERE recno = ' + cast(recno as varchar(max))
       ,0
       ) as Yhat
FROM
   #L
This produces the following result.
Example #3
Using the same regression coefficients as Example #1 this example shows how to calculate the predicted value when the new x-values are passed in as a comma separated string rather than as a SELECT statement. We use the built-in CONCAT function to build the string a derived table and then pass that value into LMPRED.
SELECT
   wct.LMPRED(
       'SELECT idx,stat_val FROM #lm WHERE stat_name = ''m'''
       ,newx
       ,0
       ) as Yhat
FROM (
   SELECT
       CONCAT([Floor Space],',',[Offices],',',[Entrances],',',[Age]) as newx
   FROM
       #L
   )n
This produces the following result.
See Also

 



Copyright 2008-2024 Westclintech LLC         Privacy Policy        Terms of Service