Site Tools


Linear Regression

RhinoScript

Version: Rhino 4.0
Summary: Demonstrates how to calculate linear regression using RhinoScript.

Overview

Linear regression is a method to best fit a linear equation, or a straight line, of the form y(x) = a + bx to a collection of n points (x, y), where b is the slope and a the intercept on the y-axis.

More Information

In the following implementation, adapted from source code provided by Paul Bourke, the result will be stated below without derivation, that requires minimization of the sum of the squared distance from the data points and the proposed line. This function is minimized by calculating the derivative with respect to a and b and setting these to zero.

This method assumes there is no known variance for the x and y values. There are solutions which can take this into account, this is particularly important if some values are known with less error than others. Also, this method requires that the slope is not infinite.

 ' Description:
 '   Linear Regression
 '   y(x) = a + bx, for n samples.
 ' Parameters:
 '   data - [in]  An array of (x,y) values.
 '   a    - [out] The slope.
 '   b    - [out] The y-axis intersect.
 '   c    - [out] The regression coefficient.
 ' Returns:
 '   True if successful, False otherwise.

 Function LinearRegression(ByVal data, ByRef a, ByRef b, ByRef r)
 
   ' Local variables
   Dim d, x, y, n
   Dim sumx, sumy, sumx2, sumy2, sumxy, sxx, syy, sxy
 
   ' Initialize variables
   sumx = 0 : sumy = 0 : sumx2 = 0 : sumy2 = 0 : sumxy = 0
   n = UBound(data) + 1
 
   ' Initialize output
   a = 0 : b = 0 : r = 0
 
   ' Default return value
   LinearRegression = False
 
   ' Must have at least two points
   If (n < 2) Then Exit Function
 
   ' Compute some things we need
   For Each d In data
     x = d(0)
     y = d(1)
     sumx  = sumx  + x
     sumy  = sumy  + y
     sumx2 = sumx2 + (x * x)
     sumy2 = sumy2 + (y * y)
     sumxy = sumxy + (x * y)
   Next
 
   sxx = sumx2 - (sumx * sumx / n)
   syy = sumy2 - (sumy * sumy / n)
   sxy = sumxy - (sumx * sumy / n)
 
   ' Infinite slope (b), non existant intercept (a)
   If (Abs(sxx) = 0) Then Exit Function
 
   ' Compute slope (b) and intercept (a)
   b = sxy / sxx
   a = sumy / n - b * sumx / n
 
   ' Compute regression coefficient 
   If (Abs(syy) = 0) Then
     r = 1
   Else
     r = sxy / Sqr(sxx * syy)
   End If
 
   LinearRegression = True
 
 End Function

Example

The following example shows the points and the best fit line as determined using the techniques demonstrated above.

 Sub Main()
 
   Dim data(9), a, b, r
 
   data(0) = Array(-0.20707, -0.319029)
   data(1) = Array(0.706672, 0.0931669)
   data(2) = Array(1.63739, 2.17286)
   data(3) = Array(2.03117, 2.76818)
   data(4) = Array(3.31874, 3.56743)
   data(5) = Array(5.38201, 4.11772)
   data(6) = Array(6.79971, 5.52709)
   data(7) = Array(6.31814, 7.46613)
   data(8) = Array(8.20829, 8.7654)
   data(9) = Array(8.53994, 9.58096)
 
   If (LinearRegression(data, a, b, r) = True) Then
     Call Rhino.Print("Slope (b) = " & FormatNumber(b, 3))
     Call Rhino.Print("Y Intercept (a) = " & FormatNumber(a, 3))
     Call Rhino.Print("Regression Coefficient = " & FormatNumber(r, 3))
   End If
 
 End Sub
developer/scriptsamples/linearregression.txt ยท Last modified: 2010/01/26 (external edit)