Does a set of points define a line? - [Solved]

General FreeBASIC programming questions.
cbruce
Posts: 136
Joined: Sep 12, 2007 19:13
Location: Dallas, Texas

Does a set of points define a line? - [Solved]

Postby cbruce » Sep 16, 2007 20:03

Does anyone know how to determine if a set of points maps relatively close to a straight line?

(I'm obviously not a math guy)
:)

Thanks
Last edited by cbruce on Sep 23, 2007 21:59, edited 1 time in total.
Plasma
Posts: 191
Joined: May 27, 2005 5:22
Location: Earth
Contact:

Postby Plasma » Sep 16, 2007 20:28

A linear regression will do that.
cbruce
Posts: 136
Joined: Sep 12, 2007 19:13
Location: Dallas, Texas

Postby cbruce » Sep 17, 2007 4:59

Thanks Plasma, but after a lot of reading, it does not appear that linear regression is the answer to my question.

I may be incorrect, but it seems that linear regression will only best fit a line to a set of points. What I need to know is if my set of points "is" a straight line (or relatively close).

I could be getting a set of points that consist of either:
- a straight (or close) line of points
- an arc of points
- a widely dispersed cloud of points

I need to be able to determine if the set of points is either a "line" or "something else".
Basic Coder
Posts: 180
Joined: Aug 02, 2006 23:37
Location: Australia

Postby Basic Coder » Sep 17, 2007 7:09

cbruce wrote:Thanks Plasma, but after a lot of reading, it does not appear that linear regression is the answer to my question.

I may be incorrect, but it seems that linear regression will only best fit a line to a set of points. What I need to know is if my set of points "is" a straight line (or relatively close).

I could be getting a set of points that consist of either:
- a straight (or close) line of points
- an arc of points
- a widely dispersed cloud of points

I need to be able to determine if the set of points is either a "line" or "something else".


It would help if you could make the context of the problem clearer.

* * * * * * * * *
* * * * * * * * *
* * * * * * * * *
* * * * * * * * *
* * * * * * * * *
* * * * * * * * *

Choose a point above, draw a line through the point and
see how it can be part of many lines (which line do you
want) and notice the ratio of the change in coordinates
will remain constant.

--
Basic Coder
Richard
Posts: 3036
Joined: Jan 15, 2007 20:44
Location: Australia

Postby Richard » Sep 17, 2007 9:08

Here is a simple linear regression that also computes the coefficient of determination, rr = r squared.
If rr = 1.0000 then it is a perfectly straight line. As the fit gets worse rr falls towards zero.

Code: Select all

 '=============================================================
' Linear Regression. y = ax+b, rr = coefficient of determination
' If rr = 1 then line is perfectly straight, 0 is not straight
' if rr is in between then points are sort of in line.
'=============================================================
Dim As Integer i, n = 5     ' number of data points
Dim As Double x(1 to n) , y(1 to n)
Dim As Double SumX=0, SumY=0, SumXY=0, SumXX=0, SumYY=0
Dim As Double a, b, rr

' Initialise the x,y data points
For i = 1 To n
    x(i) = i
    y(i) = 3*i + 2
Next i

' Calculate sigma x, y, xy, xx and yy
For i = 1 To n
    SumX += x(i)
    SumY += y(i)
    SumXY += x(i)*y(i)
    SumXX += x(i)*x(i)
    SumYY += y(i)*y(i)
Next i

' compute coefficients
b = (SumXY-(SumX*SumY/n))/(SumXX-(SumX*SumX/n))
a = (SumY/n)-b*(SumX/n)
rr = (SumXY-(SumX*SumY/n))^2/((SumXX-(SumX*SumX/n))*(SumYY-(SumY*SumY/n)))

Print "Y intercept,  a = "; a
Print "Slope dy/dx,  b = "; b
Print "Fit Quality, rr = "; rr

Sleep

This code is untested. Good luck.
cbruce
Posts: 136
Joined: Sep 12, 2007 19:13
Location: Dallas, Texas

Postby cbruce » Sep 17, 2007 16:58

Hello Basic Coder,

Using your example, it's more like I can receive one of the following sets of points:

Code: Select all

A.
*
    *
      *
          *
                *

B.
* * *
            * * *

C.
*
  *
    *
    *
  *
*

D.
* * * *   * *     
    * * * * * * *
* * * * * * * * *
* *     * * *   *
  * *   *   * *   
*   * *   * *   *


I would like to interpret A. and B. as straight lines. Induction would then allow me to determine that C. was an arc and D. was "something else", simply by the size of the data sets.

I hope that makes it clearer - And thank you for your time.

P.S. - It looks like Richard may have set me aright...
cbruce
Posts: 136
Joined: Sep 12, 2007 19:13
Location: Dallas, Texas

Postby cbruce » Sep 17, 2007 16:59

Hello Richard,

So that is what coefficient of determination was! - (I told you I wasn't a geometer)...

I'll try it out. Thank you very much...
KristopherWindsor
Posts: 2428
Joined: Jul 19, 2006 19:17
Location: Sunnyvale, CA
Contact:

Postby KristopherWindsor » Sep 17, 2007 19:27

One way to do it, is to define the line first, then make sure all points are near that line.
This works, but the column of selected points seems to have a variable with. The solution, although I don't know how to do it, is to use something like "< 40 * slope" instead of just "< 40."

Code: Select all


' Draw some points, color the ones in a line
' by Kristopher Windsor

Const screenx = 800, screeny = 600, upoint = 5000

Type apoint
  As Integer x, y
End Type

Screenres screenx, screeny, 32

Dim As apoint points(1 To upoint)
Dim As Integer pcolor, yshouldbe
Dim As Double x1, y1, x2, y2, slope

Randomize Timer
For a As Integer = 1 To upoint
  With points(a)
    .x = Rnd * screenx
    .y = Rnd * screeny
  End With
Next a

Do
  Getmouse points(1).x, points(1).y
 
  'trick to make sure the line is not vertical
  points(1).x += points(1).x = points(2).x
 
  'define the line by the first two points
  x1 = points(1).x
  y1 = points(1).y
  x2 = points(2).x
  y2 = points(2).y
  slope = (y1 - y2) / (x1 - x2)

  Screenlock
  Cls
  'draw red line connecting first two points
  Line (x1, y1) - (x2, y2), &HFFFF0000
 
  'display points as small circles
  'white > is on line; green > not on line
  For a As Integer = 1 To upoint
    With points(a)
      yshouldbe = y1 + (.x-x1) * slope
      If Abs(.y - yshouldbe) < 40 Then pcolor = &HFFFFFFFF Else pcolor = &HFF00FF00
      Circle (.x, .y), 2, pcolor,,,, f
    End With
  Next a
  Screenunlock
Loop Until Inkey = Chr(27)

System

notthecheatr
Posts: 1759
Joined: May 23, 2007 21:52
Location: Cut Bank, MT
Contact:

Postby notthecheatr » Sep 17, 2007 20:13

Linear Regression, if I remember correctly, does fit a line to the data - and it also returns a value of how close the line actually is to the actual data.

You want to find out if an arbitrary line is close to the data? Use linear regression to find the line nearest to the data, then figure out how close it is to the line you're testing - probably by comparing the slope and intercept of the two lines, and setting a certain threshold value to decide what "close enough" is. This isn't quite as accurate as KristopherWindsor's method, but it might be good enough and I think it would be faster.

Another method is to draw the line you want and the data points on the screen and eyeball it :)
KristopherWindsor
Posts: 2428
Joined: Jul 19, 2006 19:17
Location: Sunnyvale, CA
Contact:

Postby KristopherWindsor » Sep 17, 2007 23:09

Here is my attempt to find the "average line" for a set of points. It is better to use an average line, then to use the line connecting the first two points.
I don't think this code works quite right, since it is my own algorithm to find the average line, but I haven't found any big problems yet, so this should work. :-)

Code: Select all

' Draw some points, color the ones in a line v2.0
' by Kristopher Windsor

Const screenx = 800, screeny = 600, upoint = 60

Type apoint
  As Integer x, y
End Type

Screenres screenx, screeny, 32

Dim As apoint points(1 To upoint)
Dim As Integer pcolor, yshouldbe, mousepoint, mx, my
Dim As Double x1, y1, x2, y2, slope, lx, ly, slopesign, slopetemp

Randomize Timer
For a As Integer = 2 To upoint
  With points(a)
    .x = Rnd * screenx
    .y = Rnd * screeny
  End With
Next a

Do
  Getmouse mx, my
  If Abs(points(mousepoint).x - mx) > 2 Or Abs(points(mousepoint).y - my) > 2 Then
    mousepoint += 1
    If mousepoint > upoint Then mousepoint = 1
    With points(mousepoint)
      .x = mx: .y = my
    End With
  End If
 
  'define the line by the average of the points
  lx = 0: ly = 0: slope = 0
  For a As Integer = 1 To upoint
    With points(a)
      'create a line for every point to the following point
      x1 = points(a).x
      y1 = points(a).y
      x2 = points(Iif(a = upoint, 1, a + 1)).x
      y2 = points(Iif(a = upoint, 1, a + 1)).y
      'accumulate the total x and y coords, and the slope
      lx += .x
      ly += .y
      slopetemp = (y1 - y2) / (x1 - x2 + (x1 = x2)) 'trick to make sure the slope is always defined
      slope += Abs(slopetemp)
      slopesign += slopetemp
    End With
  Next a
  'convert totals into averages
  lx /= upoint
  ly /= upoint
  slope /= upoint
  slope *= Sgn(slopesign)
  'create the average line with the average coords and slope (only used for display)
  x1 = lx - 300
  y1 = ly - 300 * slope
  x2 = lx + 300
  y2 = ly + 300 * slope
 
  Screenlock
  Cls
  Print "Slope: ", slope
 
  'draw yellow midpoint
  Circle (lx, ly), 4, &HFFFFFF00,,,, f
 
  'draw red "average" line
  Line (x1, y1) - (x2, y2), &HFFFF0000
 
  'display points as small circles
  'white > is on line; green > not on line
  For a As Integer = 1 To upoint
    With points(a)
      yshouldbe = ly + (.x - lx) * slope
      If Abs(.y - yshouldbe) < 40 Then pcolor = &HFFFFFFFF Else pcolor = &HFF00FF00
      Circle (.x, .y), 2, pcolor,,,, f
    End With
  Next a
  Screenunlock
 
  Sleep 20
Loop Until Inkey = Chr(27)

System
Richard
Posts: 3036
Joined: Jan 15, 2007 20:44
Location: Australia

Postby Richard » Sep 18, 2007 1:27

Here is a major rewrite of my original. Instead of rr it gives you the RMS error measured perpendicular to the line of best fit. If RMS is zero then the fit is perfect. It is based on the original linear regression but RMS does a much better job than rr as an estimate of deviation from straightness. Have more fun!

Code: Select all

'===============================================================
' Best Line Fit. y = mx+c, RMS = Root Mean Square error. This
'  algorithm will handles diagonal, vertical and horizontal lines.
'===============================================================
Dim As Integer i, n = 5     ' number of data points

' Initialise the x,y data points
Dim As Double x(1 To n) , y(1 To n)
Print " i     x(i)      y(i) "
For i = 1 To n
    x(i) = i
    y(i) = 2 * i + 3    ' 2 * i * i + 3
    Print Using "###  ###.####  ###.#### "; i; x(i); y(i)
Next i
Print

' Calculate sigma x, y, xy, xx and yy
Dim As Double SumX = 0, SumY = 0, SumXY = 0, SumXX = 0, SumYY = 0
For i = 1 To n
    SumX += x(i)
    SumY += y(i)
    SumXY += x(i)*y(i)
    SumXX += x(i)*x(i)
Next i

' compute coefficients, double solves division by zero and infinity
Dim As Double c, m, dx, dy
dx = SumXX - SumX * SumX / n
dy = SumXY - SumX * SumY / n
m = dy / dx
c = (SumY / n) - (m * SumX / n)

' precompute conjugate of unity vector
Dim as double Ux, Uy, scale
scale = sqr(dx*dx+dy*dy)
Ux = dx / scale
Uy = -dy / scale

' sum of squares = perpendicular distance from best fit line
Dim As Double e, sos = 0, RMS
For i = 1 To n
    e = Ux * (y(i) - c) + Uy * x(i)   ' vector rotation by slope
    sos += e * e
Next i
RMS = sqr(sos / n) ' hence the term Root of the Mean Square

Print Using " Y intercept, c =####.######"; c
Print Using " Slope dy/dx, m =####.######"; m
Print Using " Sum of squares =######.####"; sos
Print Using "    RMS  Error  =######.####"; RMS

Sleep
duke4e
Posts: 717
Joined: Dec 04, 2005 0:16
Location: Varazdin, Croatia, Europe
Contact:

Postby duke4e » Sep 18, 2007 2:13

Will this be good for you? End result if pretty similar to KristopherWindsor 1st solution, but with a much different approach.

Code: Select all

Const radius = 30
Const numpoints = 5000

Const xres = 800
Const yres = 600
Screenres 800, 600, 32

Function randINT(first As Integer, last As Integer) As Integer
    Function = Cint(Rnd * (last - first) + first)
End Function

Type pt2d
    x As Integer
    y As Integer
    mycolor As Uinteger
End Type

Dim Shared As pt2d points(numpoints)

For i As Integer = 0 To numpoints - 1
    points(i).x = randINT(0, xres - 1)
    points(i).y = randINT(0, yres - 1)
Next


Sub MyLine(x1 As Integer, y1 As Integer, x2 As Integer, y2 As Integer)
    Dim As Integer dx2, a, dy2, b, dx1, dy1, i1, i2, i
   Dim As Integer x, y, d
   
   DX2 = 1
   A = X2 - X1
   If A < 0 Then
      A = -A
      DX2 = -1
    End If
   
   DY2 = 1
   B = Y2 - Y1
   If B < 0 Then
      B = -B
      DY2 = -1
    End If
   
   DX1 = DX2
   DY1 = 0
   
   If A < B Then
        Swap a, b
      DX1 = 0
      DY1 = DY2
    End If
   
   I1 = B * 2
   D = I1 - A
   I2 = D - A
   X = X1
   Y = Y1
   
   For I = 0 To A
        For j As Integer = 0 To numpoints - 1
            Dim As Single dx, dy, dist
            dx = x - points(j).x
            dy = y - points(j).y
            dist = dx * dx + dy * dy
            If dist < radius * radius Then points(j).mycolor = Rgb(0, 255, 0)
        Next
       
        Pset (x, y)
      If D < 0 Then
         X += DX1
         Y += DY1
         D += I1
        Else
         X += DX2
         Y += DY2
         D += I2
        End If
    Next
End Sub

Dim As Integer x, y, x1, y1, x2, y2, buttons
Do
    Getmouse x, y, , buttons
    If buttons And 1 Then
        x1 = x
        y1 = y
    End If
    If buttons And 2 Then
        x2 = x
        y2 = y
    End If
   
    Screenlock
    Cls
   
    MyLine x1, y1, x2, y2
    For i As Integer = 0 To numpoints - 1
        Pset(points(i).x, points(i).y), points(i).mycolor
        points(i).mycolor = Rgb(255, 0, 0)
    Next
   
    Screenunlock
    Sleep 50
Loop Until Inkey = Chr(27)


Left mouse click to select head of line, right click to select tail of line.
Change consts "radius" and "numpoints" to whatever suits you.

This is just a bresenhams line algorithm with addition of checking for close points for every pixel of line.
cbruce
Posts: 136
Joined: Sep 12, 2007 19:13
Location: Dallas, Texas

Does a set of points define a line? - [Solved]

Postby cbruce » Sep 23, 2007 21:35

Thank you folks so much!!!

duke4e's solution is the one that worked for every data set I've gotten so far. Really elegant (simple) once I figured out what Bresenham's algorithm was and how it worked. And the check to see if my data points simply fell in an area was way cool.

I still don't think I would have figured out what duke4e and Kristopher intended from reviewing their code. Seeing the graphics made all the difference in the world.

FreeBASIC people are great!
cbruce

Return to “General”

Who is online

Users browsing this forum: Baidu [Spider] and 8 guests