Two Plus Two Newer Archives  

Go Back   Two Plus Two Newer Archives > Other Topics > Science, Math, and Philosophy
FAQ Community Calendar Today's Posts Search

Reply
 
Thread Tools Display Modes
  #1  
Old 10-21-2007, 12:37 AM
CallMeIshmael CallMeIshmael is offline
Senior Member
 
Join Date: Dec 2004
Location: Tis the season, imo
Posts: 7,849
Default I need a good algorithm! (Regression related)

Ive got a bunch of data, in the form

[ a1 a2 a3 .. an a(n+1)]
[ b1 b2 b3 .. bn b(n+1)]
[ c1 c2 c3 .. cn c(n+1)]
...

I want to find the best set of n coefficients so that the sum of (coefficient*letter) for the first n terms predicts the last term. Im not really sure what the name for this is, but its some kind of regression.


Does anyone have any suggestion for places to look for an algorithm for this type of problem? Or some other point in the right direction?


To note: Im looking at a lot of data here, so Im thinking the slower stuff like matlab/R (despite the built in functions for this) wont do it fast enough, and I'll be looking to do it in C.
Reply With Quote
  #2  
Old 10-21-2007, 01:45 AM
theblackkeys theblackkeys is offline
Senior Member
 
Join Date: Sep 2006
Location: DIDS minus 21 pounds of fatness
Posts: 1,260
Default Re: I need a good algorithm! (Regression related)

I'm not a math major but it looks like a linear algebra problem. Which leaves me dumbfounded because I'd bet some money you've taken that before. If that's the case, I don't know how to help you, but if not, you should read up a bit on linear algebra.
Reply With Quote
  #3  
Old 10-21-2007, 01:48 AM
Duke Duke is offline
Senior Member
 
Join Date: Sep 2002
Location: SW US
Posts: 5,853
Default Re: I need a good algorithm! (Regression related)

I don't have a good algorithm, but I'd "brute force" it with a simple neural net, and then train the hell out of it. Like one node for each of the coefficients simple. I suppose that it would be pretty tough to avoid local maximums in this sort of thing, though, so that might be a crappy solution.
Reply With Quote
  #4  
Old 10-21-2007, 09:30 AM
Drag Drag is offline
Senior Member
 
Join Date: Oct 2006
Location: France
Posts: 117
Default Re: I need a good algorithm! (Regression related)

If n>= (number of rows) then it's quite trivial linear algebra problem.

If n< (number of rows) it is much less trivial and requires formulating a hypothesis about your data. It also depends on the nature of your data, i.e. can some data, column 4 for instance, be completely uncorrelated with column (n+1).
Check factor analysis for specifics.
http://en.wikipedia.org/wiki/Factor_analysis
Reply With Quote
  #5  
Old 10-21-2007, 09:36 AM
jogsxyz jogsxyz is offline
Senior Member
 
Join Date: Mar 2005
Posts: 1,167
Default Re: I need a good algorithm! (Regression related)

Looks like an analysis of variance problem. Fitting the data and a F-test.
Reply With Quote
  #6  
Old 10-21-2007, 06:42 PM
CallMeIshmael CallMeIshmael is offline
Senior Member
 
Join Date: Dec 2004
Location: Tis the season, imo
Posts: 7,849
Default Re: I need a good algorithm! (Regression related)

[ QUOTE ]
If n>= (number of rows) then it's quite trivial linear algebra problem.

If n< (number of rows) it is much less trivial and requires formulating a hypothesis about your data. It also depends on the nature of your data, i.e. can some data, column 4 for instance, be completely uncorrelated with column (n+1).
Check factor analysis for specifics.
http://en.wikipedia.org/wiki/Factor_analysis

[/ QUOTE ]


v nice. Thanks.


re bolded part: would that, in the end, matter? Obv it would be best to not have uncorrelated data for sake of computational time, but, I would assume that as long as there is a lot of data, the coefficient on uncorrelated terms --> 0, no?
Reply With Quote
  #7  
Old 10-21-2007, 06:47 PM
CallMeIshmael CallMeIshmael is offline
Senior Member
 
Join Date: Dec 2004
Location: Tis the season, imo
Posts: 7,849
Default Re: I need a good algorithm! (Regression related)

[ QUOTE ]
I'm not a math major but it looks like a linear algebra problem. Which leaves me dumbfounded because I'd bet some money you've taken that before. If that's the case, I don't know how to help you, but if not, you should read up a bit on linear algebra.

[/ QUOTE ]

Just to clarify: Im looking for ONE set of coefficients, such that the mean square error sum of the approximation is minimized.

Im assuming you thought I was looking for a coefficient set for each row. I only wish!
Reply With Quote
  #8  
Old 10-21-2007, 06:49 PM
CallMeIshmael CallMeIshmael is offline
Senior Member
 
Join Date: Dec 2004
Location: Tis the season, imo
Posts: 7,849
Default Re: I need a good algorithm! (Regression related)

[ QUOTE ]
I don't have a good algorithm, but I'd "brute force" it with a simple neural net, and then train the hell out of it. Like one node for each of the coefficients simple. I suppose that it would be pretty tough to avoid local maximums in this sort of thing, though, so that might be a crappy solution.

[/ QUOTE ]


This exact thought process of "ohh, this might be great. Ohhh, wait, local/global problem. Crap." was something I went through yesterday.
Reply With Quote
  #9  
Old 10-22-2007, 05:48 AM
Drag Drag is offline
Senior Member
 
Join Date: Oct 2006
Location: France
Posts: 117
Default Re: I need a good algorithm! (Regression related)

[ QUOTE ]
[ QUOTE ]
If n>= (number of rows) then it's quite trivial linear algebra problem.

If n< (number of rows) it is much less trivial and requires formulating a hypothesis about your data. It also depends on the nature of your data, i.e. can some data, column 4 for instance, be completely uncorrelated with column (n+1).
Check factor analysis for specifics.
http://en.wikipedia.org/wiki/Factor_analysis

[/ QUOTE ]


v nice. Thanks.


re bolded part: would that, in the end, matter? Obv it would be best to not have uncorrelated data for sake of computational time, but, I would assume that as long as there is a lot of data, the coefficient on uncorrelated terms --> 0, no?

[/ QUOTE ]

Yeah, in principle you'd get 0 for an uncorrelated data, it just can increase the computational cost by a big factor if you try to 'brute force' it.

If I had to solve such a problem I would try to formulate a hypothesis, which would take the physics of the process into account.
Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -4. The time now is 02:14 PM.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.