Two Plus Two Newer Archives

Two Plus Two Newer Archives (http://archives1.twoplustwo.com/index.php)
-   Science, Math, and Philosophy (http://archives1.twoplustwo.com/forumdisplay.php?f=49)
-   -   I need a good algorithm! (Regression related) (http://archives1.twoplustwo.com/showthread.php?t=527518)

CallMeIshmael 10-21-2007 12:37 AM

I need a good algorithm! (Regression related)
 
Ive got a bunch of data, in the form

[ a1 a2 a3 .. an a(n+1)]
[ b1 b2 b3 .. bn b(n+1)]
[ c1 c2 c3 .. cn c(n+1)]
...

I want to find the best set of n coefficients so that the sum of (coefficient*letter) for the first n terms predicts the last term. Im not really sure what the name for this is, but its some kind of regression.


Does anyone have any suggestion for places to look for an algorithm for this type of problem? Or some other point in the right direction?


To note: Im looking at a lot of data here, so Im thinking the slower stuff like matlab/R (despite the built in functions for this) wont do it fast enough, and I'll be looking to do it in C.

theblackkeys 10-21-2007 01:45 AM

Re: I need a good algorithm! (Regression related)
 
I'm not a math major but it looks like a linear algebra problem. Which leaves me dumbfounded because I'd bet some money you've taken that before. If that's the case, I don't know how to help you, but if not, you should read up a bit on linear algebra.

Duke 10-21-2007 01:48 AM

Re: I need a good algorithm! (Regression related)
 
I don't have a good algorithm, but I'd "brute force" it with a simple neural net, and then train the hell out of it. Like one node for each of the coefficients simple. I suppose that it would be pretty tough to avoid local maximums in this sort of thing, though, so that might be a crappy solution.

Drag 10-21-2007 09:30 AM

Re: I need a good algorithm! (Regression related)
 
If n>= (number of rows) then it's quite trivial linear algebra problem.

If n< (number of rows) it is much less trivial and requires formulating a hypothesis about your data. It also depends on the nature of your data, i.e. can some data, column 4 for instance, be completely uncorrelated with column (n+1).
Check factor analysis for specifics.
http://en.wikipedia.org/wiki/Factor_analysis

jogsxyz 10-21-2007 09:36 AM

Re: I need a good algorithm! (Regression related)
 
Looks like an analysis of variance problem. Fitting the data and a F-test.

CallMeIshmael 10-21-2007 06:42 PM

Re: I need a good algorithm! (Regression related)
 
[ QUOTE ]
If n>= (number of rows) then it's quite trivial linear algebra problem.

If n< (number of rows) it is much less trivial and requires formulating a hypothesis about your data. It also depends on the nature of your data, i.e. can some data, column 4 for instance, be completely uncorrelated with column (n+1).
Check factor analysis for specifics.
http://en.wikipedia.org/wiki/Factor_analysis

[/ QUOTE ]


v nice. Thanks.


re bolded part: would that, in the end, matter? Obv it would be best to not have uncorrelated data for sake of computational time, but, I would assume that as long as there is a lot of data, the coefficient on uncorrelated terms --> 0, no?

CallMeIshmael 10-21-2007 06:47 PM

Re: I need a good algorithm! (Regression related)
 
[ QUOTE ]
I'm not a math major but it looks like a linear algebra problem. Which leaves me dumbfounded because I'd bet some money you've taken that before. If that's the case, I don't know how to help you, but if not, you should read up a bit on linear algebra.

[/ QUOTE ]

Just to clarify: Im looking for ONE set of coefficients, such that the mean square error sum of the approximation is minimized.

Im assuming you thought I was looking for a coefficient set for each row. I only wish!

CallMeIshmael 10-21-2007 06:49 PM

Re: I need a good algorithm! (Regression related)
 
[ QUOTE ]
I don't have a good algorithm, but I'd "brute force" it with a simple neural net, and then train the hell out of it. Like one node for each of the coefficients simple. I suppose that it would be pretty tough to avoid local maximums in this sort of thing, though, so that might be a crappy solution.

[/ QUOTE ]


This exact thought process of "ohh, this might be great. Ohhh, wait, local/global problem. Crap." was something I went through yesterday.

Drag 10-22-2007 05:48 AM

Re: I need a good algorithm! (Regression related)
 
[ QUOTE ]
[ QUOTE ]
If n>= (number of rows) then it's quite trivial linear algebra problem.

If n< (number of rows) it is much less trivial and requires formulating a hypothesis about your data. It also depends on the nature of your data, i.e. can some data, column 4 for instance, be completely uncorrelated with column (n+1).
Check factor analysis for specifics.
http://en.wikipedia.org/wiki/Factor_analysis

[/ QUOTE ]


v nice. Thanks.


re bolded part: would that, in the end, matter? Obv it would be best to not have uncorrelated data for sake of computational time, but, I would assume that as long as there is a lot of data, the coefficient on uncorrelated terms --> 0, no?

[/ QUOTE ]

Yeah, in principle you'd get 0 for an uncorrelated data, it just can increase the computational cost by a big factor if you try to 'brute force' it.

If I had to solve such a problem I would try to formulate a hypothesis, which would take the physics of the process into account.


All times are GMT -4. The time now is 05:49 AM.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.