Two Plus Two Newer Archives - Market Model Thingy

Two Plus Two Newer Archives (http://archives1.twoplustwo.com/index.php)

- Business, Finance, and Investing (http://archives1.twoplustwo.com/forumdisplay.php?f=32)

- - Market Model Thingy (http://archives1.twoplustwo.com/showthread.php?t=535801)

CallMeIshmael

10-31-2007 11:45 PM

Market Model Thingy

As preface, I'll note that I have very little knowledge of the market, or anything to do with finance. So, some things I could have implicitly assumed could be very silly in the eyes some of you. I've done a lot of biological modelling, and thought Id take a stab at a market model.

I pulled data from yahoo for the NYSE over the past 7.5 years. The sample size ended up being 1747 stocks. Stocks with less than 2050 trading days or ones with the dash in the name (messed w/ my download script, and I figured leaving them out wasnt a big deal) werent included.

I used a 41 term model of various stats regarding past price and volume for the stocks over the previous 50 trading days. The equation was designed to predict the ratio of the next days stock to its current price. The data was cut into pieces for training purposes. Ie. remove days 2000->1800, then regress for coefficients using the updated data set, then use those coefficients to predict market changes over days 2000->1800. Repeat until done.

I tested it in two ways. 1) Examine the results of the top 5 picks for each trading day 2) For each trading day, predict the 100 stocks with the highest percentage increase and then see how many match the actual top 100 performers.

1) Showed promising results. The arithmetic mean for the daily returns was 1.0060 (not sure if this is how its said, but, on average, the stock multiplied themselves by 1.0060), and the geometric mean of the daily returns was 1.0051. To compare: the arithmetic mean return for the entire data set was 1.0010. Calculating the geometric mean for the entire data set would be hard, but its going to be a ways under 1.0051.

2) The test showed an average of 13.11 picks out of 100, for the top 100 performers. This is significantly greater than the random expectation of 5.72.

For something that took only maybe 10 days to put together, and a model that took all of 10 minutes think of, these results seems surprisingly good. Given the debate over the EMH, they seem too good to be true. But, I've checked the code several times over, and if I made a misake, I cant find it.

I havent tested the obvious questions: how much would it make / could it beat a buy-and-hold strategy? The reason being, I dont know how to compute transaction costs. How much are they? Do they go up by # of stocks, or just constant cost per trade? Do you pay when you buy and sell, or just buy? If someone could help me out with those, that would be appreciated. Also, what level of invesment is the assumption that the stocks would behaved similar enough to the way they did without that investment no longer valid?

Also, this has sort of sparked an interest in doing a project like this, but actually putting some time into the model. Can anyone recommend some good reading?

Jimbo

11-01-2007 12:01 AM

Re: Market Model Thingy

Go back 7 years and see that your results change radically.

Jimbo

CallMeIshmael

11-01-2007 04:15 AM

Re: Market Model Thingy

[ QUOTE ]
Go back 7 years and see that your results change radically.

Jimbo

[/ QUOTE ]

Hmm. Im not sure if there is some hidden meaning here (ie. referencing some market trend that Im unawre of), so Ill take if on face value.

I did the calculations for stocks with > 4050 trading days, analyzing days 4000->2000 (ie. roughly spanning 7.5 to 15 years ago).

Although it wasnt quite as successful, it still outperformed the market by quite a bit (geometric mean of 1.0026).

Also, it got 15.97 right, on average, for the top 100 picks of the day, which is better than the 10.3 you'd expect at random.

Also, the model was trained on the days 2000->1, meaning that in some cases the model was using training data 15 after the test day. Arguably, part of the difference could be the result of that.

hawk59

11-01-2007 09:24 AM

Re: Market Model Thingy

When you have a lot of data and you try to do things with it you will find lots of relationships, most of which have no predictive ability. It's called data mining. I think a lot of the time you just have to ask yourself if it intuitively makes sense, ie you can say a low price/book strategy makes sense to outperform or buying tax loss stocks at the end of the year will outperform in the next year. But when you have 41 variables all mashed up and you think you have something meaningful then that wouldn't make sense to me.

Phone Booth

11-01-2007 09:51 AM

Re: Market Model Thingy

[ QUOTE ]
As preface, I'll note that I have very little knowledge of the market, or anything to do with finance. So, some things I could have implicitly assumed could be very silly in the eyes some of you. I've done a lot of biological modelling, and thought Id take a stab at a market model.

I pulled data from yahoo for the NYSE over the past 7.5 years. The sample size ended up being 1747 stocks. Stocks with less than 2050 trading days or ones with the dash in the name (messed w/ my download script, and I figured leaving them out wasnt a big deal) werent included.

I used a 41 term model of various stats regarding past price and volume for the stocks over the previous 50 trading days. The equation was designed to predict the ratio of the next days stock to its current price. The data was cut into pieces for training purposes. Ie. remove days 2000->1800, then regress for coefficients using the updated data set, then use those coefficients to predict market changes over days 2000->1800. Repeat until done.

I tested it in two ways. 1) Examine the results of the top 5 picks for each trading day 2) For each trading day, predict the 100 stocks with the highest percentage increase and then see how many match the actual top 100 performers.

1) Showed promising results. The arithmetic mean for the daily returns was 1.0060 (not sure if this is how its said, but, on average, the stock multiplied themselves by 1.0060), and the geometric mean of the daily returns was 1.0051. To compare: the arithmetic mean return for the entire data set was 1.0010. Calculating the geometric mean for the entire data set would be hard, but its going to be a ways under 1.0051.

2) The test showed an average of 13.11 picks out of 100, for the top 100 performers. This is significantly greater than the random expectation of 5.72.

For something that took only maybe 10 days to put together, and a model that took all of 10 minutes think of, these results seems surprisingly good. Given the debate over the EMH, they seem too good to be true. But, I've checked the code several times over, and if I made a misake, I cant find it.

I havent tested the obvious questions: how much would it make / could it beat a buy-and-hold strategy? The reason being, I dont know how to compute transaction costs. How much are they? Do they go up by # of stocks, or just constant cost per trade? Do you pay when you buy and sell, or just buy? If someone could help me out with those, that would be appreciated. Also, what level of invesment is the assumption that the stocks would behaved similar enough to the way they did without that investment no longer valid?

Also, this has sort of sparked an interest in doing a project like this, but actually putting some time into the model. Can anyone recommend some good reading?

[/ QUOTE ]

What kind of assumptions are you using for execution prices? This is really the key for this type of analysis - a lot of great relationships aren't tradable.

I know people will give you a hard time here but all these qualitative insights and patterns people write about and trade on have a much flimsier basis (this stock always does this at time X; when fed cuts, this happens; these sectors are correlated in this way, etc, etc).

Jimbo

11-01-2007 11:33 AM

Re: Market Model Thingy

I find your analysis quite ineteresting but can't quite define exactly what you mean by the bolded portion in this below quote. In other words what are you calling "the market".

[ QUOTE ]
Although it wasnt quite as successful, it still outperformed the market by quite a bit (geometric mean of 1.0026).

[/ QUOTE ]

Thanks in advance for the interesting topic,

Jimbo

CallMeIshmael

11-01-2007 01:46 PM

Re: Market Model Thingy

[ QUOTE ]
When you have a lot of data and you try to do things with it you will find lots of relationships, most of which have no predictive ability. It's called data mining. I think a lot of the time you just have to ask yourself if it intuitively makes sense, ie you can say a low price/book strategy makes sense to outperform or buying tax loss stocks at the end of the year will outperform in the next year. But when you have 41 variables all mashed up and you think you have something meaningful then that wouldn't make sense to me.

[/ QUOTE ]

Obviously predictive ability is the key here, I certainly agree to that. Regressing data means nothing if what is about to happen isnt related to what has already happened.

But, the model produced gusses for the top 100 movers for each day over 15 years that were 19.96 and 26.30 (for each 2000 day period) standard deviations above what you'd expect if the model had no predictive ability (technically speaking its not binomial so this isnt 100% correct, but its close, and doing it right wont change the conclusion)

Also, at least to someone without much knowledge of the market, a model like this does make intuitive sense. How a stock is doing today relative to its past recent prices, its max/min over the previous 2 months, and how much volume its been trading for over the past few days, seem like they should have a small correlation to how it will to today.

(just to note: 41 = 10 variables * 4 transformations (x,x^2,ln(x),log10(x)) + 1 constant)

CallMeIshmael

11-01-2007 01:59 PM

Re: Market Model Thingy

Phone,

"What kind of assumptions are you using for execution prices?"

Sorry, what is an execution price?

Jimbo,

I ran three tests to see how the model compared to market fluctuation.

Imagine we have some amount of money, and opt to buy the top5 picks of the model everyday. Assume we use all of our money (ie. just for the sake of testing assume you can buy fractions of stocks), always buy at open and sell at close. Compare this to:

1) (Amount of money we started with) * (Total Market Value of Stocks at End) / (Total Market Value of Stocks at Start)

(ie. How much the entire test market went up)

2) Randomly buy 5 stocks on day 1, and hold them for 2000 days. (Do this test 1 million times)

3) Use the strategy of buying/selling 5 each day, but do so at random. (Do this test 1 million times)

By "outperforming the market" I mean that the model produced better than the market in test 1, and in a very high percentile of the 1 million results for test 2/3. Basically, I want to test to make sure any increase seen isnt the result of the average stock price going up.

The model produced results that were WAY WAY above average, but, given I didnt take transaction costs into account, the utility of the model is still unknown.

haakee

11-01-2007 02:07 PM

Re: Market Model Thingy

[ QUOTE ]
I find your analysis quite ineteresting but can't quite define exactly what you mean by the bolded portion in this below quote. In other words what are you calling "the market".

[ QUOTE ]

Although it wasnt quite as successful, it still outperformed the market by quite a bit (geometric mean of 1.0026).

[/ QUOTE ]

[/ QUOTE ]

Does that really matter? Excluding commissions 1.0026/day is about 85%/year assuming 240 trading days.

You'd need pretty big individual positions if you're trading every day through a discount online broker. At $10/trade if you just picked one of your 100 per day you'd eat up close to $5000 in commissions in a year. If your positions were $50K each that would still be 10% of your annual return.

Phone Booth

11-01-2007 02:28 PM

Re: Market Model Thingy

[ QUOTE ]
Phone,

"What kind of assumptions are you using for execution prices?"

Sorry, what is an execution price?

[/ QUOTE ]

What prices are you buying and selling at?

Jimbo

11-01-2007 02:44 PM

Re: Market Model Thingy

[ QUOTE ]
(ie. How much the entire test market went up)

[/ QUOTE ]

OK, got you (I think), it might be more useful to compare it to say how much the S&P 500 went up over the same timeframe or the DJIA. After all noone buys stocks randomly, at least I hope not.

Jimbo

In case any of you random buyers are following this thread please buy as much Garmin today as you are able to do so, at random intervals of course. [img]/images/graemlins/smile.gif[/img]

CallMeIshmael

11-01-2007 03:35 PM

Re: Market Model Thingy

[ QUOTE ]
[ QUOTE ]
Phone,

"What kind of assumptions are you using for execution prices?"

Sorry, what is an execution price?

[/ QUOTE ]

What prices are you buying and selling at?

[/ QUOTE ]

Well, I never really had a specific trading rule in mind for this model. I was more interested if these sorts of predictions were even possible. I'm almost certainly going to put some real effort into a much better model, using intraday data. There, I'll be very interested in maximizing return, and producing a solid trading rule.

Given that it seems haakee implied $10/trade is a reasonable transaction fee (and assuming a trade = buying AND selling, and we dont pay for both), I'll probably compute some returns using a strategy of buying at opening price and selling at closing price for each day.

How reasonable is the assumption you can buy/sell at open/close prices?

Yobz	11-01-2007 03:39 PM

Re: Market Model Thingy

CMI: I would also try your model by selling the stock at the opening price on the next day. Lots of companies report earnings overnight and you might be losing lots of EV this way.

Note: I'm also fairly uneducated about stocks, so I might be completely wrong. :-)

CallMeIshmael

11-01-2007 05:06 PM

Re: Market Model Thingy

[ QUOTE ]
CMI: I would also try your model by selling the stock at the opening price on the next day. Lots of companies report earnings overnight and you might be losing lots of EV this way.

[/ QUOTE ]

ohh wow. I didnt even know they were different. I just assumed each day's closing price was the next opening price. So, all calculations made were actually assuming you would sell your current day's stocks at close, and buy your next days stock also at close.

Anyway, given a $10/trade fee, I did the following simulation:

Start 10,000 accounts at $10,000 each. Get the top 15 picks for each day, then have each account randomly select 3 of those 15 picks for the day, spending 1/3 of their current worth on each of the stocks. Then, sell at close. Repeat for the 1999 days of testing. (subtract 30 dollars from the account each day)

In the end, there was about a 3.8% chance of being down, and about a 3.5% risk of ruin

These are graphs for the 25th/50th/75th percentiles for each day. I zoomed in on the first 600 days of the test in graph 2.

http://i73.photobucket.com/albums/i2...002/stock1.jpg

http://i73.photobucket.com/albums/i2...002/stock2.jpg

Obviously, these results are encouraging / seem too good to be true. Ive looked through the code again, and cant find any big mistakes. I even ran the simulation using coefficents that were trained on the same days as the ones they were tested on to make sure I didnt accidentally do that, and I got the expected blowup in results.

Also, I'd imagine the inherent assumption that you can invest large sums in a stock and still have it behave the same way it would have without that investment is pretty flawed. Even with that considered, the results still seem pretty encouraging.

eastbay

11-02-2007 01:44 AM

Re: Market Model Thingy

Lots of traps here. No guarantees you're not onto something, but not likely either. "If it was that easy everybody would do it", and all that.

Slippage and commission is important, and can turn smooth get rich quick strategies into certain losers. Looks like you're at least starting to account for commission. Don't forget slippage.

Inadvertent lookahead is an easy mistake to make. You can set up your code to help protect against it by only feeding data from a mock data source, rather than having it all in memory and just hoping you don't make a mistake in referencing the future.

You can't train and test on overlapping data. I couldn't quite tell if you were doing that from your description, but such tests are worthless for obvious implementation reasons.

If you've sure you're not doing any of the above, well, trade it by hand for awhile and see how it goes.

eastbay

CallMeIshmael

11-02-2007 04:08 AM

Re: Market Model Thingy

eastbay,

nice post. I would say Im reasonably sure there wasnt any look forward/overlap. I was pretty paranoid about this, and even reran the results with coefficients that I thought had overlap to observe that the results obtained then were a ton better (they were). Also, it was just about impossible for overlap to occur on the second test (the second 7 year period), since the testing data needs to go through a semi-longish transformation that I never did on that data set (just trained using the more recent 7 years), and that showed moderately comparable results.

Evan and I talked about ask/bid spread, which appears to be the same as slippage, yes? I had no idea what this was until a few hours ago, and appears to be something that would be v imporant to incorporate. Assume that a stock has a listed closing price of X is there a reasonable function to get an estimate of the price I would be expected to sell at, assuming I sell at close? Im learning that it is slightly smaller than X, but Im wondering if its possible to estimate it quantitatively.

"If you've sure you're not doing any of the above, well, trade it by hand for awhile and see how it goes."

Just to note: Im more interested IF its possible to using past information to predict the future, not necessarily (at least here) with trading.

Also, just as a general comment to all: I dont really know how to explain it, but you often see people on the forums with little poker knowledge arging against the common beliefs, and you cant help but think that he just doesnt understand the game. Im well aware thats what Im doing here, Id like to stress that Im skeptical of the results myself. Given the debate over the EMH, its v odd that the model could predict the top 100 performers for the next day so well; it would stand to reason that if the model were accurate, that the EMH ought to be discared, yet it hasnt been. But, again, I just couldnt find the error. If anyone knows matlab well, Id be haappy to comment/clean up the code and have them audit it.

edtost

11-02-2007 08:28 AM

Re: Market Model Thingy

[ QUOTE ]

Evan and I talked about ask/bid spread, which appears to be the same as slippage, yes? I had no idea what this was until a few hours ago, and appears to be something that would be v imporant to incorporate. Assume that a stock has a listed closing price of X is there a reasonable function to get an estimate of the price I would be expected to sell at, assuming I sell at close? Im learning that it is slightly smaller than X, but Im wondering if its possible to estimate it quantitatively.

[/ QUOTE ]

slippage and bid/ask are totally different. bid/ask is the spread between where you can buy and sell a block of shares at any given point in time. slippage is how much the market moves against you while you are executing. for very small traders, fixed commissions tend to be their biggest concern. once your account gets bigger, bid/ask becomes the dominant 'cost' to your trading. for institutional-sized accounts, slippage winds up having the largest effect.

basically, until you wouldn't be comfortable sending your trade to the exchange as a single order because it probably won't get filled, slippage isn't really something you need to worry much about. at least not until something bad happens and market liquidity dries up.

i'm sure someone who knows more than i do about single-stock trading can quote you an average bid/ask for large caps.

[ QUOTE ]
Given the debate over the EMH, its v odd that the model could predict the top 100 performers for the next day so well; it would stand to reason that if the model were accurate, that the EMH ought to be discared, yet it hasnt been. But, again, I just couldnt find the error. If anyone knows matlab well, Id be haappy to comment/clean up the code and have them audit it.

[/ QUOTE ]

EMH only needs to be discarded is your system is actually implementable in a way that would make money. trading at the same price (the close) you use to make a decision is a good way to make your system unrealistic. start by modifying your program to include a one-day lag for trading and a linear transaction cost for bid/ask; if your system is still profitable after those changes, then this becomes a much more interesting discussion.

edit: also, to nitpick, when you do your out of sample testing, you should train your coefficients on the older data set and look at the results on the newer set.

eastbay

11-02-2007 11:14 AM

Re: Market Model Thingy

[ QUOTE ]

Evan and I talked about ask/bid spread, which appears to be the same as slippage, yes?

[/ QUOTE ]

No, slippage is the difference between your data feed's quote and your fill. In general they will not be the same.

[ QUOTE ]

Just to note: Im more interested IF its possible to using past information to predict the future, not necessarily (at least here) with trading.

[/ QUOTE ]

Then put your pen down, because the answer is yes. It has been demonstrated many times. Google Jim Simons, for example.

The EMH is approximate, just like every nice neat academic theory any complex system.

eastbay

CallMeIshmael

11-02-2007 02:03 PM

Re: Market Model Thingy

[ QUOTE ]
trading at the same price (the close) you use to make a decision is a good way to make your system unrealistic. start by modifying your program to include a one-day lag for trading and a linear transaction cost for bid/ask; if your system is still profitable after those changes, then this becomes a much more interesting discussion.

[/ QUOTE ]

OK. Any suggestions as to how much of a linear cost?

Also, what exactly do you mean by a 1-day lag? Is using the next days opening price a more realistic sell price?

edtost

11-02-2007 06:48 PM

Re: Market Model Thingy

[ QUOTE ]
[ QUOTE ]
trading at the same price (the close) you use to make a decision is a good way to make your system unrealistic. start by modifying your program to include a one-day lag for trading and a linear transaction cost for bid/ask; if your system is still profitable after those changes, then this becomes a much more interesting discussion.

[/ QUOTE ]

OK. Any suggestions as to how much of a linear cost?

Also, what exactly do you mean by a 1-day lag? Is using the next days opening price a more realistic sell price?

[/ QUOTE ]

25 bps? 50 bps? can someone who knows anything step in with an average bid/ask spread for equities, instead of me almost randomly guessing?

there needs to be some sort of lag between calculating your positions and trading on them. calculating based on the close on day t and trading at the open of day t+1 would be a .5 day lag, which is better than nothing. a one day lag would mean trading at the closing price on day t+1, and would probably be more realistic for someone trading daily. best would probably be to average the open and close of day t+1, implicitly assuming that you used the entire day's liquidity to make your trade.

CallMeIshmael

11-02-2007 08:43 PM

Re: Market Model Thingy

[ QUOTE ]
[ QUOTE ]
[ QUOTE ]
trading at the same price (the close) you use to make a decision is a good way to make your system unrealistic. start by modifying your program to include a one-day lag for trading and a linear transaction cost for bid/ask; if your system is still profitable after those changes, then this becomes a much more interesting discussion.

[/ QUOTE ]

OK. Any suggestions as to how much of a linear cost?

Also, what exactly do you mean by a 1-day lag? Is using the next days opening price a more realistic sell price?

[/ QUOTE ]

25 bps? 50 bps? can someone who knows anything step in with an average bid/ask spread for equities, instead of me almost randomly guessing?

there needs to be some sort of lag between calculating your positions and trading on them. calculating based on the close on day t and trading at the open of day t+1 would be a .5 day lag, which is better than nothing. a one day lag would mean trading at the closing price on day t+1, and would probably be more realistic for someone trading daily. best would probably be to average the open and close of day t+1, implicitly assuming that you used the entire day's liquidity to make your trade.

[/ QUOTE ]

I added to the model a discount on all sales by a factor of 0.995, and used the next days closing price as the sell price (I didnt have immediate access to the open price, and it would require a bit of leg work, so I just went next days close for now).

The results were still good, though obviously lower given the 0.995 discount.

Oddly, it appears that using the next days close price actually benefits the model.

Phone Booth

11-03-2007 11:15 AM

Re: Market Model Thingy

[ QUOTE ]
[ QUOTE ]
[ QUOTE ]
[ QUOTE ]
trading at the same price (the close) you use to make a decision is a good way to make your system unrealistic. start by modifying your program to include a one-day lag for trading and a linear transaction cost for bid/ask; if your system is still profitable after those changes, then this becomes a much more interesting discussion.

[/ QUOTE ]

OK. Any suggestions as to how much of a linear cost?

Also, what exactly do you mean by a 1-day lag? Is using the next days opening price a more realistic sell price?

[/ QUOTE ]

25 bps? 50 bps? can someone who knows anything step in with an average bid/ask spread for equities, instead of me almost randomly guessing?

there needs to be some sort of lag between calculating your positions and trading on them. calculating based on the close on day t and trading at the open of day t+1 would be a .5 day lag, which is better than nothing. a one day lag would mean trading at the closing price on day t+1, and would probably be more realistic for someone trading daily. best would probably be to average the open and close of day t+1, implicitly assuming that you used the entire day's liquidity to make your trade.

[/ QUOTE ]

I added to the model a discount on all sales by a factor of 0.995, and used the next days closing price as the sell price (I didnt have immediate access to the open price, and it would require a bit of leg work, so I just went next days close for now).

The results were still good, though obviously lower given the 0.995 discount.

Oddly, it appears that using the next days close price actually benefits the model.

[/ QUOTE ]

For just sell? I think what he's saying is that if you're trading based on knowledge gathered during day 1 and day n, you need to compute returns between day n+1 and day n+2 instead of day n and day n+1 or even day n and day n+2. You should definitely not do the latter (n and n+2) if you're not normalizing the return.

edtost

11-03-2007 03:31 PM

Re: Market Model Thingy

[ QUOTE ]
[ QUOTE ]
[ QUOTE ]
[ QUOTE ]
[ QUOTE ]
trading at the same price (the close) you use to make a decision is a good way to make your system unrealistic. start by modifying your program to include a one-day lag for trading and a linear transaction cost for bid/ask; if your system is still profitable after those changes, then this becomes a much more interesting discussion.

[/ QUOTE ]

OK. Any suggestions as to how much of a linear cost?

Also, what exactly do you mean by a 1-day lag? Is using the next days opening price a more realistic sell price?

[/ QUOTE ]

25 bps? 50 bps? can someone who knows anything step in with an average bid/ask spread for equities, instead of me almost randomly guessing?

there needs to be some sort of lag between calculating your positions and trading on them. calculating based on the close on day t and trading at the open of day t+1 would be a .5 day lag, which is better than nothing. a one day lag would mean trading at the closing price on day t+1, and would probably be more realistic for someone trading daily. best would probably be to average the open and close of day t+1, implicitly assuming that you used the entire day's liquidity to make your trade.

[/ QUOTE ]

I added to the model a discount on all sales by a factor of 0.995, and used the next days closing price as the sell price (I didnt have immediate access to the open price, and it would require a bit of leg work, so I just went next days close for now).

The results were still good, though obviously lower given the 0.995 discount.

Oddly, it appears that using the next days close price actually benefits the model.

[/ QUOTE ]

For just sell? I think what he's saying is that if you're trading based on knowledge gathered during day 1 and day n, you need to compute returns between day n+1 and day n+2 instead of day n and day n+1 or even day n and day n+2. You should definitely not do the latter (n and n+2) if you're not normalizing the return.

[/ QUOTE ]

I'm not sure which of the above CMI is referring to, but I agree with this.

CallMeIshmael

11-03-2007 10:03 PM

Re: Market Model Thingy

[ QUOTE ]
[ QUOTE ]
For just sell? I think what he's saying is that if you're trading based on knowledge gathered during day 1 and day n, you need to compute returns between day n+1 and day n+2 instead of day n and day n+1 or even day n and day n+2. You should definitely not do the latter (n and n+2) if you're not normalizing the return.

[/ QUOTE ]

I'm not sure which of the above CMI is referring to, but I agree with this.

[/ QUOTE ]

Yeah, I did return of (n+2)/n, which did seem a bit odd. I misunderstood what you were saying (obv at this point its pretty clear that just about any assumption regarding somebackground knowledge would be too much!). I redid it with (n+2)/(n+1), again, it went down but still with results that appear to be above the upward trend of the test market over the time period. (though, all of the additions Ive made make me a lot less certain that there isnt some error happening along the way)

Assuming that the test stat is now (n+2)/(n+1) instead of (n+1)/n, I *think* the data set could be retrained, with the obejective stat changed from (n+1)/n to (n+2)/(n+1). (ie. change the test from "given x,y,z on day n, what is an estimate for how the stock will change by close tomorrow" to "given this info, how will the stock change from close on n+1 to n+2") Is this correct, or is it making an assumption I ought not to make?

Also, semi off topic, but if this sort of thing were to be done using intraday minute by minute data, how would these additions work there?

For example, if you've decided to buy or sell at time n, is there a reasonable way to estimate a good real life price that you would actually get? Im assuming Im asking for just about an impossible task, but any input would be gladly appreciated.

Perhaps something like "average of mean low and mean close for minutes n+1 to n+10" for selling, and "average of mean high and mean close for minutes n+1 to n+10"

edtost

11-04-2007 04:43 PM

Re: Market Model Thingy

[ QUOTE ]
[ QUOTE ]
[ QUOTE ]
For just sell? I think what he's saying is that if you're trading based on knowledge gathered during day 1 and day n, you need to compute returns between day n+1 and day n+2 instead of day n and day n+1 or even day n and day n+2. You should definitely not do the latter (n and n+2) if you're not normalizing the return.

[/ QUOTE ]

I'm not sure which of the above CMI is referring to, but I agree with this.

[/ QUOTE ]

Yeah, I did return of (n+2)/n, which did seem a bit odd. I misunderstood what you were saying (obv at this point its pretty clear that just about any assumption regarding somebackground knowledge would be too much!). I redid it with (n+2)/(n+1), again, it went down but still with results that appear to be above the upward trend of the test market over the time period. (though, all of the additions Ive made make me a lot less certain that there isnt some error happening along the way)

Assuming that the test stat is now (n+2)/(n+1) instead of (n+1)/n, I *think* the data set could be retrained, with the obejective stat changed from (n+1)/n to (n+2)/(n+1). (ie. change the test from "given x,y,z on day n, what is an estimate for how the stock will change by close tomorrow" to "given this info, how will the stock change from close on n+1 to n+2") Is this correct, or is it making an assumption I ought not to make?

[/ QUOTE ]

that would definitely be a reasonable thing to do. one thing to look for (in terms of how believable the model is as something other than data mining) is how much the coefficients on the various explanatory variables change when you change the specification of the model in that way.

[ QUOTE ]
Also, semi off topic, but if this sort of thing were to be done using intraday minute by minute data, how would these additions work there?

For example, if you've decided to buy or sell at time n, is there a reasonable way to estimate a good real life price that you would actually get? Im assuming Im asking for just about an impossible task, but any input would be gladly appreciated.

Perhaps something like "average of mean low and mean close for minutes n+1 to n+10" for selling, and "average of mean high and mean close for minutes n+1 to n+10"

[/ QUOTE ]

in general, i think create position at t, trade at t+1, exit at t+2 is a fairly reasonable model for any timestep delta t, in that you wouldn't want to be creating positions more often than you could trade, though i guess this could break down when you get to very small intervals.

crazy canuck

11-07-2007 08:46 AM

Re: Market Model Thingy

One easy way to check your code is to generate random data with 0 mean returns:

e.g. rand(m,n)-0.5 in matlab with m,n desired size

and see if your code generates profit (of course it shouldn't). This is not foolproof, but it's quick.

Then, you could remove stocks that have low market caps or low daily volume. Transaction cost/slippage could be fairly high for small stocks so much of your excess return could come from these.

Also, small stocks can have sick drawdowns. So if you hold a portfolio of these, you'd have to assume that some of your money is in cash. This would reduce the returns.

Also, you migh want to use pinv() instead of regress() function...sometimes it makes a difference. Same thing, just more stable numerically.

DcifrThs

11-07-2007 10:17 AM

Re: Market Model Thingy

[ QUOTE ]
One easy way to check your code is to generate random data with 0 mean returns:

e.g. rand(m,n)-0.5 in matlab with m,n desired size

and see if your code generates profit (of course it shouldn't). This is not foolproof, but it's quick.

Then, you could remove stocks that have low market caps or low daily volume. Transaction cost/slippage could be fairly high for small stocks so much of your excess return could come from these.

Also, small stocks can have sick drawdowns. So if you hold a portfolio of these, you'd have to assume that some of your money is in cash. This would reduce the returns.

Also, you migh want to use pinv() instead of regress() function...sometimes it makes a difference. Same thing, just more stable numerically.

[/ QUOTE ]

interesting...i learn new things everyday [img]/images/graemlins/smile.gif[/img]

so why exactly is pinv(X'*X) more stable numerically than (X'*X)^-1? i read the "help pinv" on it and it looks like it just calculates the inverse:

[ QUOTE ]
X = PINV(A) produces a matrix X of the same dimensions
as A' so that A*X*A = A, X*A*X = X and A*X and X*A
are Hermitian.

[/ QUOTE ]

i tested it on a few things i've done and in no case got anything different down to 4 decimal places.

also, how do i change the view so that it shows n number of decimal places?

thanks,
Barron

crazy canuck

11-07-2007 11:39 AM

Re: Market Model Thingy

The function pinv calculates the pseudoinverse:

matlab link

Sometimes the matrix inv(A'*A) has a determinant that is close to 0, so it close to non invertible. In this case there could actually be a family of vectors that minimizes the least-squares distance.

Matlab points this out by warnings...something about the condition number too high. If matlab is fine with regress (no warnings), then regress is ok.

In financial application this happens sometimes when you have inputs that are highly correlated...I guess it's ok if you have a lot of inputs, like in OP-s case.

crazy canuck

11-07-2007 11:45 AM

Re: Market Model Thingy

No idea about decimal places..if someone knows it please post it.

Up to now I just multiplied by 10 to the appropriate power.

DcifrThs

11-07-2007 01:36 PM

Re: Market Model Thingy

[ QUOTE ]
No idea about decimal places..if someone knows it please post it.

Up to now I just multiplied by 10 to the appropriate power.

[/ QUOTE ]
yea i might just search matlab index help for a while for a more permanent fix to the problem.

Barron

DcifrThs

11-07-2007 01:38 PM

Re: Market Model Thingy

[ QUOTE ]
The function pinv calculates the pseudoinverse:

matlab link

Sometimes the matrix inv(A'*A) has a determinant that is close to 0, so it close to non invertible. In this case there could actually be a family of vectors that minimizes the least-squares distance.

Matlab points this out by warnings...something about the condition number too high. If matlab is fine with regress (no warnings), then regress is ok.

In financial application this happens sometimes when you have inputs that are highly correlated...I guess it's ok if you have a lot of inputs, like in OP-s case.

[/ QUOTE ]

thanks. the stuff i have in matlab hasn't caused any errors like the one you mention.

for the future though that is very helpful.

does any other "fixes" come to your mind regarding data analysis in matlab? like something along the pinv type fix?

thanks again,
Barron

CallMeIshmael

11-07-2007 03:05 PM

Re: Market Model Thingy

CC,

I did a similar test, where I just had the program select random picks each day, and it showed the expected ~100% loss (expected after the ask/bid discount and fees)

I did have the concern that you raised, re: most picked stocks are babies, and the transaction fees are going to be killer. However, I looked through it, and the 25/50/75 percentiles were pretty reasonable (something like [11 16 24], but I cant remember exact details now).

If I were to rerun it, getting rid of all stocks < X, what is a good X?

Also, Ive used pinv since v early on, when my regular method produced a warning.

crazy canuck

11-08-2007 04:44 AM

Re: Market Model Thingy

[ QUOTE ]

does any other "fixes" come to your mind regarding data analysis in matlab? like something along the pinv type fix?

[/ QUOTE ]

Can't think of a major fix right now.

If there are higly correlated inputs one could also use Principal components analysis (PCA)

Here is the Matlab link:

matlab

There are lot of good tutorials on it online.

One interesting anecdote (this is not first hand - you might be able to verify or deny it) I read at wilmott was that DE Shaw used it as their first strategy to trade a basket of stocks, becasue some of the components were predictable. But since then, it has been eliminated.

Once I checked out the strategy on one sector and it was seemed pretty useless.

So it is unlikely that OP's strategy would be very scaleable....but maybe he is onto something.

crazy canuck

11-08-2007 05:33 AM

Re: Market Model Thingy

[ QUOTE ]

If I were to rerun it, getting rid of all stocks < X, what is a good X?

[/ QUOTE ]

I don't know exactly...I wish someone else who traded/trades small caps would reply. I'm actually very curious about it because I built systems that work on small stocks, but have no idea how useful they are...didn't get to trade it/pursue it further because I'm at school full time, and publishing is my first priority.

My guess is that you could start with a market cap of 300 million (disregard stocks below it), and work your way down from there....but this is just a rough guess.

To get a more definitive answer, look at the minimum daily volume, and bid-asks during the day...even yahoo has it. Then, you can get an approximate feel for the transaction cost, but even this is industry dependent.

For example this company has a market cap of over 100 million, stock price of $2:

[url=http://finance.yahoo.com/q/hp?s=ACPW]yahoo finance[/url ]

and minimum daily volume of 35000. So you might be running into high transaction costs when you want to trade over 2000 shares (or even less), or $4,000. Again, I wish someone would give a more definite answer.

Therefore, the answer also depends on how much money you would trade.

It is possible that you can make 50% annual returns while your bankroll is below 100k (these numbers are just illustrative), but obviously you'd hit that mark pretty quickly. So the system would not be scaleable in the end (not valuable for a hedge fund), but of course that'd be still pretty good.

DcifrThs

11-08-2007 11:27 AM

Re: Market Model Thingy

[ QUOTE ]
[ QUOTE ]

does any other "fixes" come to your mind regarding data analysis in matlab? like something along the pinv type fix?

[/ QUOTE ]

Can't think of a major fix right now.

If there are higly correlated inputs one could also use Principal components analysis (PCA)

Here is the Matlab link:

matlab

There are lot of good tutorials on it online.

One interesting anecdote (this is not first hand - you might be able to verify or deny it) I read at wilmott was that DE Shaw used it as their first strategy to trade a basket of stocks, becasue some of the components were predictable. But since then, it has been eliminated.

Once I checked out the strategy on one sector and it was seemed pretty useless.

So it is unlikely that OP's strategy would be very scaleable....but maybe he is onto something.

[/ QUOTE ]

yea i'm familiar w/ PCA. very useful way to analyze large data sets. that is the methodology that really proved to me that eigenvalues/vectors are useful [img]/images/graemlins/smile.gif[/img]

interesting stuff about DEShaw using it initially. understandable that it can't be used anymore.

thanks for your inputs.
Barron

All times are GMT -4. The time now is 08:50 PM.