Ockham's Razor - Page 5

brandofo · #41 06-16-2007, 07:33 PM

Does the razor have four or five blades?

Philo · #42 06-16-2007, 08:19 PM

[ QUOTE ]
Does the razor have four or five blades?

[/ QUOTE ]

Just two actually. For an even more parsimonious shave try the new Ernst Mach III.

Metric · #43 06-16-2007, 08:26 PM

You're missing Boro's (correct) point. Here's the prototype example:

Suppose you have a computer producing a string of output characters -- you don't know the program (input), but you do so far have the first 183,000 output characters. It just so happens that these 183,000 characters happen to be the first 183,000 digits of pi.

Now for the central question -- if you had to predict the next character (number 183,001), is it MORE LIKELY to be the next digit of pi, or a random character? What would you bet on, and why?

Keep in mind, there are plenty of possible programs that say "calculate the first 183,000 digits of pi, and then do something completely different," and given only the first 183,000 output characters, you can't distinguish between any of these and the much simpler program "calculate pi."

The fields of inductive inference, algorithmic probability, etc. were set up to answer this sort of question. And the answer turns out, not surprisingly, to be that "it is more probable that the next output character will be the next digit of pi." As such, they represent something of a rigorous justification for Ockham's razor, and makes Borodog's point -- that simpler explanations that fit the data equally well are typically more likely to be true.

Borodog · #44 06-17-2007, 02:06 AM

[ QUOTE ]
You're missing Boro's (correct) point. Here's the prototype example:

Suppose you have a computer producing a string of output characters -- you don't know the program (input), but you do so far have the first 183,000 output characters. It just so happens that these 183,000 characters happen to be the first 183,000 digits of pi.

Now for the central question -- if you had to predict the next character (number 183,001), is it MORE LIKELY to be the next digit of pi, or a random character? What would you bet on, and why?

Keep in mind, there are plenty of possible programs that say "calculate the first 183,000 digits of pi, and then do something completely different," and given only the first 183,000 output characters, you can't distinguish between any of these and the much simpler program "calculate pi."

The fields of inductive inference, algorithmic probability, etc. were set up to answer this sort of question. And the answer turns out, not surprisingly, to be that "it is more probable that the next output character will be the next digit of pi." As such, they represent something of a rigorous justification for Ockham's razor, and makes Borodog's point -- that simpler explanations that fit the data equally well are typically more likely to be true.

[/ QUOTE ]

THANK YOU!

I cannot even understand how this point is in question.

Philo · #45 06-17-2007, 05:00 AM

[ QUOTE ]
You're missing Boro's (correct) point. Here's the prototype example:

Suppose you have a computer producing a string of output characters -- you don't know the program (input), but you do so far have the first 183,000 output characters. It just so happens that these 183,000 characters happen to be the first 183,000 digits of pi.

Now for the central question -- if you had to predict the next character (number 183,001), is it MORE LIKELY to be the next digit of pi, or a random character? What would you bet on, and why?

Keep in mind, there are plenty of possible programs that say "calculate the first 183,000 digits of pi, and then do something completely different," and given only the first 183,000 output characters, you can't distinguish between any of these and the much simpler program "calculate pi."

The fields of inductive inference, algorithmic probability, etc. were set up to answer this sort of question. And the answer turns out, not surprisingly, to be that "it is more probable that the next output character will be the next digit of pi." As such, they represent something of a rigorous justification for Ockham's razor, and makes Borodog's point -- that simpler explanations that fit the data equally well are typically more likely to be true.

[/ QUOTE ]

There are two points of disagreement here. The first was, what is the correct interpretation of OR? Is it an empirical claim which says that theories that are more ontologically parsimonious are more likely to be true? Or is it a heuristic principle that says something like, given two or more theories all of which are on an equal par with respect to the evidence and with respect to their explanatory power, choose the simplest. The right answer here is, OR is a heuristic principle, not an empirical one. The empirical claim that I just mentioned is a much stronger claim than OR. That's why OR is an interesting topic in the Philosophy of Science. If it was simply an empirical claim we could just test it against our actual results and see if it was right. That's of no special interest to philosophers. You can read about OR yourself to find out that this is true.

The second disagreement grew out of the first. The second one was, given two or more theories all of which are on an equal par with respect to the evidence and with respect to their explanatory power, is the simplest one more likely to be true? I say it's an open question whether or not ontological parsimony makes a theory more likely to be true. String theory is alive and well, despite requiring at least 10 space-time dimensions.

I think the analogy with the computer generated string of output characters is a poor one. It's an open question whether or not nature itself conforms to the principle of parsimony, such that the simplest theory is indeed more likely to be true, but even if it does your analogy would be a poor reason for thinking so. A computer program is written by human beings who have knowledge of pi, and if the string of digits through the first 183,000 matched pi, that would be the reason one would believe that the next digit is likely to be the next digit of pi. This says nothing about whether or not natural phenomena conform with the empirical claim that the more parsimonious theory is more likely to be true.

Metric · #46 06-17-2007, 07:17 AM

[ QUOTE ]
I think the analogy with the computer generated string of output characters is a poor one.

[/ QUOTE ]
Of course you do. But that's only because it is, in fact, such a perfect example unpolluted with [censored] side issues. All you have are data, equally good (but differing in complexity) theories, and the ability to test them. I.E. a scenario where Ockham's razor is essentially the only relevant principle, and where it can be seen to be mathematically correct -- simple explanations will be more likely to be correct, all else being equal.

[ QUOTE ]
A computer program is written by human beings who have knowledge of pi, and if the string of digits through the first 183,000 matched pi, that would be the reason one would believe that the next digit is likely to be the next digit of pi.

[/ QUOTE ]
Sigh. Actually, the assumptions going into these arguments are that the input program is randomly generated -- i.e. specifically not written by a human with knowledge of pi. Inductive inference and algorithmic probability are not results in psychology -- they are general logical and probabilistic conclusions resting on general principles.

Next you will argue that the programming language matters, or the specific type of computer -- no, it doesn't. An invariance theorem exists that shows that these arguments have the same content on essentially any computer in any language.

After that, you will object that computers have little or nothing to do with the universe or the rest of science. But from a physics point of view it turns out that the laws of the universe can be thought of more or less completely in the language of computing if you want to (quantum computing, specifically).

After that, the thread will probably just die because you'll shift the argument to some subtly different question that no one really cares about, but one where you're not quite so obviously and dramatically wrong.

pzhon · #47 06-17-2007, 07:52 AM

[ QUOTE ]
If one theory is more likely to be true given the evidence, we don't need a heuristic principle like Occam's Razor in order to choose among theories. We can just go by the evidence in that case.

[/ QUOTE ]
No, simplicity is not only a tie-breaker. It's a significant indication of the merit of a theory. Ockham's Razor indicates that you should sometimes choose the simpler theory even when the evidence fits a more complicated theory better.

The best quadratic approximation to a function is better than the best linear approximation (except in degenerate situations). However, you should require a substantial increase in accuracy in order to accept the increase in complexity from 2 parameters to 3.

22/7 is a better approximation to pi than 102985/32768, even though the latter is more accurate. 22/7 is surprisingly accurate, relative to its complexity. |Pi-22/7|*7^2 is small. 102985/32768 is more accurate, but it's not even the right choice of numerator for that denominator. If 22/7 isn't accurate enough for you, 355/113 is off by less than a millionth, and it's much less complicated than 102985/32768.

There is a classic urban legend used to illustrate a sacrifice of accuracy to improve a model: When Copernicus proposed a model of the solar system centered about the Sun, his predictions were less accurate than the highly developed geocentric model accepted for thousands of years. However, he needed only about 30 circular motions rather than 80. (Epicycles were still needed because the true Newtonian motion is closer to an ellipse.) This story isn't literally true, but it spreads in part because we recognize that we should be willing to trade some accuracy for simplicity.

Piers · #48 06-17-2007, 07:57 AM

Ockham's Razor is a useful rule of thumb for simplifying decision-making.

But however useful Ockham’s razor might be in practise it is instructive to observe that it is almost always wrong. There will always be a more complicated model that is more accurate, but which is likely lost amongst the uncountable number of plausible but incorrect more complicated models.

[ QUOTE ]
Suppose you have a computer producing a string of output characters -- you don't know the program (input), but you do so far have the first 183,000 output characters. It just so happens that these 183,000 characters happen to be the first 183,000 digits of pi.

Now for the central question -- if you had to predict the next character (number 183,001), is it MORE LIKELY to be the next digit of pi, or a random character? What would you bet on, and why?

Keep in mind, there are plenty of possible programs that say "calculate the first 183,000 digits of pi, and then do something completely different," and given only the first 183,000 output characters, you can't distinguish between any of these and the much simpler program "calculate pi."

[/ QUOTE ]

The model that the computer will continue outputted the first n digits of pi means that its next output will be the n+1th is a good assumption to make in practise. However it is also clearly false. At some point, with 100% certainty the computer will not output the next digit of pi. Either because it not programed to, a power cut, a bug in the program, the expanding sun finally engulfs the earth and the computer, some hardware anomaly or someone just turning the computer off.

wazz · #49 06-17-2007, 08:03 AM

Or the more simple explanation that the computer is neither capable nor programmed to display pi to infinity. I have to say, I don't find that example very convincing at all.

Metric · #50 06-17-2007, 08:24 AM

[ QUOTE ]
The model that the computer will continue outputted the first n digits of pi means that its next output will be the n+1th is a good assumption to make in practise. However it is also clearly false. At some point, with 100% certainty the computer will not output the next digit of pi. Either because it not programed to, a power cut, a bug in the program, the expanding sun finally engulfs the earth and the computer, some hardware anomaly or someone just turning the computer off.

[/ QUOTE ]
None of which effects the result that simpler programs are the more likely explanation, given access to a finite amount of the output (and the assumption of truly random input -- i.e. no cheating allowed).

"Your computer will eventually burn up" is not really a good objection to rigorous results concerning algorithmic probability.