Two Plus Two Newer Archives  

Go Back   Two Plus Two Newer Archives > Other Topics > Science, Math, and Philosophy

Reply
 
Thread Tools Display Modes
  #1  
Old 11-29-2007, 11:00 PM
CallMeIshmael CallMeIshmael is offline
Senior Member
 
Join Date: Dec 2004
Location: Tis the season, imo
Posts: 7,849
Default Hypothetical CS type question...

Consider the set of all the computer programs that perform a meaningful* output computation. That is, consider only programs that start, perform some set of computations, then output data. At which point, you are done with the program, since its only purpose was the generate that output data.


What percentage of programs fall into each of these categories:

1) Less than one hour until data output
2) Between one hour and one year until output
3) More than one year until output



*For the sake of the post, we can give 'meaningful' the definition of 'anything you can reasonably see someone wishing to program' So, obviously a very wide range of possible programs here. Basically, just dont make up a program for the sole purpose of discussing it in this thread.


edit: assume we are looking at a new, fast home PC / good RAM, and using a fast language like c.
Reply With Quote
  #2  
Old 11-29-2007, 11:20 PM
madnak madnak is offline
Senior Member
 
Join Date: Aug 2005
Location: Brooklyn (Red Hook)
Posts: 5,271
Default Re: Hypothetical CS type question...

I'm thinking infinite on all counts. I don't think there's a finite number of meaningful computations. Even if there is, the number is probably extremely high.

I think the vast majority of such programs in actual use will fall under 1, with a tiny minority falling under 3 (and those would do better with a distributed model in general).
Reply With Quote
  #3  
Old 11-29-2007, 11:27 PM
gumpzilla gumpzilla is offline
Senior Member
 
Join Date: Feb 2005
Posts: 7,911
Default Re: Hypothetical CS type question...

[ QUOTE ]

*For the sake of the post, we can give 'meaningful' the definition of 'anything you can reasonably see someone wishing to program' So, obviously a very wide range of possible programs here. Basically, just dont make up a program for the sole purpose of discussing it in this thread.

[/ QUOTE ]

Tons of scientific computations can take a verrrry long time to compute. I could imagine many different types of quantum simulations, for example, that would blow up exponentially in time with addition of each new particle. In that sense, then, nearly all such programs will fall into the longest category. (EDIT: To be more explicit, let's say that the integer N characterizing the size of the problem has some threshold N0 where the problem ends up in category 3. ~100% of integers are bigger than N0, so to first order, all programs take longer.)

The one caveat is that the computer has finite space, so I can't just make these programs consider bigger and bigger problems without bound, otherwise 100% (in the measure theoretic sense) of the programs fall into the last category. But, in the real world, that's a concern. So I'd guess it's something like 95% of "interesting" programs as a lower bound fall into the last category.
Reply With Quote
  #4  
Old 11-29-2007, 11:45 PM
madnak madnak is offline
Senior Member
 
Join Date: Aug 2005
Location: Brooklyn (Red Hook)
Posts: 5,271
Default Re: Hypothetical CS type question...

[ QUOTE ]
The one caveat is that the computer has finite space

[/ QUOTE ]

I completely forgot about this! This changes things because it means the set of programs that can run on the computer is finite. But I'm also thinking that the distinction of "meaningful" is too imprecise. There is really no program at all that I can't see someone reasonably wanting to program under the right circumstances. Ah, barring redundancy and bugs.

By any standard I think the numbers are extremely large. But I have to agree that 3 will represent the biggest group. Generally the number of specific operations being performed (roughly speaking - different "classes" of operations will have different speeds, and some operations may be performed concurrently, but that doesn't change the basic fact) will determine the speed. As a result, the number of possible permutations of programs will tend to increase as the running time increases. And that will exceed 1 year quickly.

Also basically any process that describes a "simple" program will be a necessary subroutine in a large number of "complex" programs, so (given finite memory) there is a many-to-one relationship between the long programs and the short programs. There have got to be more long programs, and I'd say by orders of magnitude.
Reply With Quote
  #5  
Old 11-29-2007, 11:54 PM
CallMeIshmael CallMeIshmael is offline
Senior Member
 
Join Date: Dec 2004
Location: Tis the season, imo
Posts: 7,849
Default Re: Hypothetical CS type question...

mad/gump,

how does your answer change when the problem is changed to "fit the set of programs that people wished to / did program over the past 5 years (still only computation programs) into those 3 categories"


I guess you could say Im curious about the practical answer to what is very theoretical question. The problem w/ wording it this second way is that there is considerable skew away from groups 2/3 with actually written programs and probably some even with wished for programs.
Reply With Quote
  #6  
Old 11-30-2007, 12:10 AM
madnak madnak is offline
Senior Member
 
Join Date: Aug 2005
Location: Brooklyn (Red Hook)
Posts: 5,271
Default Re: Hypothetical CS type question...

More explicitly, each program can be viewed as a sequence of instructions. According to this view, it's easy to see that the number of permutations increases as the number of instructions in the sequence increases. As the number of instructions becomes higher than a certain constant (the number of instructions that would take one year to execute, for example), the proportion of possible programs with an instruction count above that constant geometrically increases relative to the number of possible programs with an instruction count below the constant. (We can view the program in terms of "ticks" or time units instead of instructions - an instruction that takes 4 time units is equivalent to any two instructions that take 2 time units each. But the trend doesn't change - the number of permutations still increases with the number of allowed time units.)

The number of instructions must be finite, because the program does eventually have to spit something out. But even without infinite loops, it's trivial to see that a simple program can execute an arbitrarily large number of instructions. In other words, not only can a given program execute a given number of specific instructions in a given sequence, it can also repeat instructions, skip instructions, and otherwise mess up the math. Still, every instruction must be executed at least once due to the constraints of the problem. Therefore, a sufficiently large program must execute a sufficiently large number of instructions, and even a smaller program can execute such a number of instructions.

Actually, given that the program merely produces a given output, all programs can be seen as specific sequences of instructions but with compression algorithms potentially applied. So the average number of instructions will be way higher than what can be fit into a static program!

The time it takes to execute n instructions will always be greater than or equal to the amount of time it would take to execute the fastest instruction n times.

Therefore, given a sufficiently large hard drive (and let's ignore that the memory management necessarily in such a large program will necessarily increase the running time), most programs will totally run longer than 1 year straight. I can't prove that a 200GB hard disk is big enough, but we all know it is LDO.
Reply With Quote
  #7  
Old 11-30-2007, 12:17 AM
madnak madnak is offline
Senior Member
 
Join Date: Aug 2005
Location: Brooklyn (Red Hook)
Posts: 5,271
Default Re: Hypothetical CS type question...

[ QUOTE ]
mad/gump,

how does your answer change when the problem is changed to "fit the set of programs that people wished to / did program over the past 5 years (still only computation programs) into those 3 categories"

[/ QUOTE ]

Number 1 definitely wins. People just don't want to run a program on their computer that takes over a year to spit out a given piece of information. I'm not sure how distributed programming fits into your model, since it really doesn't meet the constraints (it has to "phone home" after all), but even if we include those programs there aren't that many of them.

The development of a 1 year+ program isn't viable, so even though those will represent the majority of the programs in theory, they'll represent a minority in practice.

In terms of group 2, I'm not sure. I can't think of much other than 3d graphics software that takes so long just to give a certain output. 1 definitely wins.

There might be some long simulations brewing, but probably most are written for supercomputers in research environments. And 1 year is still a long time. I mean, even most servers won't even see 1 full year of uninterrupted uptime.
Reply With Quote
  #8  
Old 11-30-2007, 12:24 AM
madnak madnak is offline
Senior Member
 
Join Date: Aug 2005
Location: Brooklyn (Red Hook)
Posts: 5,271
Default Re: Hypothetical CS type question...

[ QUOTE ]
mad/gump,

how does your answer change when the problem is changed to "fit the set of programs that people wished to / did program over the past 5 years (still only computation programs) into those 3 categories"


I guess you could say Im curious about the practical answer to what is very theoretical question. The problem w/ wording it this second way is that there is considerable skew away from groups 2/3 with actually written programs and probably some even with wished for programs.

[/ QUOTE ]

Oh, what do you mean by "wished for," exactly? I still think it would be 1, but...

Would a program that successfully outputs the correct answer to the question "how can we achieve cold fusion" count as one of the "wished for" programs?
Reply With Quote
  #9  
Old 11-30-2007, 12:38 AM
CallMeIshmael CallMeIshmael is offline
Senior Member
 
Join Date: Dec 2004
Location: Tis the season, imo
Posts: 7,849
Default Re: Hypothetical CS type question...

[ QUOTE ]
[ QUOTE ]
mad/gump,

how does your answer change when the problem is changed to "fit the set of programs that people wished to / did program over the past 5 years (still only computation programs) into those 3 categories"


I guess you could say Im curious about the practical answer to what is very theoretical question. The problem w/ wording it this second way is that there is considerable skew away from groups 2/3 with actually written programs and probably some even with wished for programs.

[/ QUOTE ]

Oh, what do you mean by "wished for," exactly? I still think it would be 1, but...

Would a program that successfully outputs the correct answer to the question "how can we achieve cold fusion" count as one of the "wished for" programs?

[/ QUOTE ]


hahaha


well, I think its fair to say you would need to be able to, at least in theory, devise the algorithm for the program to count here. So (and I could be wrong about this, since physics is my weakest science, and I dont know much beyond the basics about CF), I think this doesnt count.


I think a good example of a wished for program that has never been coded is a program that solves chess.
Reply With Quote
  #10  
Old 11-30-2007, 12:43 AM
CallMeIshmael CallMeIshmael is offline
Senior Member
 
Join Date: Dec 2004
Location: Tis the season, imo
Posts: 7,849
Default Re: Hypothetical CS type question...

FWIW, I tend to do a fair amouont of programming in my spare time, and this question this sort of stems from my observation that, in general, tasks I want to solve almost always fall into either group 1 or 3. Most of the time, a computation takes fewer than 20 minutes or (by best estimates) so long that I would need to live thousands or millions of years to see the result.


It appears that, at least for the tasks Im interested in, the growth in the graph of programs vs time is very VERY fast once you get past 1 hour.
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -4. The time now is 07:49 AM.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.