Two Plus Two Newer Archives

Two Plus Two Newer Archives (http://archives1.twoplustwo.com/index.php)
-   Science, Math, and Philosophy (http://archives1.twoplustwo.com/forumdisplay.php?f=49)
-   -   Best Sorting Algorithm For Large Numbers (http://archives1.twoplustwo.com/showthread.php?t=545135)

gaming_mouse 11-13-2007 01:09 PM

Best Sorting Algorithm For Large Numbers
 
It's been a while since I used sorting algorithms.... so I figured I'd get a good answer here.

If you had to sort a list of random numbers between 0 and 9999999999999999 (16 digits), what is the fastest algorithm?

If the maximum number changes even higher to something that does not fit in a long data type, does the answer change?

Thanks,
gm

adios 11-13-2007 01:20 PM

Re: Best Sorting Algorithm For Large Numbers
 
It's been awhile since I've delved into sorting algorithms but if memory serves you want an "n log n" algorithm for faster sorts as opposed to an "n squared" algorithm.

CrayZee 11-13-2007 02:56 PM

Re: Best Sorting Algorithm For Large Numbers
 
Fastest is a little vague, since you can consider worst case, avg case, etc (sometimes they are the same). Different sorting algorithms also have differing memory requirements, O(1), O(log n), etc. So that should also be taken into consideration if you have large data sets.

Quicksort is the brain dead choice of programmers.

Siegmund 11-13-2007 06:02 PM

Re: Best Sorting Algorithm For Large Numbers
 
As far as algorithm efficiency, yes, quick sort or one of the nearly equivalent O(n log n) methods.

As far as whether the numbers fit into a Long or not... you will want to investigate whether String or Long or user-defined "VeryLong" variants are fastest. You may get faster performance with string comparisons (pad numbers with leading zeros as necessary.)

Ultimately, however, the length of the strings has almost nothing to do with the time needed for the comparison, only with the read and write time for swapping the strings around. The "quick sort" idea of sorting sub-sub-groups and sub-groups first can be adapted to "look at only one bit at a time, and sort into groups based on whether that bit is a 0 or a 1": there is absolutely no reason to compare an entire long string against another entire long string except convenience of programming.

oe39 11-14-2007 02:57 PM

Re: Best Sorting Algorithm For Large Numbers
 
the comparison sort wikipedia page might be of interest?

CrayZee 11-14-2007 03:24 PM

Re: Best Sorting Algorithm For Large Numbers
 
Yeah, not a bad idea. I also forgot about stability issues since it's been awhile...but most sorting algorithms are pretty stable.

A quick Googling dug this up. Maybe you'll find that useful assuming it's accurate.

Specialized sorting algorithms might be faster, but Quicksort is usually good enough. It's fast enough for most purposes and won't blow up on you most of the time. Man hours are more expensive than computing hours esp. if you're solving simple business problems, etc.

gaming_mouse, what are you trying to do? Homework, something real, or pondering sorting theory?

DoTheMath 11-14-2007 05:31 PM

Re: Best Sorting Algorithm For Large Numbers
 
[ QUOTE ]
If you had to sort a list of random numbers between 0 and 9999999999999999 (16 digits), what is the fastest algorithm?

[/ QUOTE ]
The relative speeds of sort algorithms themselves are not generally affected by the size of the individual data values. The sorting algorithms are data type independent. The characteristics of the data that most affect the choice of fastest algorithm are the number of items to be sorted and the degree to which they are already sorted. Another factors is whether the sort is to be done using the memory that stores the data or whether more memory or memory structures are available to be used as an intermediate or replacement storage area. Another factor that might affect the choice of algorithms is the distribution of data within the range.

The issue of number size is more likely to affect the comparison algorithm used by the sort and the data structure of your implementation.

[ QUOTE ]

If the maximum number changes even higher to something that does not fit in a long data type, does the answer change?

[/ QUOTE ]
As long as you are still sorting numbers, generally the answer is no. If the numbers are very large integers that do not fit into a numeric data type supported by your implemention environment, you may find yourself representing your data as a string of (often character) values. This might lead one to consider a LSD radix sort, but I don't believe that this would likely be faster than a comparison sort.

As oe39 said, read wiki, but I would suggest starting with the Sorting Algorithms page, or read Knuth.

Wyman 11-14-2007 05:41 PM

Re: Best Sorting Algorithm For Large Numbers
 
In what context are you sorting?

Is it a one-time sort? Do you want to maintain the sorted list and add/delete items over time? Are there 10000000 elements that you're sorting? 10? 2? Does your data have any structure at all, or are the numbers chosen uniformly at random from 0 to 9999999999999999?

Quick answer: For a specific problem, I'd implement and test a few sorting algorithms to see which is fastest. Check out quicksort and radix sort.

Yobz 11-14-2007 06:41 PM

Re: Best Sorting Algorithm For Large Numbers
 
"Fastest" time-complexity wise with lots of numbers and bound range would be Radix Sort, assuming no previous knowledge of data being sorted (which allows more special case sorting).

gaming_mouse 11-14-2007 06:58 PM

Re: Best Sorting Algorithm For Large Numbers
 
[ QUOTE ]


gaming_mouse, what are you trying to do? Homework, something real, or pondering sorting theory?

[/ QUOTE ]

A friend of mine who's in college asked me about it... I think it was a homework assignment, yes. Anyway, I was trying to help him and I realized I wasn't sure of the answer, so I got curious.

gaming_mouse 11-14-2007 07:07 PM

Re: Best Sorting Algorithm For Large Numbers
 
[ QUOTE ]
"Fastest" time-complexity wise with lots of numbers and bound range would be Radix Sort, assuming no previous knowledge of data being sorted (which allows more special case sorting).

[/ QUOTE ]

Wyman,

I read this quote from the radix wikipedia entry:

When an LSD radix sort processes keys which all have the same fixed length, then the upper bound of the execution time is O(n), where n is the number of keys to be sorted. When processing fixed-length keys, some implementations of LSD radix sorts are slower than O(n · log n) comparison-based sorting algorithms, such as heap sort, unless a sufficiently large amount of input data is processed. What a sufficiently large amount of input data precisely is will vary from computer system to computer system and from implementation to implementation.

It is not clear to me what conditions make the radix better than quicksort or heapsort exactly. Could you explain this?

CrayZee 11-15-2007 04:12 PM

Re: Best Sorting Algorithm For Large Numbers
 
[ QUOTE ]
A friend of mine who's in college asked me about it... I think it was a homework assignment, yes. Anyway, I was trying to help him and I realized I wasn't sure of the answer, so I got curious.

[/ QUOTE ]

Ah, more about theory than engineering then (The QA approach would be to performance test different solutions and pick the best one for the job.)

I'd probably just grab your friend's text or whatever course material he's using to see the whole context of the problem. Seems a little vague to me.

Wyman 11-18-2007 08:05 PM

Re: Best Sorting Algorithm For Large Numbers
 
[ QUOTE ]
[ QUOTE ]
"Fastest" time-complexity wise with lots of numbers and bound range would be Radix Sort, assuming no previous knowledge of data being sorted (which allows more special case sorting).

[/ QUOTE ]

Wyman,

I read this quote from the radix wikipedia entry:

When an LSD radix sort processes keys which all have the same fixed length, then the upper bound of the execution time is O(n), where n is the number of keys to be sorted. When processing fixed-length keys, some implementations of LSD radix sorts are slower than O(n · log n) comparison-based sorting algorithms, such as heap sort, unless a sufficiently large amount of input data is processed. What a sufficiently large amount of input data precisely is will vary from computer system to computer system and from implementation to implementation.

It is not clear to me what conditions make the radix better than quicksort or heapsort exactly. Could you explain this?

[/ QUOTE ]

Sorry I sort of forgot about this thread.

The idea is (roughly) that an algorithm is O(f(n)) if, as n gets large, an input of size n (in this case, n numbers to be sorted) takes at most C*f(n) time to run (where C is a constant that does not depend on n, and where "time" is usually measured by slowest operations -- perhaps number of comparisons, for instance). You may want to reread this a few times.

Example: I have a list of N numbers. To find the largest element in the list, I can just scroll through the list once. I save the largest-so-far, and I compare each new element to the largest-so-far. This takes O(1) space (the amount of space required is independent of N), and O(N) time (the number of comparisons is at worst linear in N).

Now, as it turns out, if the *size* of the numbers on your list is bounded (e.g. all numbers are < 99999), then a radix sort is O(n), where n is the size of the list to sort. Quicksort and heapsort are O(n log n).

The catch is that buried in the definition of "O", I mentioned that "C". This means that an algorithm that achieves its goal in 10000*N comparisons is O(N). This, of course, may be slower than an algorithm that runs in 3*N log N (which is O(N log N) ) if N is not sufficiently large.

So while an O(N) algorithm is *asymptotically* faster than an O(N log N) algorithm, it's not the case that for every input the O(N) algorithm will run faster than the O(N log N) algorithm.


Aside: If you don't have an upper bound on the size of your numbers, the (provably) fastest sorting algorithms are O(n log n). For the reasons above, most sorting algorithms that are built into software use quicksort (O(n log n)) when n > 7, and something like bubble sort or insertion sort (O(n^2)) when n <= 7. Even though the latter algorithms are slower asymptotically, for small enough instances, they are pretty fast.

jukofyork 11-19-2007 11:02 AM

Re: Best Sorting Algorithm For Large Numbers
 
[ QUOTE ]
Aside: If you don't have an upper bound on the size of your numbers, the (provably) fastest sorting algorithms are O(n log n). For the reasons above, most sorting algorithms that are built into software use quicksort (O(n log n)) when n > 7, and something like bubble sort or insertion sort (O(n^2)) when n <= 7. Even though the latter algorithms are slower asymptotically, for small enough instances, they are pretty fast.

[/ QUOTE ]
For very small values of n, then the best choice is usually to use a Sorting Network with hard-coded compare and swap operations to reduce overhead to a minimum.

Juk [img]/images/graemlins/smile.gif[/img]


All times are GMT -4. The time now is 06:30 PM.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.