Validation seconds = 0.7660 Total HighPrecision Clocks = 2622109016 HighPrecision clocks per lookup = 19.599489
I just ran the tests, and haven't yet looked into why there is a difference.
Separately... It makes no perceivable timing difference on our reasonably fast hardware, but I hope this thread is not locking us into this benchmark paradigm, e.g.,
for (c0 = 0; c0 < 52; c0++) for (c1 = c0+1; c0 < 52; c0++) ... for (c6 = c5+1; c6 < 52; c6++)
rather than
for (c0 = 0; c0 < 46; c0++) for (c1 = c0+1; c0 < 47; c0++) ... for (c6 = c5+1; c6 < 52; c6++)
Something in my twisted cycle-saving brain wants to see the latter