People have been thinking about prime numbers for over 1700 years, so most of the neat tricks about them have been discovered. I've pondered on a trick that's probably already been thought of, but just in case it hasn't I thought I would blog about it. I actually got the idea for this a couple years ago, but just haven't taken the time to code a proof of concept for it until now. Lately I've been up against the wall with a crazy project at work, and tonight I just wanted to unwind a little and do a little recreational coding to test out this prime number theory I've had rattling around in my brain.
It's well-known that as you search for factors of a number, you only need to search up to that number's square root and no further. And when searching for primes, you only need to check other prime numbers as possible divisors. That already speeds up things significantly. But the additional boost in search speed I'm proposing comes from making a fairly large list of offsets, on the order of billions of entries, that provides a repeated sequence of numbers that can never be prime. These are the ones that you can safely skip past when looking for primes. The length of this repeated sequence is very important. If you have the proper length, then it works with the entire set of prime numbers. Follow along if you dare as we take a quick tour.
For starters, obviously you can throw out every even number except for 2. This sequence of two numbers is repeated throughout the entire prime number set. Similarly as you go along there are much larger predictable sequences that you can use to automatically omit many numbers when checking for divisibility. The length of the sequence has to be the product of a sequence of numbers. It can be a factoral, or the product of any other range of numbers that starts from some starting number to 1 or 2 less than double that number. We'll take a look at a very simple group to start with, just 3 factoral. The length of this sequence is just 6, being 2*3. Let's move to the next larger interesting sequence by adding 4 to the end of the list. It's double the first number 2, so we can throw 2 out if we'd like. For the best efficiency we should. Multiply all the remaining numbers in the sequence together to find the length of a reliable sequence of repeatabiliity that exists through the entire set of prime numbers. You can always count on this repetition no matter how high in the millions, billions, and trillions of numbers you get up to. We now have just 3 and 4 in the sequence, and 3*4 is 12, so we can break all primes up into blocks of 12, and automatically rule out specific numbers in the block as never being prime as so:
| Offset |
Check it? |
| 0 |
No, divisible by 2, 3 and 4 |
| 1 |
Yes |
| 2 |
No, divisible by 2 |
| 3 |
No, divisible by 3 |
| 4 |
No, divisible by 2 and 4 |
| 5 |
Yes |
| 6 |
No, divisible by 2 and 3 |
| 7 |
Yes |
| 8 |
No, divisible by 2 and 4 |
| 9 |
No, divisible by 3 |
| 10 |
No, divisible by 2 |
| 11 |
Yes |
So if we were to consider checking 15, why even bother? We already marked in our "roadmap" that it can't be prime, specifically because it's divisible by 3. As the sequence repeats every 12 numbers, the same thing could be said of 27, ruled out automatically for the same reason. Thus for every 12 numbers we only need to check four of those numbers, which is already 3 times faster than checking every number. And all we have to store is 12 bits for this very simple roadmap. This isn't that exciting yet. But it gets even better when we use a larger roadmap. When we expand the list of numbers that creates the roadmap from just 3 and 4 to be 3,4, and 5, we then have a repeating sequence of 60 numbers. So for every 60 numbers in the sequence of all prime numbers we can rely on there being a pattern, and if we made the same kind of table as above we would find that we only have to check 18 of them. The next largest roadmap is 120 and requires checking only 30, so 4X better. The next is 840, which requires checking 124. As we use larger and larger roadmaps, the return gets better and better. And each entry requires just one bit of memory. 15120 entries can thus be stored in a little under 2K, and allows us to calculate 16X faster. And with just a smidge over a megabyte of RAM we can speed up prime number calculation about 180 times! That's pretty awesome. But consider that none of the even numbers even need to be tested. So we can actually store those roadmaps in half that space, achieving that 180 times faster prime number searching with a roadmap table that's a half of a meg in size.
Okay, how about if we crank it up a bit more and throw 32 megs at the problem? 9*10*11*12*13*14*15*16 equals a roadmap with 518918400 entries, which fits in just under 32 megs of RAM when you throw out all the bits that would have marked even numbers. With a roadmap table that size we are then almost a thousand times faster at calculating prime numbers, only having to check 552333 out of every half billion numbers to see if it's a prime. The roadmap pinpoints the exact numbers for analysis. Everything can fit comfortably in a standard BitArray object available in .NET. And on my puny 1.8 GHz Pentium M it creates this table in about 3 minutes. After having this table in place my machine is doing the same work that 1000 similar machines would have. I can simply hone in on the one in 1000 entries that could actually be prime, bypassing all the rest.
A further enhancement would be to compress the roadmap tables further, as they already have some repetition within them. That adds a bit to the complexity of the algorithm, but may offer 3 to 5 times more roadmap size in the same amount of RAM.
For those who would like to dig deeper, here's a proof of concept WinForms app that I wrote in C# 2.0:

Executable
C# 2.0 Source Code
When you run the program, it shows a list of the sequence of numbers that is used to determine the length of a repeating sequence among all prime numbers. Also the first 100 primes it's calculated as it goes along. After you get past 665280, it really starts to take awhile when building out the sequences. And the code is set to cut off creating the entire roadmap when it gets above half a billion entries so it doesn't run out of room in the BitArray. Of course other ways to allocate memory could then enable absolutely enormous tables stored in memory, and prime number calculation hundreds of thousands of times faster than normal.
So if anyone out there has already seen this trick then please let me know. It's probably not original. But it's been in the back of my mind for a few years, and I'd love to know who else has thought of it before. I certainly hope by disclosing this approach that systems with gargantuan amounts of RAM won't compromise the PKI infrastructure that we all rely on for security!
Well, here it is Monday, and time to put the fun prime number code away for now and get back to working on that other gnarly project. Wish me luck.