Let's make a prime number generator.
There was no real reason for this project - it was just one of those coding ear worms that bug you until you actually do something about it.
So - after staring at a blank developer IDE for a while, I decided what I really wanted to do was create an app that would give me n number of prime numbers in a list. But first, I needed a way to test whether or not a number was prime.
Rules for optimization:
- Don't optimize.
- Optimize later.
I decided that the best way to go for this project would be to determine the next prime number from a given integer. That way, I could generate lists and other such nonsense. I settled on the following sequence:
public int nextPrime(int lastNo)
{
int currentNo = lastNo;
bool isNotPrime = true; while (isNotPrime)
{
isNotPrime = false;
currentNo++;
for (int i = 2; i < currentNo; i++)
{
if (currentNo % i == 0)
{
isNotPrime = true;
break;
}
}
return currentNo;
}
}
It's a fairly simple exercise, so let's quickly walk through it.
The premise of the function is that we want to find the next prime number, not whether the input number itself is prime:
- We create a copy of the input number (currentNo).
- We create a conditional check (is the number not prime?) and set to true. This is so we can pass into the first conditional.
- While the conditional is true (the number is not prime)
- Set the conditional to false. We are setting the assumption that the number is indeed a prime.
- increment the currentNo value by 1.
- Create a for-loop to go through all of the numbers between 2 and the currentNo minus 1.
- If the modulus of currentNo divided by the increment value (i) equals 0, then the number cannot be prime, so we set the test flag as true and break from the loop. Otherwise, we continue iterating through the set range of numbers until we exhaust it, or we manage to change the flag.
- If the test flag is true, re-run the sequence until it is not.
- Return the first value that results in the for-loop not changing the test flag to true.
Our initial process worked, but the performance is dreadful for large sets.
We have a method that will find the next prime number that occurs after any given number.
But I wanted to take the process a bit further and test the Unfortunately, the previous code falls down pretty hard when looking at large numbers.
So I decided to look at some ways to speed up the process. The obvious sticking point was the [for..next] loop, particularly the method in which we determine the next occurring number that doesn't have a divisor lower than the base case.
So the next step was to try to reduce the number of values we are able to evaluate to increase performance. If we work form the assumption that the next prime number will certainly *not* be divisiable by 1/2 of the base number, we can set this number as a cutoff point.
So the logic changes as follows:
- Take the current value, add 1 as x.
-
For each number between 2 and one half of ((x) +1) as y
- See if x is divisible from y.
- If it is, exit the loop, increment the value of x by 1 and repeat the process.
-
If not, continue through the loop.
- If we reach the end of the loop without finding a match, we can assume that the number is prime.
- Return the value of x.
public int nextPrime(int lastNo)
{
int currentNo = lastNo;
bool isNotPrime = true; while (isNotPrime)
{
isNotPrime = false;
currentNo++;
for (int i = 2; i < (currentNo/2+1); i++)
{
if (currentNo % i == 0)
{
isNotPrime = true;
break;
}
}
return currentNo;
}
}
The previous method wasn't very efficient for large number sets, as we can see from the run chart below, but our update cuts the runtime down quite a bit:
Number of Records | Initial Code | Optimized Code |
100 | 0.0007 sec | 0.0005 sec |
1000 | 0.057 sec | 0.031 sec |
10000 | 3.47 sec | 1.59 sec |
100000 | 411.75 sec | 209.99 sec |
But this is still not as optimized as it could be. I made a logical choice about how to reduce the number of checks it makes to determine if a number is prime. I assumed that the number greater that 1/2 of x would be disqualified as a dividing candidate, but could we narrow that down a bit more?
Turns out that we can:
public int nextPrime(int lastNo)
{
int currentNo = lastNo;
bool isNotPrime = true; while (isNotPrime)
{
isNotPrime = false;
currentNo++;
for (int i = 2; i < (Math.Sqrt(currentNo+1); i++)
{
if (currentNo % i == 0)
{
isNotPrime = true;
break;
}
}
return currentNo;
}
}
So - what are we doing now?
- Take the current value, add 1 as x.
-
For each number between 2 and the square root of ((x) +1) as y
- See if x is divisible from y.
- If it is, exit the loop, increment the value of x by 1 and repeat the process.
-
If not, continue through the loop.
- If we reach the end of the loop without finding a match, we can assume that the number is prime.
- Return the value of x.
This way, we've significantly reduced the number of steps taken in determining the next prime number, as the testing demonstrates:
Number of Records | Initial Code | Optimized Code ((x/2) + 1) | Optimized Code (√(x+1)) |
100 | 0.0007 sec | 0.0005 sec | 0.0006 sec |
1,000 | 0.057 sec | 0.031 sec | 0.0016 sec |
10,000 | 3.47 sec | 1.59 sec | 0.034 sec |
100,000 | 411.75 sec | 209.99 sec | 1.02 sec |
1,000,000 | n/a | n/a | 24.90 sec |
5,000,000 | n/a | n/a | 240.06 sec |
Based on the last run of tests, it didn't seem practical to run the first two algorithms through more than 100,000 records. It is clear that at a certain point, the gains in speed go a bit linear. Maybe I'll create a process to collect the data into a database by number of iterations as an excuse to play with Power BI.
Now - I'm running my test on a Dell XPS 13 9730 (i7 4 core processor, 16GB), so I wasn't expecting the earth to actually move. But I am wondering:
- What updates could I make to optimize the process even further?
- At what points are each algorithm alternatives optimal for the number of items?
- Is there any practical use for being able to generate 10,000,000 prime numbers quickly?
(shrugs)
But- it looks like there is a way to sped it up just a tiny bit more . . .
So - the process looks at every integer after (2) and iterates by (1) until it reaches the square root of the number we're testing.' Given the equasion above, I was happy with the performance overall.
So the question we're addressing throughout this entire solution has been: how do we limit the number of tests run to determine if a number is prime or not?
public int nextPrime(int lastNo) { int currentNo = lastNo; bool isNotPrime = true; if (currentNo % 3 == 0 || currentNo % 5 == 0 || currentNo % 7 == 0 || currentNo % 11 == 0) currentNo += 2; while (isNotPrime) { isNotPrime = false; currentNo++; for (int i = 2; i < (Math.Sqrt(currentNo+1)); i+=2) { if (currentNo % i == 0) { isNotPrime = true; break; } } return currentNo; } }
So - what have we done?
Two things:
- We eliminate [3,5,7,11] from our initial process. We only do this once - if the current number is in the list, we'll add [2] and move on through the peocess.
-
We're making the assumption that (because of the process), we're only processing odd numbers. Since no even number can be prime, we can cut the number of tests by half.
We do this by always intoducing an odd number to the funtion. The process always iterates the count by [2].
Number of Records | Initial Code | Optimized Code (√(x+1)) | Optimized Code (((x/2) + 1), odd only) |
100 | 0.0007 sec | 0.0 sec | 0.0 sec |
1,000 | 0.057 sec | 0.0 sec | 0.0 sec |
10,000 | 3.47 sec | 0.07 sec | 0.03 sec |
100,000 | 411.75 sec | 0.243 sec | 0.113 sec |
1,000,000 | n/a | 7.564 sec | 3.88 sec |
10,000,000 | n/a | 243.554 sec | 126.881 sec |
Result! Looks like we were able to actually cut the time of the process almost in half!
And with that, I'll just put this one to bed (unless I think of something else, like including a multithreading element, or measuring the speed to determine the next prime number after 1 billion +1!)