↑ Writing ↑

GEONius.com
20-Nov-2024
 E-mail 

Primality Testing

2018

Introduction

Terminology

Generalized Algorithm

Limit Checking

Performance

Mathematically Speaking



Related functions and programs:
NBR_UTIL - prime number and integer square root functions; e.g., nbrIsPrime() and nbrNextPrime().
primate - prime number algorithms test program.

Introduction

"We don't do this song very much, but I feel like singing so, I guess, that's what we're gonna do."

     - Duane Allman, "Dimples", Ludlow Garage, 1970 (Wikipedia)

I feel like writing so, I guess, that's what I'm going to do!

WARNING: All over the internet, you'll find example code that determines if a number is prime. Nearly all the examples I've seen loop through divisors while "i×in" (i.e., while the square of the next divisor to be tested is less than or equal to the number). Squaring the divisor is fine if you're using arbitrary-precision arithmetic, but it can fail at the high end if you're using fixed-size integers. Take unsigned 32-bit numbers for example:

       ~65535.9^2 = 4,294,967,295 (maximum unsigned 32-bit number)
        65535.0^2 = 4,294,836,225

There are 5,853 prime numbers between 65535.02 and ~65535.92. When checking the next divisor after 65,535 for any of these primes, the i×i computation will overflow, leading to meaningless, possibly infinite loops. This is discussed in more detail in the "Limit Checking" section below.

NOTE: This prime number package is GROSSLY OVERENGINEERED. I began with an optimization that skipped testing potential divisors divisible by 2 or by 3 in a small, tight loop. I then got to wondering about skipping potential divisors divisible by an even wider range of prime divisors; e.g., skip divisors divisible by 2, 3, 5, 7, or 11. I generalized the code to handle an arbitrary range, [2..], of prime divisors. When skipping divisors divisible by 2 or by 3, the generalized code is virtually as fast as the hard-coded loop.

NOTE: I was REINVENTING THE WHEEL. I basically had and have a grade-school knowledge of prime numbers. After writing the code, I looked up on the internet fast ways of determining primes and I learned that "my" optimization was widely known (see the mathematics section below). So perhaps this was all a waste of time, but the open-ended exploration of the subject on my own was very intellectually satisfying for me.

NOTE: If a SMALL MEMORY FOOTPRINT is important, this package can be compiled to produce compact object code that uses the original hard-coded loop and omits the sequence-of-differences functions and data related to the generalized version of the code. To accomplish this, add

    -DNBR_SKIP_2_3_HARDCODED=1

to the compilation command for "nbr_util.c". If an application only calls nbrIsPrime() and/or nbrNextPrime(), its code does not need to be compiled with the CPP definition. If, however, the application calls any of the sequence-of-differences functions, the application should be compiled with the CPP definition. Macros in "nbr_util.h" substitute meaningful return values for the omitted functions.

NOTE: To jog your memory, if necessary, a NUMBER IS PRIME if it's positive and if its only divisors are 1 and itself; zero and one by definition are not prime. To determine if a number n is prime, you basically test if the number is evenly divisible (remainder 0) by any divisor in the range 2..√n. If the number is evenly divisible by some divisor, then the number is not prime. If none of the divisors evenly divide the number, then the number is prime.

I wrote my HASH_UTIL hash table package in the late 1980s, as I recall. It has long been a nuisance that a program using the package has had to be explicitly linked to the C math library ("-lm"). So, about 30 years later, I decided to fix that! Thinking that the math library was required because of the use of sqrt(3) in (i) determining the next prime number larger than the expected table size and (ii) computing the standard deviation of the bucket lengths, I dove into the code. My memory hadn't served me well, because I discovered that the prime number function was not using sqrt(), but was instead iterating until:

    divisor > (number / divisor)

I don't know if I was considering the possibility of overflow at the time I wrote the hash table package, but it was a fortuitous choice for the loop test.

So far, so good, but I was red-faced to see that for 3 decades my code had been checking each and every possible divisor from 2 on up: 2, 3, 4, 5, 6, 7, ... Skipping all the even numbers after 2 was a painfully obvious optimization. And then I began thinking about also skipping odd numbers divisible by 3: check divisors 5, 7, ... skip 9 ..., 11, 13, ... skip 15 ..., 17, 19, etc. I noticed the differences between divisors to be tested was 2, 4, 2, 4, etc. In other words, a repeating cycle of two divisors. I implemented a hard-coded loop ("divisible" means evenly divisible with a remainder of 0):

    if (number equals 0 or 1)  return (not-prime) ;
    if (number equals 2 or 3)  return (is-prime) ;
    if (number divisible by 2 or by 3)  return (not-prime) ;

    for (divisor = 5 to "square root" of number) {
        if (number divisible by divisor)  return (not-prime) ;
        divisor += 2 ;
        if (number divisible by divisor)  return (not-prime) ;
        divisor += 4 ;
    }

    return (is-prime) ;

which checks the following divisors: [2, 3,] 5, 7, 11, 13, 17, 19, 23, 25, 29, 31, 35, 37, ... Skipping odd divisors divisible by 3 is almost twice as fast as checking all of the odd numbers.

I was pleased by my independent "discovery" of this heretofore unthought-of optimization (/snark) and my thoughts naturally turned to skipping even more divisors.

Terminology

First, however, some terminology I use in this package:

Skip-Through Prime (STP) - is the maximum prime number for which divisors divisible by one or more of primes 2 through STP are skipped. In the pseudocode above, the skip-through prime is 3. As another example, if the skip-through prime is 11, nbrIsPrime() will skip divisors divisible by any of the primes 2 through 11: 2, 3, 5, 7, 11.

Start Divisor - is the divisor at which to begin the first cycle of testing. This is always the next prime after the skip-through prime. Using the examples above, STP 3's start divisor is 5 and STP 11's start divisor is 13.

Sequence of Differences - is an array of the differences between successive divisors to be tested. The differences skip over divisors that do not need to be tested; i.e., divisors evenly divisible by primes 2 through the skip-through prime.

Beginning with the start divisor, nbrIsPrime() tests if the target number is evenly divisible by the start divisor. If so, the number is not prime and nbrIsPrime() returns. If not, the first difference in the sequence is added to the start divisor to get the next divisor to test. The function continues testing successive divisors; when the end of the sequence is reached, nbrIsPrime() jumps back to the beginning of the sequence to continue with the next pass through the sequence.

For example, in the pseudocode above, the sequence of differences for STP 3 is { 2, 4 }. Beginning with start divisor 5, the loop iterations unfold as follows:

    (1) Test  5, Add 2, Test  7, Add 4
    (2) Test 11, Add 2, Test 13, Add 4
    (3) Test 17, Add 2, Test 19, Add 4
    (4) Test 23, Add 2, Test 25, Add 4
    (5) Test 29, ...

Limit Check - refers to checking if a divisor about to be tested is less than or equal to the square root of the target number. When testing all odd divisors (skip-through prime 2), there is only one test in the loop and the limit check is performed as part of the loop control. For skip-through primes 3 and above, there are multiple tests per loop iteration, corresponding to each difference in the prime's sequence. STP 5 has a sequence of 8 differences, for example, but adding 7 limit checks inside the loop incurs a performance hit. (The pseudocode above doesn't limit check the second test's divisor in the loop.) I discuss this issue in more detail below, "Limit Checking".

Generalized Algorithm

The pseudocode for the generalized implementation inserts an inner loop that steps through the sequence of differences (again, "divisible" means evenly divisible with a remainder of 0):

    if (number equals 0 or 1)  return (not-prime) ;
    if (number equals any prime in 2..STP)  return (is-prime) ;
    if (number divisible by any prime in 2..STP)  return (not-prime) ;

    divisor = start divisor ;
    while (divisor ≤ "square root" of number) {
        for (each difference in sequence) {
            if (number divisible by divisor)  return (not-prime) ;
            divisor += difference ;
        }
    }

    return (is-prime) ;

Figuring out the sequence of differences for skip-through prime 3 was easy enough to do in my head. I wrote a throw-away program to generate differences for skip-through primes 5, 7, and 11; examined the listings to determine when the sequences began repeating (8, 48, and 480, respectively); and noticed the first of a few relations I would find:

seq-length (next-prime) = seq-length (current-prime) × (next-prime - 1)

For example:

    Skip-through prime 2's sequence of differences, { 2 }, is of length 1.
    STP 3 has a sequence of length 2 = 1 × (3 - 1).
    STP 5 has a sequence of length 8 = 2 × (5 - 1).
    STP 7 has a sequence of length 48 = 8 × (7 - 1).
    STP 11 has a sequence of length 480 = 48 × (11 - 1).
    And so on.

Sequence length, for the skip-through primes under consideration (2 through 53 for 64-bit unsigned long integers), increases by one or two orders of magnitude for each new prime. STP 59's sequence length exceeds the maximum representable, unsigned 64-bit value.

range (next-prime) = range (current-prime) × next-prime

The range of a sequence of differences is the range of divisors covered by each pass over the sequence, found by summing all the differences in the sequence. STP 3's sequence is { 2, 4 }, whose range is 2 + 4 = 6; as an example above shows, the first divisor tested in each loop iteration is 5, 11, 17, 23, 29, ... STP 5's 8-element sequence has a range of 30, so the divisors at the beginning of each loop iteration are 7, 37, 67, 97, ... I wasn't clever enough to realize that the formula above is more simply expressed as the product of the preceding primes and the skip-through prime; e.g., the range of STP 7 is 2 × 3 × 5 × 7 = 210. Although STP 53's sequence length fits in an unsigned 64-bit integer, its range doesn't.

density (prime) = seq-length (prime) / range (prime)

Density, in this case, is a measure of how many numbers in a range are actually tested as divisors. The lower the density the better. If even numbers are skipped and odd numbers are tested (STP 2), the density is 50%. Higher skip-through primes lower the density even further at the expense of increasingly lengthy sequences of differences.

max-difference (next-prime) = current-prime × 2

Sometimes! The maximum differences in the sequences for skip-through primes [2,] 3, 5, 7, 11, 13, 17, 19, and 29, 31 obey this relation. For example, the maximum difference in STP 31's sequence is 29 × 2 = 58. However, skip-through prime 23's maximum difference is 40 (not the expected 19×2 = 38) and STP 37's is 66 (not the expected 31×2 = 62). Using nbrPrimeSeqDiffsMax() to generate the one trillion (1012) differences in STP 37's sequence and determine the maximum difference originally consumed 99%-100% of the CPU for 8.5 days. I subsequently optimized the function by skipping both even divisors and odd divisors divisible by 3, reducing the time to 6.4 days. I'm not a glutton for punishment, so I did not give serious consideration to optimizing nbrPrimeSeqDiffs() and nbrPrimeSeqDiffsMax() further by using higher skip-through primes than 3. In any case, finding out STP 41's maximum among its 44 trillion differences would take three-quarters of a year or more. Maybe there's a mathematical relation that gives the correct maximum difference for all skip-through primes, but nothing jumped out at me.

Knowing a sequence's maximum difference--and using differences in the first place--can significantly reduce memory usage. For unsigned 64-bit integers, the largest skip-through prime, 47, has an estimated maximum difference of 43 × 2 = 86, which can be represented in an unsigned 8-bit integer. STP 47's sequence of differences requires only one-eighth of the memory that the sequence of offsets in the math section below requires. (Actually, STP 53 is the largest skip-through prime in the differences form of the algorithm. It can't be used in the math section's offset form of the algorithm. The last offset for any STP is always one less than the STP's range and STP 53's range exceeds the maximum representable, unsigned 64-bit integer by 14×1018.) Looking at unsigned 128-bit integers, the largest skip-through prime is 101 with an expected maximum difference of 202, which, if correct, can still be represented in an unsigned 8-bit integer. (This is all theoretical for the most part; the largest skip-through prime I've actually tested was 29 with its one-billion-element sequence.)

Time for some actual numbers to get a sense of what we are dealing with. The following numbers are for unsigned 64-bit integers (L/R is the length divided by the range; i.e., density):

 Skip-                              Estimated   Sequence Range
Through   Start                      Maximum       (Sum of
 Prime   Divisor  Sequence Length   Difference   Differences)     L/R
-------  -------  ---------------   ----------  --------------    ---
   2        3                     1      2                    2   50%
   3        5                     2      4                    6   33%
   5        7                     8      6                   30   26%
   7       11                    48     10                  210   22%
  11       13                   480     14                 2310   20%
  13       17                  5760     22                30030   19%
  17       19                 92160     26               510510   18%
  19       23               1658880     34              9699690   17%
  23       29              36495360     38            223092870   16%
  29       31            1021870080     46           6469693230   15%
  31       37           30656102400     58         200560490130   15%
  37       41         1103619686400     62        7420738134810   14%
  41       43        44144787456000     74      304250263527210   14%
  43       47      1854081073152000     82    13082761331670030   14%
  47       53     85287729364992000     86   614889782588491410   13%
  53       59   4434961926979584000     94             overflow   13%
    SIZE_MAX:  18446744073709551615

(The primate program computes the range as both an integer and a floating-point number. If the integer overflows, the floating-point number is substituted in the density calculation. So the density for skip-through prime 53 is correctly listed as 13%.)

For unsigned 128-bit integers, the following figures pick up with skip-through prime 53 (whose range overflowed 64 bits) and run through skip-through prime 101. The density at this point has only dropped down to 11.9%. (The sequence length and range both overflow for STP 103.)

  53 (Skip-Through Prime)    59 (Start Divisor)   94 (Est. Max. Diff.)
  Length:                      4434961926979584000    (4.43496e+18)
   Range:                     32589158477190044730    (3.25892e+19)

  ... through ...

  101 (Skip-Through Prime)  103 (Start Divisor)  202 (Est. Max. Diff.)
  Length:   27739969042773783995307880611840000000    (2.774e+37)
   Range:  232862364358497360900063316880507363070    (2.32862e+38)
SIZE_MAX:  340282366920938463463374607431768211455    (3.40282e+38)

  103 (Skip-Through Prime)    107 (Start Divisor)
  Length:  
   Range:  

Knowing the exploding lengths and ranges for the higher skip-through prime, whether for 64-bit or 128-bit numbers, was of intellectual interest to me, but these higher skip-through primes are obviously not of any practical value. The Prime Page's Glossary entry for wheel factorization says, in different terminology than mine, that skip-through prime 251 reduces the density to 10% and 75,037 reduces it to 5%. Skip-through prime 134,253,593, with a sequence length of approximately 6.4×1099, drives the density down to 3%!

Limit Checking

"Limit checking" refers to not testing divisors greater than the square root of the number when determining if the number is prime.

First of all, heed my warning at the very beginning about the danger of overflow using the widely seen i×in check with fixed-precision integers. The following explanation uses 64-bit unsigned longs, but the warning applies to any bit width.

The integer square root of the maximum representable, 64-bit value, ULONG_MAX, is 0x0FFFFFFFF — all of the lower 32 bits set. (The real square root of ULONG_MAX is 0x0FFFFFFFF plus about 0.9 decimal.) If you square the integer square root, the result is a number that is 8.6 billion less than ULONG_MAX. In other words, the 8.6 billion numbers in the range [root2..ULONG_MAX] all have the same integer square root: 0x0FFFFFFFF. There are 193 million primes in that range. Consequently, for each of the 193 million primes tested with divisor i:

So 193 million primes are excluded from being tested because of the i×i multiplication. The steps above will occur when testing every possible divisor (i++) or only odd divisors (i += 2).

Additional optimizations in the algorithm make the situation even worse. If the algorithm is extended to also skip divisors divisible by 3 (my first optimization), the divisor is incremented alternately by 2 and by 4. Since 0x0FFFFFFFF is divisible by 3 and will be skipped, i will only reach the preceding odd number, 0x0FFFFFFFD, and then be incremented by 4 to 0x100000001, causing i×i to overflow. The square of the preceding odd number is 25.8 billion less than ULONG_MAX. Testing the primality of the 579 million primes in that range — those whose integer square root is 0x0FFFFFFFD, 0x0FFFFFFFE, or 0x0FFFFFFFF — is impossible.

To avoid overflow with fixed-precision integers, divide both sides of the relational operator by i:

    if (i×in) ... continue checking divisors ... (INCORRECT)

    if (in/i) ... continue checking divisors ... (CORRECT)

Now that there is no danger of overflow, there are some other aspects of limit checking in relation to my skip-through prime optimization that need to be examined. Remember that the algorithm is basically two nested loops, where the inner loop runs through the elements in the sequence of differences and the outer loop repeats the inner loop as often as necessary to reach and pass the square root of the number (if the number is prime):

    div = STP's start divisor
    while (div ≤ n/div) {
        foreach difference in sequence {
            ... test if n is evenly divisible by div ...
            div = div + difference
        }
    }

In the Wikipedia pseudocode for the related c#k+i algorithm (see the mathematics section below), the "inner loop" corresponding to my inner loop tests the same number of divisors as mine does, but without limit checking the divisors. (I put "inner loop" in quotes because the pseudocode is hard-coded for the STP 3 optimization and the "loop" is a single IF-statement that checks both the 6(k-1)+5 and "6k+1" divisors; again, the c#k+i version of my STP 3 optimization is explained in the mathematics section.) It is a good idea, however, to limit check each divisor inside the inner loop as well as the outer loop:

    div = STP's start divisor
    while (div ≤ n/div) {
        foreach difference in sequence {
            ... test if n is evenly divisible by div ...
            div = div + difference
            if (div > n/div)  break
        }
    }

(Whether the check is at the beginning or end of the inner loop, there will always be one check that duplicates the check in the outer loop on each pass through the sequence.)

Because the length of the sequence and range increase rapidly with higher skip-through primes, there are two cases that are of concern when there is no limit check in the inner loop:

  1. The primality test must run through the entire sequence of differences. If the divisor exceeds the square root early in a pass through the sequence, doing all the remaining tests in the sequence is a waste of time. Assume the function uses STP 11 with its 480-difference sequence. If on some cycle the divisor surpasses the square root at the 5th difference, the inner loop will uselessly continue checking the other 475 divisors, all of which are greater than the square root.

  2. If the range of a sequence is greater than a prime number, continued testing of divisors past the square root might reach a point where the prime number itself is tested as a divisor. "n%n == 0", so the number will be rejected as composite even though it's really prime!

The first case above is a matter of efficiency; the second case requires flagging a number as composite if "n%div == 0" and if "n != div".

Adding the limit check, "div > n/div", to the inner loop caused a big performance hit. My first attempt at ameliorating the slow-down was to add a global boolean variable that the calling program could set to enable or disable limit checking in the inner loop:

    if (gCheckLimits && (div > n/div))  ...

Testing a boolean flag takes no time at all and, when limit checking is disabled, the algorithm is as fast as the original no-limit-checking algorithm. However, I was not satisfied with burdening the calling program with the responsibility of deciding whether to turn limit checking on or off.

My second attempt took advantage of a sequence's range to put off limit checking until the tested divisor got "close" to the square root of the input number. First, I computed a rough estimate of the square root by repeatedly dividing the number by 2 until "root ≤ n/root". Then, in the outer loop, before beginning the inner loop each time, figure out if the rough root estimate falls within the range of numbers covered by the coming pass through the sequence of differences. If not, clear a local boolean variable to disable limit checking in the inner loop. If the estimated root does fall in the range, set the boolean variable to enable checking.

    ... compute rough estimate of square root ...
    checkLimits = false
    div = STP's start divisor
    while (div ≤ n/div) {
        if (div ≤ root ≤ (div+range-1))  checkLimits = true
        foreach difference in sequence {
            ... test if n is evenly divisible by div ...
            div = div + difference
            if (checkLimits && (div > n/div))  break
        }
    }

A big improvement without intervention by the calling program, but ...

  1. The estimate of the square root could be off by as much as half of the actual square root, so limit checking might begin multiple range blocks before the range containing the actual square root. Obviously, it would be faster to only check limits in the last block.

  2. The large range of a higher skip-through prime's sequence may be greater than the estimated or actual square root and limit checking would then begin with the very first divisor of the very first pass through the sequence; i.e., the skip-through prime's start divisor. Thus the algorithm reverts to the slow check-every-divisor mode.

A more accurate estimate of the square root was needed, so I searched for "fast integer square root" on the web and, not unsurprisingly, up popped numerous references to a technical bulletin, "Fast Integer Square Root", by Ross M. Fosler. (I added an "s" to the title for my own description of the very fast algorithm and how I implemented it, "Fast Integer Square Roots".) My is-prime function now knows exactly when the tested divisors pass the actual square root of the number and the function no longer needs to perform the time-consuming "div > n/div" check in the inner loop:

    root = nbrUnsignedSQRT (n)
    div = STP's start divisor
    while (div ≤ root) {
        foreach difference in sequence {
            ... test if n is evenly divisible by div ...
            div = div + difference
            if (div > root)  break
        }
    }

Performance

Performance tests were conducted using my primate test program. The command lines are shown below. The "output" following the command lines are not the output of the program, but me tabulating the results of multiple runs. When primate is in "-time" mode, most output is suppressed so that I/O is not a significant factor in the times. Also, the generation of sequences of differences is done before the timer starts counting.

My original prime number function was in my HASH_UTIL package and the largest hash table I ever needed was for about 80,000 telemetry points. (Most of those points were variable names tied to memory locations in the onboard computer and would only be downloaded, selectively, when trying to diagnose a problem.) So my first suite of tests listed all the prime numbers in the range 80,000-200,000. Since listing those numbers is very quick, I had each run repeat the test 1000 times to get a large enough timing figure for comparison's sake.

  1. Determine the prime numbers in the range 80,000-200,000; repeat 1000 times:

        % primate -time -repeat 1000 -list 80000-200000 -alternate odds
    
            STP 2*   209.9 seconds    * hand-coded loop to check odd divisors
    
        % primate -time -repeat 1000 -list 80000-200000 -qprime stp
    
            STP 2    221.3 seconds
            STP 3    127.5 seconds
            STP 5     87.0 seconds
            STP 7     71.0 seconds
            STP 11    65.5 seconds
            STP 13    62.2 seconds
            STP 17    60.3 seconds
            STP 19    59.1 seconds
            STP 23    58.5 seconds
            STP 29    58.0 seconds
    
    The first test ("-alternate odds") used a run-of-the-mill, hand-coded loop to check all odd divisors. The remaining tests ("-qprime stp") used my sequence of differences, hence the slightly higher time for STP 2. As expected, the times decrease as the skip-through prime increases.

The second suite of tests determines the 1,000 largest prime numbers representable in unsigned, 64-bit integers. These numbers are verified by matching them against a pre-computed list of the primes generated by a different program, Achim Flammenkamp's prime_sieve, found at his web page:

"The Sieve of Eratosthenes"
http://wwwhomes.uni-bielefeld.de/achim/prime_sieve.html

[Modifications to his program: (i) define LONG as "unsigned long", (ii) undefine _ANSI_EXTENSION, (iii) uncomment the commented-out code in the definition of "use(x)", and (iv) hard-code the initial sieve size.]

  1. Determine and verify the largest 1,000 prime numbers representable in unsigned, 64-bit integers against a pre-computed (by a different program) list of those primes:

        % primate -time -verify last64 -qprime stp
    
            STP 3     90,842 seconds    (25.2 hours)
            STP 5     55,797 seconds    (15.5 hours)
            STP 7     43,093 seconds    (12.0 hours)
            STP 11    38,338 seconds    (10.6 hours)
            STP 13    35,310 seconds    ( 9.8 hours)
            STP 17    33,234 seconds    ( 9.2 hours)
            STP 19    31,667 seconds    ( 8.8 hours)
            STP 23    30,238 seconds    ( 8.4 hours)
            STP 29    29,430 seconds    ( 8.2 hours)
    

The most time-consuming aspect of my algorithm is generating the sequence of differences for the higer skip-through primes. My computer was able to generate a physical, 1-billion-byte sequence of differences for STP 29, but malloc(3) failed trying to allocate the 31-billion-byte sequence for STP 31. The nbrPrimeSeqDiffsMax() function generates the same sequences of differences, but without requiring any memory for the sequence: each difference is generated, compared to the running maximum value, and then discarded. So I was able to time sequence generation for STPs 31 and 37; the last took 6.4 days and I didn't have the patience to wait 280+ days for the generation of STP 41's 44-trillion-element sequence to complete.

  1. Generate sequences of differences:

        % primate -time -qmax stp
    
            STP 17        0.04 seconds             92 thousand elements
            STP 19        0.4 seconds               1.7 million elements
            STP 23       11 seconds                36 million elements
            STP 29      407 seconds  (6.8 minutes)  1 billion elements
            STP 31   12,602 seconds  (3.5 hours)   31 billion elements
            STP 37  550,670 seconds  (6.4 days)     1 trillion elements
    

Mathematically Speaking

When the code was pretty much in its final form, bar tweaking of the limit checking, I researched ways to speed up generating sequences of differences (e.g., originally 8.5 days for skip-through prime 37's one trillion differences, a 1012-length sequence). This research was unsuccessful, but I came upon many queries on programming forums about determining if a number is prime and many responses about an optimized method called "6k +/- 1". I then searched for information about 6k +/- 1 and landed on the Wikipedia page about primality testing using "trial division", i.e., looping through all possible divisors to find a divisor other than 1 for the number. From Wikipedia (Wikipedia: Primality test - Simple methods):

The algorithm can be improved further by observing that all primes greater than 6 are of the form 6k ± 1. This is because all integers can be expressed as (6k + i) for some integer k and for i = -1, 0, 1, 2, 3, or 4; 2 divides (6k + 0), (6k + 2), (6k + 4); and 3 divides (6k + 3). So, a more efficient method is to test if n is divisible by 2 or 3, then to check through all the numbers of the form 6k ± 1 ≤ √n. This is 3 times as fast as testing all m.

Generalising further, it can be seen that all primes greater than c# are of the form c#k + i for i < c# where c and k are integers, c# represents c primorial, and i represents the numbers that are coprime to c#. For example, let c = 6. Then c# = 2 ⋅ 3 ⋅ 5 = 30. All integers are of the form 30k + i for i = 0, 1, 2, …, 29 and k an integer. However, 2 divides 0, 2, 4, …, 28 and 3 divides 0, 3, 6, …, 27 and 5 divides 0, 5, 10, …, 25. So all prime numbers greater than 30 are of the form 30k + i for i = 1, 7, 11, 13, 17, 19, 23, 29 (i.e. for i < 30 such that gcd(i,30) = 1). Note that if i and 30 were not coprime, then 30k + i would be divisible by a prime divisor of 30, namely 2, 3 or 5, and would therefore not be prime. [Last sentence: ... for ex:- 437 ... Dog's bollocks — I learn something new every day!]

So all primes are of the form c#k+i for i meeting the conditions above. (Note that the reverse is not true; numbers of the form c#k+i are not necessarily prime.) Relating this to my code:

c   is my skip-through prime. (Strictly speaking, c can be any number in the range STP ... next-prime - 1; in other words, STP is the largest prime less than or equal to c.)

c#   is "c primorial", where primorial is simply the product of all the primes less than or equal to c; i.e., 2x3x5x7x... This value is identical to my "range" of a sequence of differences. (Wikipedia: Primorial)

k   indexes the blocks of range c# of divisors to be checked. Incrementing k is like reaching the end of a sequence of differences in my code and looping back to start at the beginning of the sequence for the next higher block of divisors.

i   is drawn from the set of numbers in the effective range 1..c#-1 such that each i is coprime with c#. Two numbers are coprime with respect to each other if they share no divisors other than 1. Note that "coprime" is a relation between numbers, not an attribute (like "prime") of a single number. (Wikipedia: Coprime integers)

So the c#k+i optimization is the foundation for my code, unbeknownst to me when I wrote the code! The difference between the c#k+i method and my code is that the former method works with explicit i offsets from the base of a c#k block and my code works with the sequence of differences between successive i offsets.

Now let's consider the set of numbers coprime with c# from which i is drawn. The numbers are in the range 0..c#-1. Zero (0) drops out because the first multiplier in c# is 2, so c#k+0 is always even (divisible by 2). The very first number in the set is always 1 since the only divisor it shares with c# is 1.

The second number in the set is always the next prime after c: the numbers between 1 and the next prime, exclusive, are either equal to or evenly divisible by one of the primes less than or equal to c (that's why the next prime is prime!), so these numbers drop through the "sieve". NOTE that the next prime, the second number in the set, is the start divisor in my algorithm. My sequence of differences begins with the difference between the second and third elements of the set and ends with the difference between the first and second elements.

The remaining numbers in the set are determined by checking the divisors of the numbers after the next prime. If a number is evenly divisible by any of the primes less than or equal to c, the number drops through the "sieve" and out of the set. Since the number of elements in a set rises rapidly with high c's, this is a time-consuming process. This process can itself be optimized by using the c#k+i method. My comparable generation of a sequence of differences uses a hard-coded skip-through prime of 3; as I mentioned above, it didn't seem worth the effort to generalize that bit of code for higher STPs.

Skip-Through Primes

Here are some example skip-through primes in their c#k+i form. (In the first example, 1 is not prime, but the algorithm will work with the given parameters to check every possible divisor, odd and even.)

    c = 1    c# = 1
    i in { 1 }

    c = 2    c# = 2
    i in { 1 }

    c = 3    c# = 6
    i in { 1, 5 }

    c = 5    c# = 30
    i in { 1, 7, 11, 13, 17, 19, 23, 29 }

    c = 7    c# = 210            48 elements in set
    i in { 1, 11, 13, 17, 19, 23, 29, 31, ..., 199, 209 }

    c = 11   c# = 2310		480 elements in set
    i in { 1, 13, 17, 19, 23, 29, 31, 37, ..., 2297, 2309 }

    ...

    c = 23   c# = 223,092,870    36,495,360 elements in set
    i in { 1, 29, 31, 37, 41, ..., 223092841, 223092869 }

    c = 29   c# = 6,469,693,230  1,021,870,080 elements in set
    i in { 1, 31, 37, 41, 43, ..., 6469693199, 6469693229 }

Note that the algorithm has to ignore the very first potential divisor: when k is 0 and i is 1, c#k+i = 1 and any number being divisible by 1 does not mean the number is composite. Effectively, the first divisor checked is the next prime after c, my start divisor. (Of course, this is after the initial check if the target number is equal to or divisible by the primes less than or equal to c).

I just noticed that the difference between the first two elements in a set equals the difference between the last two elements, at least up to c=29. I don't know if there's a mathematical reason for this or if it holds true for all possible sets.

You might be wondering how the "6k +/- 1" form widely seen on the internet relates to the c#k+i scheme of 6k+i for i in { 1, 5 }. Swap the arithmetic operations and "6k -/+ 1" becomes shorthand for the +5 term of the previous k block and the +1 term of the current k block:

    c = 3    c# = 2×3 = 6    i in { 1, 5 }
    Previous k block: 6(k-1)+5 = 6k-6+5 = 6k-1
    Current k block:  6k+1

A graphical representation, I guess you would call it, of the c#k+i optimization is called wheel factorization (Wikipedia: Wheel factorization). Like my algorithm, the algorithm presented on the Wikipedia page (i) explicitly begins checking divisors with the next prime after c and (ii) uses the differences between divisors rather than the i offsets.

Sequence Length

Lastly, a remark upon my formula for the length of a sequence of differences:

seq-length (next-prime) = seq-length (current-prime) ⋅ (next-prime - 1)

As mentioned before, the sequence length is the count (or totient) of the numbers in 1..c#-1 that are coprime with c#. Euler's product formula (Wikipedia: Euler's product formula) can be used to compute the totient, phi(n), for a given number n:

φ(n) = n ⋅ ∏(p|n) (1 - 1/p)

which means the product of successive (1-1/p)s for all primes p that are divisors of n:

    phi = 1
    for each prime number p that is a divisor of n {
        phi = phi * (1 - 1/p)
    }

(Note that Euler's formula can be used for arbitrary positive integers, but what is of interest in the primality testing context is specifically when n is an instance of a c#.)

With p[k] being a prime number and p[k+1] the next larger prime number, my sequence-length formula can be derived from Euler's formula as follows:

    φ(p[k+1]#) = φ(p[k]#) ⋅ p[k+1] ⋅ (1 - 1/p[k+1])
               = φ(p[k]#) ⋅ p[k+1] ⋅ ((p[k+1] - 1)/p[k+1])
               = φ(p[k]#) ⋅ (p[k+1] - 1)
               = seq-length (current-prime) ⋅ (next-prime - 1)

I am not a mathematician and my heart sank after reading the first two paragraphs on totient's Wikipedia page. The rest of the page is not for the mathematically faint of heart. Fortunately, I really only needed to understand Euler's product formula; some ancillary Wikipedia pages on the notation helped me figure out how to apply the formula. I tried it on three skip-through primes:

    5 (5# = 30, φ(30) = 8);
    7 (7# = 210, φ(210) = 48); and
   11 (11# = 2310, φ(2310) = 480).

The pattern of the calculations became obvious and I could then work out the derivation above of my formula after the fact.


Alex Measday  /  E-mail