Enumeration to 1.6*10^15 of the prime quadruplets

Thomas R. Nicely

http://www.trnicely.net
Current e-mail address


Copyright © 2007 Thomas R. Nicely. All rights reserved. This document may be freely reproduced and distributed for educational and non-profit purposes.

Date of first release 23 August 1999.

Last revised 0500 EDT 28 March 2007.

The content of this document (other than the addendum, which was not part of the submission for publication) is essentially that of the original release, except that information rendered obsolete by subsequent events has been removed or modified, in both the main document and the addendum (this liberty is taken in view of the fact that the paper was never accepted for publication). There may also be differences in formatting, and in minor details and corrections.

Abstract

Enumeration of the prime quadruplets (q, q+2, q+6, q+8), and the sum of their reciprocals, is extended to 1.6*10^15. An estimate is obtained for the limit of the sum of the reciprocals, an analog to Brun's constant, B_4 = 0.87058 83800 +/- 0.00000 00005. Error analysis is presented to support the opinion that the stated error bound represents a 99 % confidence level.

Mathematics Subject Classification 2000 (MSC2000)

Primary: 11A41.
Secondary: 11-04, 11Y70, 11Y60.

Key Words and Phrases

Prime numbers, prime quadruplets, prime constellations, Brun's constant, prime k-tuples conjecture.


1. Introduction

Certain groupings (k-tuples, or constellations) of two or more prime numbers, such as the twin-prime pairs K_2 = {(3, 5), (5, 7), (11, 13), (17, 19), ...}, are believed to constitute infinite sets. Other such groupings may occur only a finite number of times, and perhaps not at all; e.g., the prime triplets (q, q+2, q+4) produce only the one instance (3, 5, 7), while there are no prime triplets at all of the form (q, q+1, q+2). Constellations which, due to divisibility considerations and the resulting residue classes occupied, can have only a finite number of instances (possibly none), are termed inadmissible; the remaining constellations are termed admissible. The question of infinitude remains undecided for all the admissible constellations; in the case of the twins, the proposition of infinitude is referred to as the "twin-primes conjecture." Hardy and Littlewood [4] conjectured a much stronger and more quantitative description of the distribution of the admissible constellations, the "prime k-tuples conjecture," which imples not only the infinitude of every admissible constellation, but yields also quite accurate asymptotic formulas for the number of members of a particular admissible constellation which lie below a specified upper bound x.

2. The Prime Quadruplets

Most of the study of the prime constellations has been focused on the twin primes; see [7] and [9] for some recent results. This paper considers instead the constellation of prime quadruplets of the form (q, q+2, q+6, q+8), the set

(2.1)        K_4 = {(5, 7, 11, 13), (11, 13, 17, 19), (101, 103, 107, 109), ...} ,

in which the first element of each member (except the anomalous initial one) is of the form 30n + 11. As with other admissible constellations, it is not known if K_4 is infinite. It is known that there exist extremely large numbers and instances of such prime quadruplets; more than a thousand million of them have been counted beyond 10^15 in this study, and Forbes [2] reports an instance in which the initial prime is a number of 1032 decimal digits,

(2.2)        q = 24947432928741915235*(2^3363 - 2^1121) - 6*(2^1121) - 7 .

Furthermore, the Hardy-Littlewood prime k-tuples conjecture implies that

(2.3)        pi_4(x) ~ L_4(x) = (27/2)*c_4*int(1/(ln(t))^4, t, 2, x) ,

where pi_4(x) represents the count of prime quadruplets (q, q+2, q+6, q+8) such that q <= x, and c_4 is the constant

(2.4)        c_4 = prod((p^3)*(p - 4)/(p - 1)^4, p, 5, +infinity) ,

with the infinite product taken over all the primes save 2 and 3. We shall refer to (2.3) as the Hardy-Littlewood approximation for the prime quadruplets, or simply the Hardy-Littlewood approximation; it clearly implies the infinitude of the quadruplets, and a great deal more. The validity of this conjecture will be central to our methods for estimating B_4 and its error bound. See Riesel [11, pp. 60-83] for detailed discussions and derivations of (2.3), (2.4), and similar consequences of the prime k-tuples conjecture. The constant c_4 has been computed to 45 significant digits by Harley [5],

(2.5)        c_4 = 0.30749 48787 58327 09312 33544 86071 07685 30221 78519... .

For the twin primes, Brun [1] proved that whether or not the set K_2 is infinite, the sum of the reciprocals

(2.6)        B_2 = (1/3 + 1/5) + (1/5 + 1/7) + (1/11 + 1/13) + (1/17 + 1/19) + ... ,

is convergent, in contrast to the known divergence of the sum of the reciprocals of all the primes. The limit of this sum is now referred to as Brun's constant,and has been approximated by Nicely [9] as

(2.7)        B_2 = 1.90216 05823 +/- 0.00000 00008 ,

where the error bound is believed to define a 95 % confidence interval.

In the same fashion, we define Brun's constant B_4 for the prime quadruplets as the limit of the sum of the reciprocals,

(2.8)      B_4 = (1/5 + 1/7 + 1/11 + 1/13) + (1/11 + 1/13 + 1/17 + 1/19)
                                                    + (1/101 + 1/103 + 1/107 + 1/109) + ...

As with B_2, we do not if the series is finite or infinite; but as a consequence of Brun's proof of the convergence of the sum of the reciprocals of the twins, we do know that the series for B_4 is convergent; for if the series is infinite, its terms are a proper subset of the series in (2.6). Since (2.6) is a positive series, its convergence is immune to deletion, rearrangement, or regrouping of terms; thus the series (2.8) defining B_4 must also be convergent. In essence, the quadruplets are constructed from a proper subset of the twins.

Although (2.8) is convergent, the monotonically increasing partial sums approach the limit quite slowly. However, assuming the validity of the Hardy-Littlewood approximation (2.3), a more rapidly converging first order extrapolation may be derived as follows. Define S_4(x) as the partial sum of the reciprocals of the quadruplets,

(2.9)        S_4(x) = sum(1/q + 1/(q+2) + 1/(q+6) + 1/(q+8), q, q <= x) .

Then the remainder term in the series defining B_4 may be approximated by

(2.10)        B_4 - S_4(x) = sum(1/q + 1/(q+2) + 1/(q+6) + 1/(q+8), q, q > x)
                                         ~  int((4/t)*(27c_4/2)*1/(ln(t))^4, t, x, +infinity)
                                         ~  54c_4*int(1/(t*(ln(t))^4)), t, x, +infinity)
                                         ~  54c_4*int(1/u^4, u, ln(x), +infinity)
                                         ~  18c_4/(ln(x))^4 ,

where we have employed the density (27/2)c_4*1/(ln(t))^4 of quadruplets implied by the Hardy-Littlewood approximation, approximated the sum of the reciprocals of a particular quadruplet by 4/t, and used the substitution u = ln(t). This produces a "first order" extrapolation of S_4(x) to B_4, which we denote by F_4(x),

(2.11)        B_4 ~ F_4(x) = S_4(x) + 18c_4/(ln(x))^3

No effective second order extrapolation is known; however, we will present evidence that the error term in (2.11) is O(1/(sqrt(x)*(ln(x))^2), so that

(2.12)        B_4 = F_4(x) + O(1/(sqrt(x)*(ln(x))^2))
                          = S_4(x) + 18c_4/(ln(x))^3 + O(1/(sqrt(x)*(ln(x))^2)) ,

implying that F_4(x) converges to B_4 at a rate O(sqrt(x)/ln x) faster than the partial sums S_4(x).

3. Computational Results

The present study results from the continuation of a project initiated in 1993, an exercise in the use of distributed computing to leverage limited resources (personal computers of moderate cost and power) in order to attack problems in computational number theory which normally would be reserved to supercomputers. Detailed descriptions of the general program, the computational methods employed, and the incidental discovery of the Pentium FDIV flaw may be found in [7], [8], [9], and [10].

Table 1 contains a brief summary of the computational results, including the counts pi_4(x) of the prime quadruplets (q, q+2, q+6, q+8); the values of the discrepancy between pi_4(x) and the Hardy-Littlewood approximation,

(3.1)        delta_4(x) = L_4(x) - pi_4(x) ;

the partial sums S_4(x) of the reciprocals of the quadruplets; and the first order extrapolations F_4(x) of S_4(x) to the limit, according to (2.11), members of a sequence believed to be converging to Brun's B_4 constant.


                               Table 1.
  Counts of prime quadruplets and estimates of Brun's B_4 constant.
======================================================================
      x      pi_4(x) delta_4(x)        S_4(x)             F_4(x)
======================================================================
      10           1      10.29  0.510689310689311  0.964070321217938
     100           2      11.60  0.789976586880612  0.846649213196690
    1000           5      11.49  0.853473194253130  0.870265083531968
   10000          12      12.17  0.863733192400183  0.870817270689693
   10^05          38      14.88  0.867011003684134  0.870638051768363
   10^06         166      17.68  0.868379532753497  0.870478518913352
   10^07         899     -36.05  0.869267876960829  0.870589687487152
   10^08        4768     -33.36  0.869705293632323  0.870590803418512
   10^09       28388       8.84  0.869966856425087  0.870588778250229
   10^10      180529     545.93  0.870134891176928  0.870588272187457
   10^11     1209318     638.22  0.870247695545365  0.870588327409023
   10^12     8398278   -3699.97  0.870326020813441  0.870588394083423
   10^13    60070590    4848.36  0.870382016088034  0.870588379770569
   10^14   441296836   -6103.68  0.870423153466140  0.870588379781931
 2.0e+14   807947960    2717.36  0.870433368925933  0.870588379517423
 3.0e+14  1151928827  -12660.14  0.870438957019776  0.870588379757893
 4.0e+14  1482125418  -15032.60  0.870442759816539  0.870588379787802
 5.0e+14  1802539207  -23557.26  0.870445621161320  0.870588379871401
 6.0e+14  2115416076  -35177.17  0.870447903679533  0.870588379961704
 7.0e+14  2422194981  -49882.89  0.870449795732922  0.870588380059497
 8.0e+14  2723839871  -35301.69  0.870451407176393  0.870588379983029
 9.0e+14  3021126140  -38141.52  0.870452807976233  0.870588379996686
 1.0e+15  3314576487  -26197.22  0.870454044834374  0.870588379948604
 1.1e+15  3604646822  -19485.07  0.870455150797010  0.870588379922608
 1.2e+15  3891706125  -36034.00  0.870456149967533  0.870588379981021
 1.3e+15  4175985018  -18182.67  0.870457060200978  0.870588379923299
 1.4e+15  4457782901  -24552.75  0.870457895584737  0.870588379943262
 1.5e+15  4737286827  -38254.45  0.870458666969910  0.870588379980055
 1.6e+15  5014641832  -29496.94  0.870459383002448  0.870588379958289
======================================================================

The count pi_4(10^8) in Table 1 agrees with the value attributed by Riesel [11, p. 62] to Gruenberger and Armerding [3]. Two values omitted from Table 1, pi_4(2*10^9) = 49262 and pi_4(2*10^10) = 319107, agree with those computed by Hein [6]. Additional checks on the integrity of the results included comparisons of the associated values of pi(x) with published values, and duplications of each calculated interval on two different computer systems. The computed values of pi(x) are not shown here, but are available at http://www.trnicely.net/pi/tabpi.html. Also, more extensive versions of Table 1 are available at http://www.trnicely.net/quads/tabpi4.html.

No previous investigator is known to have considered the sums of the reciprocals, or Brun's B_4 constant.

In support of the accuracy and probable validity of the Hardy-Littlewood conjecture (2.3) for the quadruplets, note that delta_4(x) has about half as many digits as L_4(x). This also supports the deduction

(3.2)        delta_4(x) = O(sqrt(L_4(x)) = O(sqrt(x)/(ln(x))^2) .

Furthermore, this would correspond to an error term in the first order extrapolation (2.11) of order

(3.3)        B_4 - F_4(x) = O((4/x)*sqrt(x)/(ln(x))^2) = O(1/(sqrt(x)*(ln(x))^2)) ,

supporting the error term proposed in (2.12). The consequent consistency of the extrapolated estimates F_4(x) for Brun's B_4 constant provides additional evidence in favor of the Hardy-Littlewood approximation.

4. Brun's B_4 Constant and the Error Analysis

The first order extrapolation F_4(1.6*10^15) is believed to yield an estimate of Brun's B_4 constant accurate to ten significant decimal digits,

(4.1)        B_4 = 0.87058 83800 +/- 0.00000 00005 .

The error estimate is believed to represent a 99 % confidence level, although this is a subjective opinion of the author for which definitive proof remains lacking. More precisely, the author believes that for at least 99 % of the integers in any sufficiently large set of distinct integers x > 1, Brun's B_4 constant would lie between F_4(x) - E_4(x) and F_4(x) + E_4(x). Here E_4(x) is the error bound function stated in (4.6) below, from which the error bound in (4.1) will be obtained. The error estimate is arrived at as follows.

Rather than attempting to bound the error F_4(x) - B_4 directly, we consider the "scaled deviations"

(4.2)        P_4(x) = sqrt(x)*((ln(x))^2)*[F_4(x) - B_4] ,

where the scaling factor sqrt(x)*(ln(x))^2 arises from the conjectured error term in (2.12). No rigorous derivation is known for this error term. One justification for it was given in the derivation of (3.3), based on the observed magnitude of delta_4(x) in the computed data. A second justification arises from further analysis of the data, which reveals that the resulting mean absolute value of P_4(x) (using our best estimate F_4(1.6*10^15) in place of B_4) remains of the same order of magnitude O(1) over most intervals of 10^12 from 0 through 1.6*10^15, as would be expected with a valid scaling factor (deviations at all values of x are thus given similar weight).

We now hypothesize that the deviations F_4(x) - B_4 (and consequently P_4(x) as well) change sign infinitely often; more precisely, given any x_0, however large, there will exist integers x_1 > x_0 and x_2 > x_0 such that F_4(x_1) < B_4 and F_4(x_2) > B_4. We will refer to this as hypothesis [H4]. Note that although [H4] is neither necessary nor sufficient for the Hardy-Littlewood approximation (2.3), it will almost certainly fail (along with (4.1) in its entirety) if Hardy-Littlewood is false. In support of [H4], we note that if our best estimate F_4(1.6*10^15) is used for B_4, then F_4(x) - B_4 is observed to undergo 504 sign changes over the 160081 data points recorded in the present study, with 315 sign changes occurring beyond 10^15.

Given [H4], we then look for the maximum computable value of the function

(4.3)        N_4(x_1, x_2) = sqrt(x_1)*((ln(x_1))^2)*|F_4(x_1) - F_4(x_2)| ,

where x_1 and x_2 are integers such that x_2 > x_1 > 1. N_4 may be considered as a "normalized" measure of the amplitude of the oscillations in F_4(x), or alternately as a measure of the scaled deviation of a "predictor" value F_4(x_1) from a "terminal approximation" F_4(x_2). In contrast to P_4, N_4 has the virtue that it is independent of the uncertain value of B_4. If N_4 has a global maximum, or even an upper bound, then given [H4] this must also represent an upper bound on the absolute value of P_4, thus producing an unconditional error bound; for any specific value |P_4(x_3)| will be exceeded by N_4(x_3, x_4), where x_4 is chosen so that x_4 > x_3 while F_4(x_3) - B_4 and F_4(x_4) - B_4 are of opposite sign ([H4] implies that there is an infinite sequence of such integers x_4 for any given integer x_3). In practice, we are unable to prove that N_4 has a global maximum, although (2.12) would imply that P_4 is bounded. Indeed, it is computationally impractical even to find the absolute maximum of N_4 over all integers in the range under investigation, 1 < x_1 < x_2 <= 1.6*10^15; this would involve the comparison of more than 10^30 data pairs (F_4(x_1), F_4(x_2)), a calculation of doubtful feasibility. What has been determined is that the absolute maximum of N_4 over all of the (more than 10^10 such) data pairs recorded in this study is

(4.4)        N_4(1000000, 20000000) = 21.704105 .

However, additional computations reveal an even larger value of N_4 in the vicinity of this point,

(4.5)        N_4(1002340, 16977221) = 22.6687145 ,

where the values in both (4.4) and (4.5) have been rounded up. Although the value 22.6687145 is still not established as the sought after upper bound on N_4 (and thus on |P_4|), the numerical evidence (zero exceptions among more than 10^10 data pairs extending over fifteen orders of magnitude) indicates that if any larger values of N_4 (and thus |P_4|) exist, they must be relatively rare. Our intuitive conclusion is that for the great majority (99 percent or more?) of integers x > 1, it will be true that |P_4(x)| < 22.6687145, producing the error bound formula E_4(x),

(4.6)        |F_4(x) - B_4| < E_4(x) = 22.6687145/(sqrt(x)*(ln(x))^2) .

In arriving at our error estimate in (4.1), we are assuming that (4.6) does in fact hold specifically for x = 1.6*10^15, so that

(4.7)    |F_4(1.6*10^15) - B_4| < E_4(1.6*10^15) = 22.6687145/(sqrt(1.6*10^15)*(ln(1.6*10^15))^2) ,

giving

(4.8)        |F_4(1.6*10^15) - B_4| < 0.00000 00004 624 ,

which, with rounding up for good measure, produces the error estimate given in (4.1).

The validity of the Hardy-Littlewood conjecture (2.3) is critical to our estimate (4.1) of Brun's B_4 constant and the associated error bound. On the other hand, the particular scaling factor sqrt(x)*(ln(x))^2, taken from the conjectured error term in (2.12), is not indispensable to this line of error analysis. Alternative scaling factors could be employed, presumably corresponding to different error terms, and the author in fact investigated several alternatives. No other yielded superior results; either the resulting error bound was inferior, or a "near-maximum" value of the corresponding function N_4 (meaning one sufficiently large to provide an error bound of equal or better confidence level) could not be located without greatly escalating the computational effort. One notable virtue of this scaling factor was that the resulting maximum computed value of N_4 occurred at relatively small values of x_1 and x_2, permitting a detailed search for the extreme value, and yielding an increased confidence that larger values would prove difficult to unearth, or perhaps even be nonexistent. Nonethless, some other alternative scaling factor might yet produce a superior error bound.

On the other hand, more than ten thousand million pairs of first order extrapolations have agreed with each other within this error bound, without exception. Of course, it carries no unconditional guarantee; overwhelming numerical evidence is occasionally a subtle trap, as the conjecture Li(x)   >   pi(x) so memorably illustrated. However, it would appear that a more rigorous error bound must await a deeper understanding of the subtleties in the distribution of the prime quadruplets in particular, and the prime constellations in general.


References

  1. Viggo Brun, "La série 1/5 + 1/7 + 1/11 + 1/13 + 1/17 + 1/19 + 1/29 + 1/31 + 1/41 + 1/43 + 1/59 + 1/61 + ..., où les dénominateurs sont 'nombres premieres jumeaux' est convergente ou finie," (French) Bulletin des sciences mathématiques 43 (1919), 100-104, 124-128.
  2. Tony Forbes, unpublished work, accessed (5 August 1999) at http://www.ltkz.demon.co.uk/ktuplets.htm.
  3. F. J. Gruenberger and G. Armerding, "Statistics on the first six million primes," UMT 73, Math. Comp. 19 (1965), 503-505.
  4. G. H. Hardy and J. E. Littlewood, "Some problems of 'Partitio Numerorum' III: On the expression of a number as a sum of primes," Acta Math. 44 (1922), 1-70. See also "Collected papers of G. H. Hardy," Clarendon Press, Oxford, 1966, Vol. I, 561-630.
  5. Robert Joseph Harley, unpublished work, accessed (16 September 1996) at http://pauillac.inria.fr/~harley/bits.html.
  6. Robert Hein <rhein(AT)erols(DOT)com>, unpublished work (3 March 1995), received by telefax from U. S. Army CECOM RDEC, Intelligence and Electronic Warfare Directorate, Data Fusion Branch, ECM Lab, Vint Hill Farms Station, Warrenton VA 22186-5100 USA.
  7. Thomas R. Nicely, "Enumeration to 10^14 of the twin primes and Brun's constant," Virginia Journal of Science 46:3 (Fall, 1995), 195-204. MR1401560 (97e:11014). Reprint available electronically at http://www.trnicely.net/twins/twins.html.
  8. Thomas R. Nicely, "New maximal prime gaps and first occurrences," Math. Comp. 68:227 (July, 1999), 1311-1315. MR1627813 (99i:11004). Reprint available electronically at http://www.trnicely.net/gaps/gaps.html.
  9. Thomas R. Nicely, "A new error analysis for Brun's constant," Virginia Journal of Science 52:1 (Spring, 2001), 45-55. MR1853722 (2003d:11184). Reprint available electronically at http://www.trnicely.net/twins/twins4.html.
  10. Thomas R. Nicely and Bertil Nyman, "First occurrence of a prime gap of 1000 or greater," unpublished (21 May 1999), available electronically at http://www.trnicely.net/gaps/gaps2.html.
  11. Hans Riesel, "Prime numbers and computer methods for factorization," 2nd ed., Birkhäuser, Boston, 1994. MR 95h:11142.

Addendum

NOTE: This addendum was not part of the original submission for publication.