Enumeration to 1.6*10^15 of the prime quadruplets
Thomas R. Nicely
Copyright © 2007 Thomas R. Nicely. All rights reserved.
This document may be freely reproduced and distributed for
educational and non-profit purposes.
Date of first release 23 August 1999.
Last revised 0500 EDT 28 March 2007.
The content of this document (other than the addendum, which was
not part of the submission for publication) is essentially that of
the original release, except that information rendered obsolete by
subsequent events has been removed or modified, in both the main
document and the addendum (this liberty is taken in view of the
fact that the paper was never accepted for publication). There may
also be differences in formatting, and in minor details and
corrections.
Abstract
Enumeration of the prime quadruplets (q, q+2, q+6, q+8), and the
sum of their reciprocals, is extended to 1.6*10^15. An estimate
is obtained for the limit of the sum of the reciprocals, an analog
to Brun's constant, B_4 = 0.87058 83800 +/- 0.00000 00005. Error
analysis is presented to support the opinion that the stated error
bound represents a 99 % confidence level.
Mathematics Subject Classification 2000 (MSC2000)
Primary: 11A41.
Secondary: 11-04, 11Y70, 11Y60.
Key Words and Phrases
Prime numbers, prime quadruplets, prime constellations, Brun's
constant, prime k-tuples conjecture.
1. Introduction
Certain groupings (k-tuples, or constellations) of two
or more prime numbers, such as the twin-prime pairs
K_2 = {(3, 5), (5, 7), (11, 13), (17, 19), ...},
are believed to constitute infinite sets. Other such groupings
may occur only a finite number of times, and perhaps not at all;
e.g., the prime triplets (q, q+2, q+4) produce only
the one instance (3, 5, 7), while there are no prime
triplets at all of the form (q, q+1, q+2).
Constellations which, due to divisibility considerations and the
resulting residue classes occupied, can have only a finite number
of instances (possibly none), are termed inadmissible; the
remaining constellations are termed admissible. The
question of infinitude remains undecided for all the admissible
constellations; in the case of the twins, the proposition of
infinitude is referred to as the "twin-primes conjecture."
Hardy and Littlewood [4] conjectured a much stronger and more
quantitative description of the distribution of the admissible
constellations, the "prime k-tuples conjecture," which
imples not only the infinitude of every admissible constellation,
but yields also quite accurate asymptotic formulas for the number
of members of a particular admissible constellation which lie below
a specified upper bound x.
2. The Prime Quadruplets
Most of the study of the prime constellations has been focused
on the twin primes; see [7] and [9] for some recent results. This paper
considers instead the constellation of prime quadruplets of the
form (q, q+2, q+6, q+8), the set
(2.1)
K_4 = {(5, 7, 11, 13), (11, 13, 17, 19), (101, 103, 107, 109), ...} ,
in which the first element of each member (except the anomalous
initial one) is of the form 30n + 11. As with other
admissible constellations, it is not known if K_4 is
infinite. It is known that there exist extremely large numbers
and instances of such prime quadruplets; more than a thousand
million of them have been counted beyond 10^15 in this
study, and Forbes [2] reports an instance in which the initial
prime is a number of 1032 decimal digits,
(2.2)
q = 24947432928741915235*(2^3363 - 2^1121) - 6*(2^1121) - 7 .
Furthermore, the Hardy-Littlewood prime k-tuples conjecture implies that
(2.3)
pi_4(x) ~ L_4(x) = (27/2)*c_4*int(1/(ln(t))^4, t, 2, x) ,
where pi_4(x) represents the count of prime quadruplets
(q, q+2, q+6, q+8) such that q <= x, and c_4 is the constant
(2.4)
c_4 = prod((p^3)*(p - 4)/(p - 1)^4, p, 5, +infinity) ,
with the infinite product taken over all the primes save 2 and 3.
We shall refer to (2.3) as the Hardy-Littlewood approximation for the
prime quadruplets, or simply the Hardy-Littlewood approximation;
it clearly implies the infinitude of the quadruplets, and a great
deal more. The validity of this conjecture will be central to
our methods for estimating B_4 and its error bound.
See Riesel [11, pp. 60-83] for detailed discussions and derivations
of (2.3), (2.4), and similar consequences of the prime k-tuples
conjecture. The constant c_4 has been computed to 45 significant
digits by Harley [5],
(2.5)
c_4 = 0.30749 48787 58327 09312 33544 86071 07685 30221 78519... .
For the twin primes, Brun [1] proved that whether or not the set
K_2 is infinite, the sum of the reciprocals
(2.6)
B_2 = (1/3 + 1/5) + (1/5 + 1/7) + (1/11 + 1/13) + (1/17 + 1/19) + ... ,
is convergent, in contrast to the known divergence of the sum of the
reciprocals of all the primes. The limit of this sum is now referred
to as Brun's constant,and has been approximated by Nicely [9]
as
(2.7)
B_2 = 1.90216 05823 +/- 0.00000 00008 ,
where the error bound is believed to define a 95 % confidence interval.
In the same fashion, we define Brun's constant B_4 for
the prime quadruplets as the limit of the sum of the reciprocals,
(2.8)
B_4 = (1/5 + 1/7 + 1/11 + 1/13) + (1/11 + 1/13 + 1/17 + 1/19)
+ (1/101 + 1/103 + 1/107 + 1/109) + ...
As with B_2, we do not if the series is finite or infinite;
but as a consequence of Brun's proof of the convergence of the sum of
the reciprocals of the twins, we do know that the series for
B_4 is convergent; for if the series is infinite, its terms
are a proper subset of the series in (2.6). Since (2.6) is a positive
series, its convergence is immune to deletion, rearrangement, or
regrouping of terms; thus the series (2.8) defining B_4
must also be convergent. In essence, the quadruplets are constructed
from a proper subset of the twins.
Although (2.8) is convergent, the monotonically
increasing partial sums approach the limit quite slowly. However,
assuming the validity of the Hardy-Littlewood approximation (2.3), a
more rapidly converging first order extrapolation may be derived as
follows. Define S_4(x) as the partial sum of the
reciprocals of the quadruplets,
(2.9)
S_4(x) = sum(1/q + 1/(q+2) + 1/(q+6) + 1/(q+8), q, q <= x) .
Then the remainder term in the series defining B_4 may be approximated by
(2.10)
B_4 - S_4(x) =
sum(1/q + 1/(q+2) + 1/(q+6) + 1/(q+8), q, q > x)
~ int((4/t)*(27c_4/2)*1/(ln(t))^4, t, x, +infinity)
~ 54c_4*int(1/(t*(ln(t))^4)), t, x, +infinity)
~ 54c_4*int(1/u^4, u, ln(x), +infinity)
~ 18c_4/(ln(x))^4 ,
where we have employed the density (27/2)c_4*1/(ln(t))^4 of
quadruplets implied by the Hardy-Littlewood approximation,
approximated the sum of the reciprocals of a particular quadruplet
by 4/t, and used the substitution u = ln(t). This produces a
"first order" extrapolation of S_4(x) to B_4, which we
denote by F_4(x),
(2.11)
B_4 ~ F_4(x) = S_4(x) + 18c_4/(ln(x))^3
No effective second order extrapolation is known; however, we will
present evidence that the error term in (2.11) is
O(1/(sqrt(x)*(ln(x))^2), so that
(2.12)
B_4 = F_4(x) + O(1/(sqrt(x)*(ln(x))^2))
= S_4(x) + 18c_4/(ln(x))^3 + O(1/(sqrt(x)*(ln(x))^2)) ,
implying that F_4(x) converges to B_4 at a rate O(sqrt(x)/ln x) faster
than the partial sums S_4(x).
3. Computational Results
The present study results from the continuation of a project initiated
in 1993, an exercise in the use of distributed computing to leverage
limited resources (personal computers of moderate cost and power) in
order to attack problems in computational number theory which normally
would be reserved to supercomputers. Detailed descriptions of the
general program, the computational methods employed, and the incidental
discovery of the Pentium FDIV flaw may be found in [7], [8], [9],
and [10].
Table 1 contains a brief summary of the computational results,
including the counts pi_4(x) of the prime quadruplets
(q, q+2, q+6, q+8); the values of the discrepancy between pi_4(x)
and the Hardy-Littlewood approximation,
(3.1) delta_4(x) = L_4(x) - pi_4(x) ;
the partial sums S_4(x) of the reciprocals of the quadruplets; and
the first order extrapolations F_4(x) of S_4(x) to the limit,
according to (2.11), members of a sequence believed to be converging
to Brun's B_4 constant.
Table 1.
Counts of prime quadruplets and estimates of Brun's B_4 constant.
======================================================================
x pi_4(x) delta_4(x) S_4(x) F_4(x)
======================================================================
10 1 10.29 0.510689310689311 0.964070321217938
100 2 11.60 0.789976586880612 0.846649213196690
1000 5 11.49 0.853473194253130 0.870265083531968
10000 12 12.17 0.863733192400183 0.870817270689693
10^05 38 14.88 0.867011003684134 0.870638051768363
10^06 166 17.68 0.868379532753497 0.870478518913352
10^07 899 -36.05 0.869267876960829 0.870589687487152
10^08 4768 -33.36 0.869705293632323 0.870590803418512
10^09 28388 8.84 0.869966856425087 0.870588778250229
10^10 180529 545.93 0.870134891176928 0.870588272187457
10^11 1209318 638.22 0.870247695545365 0.870588327409023
10^12 8398278 -3699.97 0.870326020813441 0.870588394083423
10^13 60070590 4848.36 0.870382016088034 0.870588379770569
10^14 441296836 -6103.68 0.870423153466140 0.870588379781931
2.0e+14 807947960 2717.36 0.870433368925933 0.870588379517423
3.0e+14 1151928827 -12660.14 0.870438957019776 0.870588379757893
4.0e+14 1482125418 -15032.60 0.870442759816539 0.870588379787802
5.0e+14 1802539207 -23557.26 0.870445621161320 0.870588379871401
6.0e+14 2115416076 -35177.17 0.870447903679533 0.870588379961704
7.0e+14 2422194981 -49882.89 0.870449795732922 0.870588380059497
8.0e+14 2723839871 -35301.69 0.870451407176393 0.870588379983029
9.0e+14 3021126140 -38141.52 0.870452807976233 0.870588379996686
1.0e+15 3314576487 -26197.22 0.870454044834374 0.870588379948604
1.1e+15 3604646822 -19485.07 0.870455150797010 0.870588379922608
1.2e+15 3891706125 -36034.00 0.870456149967533 0.870588379981021
1.3e+15 4175985018 -18182.67 0.870457060200978 0.870588379923299
1.4e+15 4457782901 -24552.75 0.870457895584737 0.870588379943262
1.5e+15 4737286827 -38254.45 0.870458666969910 0.870588379980055
1.6e+15 5014641832 -29496.94 0.870459383002448 0.870588379958289
======================================================================
The count pi_4(10^8) in Table 1 agrees with the value attributed by
Riesel [11, p. 62] to Gruenberger and Armerding [3]. Two values
omitted from Table 1, pi_4(2*10^9) = 49262 and pi_4(2*10^10) = 319107,
agree with those computed by Hein [6]. Additional checks on the
integrity of the results included comparisons of the associated values
of pi(x) with published values, and duplications of each calculated
interval on two different computer systems. The computed values of
pi(x) are not shown here, but are available at
http://www.trnicely.net/pi/tabpi.html.
Also, more extensive versions of Table 1 are available at
http://www.trnicely.net/quads/tabpi4.html.
No previous investigator is known to have considered the sums of
the reciprocals, or Brun's B_4 constant.
In support of the accuracy and probable validity of the
Hardy-Littlewood conjecture (2.3) for the quadruplets, note that
delta_4(x) has about half as many digits as L_4(x). This also
supports the deduction
(3.2)
delta_4(x) = O(sqrt(L_4(x)) = O(sqrt(x)/(ln(x))^2) .
Furthermore, this would correspond to an error term in the
first order extrapolation (2.11) of order
(3.3)
B_4 - F_4(x) = O((4/x)*sqrt(x)/(ln(x))^2)
= O(1/(sqrt(x)*(ln(x))^2)) ,
supporting the error term proposed in (2.12). The consequent
consistency of the extrapolated estimates F_4(x) for
Brun's B_4 constant provides additional evidence in
favor of the Hardy-Littlewood approximation.
4. Brun's B_4 Constant and the Error Analysis
The first order extrapolation F_4(1.6*10^15) is believed to yield
an estimate of Brun's B_4 constant accurate to ten significant
decimal digits,
(4.1)
B_4 = 0.87058 83800 +/- 0.00000 00005 .
The error estimate is believed to represent a 99 % confidence
level, although this is a subjective opinion of the author for
which definitive proof remains lacking. More precisely, the
author believes that for at least 99 % of the integers in any
sufficiently large set of distinct integers x > 1, Brun's
B_4 constant would lie between F_4(x) - E_4(x) and F_4(x) + E_4(x).
Here E_4(x) is the error bound function stated in (4.6) below,
from which the error bound in (4.1) will be obtained. The error
estimate is arrived at as follows.
Rather than attempting to bound the error F_4(x) - B_4 directly,
we consider the "scaled deviations"
(4.2)
P_4(x) = sqrt(x)*((ln(x))^2)*[F_4(x) - B_4] ,
where the scaling factor sqrt(x)*(ln(x))^2 arises from the
conjectured error term in (2.12). No rigorous derivation
is known for this error term. One justification for it was given
in the derivation of (3.3), based on the observed magnitude of
delta_4(x) in the computed data. A second justification
arises from further analysis of the data, which reveals that the
resulting mean absolute value of P_4(x) (using our best estimate
F_4(1.6*10^15) in place of B_4) remains of the same order of
magnitude O(1) over most intervals of 10^12 from 0 through
1.6*10^15, as would be expected with a valid scaling factor
(deviations at all values of x are thus given similar weight).
We now hypothesize that the deviations F_4(x) - B_4 (and consequently
P_4(x) as well) change sign infinitely often; more precisely, given any
x_0, however large, there will exist integers x_1 > x_0 and
x_2 > x_0 such that F_4(x_1) < B_4 and F_4(x_2) > B_4.
We will refer to this as hypothesis [H4]. Note that although
[H4] is neither necessary nor sufficient for the Hardy-Littlewood
approximation (2.3), it will almost certainly fail (along with
(4.1) in its entirety) if Hardy-Littlewood is false. In support
of [H4], we note that if our best estimate F_4(1.6*10^15) is used for
B_4, then F_4(x) - B_4 is observed to undergo 504 sign changes over
the 160081 data points recorded in the present study, with 315 sign
changes occurring beyond 10^15.
Given [H4], we then look for the maximum computable value of the function
(4.3)
N_4(x_1, x_2) = sqrt(x_1)*((ln(x_1))^2)*|F_4(x_1) - F_4(x_2)| ,
where x_1 and x_2 are integers such that x_2 > x_1 > 1.
N_4 may be considered as a "normalized" measure of the amplitude
of the oscillations in F_4(x), or alternately as a measure of the
scaled deviation of a "predictor" value F_4(x_1) from a "terminal
approximation" F_4(x_2). In contrast to P_4, N_4 has the virtue
that it is independent of the uncertain value of B_4. If N_4 has
a global maximum, or even an upper bound, then given [H4] this must
also represent an upper bound on the absolute value of P_4, thus
producing an unconditional error bound; for any specific value
|P_4(x_3)| will be exceeded by N_4(x_3, x_4), where x_4 is chosen
so that x_4 > x_3 while F_4(x_3) - B_4 and F_4(x_4) - B_4 are of
opposite sign ([H4] implies that there is an infinite sequence of
such integers x_4 for any given integer x_3). In practice, we are
unable to prove that N_4 has a global maximum, although (2.12) would
imply that P_4 is bounded. Indeed, it is computationally impractical
even to find the absolute maximum of N_4 over all integers in the range
under investigation, 1 < x_1 < x_2 <= 1.6*10^15; this would
involve the comparison of more than 10^30 data pairs
(F_4(x_1), F_4(x_2)), a calculation of doubtful feasibility. What has
been determined is that the absolute maximum of N_4 over all of the
(more than 10^10 such) data pairs recorded in this study is
(4.4)
N_4(1000000, 20000000) = 21.704105 .
However, additional computations reveal an even larger value of
N_4 in the vicinity of this point,
(4.5)
N_4(1002340, 16977221) = 22.6687145 ,
where the values in both (4.4) and (4.5) have been rounded up.
Although the value 22.6687145 is still not established as the
sought after upper bound on N_4 (and thus on |P_4|), the numerical
evidence (zero exceptions among more than 10^10 data pairs extending
over fifteen orders of magnitude) indicates that if any larger values
of N_4 (and thus |P_4|) exist, they must be relatively rare. Our
intuitive conclusion is that for the great majority (99 percent or
more?) of integers x > 1, it will be true that
|P_4(x)| < 22.6687145, producing the error bound formula E_4(x),
(4.6)
|F_4(x) - B_4| < E_4(x) = 22.6687145/(sqrt(x)*(ln(x))^2) .
In arriving at our error estimate in (4.1), we are assuming that
(4.6) does in fact hold specifically for x = 1.6*10^15, so that
(4.7)
|F_4(1.6*10^15) - B_4| < E_4(1.6*10^15)
= 22.6687145/(sqrt(1.6*10^15)*(ln(1.6*10^15))^2) ,
giving
(4.8)
|F_4(1.6*10^15) - B_4| < 0.00000 00004 624 ,
which, with rounding up for good measure, produces the error
estimate given in (4.1).
The validity of the Hardy-Littlewood conjecture (2.3) is critical
to our estimate (4.1) of Brun's B_4 constant and the
associated error bound. On the other hand, the particular scaling
factor sqrt(x)*(ln(x))^2, taken from the conjectured
error term in (2.12), is not indispensable to this line of error
analysis. Alternative scaling factors could be employed, presumably
corresponding to different error terms, and the author in fact
investigated several alternatives. No other yielded superior
results; either the resulting error bound was inferior, or a
"near-maximum" value of the corresponding function
N_4 (meaning one sufficiently large to provide an error
bound of equal or better confidence level) could not be located
without greatly escalating the computational effort. One notable
virtue of this scaling factor was that the resulting maximum
computed value of N_4 occurred at relatively small values
of x_1 and x_2, permitting a detailed search
for the extreme value, and yielding an increased confidence that
larger values would prove difficult to unearth, or perhaps even
be nonexistent. Nonethless, some other alternative scaling factor
might yet produce a superior error bound.
On the other hand, more than ten thousand million pairs of first
order extrapolations have agreed with each other within this error
bound, without exception. Of course, it carries no unconditional
guarantee; overwhelming numerical evidence is occasionally a subtle
trap, as the conjecture Li(x) > pi(x) so memorably
illustrated. However, it would appear that a more rigorous error
bound must await a deeper understanding of the subtleties in the
distribution of the prime quadruplets in particular, and the prime
constellations in general.
References
- Viggo Brun, "La série 1/5 + 1/7 + 1/11 + 1/13 +
1/17 + 1/19 + 1/29 + 1/31 + 1/41 + 1/43 + 1/59 + 1/61 + ...,
où les dénominateurs sont 'nombres premieres
jumeaux' est convergente ou finie," (French) Bulletin
des sciences mathématiques 43 (1919), 100-104, 124-128.
- Tony Forbes, unpublished work, accessed (5 August 1999) at
http://www.ltkz.demon.co.uk/ktuplets.htm.
- F. J. Gruenberger and G. Armerding, "Statistics on
the first six million primes," UMT 73, Math. Comp. 19 (1965),
503-505.
- G. H. Hardy and J. E. Littlewood, "Some problems of 'Partitio
Numerorum' III: On the expression of a number as a sum of
primes," Acta Math. 44 (1922), 1-70. See also
"Collected papers of G. H. Hardy," Clarendon Press, Oxford,
1966, Vol. I, 561-630.
- Robert Joseph Harley, unpublished work, accessed (16 September 1996)
at
http://pauillac.inria.fr/~harley/bits.html.
- Robert Hein <rhein(AT)erols(DOT)com>, unpublished work
(3 March 1995), received by telefax from U. S. Army CECOM RDEC,
Intelligence and Electronic Warfare Directorate, Data Fusion Branch,
ECM Lab, Vint Hill Farms Station, Warrenton VA 22186-5100 USA.
- Thomas R. Nicely, "Enumeration to 10^14 of the twin primes
and Brun's constant," Virginia Journal of Science 46:3 (Fall,
1995), 195-204. MR1401560 (97e:11014). Reprint available
electronically at
http://www.trnicely.net/twins/twins.html.
- Thomas R. Nicely, "New maximal prime gaps and first
occurrences," Math. Comp. 68:227 (July, 1999), 1311-1315.
MR1627813 (99i:11004). Reprint available electronically at
http://www.trnicely.net/gaps/gaps.html.
- Thomas R. Nicely, "A new error analysis for Brun's
constant," Virginia Journal of Science 52:1 (Spring, 2001),
45-55. MR1853722 (2003d:11184). Reprint available electronically at
http://www.trnicely.net/twins/twins4.html.
- Thomas R. Nicely and Bertil Nyman, "First occurrence of a prime
gap of 1000 or greater," unpublished (21 May 1999), available
electronically at
http://www.trnicely.net/gaps/gaps2.html.
- Hans Riesel, "Prime numbers and computer methods for
factorization," 2nd ed., Birkhäuser, Boston, 1994.
MR 95h:11142.
Addendum
NOTE: This addendum was not part of the original submission for
publication.
- The most recent count of the prime quadruplets, and the corresponding
values for S_4, and for B_4 and its error bound, are available
on the page Prime constellations
research project.
- More recent work indicates that the error bound formula
(4.6) obtained in this paper is in fact definitive, although no
rigorous proof of this assertion has yet been discovered; additional
details will be provided in a forthcoming paper, "Enumeration of
some prime constellations" (in preparation).
- A more extensive table of the quadruplets is available at
http://www.trnicely.net/quads/tabpi4.html.
- Extensive tables of the values of pi(x), determined by direct
count as part of this project, are being maintained at
http://www.trnicely.net/pi/tabpi.html,
and at
http://www.trnicely.net/pi/tabpix.html.
- Efforts to use weighted and/or unweighted data averaging or
smoothing, or linear regression techniques, in an attempt
to obtain a more accurate value of Brun's B_4
constant, have not met with success. The results thus obtained,
at values of x < 1.6*10^15, were in some cases slightly better,
but also in some cases slightly worse, than the values of F_4(x).
"Better" or "worse" here refers to whether or
not the resulting estimate was closer to the presumably superior
estimates of B_4 obtained from more extensive calculations, i.e.,
F_4(x) for a much larger value of x. However, I have not mounted a
full-scale effort using weighted or non-linear regression; I invite
third parties to carry out such an effort, using the
extensive data tables I have provided.
It may well be possible to produce a more rapidly converging sequence
of estimates by some such method; this would amount to a successful
second order extrapolation. I recommend as critical test values
for this purpose the estimates for Brun's B_4 constant at x = 10^13
and x = 10^14; can your proposed extrapolation produce estimates
substantially superior to F_4(10^13) and F_4(10^14)?