Selected configuration interaction with truncation energy error and application to the Ne atom

Selected configuration interaction SCI for atomic and molecular electronic structure calculations is reformulated in a general framework encompassing all CI methods. The linked cluster expansion is used as an intermediate device to approximate CI coefficients BK of disconnected configurations those that can be expressed as products of combinations of singly and doubly excited ones in terms of CI coefficients of lower-excited configurations where each K is a linear combination of configuration-state-functions CSFs over all degenerate elements of K. Disconnected configurations up to sextuply excited ones are selected by Brown’s energy formula, EK= E−HKK BK 2 / 1−BK 2 , with BK determined from coefficients of singly and doubly excited configurations. The truncation energy error from disconnected configurations, Edis, is approximated by the sum of EKs of all discarded Ks. The remaining connected configurations are selected by thresholds based on natural orbital concepts. Given a model CI space M, a usual upper bound ES is computed by CI in a selected space S, and EM =ES+ E dis+ E, where E is a residual error which can be calculated by well-defined sensitivity analyses. An SCI calculation on Ne ground state featuring 1077 orbitals is presented. Convergence to within near spectroscopic accuracy 0.5 cm−1 is achieved in a model space M of 1.4 109 CSFs 1.1 1012 determinants containing up to quadruply excited CSFs. Accurate energy contributions of quintuples and sextuples in a model space of 6.5 1012 CSFs are obtained. The impact of SCI on various orbital methods is discussed. Since Edis can readily be calculated for very large basis sets without the need of a CI calculation, it can be used to estimate the orbital basis incompleteness error. A method for precise and efficient evaluation of ES is taken up in a companion paper. © 2006 American Institute of Physics. DOI: 10.1063/1.2207620


I. INTRODUCTION
For atoms and small molecules, Schrödinger's equation can be approximated by a matrix-eigenvalue equation, where H is the representation of H in terms of the Slater determinants or N-electron symmetry-eigenfunctions constructed from a given orbital basis.Equation ͑1͒, which can be applied to the complete range of quantum mechanical problems associated to the given system, defines the full configuration interaction ͑CI͒ method 1 and E FCI is the full CI ͑FCI͒ energy.In terms of FCI quantities, the exact eigenvalues E of Schrödinger's equation may be expressed as where ⌬E OBI is the error due to orbital basis incompleteness.2-5 Henceforth the subscript will be dropped in the understanding that the following also applies to excited states.⌬E OBI shall be further discussed in Sec.VII B.
Full CI, on the other hand, is the central referent of all orbital methods based on Hamiltonians obtained from the first principles: 6,7 highly correlated CI ͑HCCI͒, 8 symmetry-adapted-cluster 9 ͑SAC͒ and SAC-CI, [10][11][12] sizeconsistent CI, 13 coupled cluster ͑CC͒ methods, 14,15 manybody perturbation theory ͑MBPT͒, 16,17 electron propagator theory, 18,19 and, more recently, density matrix variational theory, 20,21 the density matrix renormalization group method, 22 iterative CI ͑Refs.23 and 24͒ and extended CC, 25 and ab initio density functional theory. 26͑Quantum Monte Carlo methods 27 are becoming increasingly competitive but use an entirely different methodology.͒Traditional FCI is an impossible task, except to test ab initio electronic structure methods 28 with ͑necessarily͒ too small orbital bases lacking predictive value.The new FCI methods 22,23,29,30 considerably extend the scope of traditional FCI but will continue to be limited by the size of the orbital bases.This paper addresses HCCI methods in general. 8Let us convene calling HCCI any CI method which, despite a formal lack of size extensivity, [31][32][33][34][35] competes on a par with the better founded coupled cluster methods such as CCSD, 34 CCSD͑T͒, 36,37 CCSDT, 38,39 CCSDT͑Q͒, 40 CCSDTQ, 41,42 or even CCSDTQQn, 43 for a given problem at hand.͑S, D, T, Q, Qn, Sx, etc., refer to singles, doubles, triples, quadruples, quintuples, sextuples, etc.͒ Comprehensive studies on the water molecule 44,45 in which the HOH angle is fixed at 110.6°and the OH distance is varied between R e and 3R e with R e Ϸ 1.843 45 a.u.show that in order to compete with CCSDT, a CISDTQ treatment is adequate CISDTQ Ϸ CCSDT.
The previous assertion applies, in general, to similar situations, atoms and small molecules.The reason for the obstinate endurance of CISDTQ vis-à-vis CCSDT is its variational character, not shared by standard CC methods.This is consistent with recent results 46,47 in variational CC calculations 48 achieving close to FCI quality.Accordingly, whereas nonvariational CC methods for N-electron systems include up to N-excited determinants that are normally absent in HCCI, the latter provides the best expansion coefficients up to the level of excitations actually incorporated, and that is enough-at least energy wise-to compensate for lack of size consistency.The problem with straight CISDTQ, nevertheless, has been the excessive computer resources required for current and even future computational power.
Continuing with the water molecule, a recent application of the ultimate of CC tools, namely, CCSDTQQn, 43 suggests that CISDTQQnSx is a clear match, CISDTQQnSx Ϸ CCSDTQQn.This paper presents a complete and efficient approach to approximate CISDTQQnSx for atoms and small molecules to within reliable and acceptable errors by means of selected CI ͑SCI͒ and its corresponding truncation and residual energy errors relative to the full CISDTQQnSx treatment.͑The residual energy error accounts for inaccuracies in the truncation energy error itself and for other errors requiring sensitivity analyses.͒SCI calls for three methodological requirements: ͑i͒ a priori selection of configurations, 8,49-51 ͑ii͒ a priori estimate of truncation energy errors, and ͑iii͒ a posteriori assessment of all other errors not calculated in ͑ii͒.
The present SCI method differs from its predecesors in two important aspects: ͑i͒ truncation energy errors are quantitatively assessed all along making use of Brown's energy formula, 52 and ͑ii͒ the selection scheme targets configurations rather than configuration-state functions ͑CSFs͒ or determinants, both advances, in combination, leading to orders of magnitude improvements in accuracy and precision.CI notation and the Brown formula are given in Sec.II.In Sec.III, the linked cluster expansion is compared with the determinantal cluster expansion to obtain n-excited determinantal CI coefficients in terms of those of lower-excited determinants, as already known 7,53 but unexploited.The expressions for determinantal coefficients are then generalized to approximate configurational coefficients in a quick and reliable way, therefore opening the way for large-scale a priori applications of Brown's formula.
Selection of configurations involves additional conceptualizations discussed in Sec.IV.Truncation and residual energy errors are taken up in Sec.V, and an application on the Ne atom is presented in Sec.VI.Present achievements, their impact on other ab initio methods and conclusions are given in Sec.VII.Various details are given elsewhere. 54Efficiency requirements in connection with the matrix-eigenvalue problem demand the development of yet another variational method presented in a companion paper. 55

II. CI NOTATION AND BROWN's FORMULA
A general HCCI model wave function can be written as 56 K and g label configurations and degenerate elements, respectively, and C gK denotes a CI coefficient.Triply and higher-excited configurations can be classified into disconnected and connected ones.Disconnected configurations are those that can be expressed as products of combinations of singly and doubly excited ones, whereas connected configurations are all others.F gK is an N-electron symmetry eigenfunction or CSF expressed as a linear combination of n K Slater determinants D iK , where O͑⌫ , ␥͒ is a symmetric projection operator 57 for all pertinent symmetry operators ⌫ and a given ͑N-electron͒ irreducible representation ␥. [58][59][60][61] Let ⌿͑−F gK ͒ denote N͑⌿ − F gK C gK ͒ where N is a normalization factor, viz., let us assume that after deletion of F gK , the new wave function ⌿͑−F gK ͒ has the same remaining expansion coefficients except for renormalization.The energy contribution ⌬E gK of F gK can be approximated by which readily yields Brown's formula, 52 In Eq. ͑6͒, E = ͗⌿͉H͉⌿͘.Approximation ͑6͒ is particularly good for small values of ⌬E gK , viz., for expansion terms F gK eventually to be discarded, like triply and up to sextuply excited configurations.As pointed out in Ref. 62 similar equations of perturbational lineage have been used by other authors.Equation ͑6͒ requires previous knowledge of C gK coefficients which so far could only be obtained after making a calculation. 63Quick prediction of C gK s for each g of a given K is probably hopeless.Fortunately, as shown in Sec.III E, it is possible to predict configurational B K coefficients defined below.
First, Eq. ͑3͒ is rewritten as in terms of normalized symmetry configurations G K ,

͑9͒
Similarly as ⌬E gK in Eq. ͑6͒, ⌬E K for expansion ͑7͒ is given by to be used just for estimating an approximate truncation energy error.The variational calculations are still carried out via Eq.͑3͒ but the selection process targets configurations G K instead of F gK s whereby the need to predict C gK coefficients is eliminated.In the next section, predictive formulas for B K coefficients of triply and up to sextuply excited configurations will be discussed.Returning to Eq. ͑10͒, for highly excited configurations, the term ͑E − H KK ͒ is generally of the order of several Hartree, thus E can initially be approximated by any correlated energy, viz., a singles and doubles CI ͑CISD͒ energy.Also, H KK can be well approximated by ͗D iK ͉H͉D iK ͘, where D iK is any determinant of K.In atomic work, where degeneracies g K may easily reach several thousands, thanks to simplification ͑11͒ Brown's formula can be used before generating very expensive F gK s, allowing to make a decision at this early stage whether to incorporate these explicitly in an ensuing variational treatment or to leave them out in the form of a contribution ⌬E K to the truncation energy error.The final expression for total truncation and residual energy errors is postponed to Sec.V.

A. Determinantal CI and Oktay Singanoğlu
A CI expansion in terms of n-excited determinants and a single reference determinant D 0 can be expressed in cluster form as 64 In a landmark paper, 65  In general, however, energy contributions of triples cannot be neglected since they are about equally important as quadruples. 66Analogously, in going to a higher order of approximation, quintuples and sextuples rather than just sextuples must be incorporated, even for closed-shell systems. 44

B. Exponential ansatz for the wave function
Following the linked cluster theorem, [67][68][69] the introduction of an exponential wave function of a cluster operator T, 70,71 ⌿ = exp͑T͒D 0 , ͑13͒ established a powerful theoretical framework free from socalled CI traps, namely, the CI limitation to a given and necessarily low level of spin-orbital excitations.Let the cluster operator be defined as 70 in terms of creation operators a ˆa † for unoccupied orbitals a and annihilation operators a ˆi for occupied orbitals i. Developing the exponentials and collecting terms, 31 the following exact relationships between determinantal CI coefficients c ijk. . .abc. . .and cluster amplitudes t ijk. . .abc. . .are obtained 33 1 and so on for c ijkᐉ abcd and higher-excited CI coefficients.Apart from the coefficient c 0 , Eqs. ͑18͒-͑20͒ are particular cases of Eq. ͑A4͒ of Ref. 53.A hierarchy of coupled-cluster methods may be derived by replacing the right-hand side ͑rhs͒ of ͑18͒-͑20͒ and similar equations into the full CI equations. 33Instead, we shall move in the opposite direction.

C. Predictor of determinantal coefficients
By replacing the rhs of ͑18͒ in ͑19͒ one gets When ͑18͒ and ͑21͒ are replaced into the rhs of ͑20͒ it follows: Equation ͑22͒ shows that c ijk abc coefficients are given exactly in terms of coefficients of lower excited detors plus the irreducible amplitude t ijk abc .Apart from the occurrence of c 0 , these and similar equations for the coefficients associated to higher-excited determinants are particular cases of Eq. ͑A4Ј͒ of Ref. 53.The exciting promise of the above equations stems from the reasonable hypothesis that distinct from c ijl. . .abc. . .coefficients, the t ijk. . .abc. . .amplitudes diminish quickly with the order of excitation ͑in analogy with the virial expansion in imperfect gas theory 65 ͒, and can be neglected.
In general one has Equation ͑23͒ is a shorthand for a predictor of CI coefficients of the n-excited determinants in terms of coefficients of lower excited detors if triply and higher-excited irreducible t ijk. . .abc. . .amplitudes on the rhs can be neglected.If t ijk. . .abc. . .Ϸ 0, as first envisioned by Sinanoğlu, Eq. ͑22͒ and similar ones can be used to estimate c ijk. . .abc. . .coefficients for evaluation of approximate truncation energy errors of disconnected determinants through an equation similar to ͑6͒ or ͑10͒.This is different from Sinanoğlu's original proposal 65 to use Eq.͑22͒ and similar ones as part of a scheme to calculate the total energy itself.Also, the need for large-scale CI is here anticipated as unavoidable.Moreover, as it is well known, 72 t ijk abc Ϸ 0 and even t ijkᐉ abcd Ϸ 0 is not always justified, causing the need of additional considerations to be discussed in Sec.IV.

D. From determinants to configurations
Simplifications that are essential for large-scale application of Brown's formula will now be considered for the first time.In molecules there is no much of an incentive to contract determinantal expansions into CSFs. 8But the situation changes, even in molecules, when the final purpose becomes to contract sums of determinants into symmetry configurations G K , Eq. ͑8͒, embracing all degenerate elements into a single term.Here the effective contraction factor becomes 1/n K , viz., it is equal to the reciprocal of the number of determinants for a given configuration K, which is around 0.05 for CISDTQ with the Abelian point-symmetry groups, falling under 0.0001 in atomic CISDTQ with orbitals of high angular momentum, continuing to decrease for higher exci-tations.Therefore, the configurational counterpart of equations such as ͑22͒ is considered next.
The configurational cluster expansion is given by where ഛ has now taken the place of Ͻ in Eq. ͑12͒, symmetry-orbitals replace spin-orbitals, and the summation over the degeneracy index g has already taken place thus hiding the linear variational coefficients C gK through Eqs.͑8͒ and ͑9͒.Formally, other than for calculation purposes, ͑24͒ is entirely equivalent to ͑12͒ as well as to ͑7͒ and ͑3͒, thus any ⌽ ijk. . .abc. . . is identical with some G K of ͑7͒.

E. Predictor of configurational coefficients
A priori prediction of the C gK coefficients of Eq. ͑3͒ was discussed by Pipano and Shavitt 73 but the lengthy calculations of their proposal were never implemented.Rather than deriving equations similar to ͑23͒, approximate equations to predict the configurational coefficients B ijk. . .abc. . . of Eq. ͑24͒ shall be guessed.The correctness of the guessed equations will be tested by means of actual calculations.
When there are no equal signs among the participating orbitals and all concerned degeneracies are equal to one, the predictor equations for the configurational coefficients B K of ͑9͒ or B ijk. . .abc. . . of ͑24͒ should be identical to those for the c ijk. . .abc. . .coefficients of determinantal expansions, Eq. ͑22͒, and similar ones.The question to be answered then is how Eq. ͑22͒, for example, is to be modified when there are equal orbital indices.Let us consider the extreme case when all occupied orbitals i are equal among themselves, as well as the excited orbitals a.Since the expansion in Eq. ͑24͒ does not contain repeated coefficients, the recipe must be to drop all terms with repeated coefficients.Consequently, for configurational coefficients, Eq. ͑22͒ changes into B ˆiii aaa in the rhs of ͑25͒ is a linked or irreducible coefficient ͑in Sinanoğlu's nomenclature͒ which will be neglected in the evaluation of the left-hand side ͑lhs͒ of ͑25͒.In this way, of the 15 terms of Eq. ͑22͒ only two survive.The codes expressing the 1440 formulas for up to sextuply excited coefficients ͓2͑2q −2͒ formulas for coefficients of q-excited con-figurations͔ were produced by FORTRAN programs and are further discussed elsewhere. 54Moreover, the irreducible components B ˆijk... abc. . .are significant in many triple excitations, and also in those instances where the remaining terms in the rhs of ͑25͒ are zero, as discussed in Sec.IV.

A. Disconnected and connected configurations
The selection process described so far may be summarized as follows: given a model space M, all disconnected configurations K with energy contributions ⌬E K greater than an energy threshold T egy are included in a selected space S that will subsequently be subjected to a variational treatment.However, other configurations also require systematic incorporation since there are two instances when the above criterion is inadequate:  76 independently of the magnitude of the disconnected terms in ͑22͒.
Connected configurations do not exist when using orbital bases lacking spatial symmetry.They necessarily occur when at least one irrep does not appear as a fully occupied orbital in the reference configuration ⌽ 0 , namely, in all atomic and linear-molecule states, and in few-electron molecules with spatial symmetry.Our aim shall just be to guarantee that all deleted connected configurations, together with disconnected triples that were discarded by Brown's energy criterion, contribute less than a given amount of energy.

B. Additional selection criterion
The occurrence of connected configurations makes it necessary to introduce a new requisite: the correlation orbitals a , b , c ,... must be approximate natural orbitals, 77 viz., eigenfunctions of the reduced first-order density matrix or, better yet, average natural orbitals 78 so that orbital symmetry is preserved.
Let ␥͑1,1Ј͒ be the average reduced first-order density matrix with eigenfunctions a and eigenvalues ͑occupation numbers͒ n a , ␥͑1,1Ј͒ = ͚ n a a * ͑1͒ a ͑1Ј͒.

͑27͒
In studies on atomic electron correlation 79 it was found that configurations can be chosen by the following criterion: for each q-excited configuration K the product P͑q , K͒ of corresponding occupation numbers is calculated where K i represents a correlation natural orbital.If g is the symmetry degeneracy of natural orbital a, n Ka = gn a .The whole configuration ͑all corresponding degenerate elements͒ is incorporated if P͑q , K͒ is greater than some occupation number threshold T on , P͑q,K͒ Ͼ T on .͑29͒ A functional form for T on can be expressed in terms of the excitation level q and of a parameter m as where m is shown explicitly on the lhs of ͑30͒ for later purposes.Thus, 10 −m may be interpreted as an average occupation number below which configurations involving a given natural orbital are deleted from an original model space M.
In practice, starting from a sufficiently small energy threshold T egy , the value of m is increased until successive energy lowerings start to converge to within a prescribed residual energy error.Since the actual value of m in ͑30͒ guaranteeing a given contribution to the residual energy error depends on the holes i , j , k ,... of the configuration involved, there is ample room for enriching Eq. ͑30͒. 54

C. Strategy for configuration selection
The following strategy for configuration selection is adopted ͑i͒ All triples with P͑3,K͒ ജ 10 −3m are selected.This criterion is applied to all triply excited configurations alike, disconnected and connected ones.The value of m must be sufficiently high to guarantee that the energy contribution of all deleted connected configurations is negligible.This is all that is to be done to select connected triples.͑ii͒ As to the disconnected triples that were not selected in ͑i͒, all those with ͉⌬E K ͉ ജ T egy are selected while the energy contributions of the discarded ones are accumulated into the total truncation energy error ⌬E dis , Sec.V A. ͑iii͒ All connected quadruples with P͑4,K͒ ജ 10 −4m are selected.This is the mechanism used to incorporate t ijkᐉ abcd s associated to connected configurations.͑iv͒ All disconnected quadruples with ͉⌬E K ͉ ജ T egy are selected while the energy contributions of the discarded ones are accumulated into ⌬E af dis .This implies to neglect all t ijkᐉ abcd associated to disconnected configurations deleted by the T egy test, no matter how significant they might be.
͑v͒ Quintuple-and sextuply excited configurations are selected according to ͑iii͒ and ͑iv͒.

V. ENERGY EXPRESSION
The discussions in Secs.II and IV allow to develop an appropriate notation and a general equation for the CI energy in terms of the usual energy upper bound, a computable ͑rather than formal͒ truncation energy error, and a residual energy error.The latter two contain a part corresponding to disconnected configurations, which can be estimated a priori, and another one due to connected configurations, which can be evaluated after studying energy convergence as a function of the parameter m of Eq. ͑30͒.

A. Effect of truncating disconnected terms
The a priori computable truncation energy error ⌬E dis comes from truncations of disconnected configurations, with ⌬E K given by Eqs.͑10͒ and ͑11͒ and predictor equations for CI coefficients such as Eq.͑25͒ and similar ones. 54⌬E dis decreases monotonically with the threshold T egy introduced in ͑26͒.⌬E dis is an approximation to an exact, usually unknown truncation energy error ⌬E exact dis ,

͑32͒
For large values of ⌬E dis , the unknown quantity ␦E dis is comparatively small.As T egy is made smaller, ⌬E dis becomes tiny and ␦E dis , which may end up being larger than ⌬E dis , can be interpreted as a residual error which may be obtained through sensitivity analyses.
In atoms, ⌬E dis has two sources: ⌬E bf dis from truncations before CSF evaluation and ⌬E af dis from truncations afterwards

B. Effect of truncating connected terms
The existence of connected configurations, and the need to truncate most of them, brings in a new kind of error, to be denoted ␦E con .Distinct from ⌬E dis and analogously as ␦E dis , ␦E con cannot be computed a priori; it can only be estimated after studying suitable patterns of energy convergence, see Ref. 54.For sufficiently small thresholds, ␦E con can also be understood as a residual error.The sign of ␦E con is always negative, since the latter is made up of bonafide variational energy contributions which have not been incorporated into the final calculation.

C. Energy in a model space M
The energy E M in a model space M is written as

͑34͒
where ␦E values are conditioned by various thresholds T. 54 Since E M is well defined, its value can in principle be obtained by a limiting process, letting all thresholds in T to become sufficiently small, thus lim ͑35͒ In very precise calculations, however, one must always settle for threshold values in T that are still too large to qualify as sufficiently small, and therefore the use of residual errors ␦E dis and ␦E con is inevitable Before ␦E becomes known, convergence studies necessarily

͑37͒
eventually converging to the net value E M .Equation ͑37͒ can easily be applied and may well be all that is needed if precision requirements on E M are not too tight.Otherwise, one must fall back into the more detailed Eq. ͑36͒.

A. Choice of system
As a numerical test, the Ne ground state is chosen because it is the simplest well known example 66,[79][80][81][82][83] exhibiting many of the complexities of a highly correlated CI.The basis set consists of 103 energy-optimized radial orbitals 79 up to ᐉ =13:12s12p11d10f10g9h8i7k6l5m4n3o3q3r, amounting to 1077 orbitals.
Use was made of two programs: AUTOCL ͑106 000 lines of code and comments͒, for the calculation of pruned lists of CSFs together with the corresponding truncation energy error ⌬E bf dis , and ATMOL ͑159 000 lines of code and comments͒, for atomic and molecular SCI.The relatively large sizes of the above codes comes from the formulas used to predict energy contributions from quintuply and sextuply excited configurations.Both programs can be downloaded from a website. 54ull CI with the chosen basis calls for 2.4ϫ 10 25 CSFs ͑Ref.84͒ and 1.4ϫ 10 26 determinants disregarding spatial symmetry.CISDTQQnSx up to ᐉ = 7 demands 6.5ϫ 10 12 CSFs ͑4.2ϫ 10 15 distinct determinants͒ CISDTQ only requires 1.4ϫ 10 9 CSFs containing 1.1ϫ 10 12 determinants.Thus the size of the calculation to be presented exceeds by orders of magnitude the size of any calculations previously attempted.
Despite neglect of relativistic effects, cm −1 precision within the CISDTQQnSx model is sought in order to exhibit various challenges and opportunities.

B. Beginning of calculation
A complete calculation starts as follows: .In all, the said CI coefficients were iterated fourfold.
The eigenproblem is determined variationally by a method whose accuracy can be controlled, 55 presently to within less than one microhartree in the largest reported calculations.

C. General strategy
Convergence of connected configurations was first studied in detail 54 as a function of m until reaching very small values of T on ͑m͒ using triples and quadruples truncated after ᐉ = 7 for the purpose of gaining an idea about convergence behavior and expected values of m for ᐉ = 13.
The final parameters for pruning the configuration list before CSF evaluation were obtained from various studies 54 aiming at both a sufficiently small truncation energy error ⌬E bf and a negligible residual error ␦E bf .In particular, the maximum value of degenerate elements g K per configuration was set at g K = 261, whereas a maximum value of n K = 70 000 was chosen for reasons explained in Ref. 54.

D. CISDTQQnSx results for Ne ground state
After the several studies outlined in the previous subsection CISDTQQnSx calculations are presented as a function of T egy in Table I.The occupation number thresholds are set at T on,dis = T on,con =3ϫ 10 −8q The second and third columns show the number of determinants, N sd ͑in 10 9 ͒, and of CSFs, N csf ͑in millions͒, respectively.These large numbers in many routine calculations are comparable to those in state-of-the-art full CI prowess, 28 except that in the present case the orbital bases are considerably larger while the sparseness of the CI matrices is greatly reduced due to the selection process.The fourth column holds the number of nonzero Hamiltonian matrix elements, N hme ͑in 10 12 ͒.In the penultimate row, this number amounts to 9.59ϫ 10 12 , entailing 153 terabytes of disk storage in a traditional application of Davidson's eigensolver 85 in which the matrix elements are expensive to evaluate thus precluding their recalculation at each iteration.Fortunately, this demand is obviated by the use of a select-divide-andconquer method 55 to solve the eigenproblem.
Neglecting residual errors ␦E coming from quintuples and sextuples, the following conclusions are obtained.

͑i͒
For T egy =10 −8 , ⌬E af dis is larger than the actual energy lowering, yielding a too low gross energy.However, as T egy becomes 10 −10 a.u. and smaller, ⌬E af dis values achieve remarkable accuracy, allowing to produce a reliable converged energy as far as disconnected configurations are concerned; it is estimated that ͉␦E af dis ͉ ഛ 0.05 hartree.͑ii͒ In order to estimate ␦E af con , a final calculation with T egy =10 −11 a.u., and T on,dis = T on,con =10 −8q was carried out and reported in the last row of Table I.Considering patterns of energy convergence of connected configurations from previous studies 54 it may be estimated −␦E af con ഛ 0.55± 0.15 hartree.͑iii͒ Studies of ⌬E bf dis values in various circumstances 54 indicate that ␦E bf dis is negligible, around ±0.15 hartree.͑iv͒ From pilot calculations it was estimated −␦E bf con ഛ 0.5 hartree.The significance of connected configurations for still higher values of g K and n K not yet considered is deemed to be equally negligible.Adding both contributions, −␦E bf con ഛ 1 hartree.
As to the truncation error ⌬E bf dis ͑QnSx͒ from quintuples and sextuples, it amounts to 373 hartree subdivided as fol-

E. Comparison with previous Ne results
The best previous variational calculation 79 used the same orbital set and consisted of a multireference CISD ͑MRCI-SD͒ supplemented with connected configurations selected according to Eqs. ͑33͒ and ͑35͒ including 0.35ϫ 10 6 CSFs and 34ϫ 10 6 determinants.Unknown and unsuspected to the authors at the time, 79 its CI energy error amounted to 739 hartree, as it may be deduced from Table I after subtracting the energy contribution from quintuples and sextuples.
From the previous subsection, E S = −128.936541 a.u.͑Ne͒, and E M = −128.936914͑2͒ a.u.͑Ne͒ ͑Table II͒.The energy error ⌬E OBI due to orbital basis incompleteness was computed previously as −643± 20 hartree ͑this value may be deduced from Table VI of Ref. 79͒ through studies of successive saturation with radial functions for a given ᐉ value at the CISD level of approximation, together with patterns of convergence of angular energy limits. 2,66dding E M to ⌬E OBI , an upper bound E u = −128.937557 a.u.͑Ne͒ is obtained, 13 hartree above the "exact" value estimated by Chakravorty et al., 86 and probably fortuitously close to it since septuples and higher excitations are deemed to contribute slightly more than the observed 13 hartree.More accurate estimates of ⌬E OBI and of energy contributions beyond quadruples are needed to test the reliability of this exact energy prediction. 86

A. Achievements
A priori SCI together with truncation and residual energy errors, Eq. ͑36͒, has been generally formulated, and a practical approach to approximate CISDTQQnSx has been given.SCI rests upon ͑i͒ Brown's formula 52 ͑Sec.II͒, ͑ii͒ the use of predictors for configurational CI coefficients to select and assess disconnected configurations via Brown's formula, ͑iii͒ the use of natural orbital concepts to select connected configurations and disconnected triples, and ͑iv͒ sensitivity analyses to determine residual errors.
Predictors for configurational rather than determinantal coefficients are essential to reduce computational requirements by several orders of magnitude.Gross energies converge from below, however, as T egy becomes sufficiently small, ⌬E af dis values become remarkably accurate, well under 1 hartree ͑see Table I͒.
As implemented here, the method can be applied quite in general to an important range of electronic systems including all atoms, for CISDTQQnSx calculations in model CI spaces exceeding 10 12 CSFs and quadrillions of determinants.
SCI has been tested on the Ne ground state using a single computer processor.A rapidly convergent sequence of energies and wave functions ͑Table I͒ together with calculated truncation and residual energy errors is used to achieve an precision of 2 hartree within a CISDTQ model.A less precise result within a CISDTQQnSx model is also given.The final energy result still needs to be complemented by similar analyses with septuply and higher-excited configurations, and also by more accurate estimates of ⌬E OBI due to orbital basis incompleteness, 3,4 as discussed earlier in the Introduction in connection with Eq. ͑2͒.
The largest previous CI calculation 28 involved ten electrons, 34 orbitals, 9.68ϫ 10 9 determinants, 128 processors, and attained absolute convergence to within 5 hartree.For comparison, the largest calculation in this paper ͑penultimate row of Table I͒ also involves ten electrons, 1077 energyoptimized orbitals, 88ϫ 10 9 CSFs expanded in 35ϫ 10 9 determinants in the selected space S with all corresponding CI coefficients being calculated variationally at least once, while energy convergence in S attains a fraction of 1 hartree.͑The last entry of Table I features 41ϫ 10 9 determinants.͒A recent FCI calculation 87 on the eight valence electrons of the C 2 molecule uses 68 orbitals, 65ϫ 10 9 determinants, and 432 processors, achieving convergence with a residual norm of 10 −5 .

B. Estimate of the orbital basis incompleteness error ⌬E OBI
Whatever ab initio method is being used, Eqs.͑31͒-͑33͒ can be applied to quickly estimate the orbital basis incompleteness error ⌬E OBI without ever carrying out a major calculation, if connected configurations can be neglected or do not occur altogether: ⌬E OBI can be approximated by the difference between ⌬E af dis values for one very large basis set and for the original basis.To do so one only needs ͑i͒ CI coefficients of singly, doubly, and triply excited configurations, from CISDT, CCSD͑T͒ or HCCI wave functions, ͑ii͒ one diagonal matrix element between the Slater determinants for each configuration involved together with corresponding one-and two-electron integrals, and ͑iii͒ the configurational coefficients calculated from the predictor equations, Eq. ͑25͒ and similar ones.
A sequence of calculations with increasing basis set size can be used to yield increasingly small ⌬E OBI values ͑in magnitude͒ which may be extrapolated.

C. Impact on other methods and outlook
In order to formulate a theoretical model 88 one must settle for ͑a͒ accuracy with respect to the Breit-Dirac-Schrödinger theory or experiment, ͑b͒ precision with respect to the model itself ͑truncation and residual errors for energies, and sensitivity tests for all properties, in general͒, and ͑c͒ the method to be used, for example, CISDTQQnSx, or any of the suggestions below.
Selected CI is too general to constitute a theoretical model on its own, however, it can be used to formulate new theoretical models or to improve upon existing ones.
Brown's formula, Eq. ͑10͒, used in conjunction with the predictor equations for CI coefficients of higher than double excitations, can smoothly replace perturbation theory in all so called PT2 methods 89,90 since it is more accurate and about as efficient, thanks to Eq. ͑11͒.In principle, it can also be applied beyond PT2.1]91 Multireference CI ͑Refs.92-96͒ continues to be actively developed. 97In carrying the transition to SCI, MRCI can first be supplemented with the truncation energy error ⌬E af dis , Eq. ͑33͒, and with connected configurations, 79 given the case.Next, the configuration generator in MRCI can be extended from MRCI-SD to MRCI-SDTQ.If Qn and Sx excitations are considered only at the level of evaluation of truncation energy errors, the corresponding effort ͑which increases linearly with the number of configurations͒ is a small fraction of the one required for a selected CISDTQ calculation.In any case, introduction of leading selected quintuples and sextuples into the wave function is straightforward.
General incorporation of higher than sextuply excited SCI is not a good idea, in general, although septuples and octuples are feasible in atoms. 54In molecules, variational expansion coefficients c ijk. . .abc. . .can be used to obtain accurate t ijk. . .abc. . .amplitudes, which in turn can be fed into a CC ansatz, 7 hopefully improving the energy and the efficiency of CC methods in a single step without need of CC iterations.Perhaps more interesting, accurate CI wave functions may be used to tailor CCSD. 98he density matrix renormalization group method, 29 and the growing family of iterative CI methods [23][24][25]99,100 can be used with the largest possible bases to estimate the energy errors due to truncations beyond quadruples, thus enhancing SCI capabilities.
By advancing reliable CISDTQQnSx, the present SCI method considerably extends the scope of accurate atomic and molecular ab initio electronic structure applications.If the orbital bases are well chosen 101,102 or well developed ͑for example, through energy optimization 79,103 ͒, one may envision unprecedented accuracy for problems tractable by CIS-DTQQnSx or by the SCI-improved methods mentioned above.
Needless to emphasize, SCI applies mutatis mutandis to the Dirac-Schrödinger equation, 104 and also to the Breit-Dirac-Schrödinger equation, 105 provided only positiveenergy orbitals are used, viz., within the no-pair Hamiltonian approximation. 106In fact, the computer program ATMOL 54 has precisely that capability for general atomic states.
After what has been said, there remains the input/output bottleneck 107 inherent to large-scale applications of Davidson's eigensolver 85 when applied to CI matrices expressed in terms of CSFs.In the following paper, 55 this bottleneck is overcome by a select-divide-and-conquer variational procedure based on the present configuration selection scheme.

ACKNOWLEDGMENTSI
am indebted to Professor Ramon Carbó-Dorca for stimulating conversations and lasting friendship during my sabbatical year at the Institute of Computational Chemistry of University of Girona ͑Catalonia, Spain͒ in 2001-2002.Discussions with various colleagues stirred further motivations, particularly with Professors Ignacio Garzón ͑Mexico, D.F.͒, Ingvar Lindgren ͑Goteborg, Sweden͒, and Alejandro Ramírez-Solís ͑Cuernavaca, Mexico͒, and Dr. Oliverio Jitrik.The sharp and kind criticism of Professor Isahia Shavitt ͑Urbana, Illinois͒ is deeply appreciated.The Dirección General de Servicios de Cómputo Académico ͑DGSCA͒ of Universidad Nacional Autónoma de México, and the Computing Center at my own institute are thanked for their excellent and free services.Support from CONACYT through Grant Nos.E-26726 and 44363-F is gratefully acknowledged.
74r a given set of indices abc...ijk..., only t ijk...abc. . . in the rhs of ͑23͒ is different from zero.Such configurations shall be called connected configurations,74while all others are called disconnected ones.Examples of connected configurations are pdf in Li 2 S and pdfg in Be 1 S.
75͑ii͒ it is not sufficient for triply excited disconnected configurations: here, the largest part of the energy contributions comes from nonnegligible irreducible t ijk abc s and corresponding B ˆijk abc coefficients, is run to obtain approximate natural orbitals.͑ii͒Usingthese approximate natural orbitals, a CISDT is carried out and its energy as well as the CI coefficients of single, double, and leading triple excitations are saved on a file for later use in the prediction of configurational B K coefficients needed for the a priori evaluation of estimates of energy contributions of disconnected configurations.͑iii͒ With data from ͑ii͒, a pruned list of CSFs for CIS-DTQQnSx is calculated using suitable pruning parameters. 54͑iv͒ Using the list of CSFs obtained in ͑iii͒, an approximate CISDTQ wave function is obtained for the purpose of improving over the coefficients of single, double, and triple excitations first calculated in ͑ii͒.
͑v͒ These are used to run ͑iii͒ once more, yielding a very similar list of CSFs.The new CI coefficients of singles, doubles and triples yield more accurate truncation energy errors ⌬E bf dis and ⌬E af dis

TABLE I .
Convergence of the CISDTQQnSx ground state energy of Ne with a 12s12p11d10f10g9h8i7k6l5m4n3o3q3r basis, as a function of T egy =10 −n a.u., using fourfold iterated CI coefficients

TABLE II .
Comparison with best previous calculation using the same orbital basis; energies in a.u.͑Ne͒.