Importance of the Basis Set for the Spin-State Energetics of Iron Complexes

We have performed a systematic investigation of the influence of the basis set on relative spin-state energies for a number of iron compounds. In principle, with an infinitely large basis set, both Slater-type orbital (STO) and Gaussian-type orbital (GTO) series should converge to the same final answer, which is indeed what we observe for both vertical and relaxed spin-state splittings. However, we see throughout the paper that the STO basis sets give consistent and rapidly converging results, while the convergence with respect to the basis set size is much slower for the GTO basis sets. For example, the large GTO basis sets that give good results for the vertical spin-state splittings of compounds 1-3 (6-311+G**, Ahlrichs VTZ2D2P) fail for the relaxed spin-state splittings of compound 4 (where 1 is Fe-(PyPepS)2 (PyPepSH 2 = N-(2-mercaptophenyl)-2-pyridinecarboxamide), 2 is Fe(tsalen)Cl (tsalen = N, N'-ethylenebis-(thiosalicylideneiminato)), 3 is Fe(N(CH2-o-C6H4S) 3)(1-Me-imidazole), and 4 is FeFHOH). Very demanding GTO basis sets like Dunning's correlation-consistent (cc-pVTZ, cc-pVQZ) basis sets are needed to achieve good results for these relaxed spin states. The use of popular (Pople-type) GTO, effective core potentials basis set (ECPB), or mixed ECPB(Fe):GTO(rest) basis sets is shown to lead to substantial deviations (2-10 kcal/mol, 14-24 kcal/mol for 3-21G), in particular for the high spin states that are typically placed at too low energy. Moreover, the use of an effective core potential in the ECPB basis sets results in spin-state splittings that are systematically different from the STO-GTO results.


Introduction
Correctly predicting relative energies of spin states (i.e. spin-state splittings) of transition-metal complexes is a necessary requirement for being able to distinguish between competing pathways for reactions of (bio)inorganic compounds. Because of problems with the B3LYP functional for providing the spin ground-state of iron-sulfur complexes, Reiher and co-workers therefore proposed 1 to lower the amount of Hartree-Fock exchange in B3LYP to 15%. 2 This new functional, called B3LYP*, was indeed shown 3 to improve upon B3LYP, but still failed for spin-crossover systems. 4 More recently, validation studies of Density Functional Theory (DFT) 5 functionals by some of us 6 have shown the excellent performance of the OPBE 7-9 functional for these spin-state splittings, which is corroborated by good results in other studies. [10][11][12][13][14][15][16][17][18][19][20][21][22][23] In fact, "there is no reason to suppose that hybrid functionals are inherently better than pure functionals", because "the pure functionals OLYP and OPBE .. appear to provide the best overall description of the spin-state energetics" (both quotes from ref. 10). This good performance of OPBE (and OLYP 24,25 ) concurs with a recent benchmark study 26 on the energy profiles of S N 2 reactions, where DFT was compared to high-level coupled cluster CCSD(T) data. Indeed, the underestimation of reaction barriers by standard DFT functionals is dramatically reduced 7,26 when using criticism of the OPBE functional might as well be resulting from the choice of basis set used.
Unfortunately, no systematic study comparing the influence of the basis set on spin-state splittings of iron complexes has appeared in the literature so far. Hence, we decided to extend our series of validation studies for spin-state splittings of iron complexes with a systematic investigation of the influence of the basis set.
[here Figure 1] Here we report the relative vertical spin-state energies for three Fe(III) compounds (see Figure 1) that have either a low (1), 33 intermediate (2) 34 or high (3) 35 spin ground state experimentally. The spin-state splittings were computed using the OPBE functional with a variety of basis sets that include STOs, GTOs, ECPBs, and mixed ECPB(Fe):GTO(rest) basis sets. The GTO, ECPB and ECPB(Fe): GTO(rest) basis sets comprise all basis sets in the EMSL library 36 that are available for all elements present in compounds 1-3, apart from the basis sets that are too demanding for these larger molecules (Roos-ANO-aug-dz, cc-pVTZ, cc-pVQZ). Moreover, we also examined the influence of the basis set on the geometry optimization for the three spin-states of a small iron complex (4, FeFHOH), by looking at both the structure and the resulting relaxed spin-state splittings. For this compound 4, we could include the results from the more demanding Roos-ANO-aug-dz, cc-pVTZ and cc-pVQZ GTO basis sets.

Computational details
The calculations using the unrestricted formalism have been performed with four computational chemistry programs: ADF 37,38 (version 2006.01), Gaussian03 39 (revision B.02), NWChem 40 (version 5.0) and ORCA 41 (version 2.5.20). The PBEc correlation functional 9 is implemented differently in the programs, i.e. with PW92 local correlation 42 in Gaussian, with VWN5 43 in ADF, while NWChem and ORCA allow to choose either one of these or VWN3. 43 Given in Table 1 are the ORCA results for compounds 1-3, as calculated with either one of the three local correlation functionals. These results show that the spin-state splittings are virtually the same (difference < 0.1 kcal/mol) when using either PW92 or VWN5 local correlation, which enables us to compare directly between different programs, as long as either one of these two is used.
[here Table 1] The Cartesian coordinates for the three Fe(III) complexes 1-3, representing the experimental structures, have been taken from a previous benchmark study, 6 while the geometry optimization for the Fe(III) compound 4 has partly been performed with the QUILD program 44 (that serves as a wrapper around ADF and ORCA).

Basis sets
There is a fundamental difference between basis sets and effective core potentials (ECPs), which are in fact model Hamiltonians that replace the effect of the core electrons. Here, we refer to the combination of the ECPs with their corresponding valence basis sets together as ECP Basis-sets (ECPBs). Note also that the ECP approach differs fundamentally from the frozen-core approach 38 used in ADF. Although core-electrons are not included in the SCF procedure within the frozen-core approach in ADF, it does include the core orbitals and explicitly orthogonalizes the valence orbitals to them.
Moreover, the core density is obtained and included explicitly, albeit with core orbitals that are kept frozen during the SCF procedure. The influence of keeping the core-orbitals frozen within ADF is usually small (see below).
For an infinitely large basis set, the actual form of the basis functions is irrelevant and either STO or GTO basis sets (but not necessarily the ECPB basis sets !) should give the same final result. In reality however, we choose a finite basis set, in the hope that by increasing it we come closer to the final result.
There are indications that this is indeed observed for STO basis sets (see below) and Dunning's correlation-consistent basis sets, 45,46 but it does not have to be true in general for other type of GTOs or ECPBs. In fact, because the ECPBs include a model Hamiltonian for the core electrons, the combination with an infinitely large basis set for the valence electrons would not necessarily result in the same final answer obtained with an infinitely large STO or GTO basis set.
The basis sets used in this study are generally reported here as they have been used within the programs. There are however a few exceptions, such as the SDD ECPB, which when used for all atoms is called SDDAll within Gaussian, and the mixed ECPB(Fe):GTO(others) basis set combinations that were given as general basis set input to Gaussian03 (Gen keyword). The correlation consistent basis sets for iron (cc-pVTZ-NR, cc-pVQZ-NR) were taken from the basis set exchange 47 website, and combined with the standard cc-pVTZ and cc-pVQZ basis sets for H, F and O in compound 4.

Results and discussion
We have made a systematic investigation into the influence of the type and size of basis set used on the spin-state splittings of several spin-state complexes. In particular, we have calculated the vertical spin-state splittings with the OPBE functional for three iron complexes (see Figure 1), for which the spin ground-state is known experimentally. Recently, it was already shown 6 that the OPBE functional is remarkably successful for the prediction of spin-state splittings, which is confirmed by other and more recent studies. [10][11][12][13][14][15][16][17][18][19][20][21][22][23] However, the actual choice for the DFT functional is not important for the issue at stake in this study, where the basis set is under scrutiny. Furthermore, to investigate the influence of the basis set on the geometry, and the corresponding relaxed spin-state splittings (see Figure 2 for the difference between vertical and relaxed spin-state splittings), we have optimized the geometry of a small iron complex (4, FeFHOH), for each spin state separately and with all basis sets, including the very demanding ones like cc-pVQZ.
[here Figure 2] Influence of the basis set on the vertical spin-state splittings for compounds [1][2][3] For compounds 1-3, no significant amount of spin-contamination was observed, as evidenced by the expectation values for S 2 , which were with all (STO, GTO, ECPB) basis sets very close to the pure spinstate values of 0.75 (doublet), 3.75 (quartet) and 8.75 (sextet).
With the STO basis sets, the OPBE functional predicts correctly 6 a doublet ground state for compound 1, a quartet for 2, and a sextet for 3 (see Table 2). Moreover, the spin-state splittings are very similar in magnitude, irrespective of the size of the basis set, and whether an all-electron or frozen-core basis set is chosen. Furthermore, including scalar relativistic corrections (ZORA 38,48 ) does not have a large influence on the spin-state splittings (see Table 2).
[here Table 2] Included in Tables 2 and 3 is the mean absolute deviation (MAD) of the spin-state splittings calculated with a certain basis set, compared to those calculated with the largest non-ZORA STO basis set (TZ2P). We chose the results with this basis set as reference, for different reasons. First, because there are no experimental values available. Second, only STO basis sets can properly represent the cusps at the nuclear positions as well as the correct exponential asymptotic behavior of the electron density.
Third, the results with the large QZ4P ZORA basis set are obtained with relativistic corrections included, which makes a direct comparison with the results from the GTO and ECPB basis sets (that do not explicitly include relativistic corrections) unfair. The MAD value is reasonably small (0.9 kcal/mol) for the smallest DZ basis, and reduces even further to values of 0.1-0.3 kcal/mol for larger basis sets (see Table 2). The influence of using a frozen-core basis set is shown to be equally small (0.3 kcal/mol or less). Therefore, the STO basis sets give results that are consistent with each other, and that converge rapidly with basis set size.
This picture changes when turning to the GTO basis sets (see Table 3). These results are shown to be sensitive to the number of primitive Gaussian functions, as shown by the MAD values of 3-21G (13.6 kcal/mol) vs. 6-31G (5.4 kcal/mol). Both basis sets have the same number of basis functions, and differ only in the number of (contracted) primitive GTO functions: e.g. for 6-31G, the number of primitives (six) for the core electrons is twice as large as for 3-21G.
The MAD values for GTOs are much larger than the STO values, and they tend to converge much slower with basis set size. For instance for the series of Pople basis sets with six primitive Gaussians for the core electrons (6-31G, .., 6-311+G**), the MAD value goes from 5.4 kcal/mol for 6-31G, to 4.5 kcal/mol for 6-31G*, to 2.7 kcal/mol for 6-311G and only with the 6-311+G** is the performance sufficiently close (0.3 kcal/mol) to the STO basis sets. In contrast, the Ahlrichs basis sets VDZP 49 and VTZ2D2P 49,50 give more reliable results than the Pople-type equivalent, with MAD values of 1.2 and 0.2 kcal/mol respectively.
[here Table 3] Particularly striking is that the (smaller) GTO basis sets have a problem with high spin states. This is most evident by looking at the spin-state splittings for compounds 1 and 3. The latter has a high-spin ground state with the doublet state at around +13 kcal/mol (see the results with the STO basis sets in Table 2). However, with the Pople-type basis sets the doublet is typically placed in the range from +17 to +23 kcal/mol (and +37 for 3-21G, see Table 3), i.e. deviations of 4-10 kcal/mol (24 kcal/mol for 3-21G). Likewise, for compound 1 with a doublet ground state, the sextet state is too low in energy by 4-7 kcal/mol (19 kcal/mol for 3-21G) with the GTO basis sets. Only with the 6-311+G** and the Ahlrichs VTZ2D2P basis set are the high spin states described reasonably well. This overestimation of the stability of high spin states by (smaller) GTO basis sets is not influenced by the number of d-functions of the basis set (5D vs. 6D), nor by the program used, whose effects are of the order of only 0.1 kcal/mol (results not shown in tables).
Also the ECPB basis sets have a problem with describing the high spin states. Similar to the GTO basis sets, most of the ECPBs place the high spin-states at too low energy, and show overall MAD values of 3-5 kcal/mol. The only exception is the LANL2DZ ECPB, which places the high spin state at too high energy, and thus predicts a too small splitting between the doublet and sextet states for compound 3 (of only 11.0 kcal/mol). By mixing the ECPB on iron with GTO basis sets on the rest of the atoms, the high spin states are pushed up even further, and leads to a further reduction to 8.7 kcal/mol for LANL2DZ(Fe):6-311G**(rest). As a result, the MAD value increases with increasing basis set size, unlike the STO or GTO basis sets. The same happens for the LACV series, where by making the calculations technically more accurate (increasing the GTO basis set size) the MAD value increases, from 0.8 kcal/mol for LACVP, to 1.5 kcal/mol for LACV3P and 1.9 kcal/mol for LACV3P++**. Again, this results mostly from the high spin-states that instead of going up in energy, go to lower energies (see sextet state energies of compound 1 in Table 3). Therefore, the introduction of a model Hamiltonian (ECP) for replacing the effect of the core electrons leads to spin-state splittings that are systematically different from the STO-GTO results.

Influence of basis set on geometry and relaxed spin-state splitting of FeFHOH
The spin state energies of compounds 1-3 were all calculated using the same geometry for the three spin states, i.e. vertical spin-state splittings (see Figure 2). However, metal-ligand distances vary with spin state due to different occupations of the molecular valence orbitals. Moreover, the basis set size is likely to influence not only the spin state energies, but also the corresponding geometries (see Figure 2). Therefore, we decided to take a small iron complex and optimize that with all basis sets to investigate the influence of the basis set on the vertical spin-state splitting.
We started first with FeF 3 , but this resulted in problematic SCF in many cases. We therefore turned to the most simple non-symmetric iron complex we could think of (FeFHOH); for this system, all programs were able to locate the different spin states without problems. The doublet state for this molecule is with all basis sets significantly spin-contaminated, as evidenced by the expectation value for S 2 . This is found to be around 1.70 (a pure doublet has a S 2 value of 0.75) by the more accurate basis sets (see Table 5). In principle, we could have used spin-projection techniques 51,52 to correct the energy for the doublets, but these corrections would be of similar magnitude because of the similarity in the value for S 2 , and are therefore ignored here. Given in Table 4 are the iron-ligand distances for the three spin states as obtained with the different basis sets, with the corresponding spin state energies reported in Table 5. The mean absolute deviations with respect to the STO-TZ2P results for the distances (MAD R ) and energies (MAD E ) are given in the last column of these tables.
[here Table 4] For the STO basis sets, the deviations of the distances are similar to the accuracy of geometries for a set of small molecules. 53 I.e., the MAD R value for DZ of ca. 0.025 Å is reduced to 0.006 Å with the TZP basis (see Table 4). Most GTO basis sets have MAD R values of similar magnitude (0.008-0.024 Å), which is also observed for the ECPB and mixed ECPB:GTO basis sets. The more accurate GTO basis sets (Ahlrichs, Roos-ANO-aug-dz, cc-pVTZ, cc-pVQZ) have rather low MAD R values (0.003 Å). Therefore, the changes in geometry due to different d-orbital occupations in the three spin states are reproduced reasonably well by all basis sets.
More important than the geometry is how the basis set influences the spin-state splittings when optimizing the three spin states of FeFHOH separately, i.e. for the relaxed spin-state splittings (see Figure 2). Similar to seen for compounds 1-3, the STO basis sets give a consistent picture, with MAD E values of 0.6 kcal/mol (DZ) or less (0.1 kcal/mol) for the TZP STO basis set (see Table 5). Therefore, we can already conclude that the STO basis sets show similar and consistent accuracy for both spin-state energy and geometry.
[here Table 5] The smaller GTO basis sets show the same problems with the spin-state splittings as for compounds 1-3, in that they predict the high spin state at too low energy (see Table 5). The 3-21G basis even predicts a high spin ground state for the FeFHOH molecule, while the 6-31G and 6-31G* place it at almost the same energy as the quartet ground state. This erratic behavior should not be attributed to the basis set size, as the DZ STO (of similar size as 6-31G*) does give a good prediction for the relative energy of the high spin state.
For compounds 1-3, we saw that by increasing the basis set size, to e.g. 6-311+G** or VTZ2D2P, the difference between GTO and STO results virtually disappeared. This is not the case for compound 4, where the 6-311+G** and VTZ2D2P basis sets place the sextet state at 3.5 and 5.9 kcal/mol respectively (see Table 5), i.e. an underestimation of 5.4 and 3.0 kcal/mol. Because of the small size of compound 4, the more demanding basis sets could also be used, but at great computational cost: instead of minutes, these jobs took days of calculation. The three basis sets (Roos-ANO-aug-dz, cc-pVTZ, cc-pVQZ) significantly improve the results from the other GTO basis sets. This is in particular seen for the high spin-state that is predicted at around +9 kcal/mol, in excellent agreement with the STO results (see Tables 5). Thus, the same results are obtained for relaxed spin-state splittings when sufficiently large STO and GTO basis sets are used, but the convergence towards the basis set limit is significantly slower for the GTO series.
Representing the core electrons with only three GTO primitives (3-21G) is shown again to be insufficient, which shows up in MAD R (0.036 Å) and MAD E (5.7 kcal/mol) values for the 3-21G basis that are the largest deviations observed for all basis sets. Moreover, this basis set is the only one that predicts a high spin state for compound 4.
The belief that increasing the basis set size should always lead to more accurate results is shown to be incorrect. This is in particular true for the Pople-type basis sets and the basis sets containing ECPs. For the Pople series with six primitives for the core electrons, the MAD E value decreases from 4.4 kcal/mol for 6-31G to 4.3 kcal/mol for 6-31G*, to 2.2 kcal/mol for 6-311G, but then increases (by a factor of almost three !) to 6.0 kcal/mol for 6-311+G**. A similar oscillating trend is shown by the LACV series, where the MAD E value increases from 3.7 kcal/mol for LACVP to 5.7 kcal/mol for LACV3P, and then decreases to 4.2 kcal/mol for LACV3P++**. The problems show up again most evidently for the high spin-state, which is predicted ca. 5 kcal/mol too high by LACVP and LACV3P++**, and ca. 5 kcal/mol too low by LACV3P. Similar as for the vertical spin-state splittings of compounds 1-3, the LANL2DZ basis set places the high spin-state at a (5-6 kcal/mol) too high energy, resulting in a significant deviation from the accurate STO and GTO basis sets.

Overall performance of the basis sets for the spin-state splittings
As we have seen throughout this paper, the STO basis sets give consistent behavior while the convergence with respect to the basis set size is much slower for the GTOs. This is graphically illustrated in Figure 3, where we have plotted the MAD (compounds 1-3) and MAD E (compound 4) value for each basis set. Also obvious from the figure is the fact that if sufficiently large GTO basis sets are used, that they give the same spin-state splittings (both vertical and relaxed) as those obtained with the STO basis sets. The same can not be said about the basis sets that include ECPs, which give systematically different results from the STO-GTO data.
The GTO series seem to converge faster to the basis set limit results for vertical splittings than for relaxed splittings. This seems to suggest that a cancellation of errors occurs for the vertical splittings when using GTO basis sets, i.e. it shows up only when different geometries, i.e. the relaxed structures, are used for different spin-states. Because the STO basis sets do not seem to suffer from it, it may have to do with the cusps at the nuclei (present in STO, absent in GTO), that might influence the relative energy of the d-orbitals on the metal.

Concluding remarks
We have made a systematic investigation of the influence of the basis set on the spin state energies of iron compounds when using the OPBE functional. Reliable and consistent results for spin state energies and geometries have been obtained when using Slater-type orbital (STO) and large Gaussian-type orbital (GTO) basis sets. Substantial deviations (2-10 kcal/mol) and inconsistencies are observed when using (small) GTO, Effective Core Potential (ECPB) or mixed ECPB(Fe):GTO(rest) basis sets. Using three primitive GTOs for core electrons (3-21G) is shown to be insufficient and results in large deviations (14 kcal/mol on average for compounds 1-3, 19-24 kcal/mol for high spin states).
An infinitely large GTO basis set should in principle give the same final results as the STO series, which is indeed what we see happening for the vertical and relaxed spin-state splittings. However, the convergence of the result with basis set size is much slower for GTOs than for STOs. This is in particular true for the energy of the high spin state, which is generally predicted at too low energy by the GTOs. In fact for the relaxed spin-state splittings of compound 4, the large (6-311+G** and VTZ2D2P) GTO basis sets that were capable to reproduce the vertical spin-state splittings of compounds 1-3, still underestimate the energy of the high spin-state by 3-5 kcal/mol. The STO results were however confirmed by the large and very demanding GTO basis sets (Roos-ANO-aug-dz, cc-pVTZ, cc-pVQZ).
The ECPB and mixed ECPB(Fe):GTO(rest) basis sets are unable to give an accurate prediction of spin-state splittings. Most ECPBs underestimate the energy for the high spin-state, typically by some 5 kcal/mol, while increasing the valence GTO basis set does not always lead to better results. This shows clearly that the replacement of the effect of core electrons by a model (ECP) Hamiltonian differs systematically from the accurate STO-GTO results.         a) values in parentheses refer to expectation values of S 2 , which should be 0.75, 3.75 and 8.75 for pure spin states; b) number of basis functions used for compound 4; c) Mean Absolute Deviation (kcal/mol) with respect to all-electron TZ2P (STO) results; d) calculated with ADF program, using VWN5 local correlation with all electrons included; e) calculated with ORCA program, using VWN5 local correlation; f) calculated with G03 program, using PW92 local correlation; g) calculated with NWChem, using VWN5 local correlation; h) mixed basis set with ECPB on Fe, GTO on rest of atoms