Comparing Geometry Accuracy
This simple example will demonstrate basic usecases of the pyorca.geometry module.
The geometry of fluoromethane was optimized at different levels of theory, such as from semi-empirical PM3, local BLYP, hybrid B3LYP and double-hybrid B2PLYP. We want to see how accurate and fast are these methods - let’s use PyORCA to analyse that!
import pyorca as po
# load the experimentally determined geometry as a reference
# the order of atoms if [C, F, H, H, H]
with open('fluoromethane.xyz', 'r') as f:
atoms, exp_geometry, _ = po.geometry.read_xyz(f.read())
# this is the list of different levels of theory and the respective output files
calculated = [
("PM3", "fluoromethane_PM3.out"),
("BLYP/def2-SVP", "fluoromethane_BLYP_SVP.out"),
("B3LYP/def2-SVP", "fluoromethane_B3LYP_SVP.out"),
("B3LYP/def2-TZVP", "fluoromethane_B3LYP_TZVP.out"),
("B3LYP/def2-TZVP/D4", "fluoromethane_B3LYP_TZVP_D4.out"),
("B2PLYP/def2-TZVP", "fluoromethane_B2PLYP_TZVP.out"),
("B2PLYP/def2-QZVPP", "fluoromethane_B2PLYP_QZVPP.out"),
]
# this function extracts information about the geometry
# and also calculates RMSD (root mean squared distance) from the experimental structure
def calc_descriptors(exp_geometry, other_geometry):
# align the structures and calculate RMSD
rmds = po.geometry.aligned_rmsd(exp_geometry, other_geometry)
# calculate bond distances and angles
CF_dist = po.geometry.distance(other_geometry[0], other_geometry[1])
CH_dist = po.geometry.distance(other_geometry[0], other_geometry[2])
HCH_angl = po.geometry.angle_degree(other_geometry[2], other_geometry[0], other_geometry[3])
# return all values for printing
return rmds, CF_dist, CH_dist, HCH_angl
# Define template for printing nice table
template = '{: <18} {: <10.3f} {: <10.3f} {: <10.3f} {: <10.3f} {}'
# print table header
print(
template.replace('.3f','')
.format('Method', 'RMSD', 'C-F dist.', 'C-H dist.', 'H-C-H angl.', 'duration')
)
# print the experimental geometry for reference
print(template.format('Experimental', *calc_descriptors(exp_geometry, exp_geometry), '-'))
# process all ORCA output files
for method, filename in calculated:
# parse the file
data = po.OrcaOutput.parse_file(filename)
# load final geometry and calculation duration
final_geometry = data.calculations[0].final_coordinates
duration = data.duration
# print the data
print(template.format(method, *calc_descriptors(exp_geometry, final_geometry), duration))
Method RMSD C-F dist. C-H dist. H-C-H angl. duration
Experimental 0.000 1.365 1.082 110.202 -
PM3 0.010 1.351 1.092 110.362 0:00:03.159000
BLYP/def2-SVP 0.028 1.387 1.112 108.819 0:00:11.827000
B3LYP/def2-SVP 0.020 1.373 1.104 108.958 0:00:24.174000
B3LYP/def2-TZVP 0.013 1.390 1.091 110.109 0:00:43.175000
B3LYP/def2-TZVP/D4 0.013 1.390 1.091 110.104 0:00:55.844000
B2PLYP/def2-TZVP 0.011 1.388 1.089 110.137 0:01:13.593000
B2PLYP/def2-QZVPP 0.010 1.388 1.087 110.193 0:35:23.387000
As we can see, the geometry accuracy gets better with more complex functionals and larger basis sets, but the calculations also take a lot longer.
Semi-empirical methods, such as PM3, can often provide high-quality geometries much faster than DFT, but the results become much less reliable for not-so-common molecules and structural motives.