Part 1: Error analysis and hyperparameter optimization of machine-learned force fields ¶
This tutorial will explain error analysis and hyperparameter optimization of machine-learned force fields (MLFF). In the first part of the tutorial, you will learn how to examine the training-set and test-set errors. The second part of the tutorial will teach you about hyperparameter optimization. By careful hyperparameter optimization, it is possible to improve the accuracy and the performance of the MLFF simultaneously.
Disclaimer: The force field and test set used in this tutorial are tweaked. They are useful to demonstrate the workflow within reasonable time and computational effort, but not for research applications.
1 Training-set and test-set error analysis ¶
1.1 Task¶
Compute the training-set and test-set errors of an MLFF model for given reference data.
Error estimation of an MLFF model gives insights into the accuracy of the trained model and how the model generalizes to out-of-training-set structures. In machine-learning approaches, one generally differentiates between two types of errors: The training-set error and the test-set error. The training-set error is evaluated on the training set. It is computed as the error between the values of the density-functional-theory (DFT) calculation used for fitting and the MLFF prediction.
The test-set error is evaluated using an external test set that was not utilized during the training process. It is considered good practice to sample the configurations in the test set under the same conditions as those employed in the MLFF production runs. For example, the structures in the test set should possess the same number of atoms as the production run, even if the initial training was conducted on structures with fewer atoms. Furthermore, the test set must be collected in the same thermodynamic phase as the one in which the production run will take place. When test-set error analysis is conducted in this manner, it provides insights into how well the MLFF model generalizes from the training set to the production conditions.
In general, there are three cases which can be distinguished for the comparison of training-set and test-set errors:
A low training-set error and a high test-set error indicate overfitting. This scenario means that your force field is bad at extrapolating to structures outside of the training set. Improving this force field is necessary, either by including more training structures or by tuning hyperparameters.
A training-set error and a test-set error that are roughly of the same size indicate that your force field will work as long as both errors are low enough for the desired application.
A high training-set error and a low test-set error indicate that your test set is biased, i.e., not general enough.
For further information about errors in machine-learning approaches study section IV in the paper by A. M. Tokita and J. Behler (DOI: 10.1063/5.0160326) about MLFF validation.
In this tutorial, we are going to refit an MLFF model based on existing ab-initio data stored in an ML_AB file. We assume that we finished the on-the-fly learning. With the ML_AB data we will analyze the training-set error and the test-set error of the obtained MLFF, i.e. we will evaluate this force-field.
1.2 Input¶
To fit a MLFF model based on existing ab-initio data, supply a ML_AB file
that contains the ab-initio data in addition to the VASP-standard-input files. During this tutorial, we will not perform any DFT calculation but rely on previously computed data. Check out the input at $TUTORIALS/MLFF/e01_error_analysis
:
INCAR.refit:
ML_LMLFF = TRUE
ML_MODE = refit
INCAR.run:
IBRION = -1 # no ionic update (default)
ML_LMLFF = TRUE
ML_MODE = run
Li H
1.00000000000000
4.0271000861999999 0.0000000000000000 0.0000000000000000
0.0000000000000000 4.0271000861999999 0.0000000000000000
0.0000000000000000 0.0000000000000000 4.0271000861999999
Li H
4 4
Direct
0.0021931998366077 0.9377313783581447 0.0399062356234574
0.9873787948250990 0.4401410952913111 0.5508732802249812
0.4655028962322062 0.0412653312320183 0.4666666187329044
0.5673178577048460 0.5946132074966266 0.9329094887162737
0.4620390886414084 0.4849929008378770 0.4952311142297970
0.4556333788011961 0.0088289694138672 0.0177579698840076
0.9786514408600437 0.4389104900251807 0.0101037431973456
0.9467029288350567 0.9708730319095862 0.5445141862785567
1x1x1
0
G
1 1 1
0 0 0
Collection of ab-initio data from previous calculations for LiH: Lattice vectors, atomic positions, energies, forces, and stress tensors.
Pseudopotentials for two atom types.
1.2.1 Refitting input¶
To do a refitting, the INCAR file only contains two tags: ML_LMLFF to switch on the machine-learning mode and ML_MODE to select the refitting mode. In general, refitting should always be done after a ML_AB file was obtained during on-the-fly training. This will generally result in a more accurate MLFF because the model weights are determined via linear regression using singular-value decomposition instead of evidence approximation. After refitting, VASP runs faster in production (with ML_MODE=run) because the more performant algorithms require refitting.
Why does the POSCAR file not influence the refitting?
Click to see the answer!
The used POSCAR will not influence the refitting procedure because the refit is based on the ab-initio data stored in the ML_AB file. The structure defined in the POSCAR file is used for a single prediction step, so that the energies and forces which are in the OUTCAR file will belong to it. But we don't care about these results in this exercise about analysing an MLFF.
We use a gamma-only KPOINTS file.
We have to supply a POTCAR file with the proper number of atom types for refitting but it will only be read and not used. The machine-learning algorithms will determine the atom types from the ML_AB file independent of the atom types defined by the POTCAR file.
1.2.1 Error-analysis input¶
The test set can be found in the folder $TUTORIALS/MLFF/test_set
. It contains POSCAR files in the folder $TUTORIALS/MLFF/test_set/structures
and the corresponding DFT data in $TUTORIALS/MLFF/test_set/DFT_data
. The test set consists of 50 independent structures. In practical applications, the test set should be chosen larger, i.e., roughly around 500 structures, depending on the final application of the force field.
1.3 Calculation¶
Determine the training-set error by executing the following commands in the terminal
cd $TUTORIALS/MLFF/e01_*
cp INCAR.refit INCAR
mpirun -np 4 vasp_gam
After refitting, VASP writes the force field to an ML_FFN file and information about the training-set error to the ML_LOGFILE. Check the documentation of the ML_LOGFILE file. How can you grep for the training-set error and print it to the stdout? What units do the training-set error of the energy, force and stress tensor have, respectively?
Click to see the answer!
To print the training-set error, run the following command in the bash shell
grep ERR ML_LOGFILE
The last line contains 5 columns. The third column shows the training-set error of the energy in eV, the fourth column is the force error in eV/Å, and the last column is the error in the stress tensor in kbar.
%%bash
grep ERR e01_*/ML_LOGFILE
Next, we want to analyse the test-set error and compare it to the training-set error. To do this, we have to apply the MLFF model on our test set. Compute energies, forces and stress tensor of the test set by executing the following commands:
cp ML_FFN ML_FF
cp INCAR.run INCAR
bash test_set_analysis.sh
This runs the following script
test_set_analysis.sh
#!/usr/bin/env bash
mkdir MLFF_data
for i in {1..50}; do
echo "test configuration " $i
cp ../test_set/structures/POSCAR.${i} POSCAR
mpirun -np 4 vasp_gam
cp ./vaspout.h5 ./MLFF_data/vaspout_${i}.h5
done
After this calculation has finished, we compute the errors between the DFT data and the MLFF data in the test set with py4vasp by running the following python cell. It will compute the test-set error for every configuration in the test set and plot it.
from py4vasp import MLFFErrorAnalysis
from py4vasp import plot
import numpy as np
# Compute the errors
mlff_error_analysis = MLFFErrorAnalysis.from_files(
dft_data="./test_set/DFTdata/*.h5",
mlff_data="./e01_error_analysis/MLFF_data/*.h5"
)
energy_error = mlff_error_analysis.get_energy_error_per_atom()
force_error = mlff_error_analysis.get_force_rmse()
stress_error = mlff_error_analysis.get_stress_rmse()
x = np.arange(len(energy_error))
plot(x, energy_error, ylabel="Energy Error per atom [eV]", xlabel="Configuration Number")
plot(x, force_error, ylabel="Force RMSE [eV/Å]", xlabel="Configuration Number")
plot(x, stress_error, ylabel="Stress RMSE [kbar]", xlabel="Configuration Number")
To compute the total test-set error averaged over the configurations in the test set, execute:
from py4vasp import MLFFErrorAnalysis
mlff_error_analysis = MLFFErrorAnalysis.from_files(
dft_data="./test_set/DFTdata/*.h5",
mlff_data="./e01_error_analysis/MLFF_data/*.h5",
)
energy_error = mlff_error_analysis.get_energy_error_per_atom(normalize_by_configurations=True)
force_error = mlff_error_analysis.get_force_rmse(normalize_by_configurations=True)
stress_error = mlff_error_analysis.get_stress_rmse(normalize_by_configurations=True)
print(energy_error,force_error,stress_error)
What do you conclude from the comparison of the test set-error and the training-set error in the forces?
Click to see the answer!
The training-set error in the forces is slightly lower compared to the test-set error. Since both errors are small and below 10meV/Å the MLFF model seems to be reliable. The test-set error is of the same order as the training-set error and hence the MLFF model generalizes to structures outside of the training set.
1.4 Questions¶
- What happened if the error analysis yields a low training-set error and a high test-set error?
- Is the POSCAR file required when ML_MODE=refit?
- How is the linear-regression problem addressed when ML_MODE=refit?
2 Optimize the accuracy by hyperparameter tuning ¶
2.1 Task¶
Hyperparameter are user-defined parameters of the MLFF model that will not be optimized during training of the MLFF. Hyperparameter optimization can improve the accuracy and at the same time the performance of the force field. Examples for such parameters are the cutoff radii (ML_RCUT1 and ML_RCUT2) within which the descriptors of the local atomic environments are computed. For further information about hyperparameters and their optimization, we recommend Hyperparameter optimization by M. Feurer and F. Hutter. In our experience, the hyperparameters listed in the following table are the most important to consider in order to optimize the accuracy and performance:
parameter | function | class |
---|---|---|
ML_RCUT1 | cutoff radius 2 body | descriptor |
ML_RCUT2 | cutoff radius 3 body | descriptor |
ML_WTIFOR | weight factor atomic forces | fitting |
ML_WTOTEN | weight factor forces | fitting |
ML_WTSIF | weight factor stress tensor | fitting |
ML_EPS_LOW | sparse factor local reference configurations | sparse |
ML_RDES_SPARSDES | sparse factor descriptors | sparse |
This is only a small subselection of hyperparameters available in VASP. To find more information about these hyperparameters and to find a full list of the parameters available in VASP, please check the VASP Wiki.
In this exercise, you are going to scan the ML_RCUT1 parameter and try to improve your force field.
Cross-validation) is a method to judge the quality of the applied hyperparameters. As a preparatory step, make sure that there is no overfitting by ensuring the training-set error is not significantly smaller than the test-set error. Next, optimize the hyperparameters by varying them systematically. It is unlikely to improve the quality of the force field by randomly changing hyperparameters. Then, do a repeated comparison of training-set and test-set error with different hyperparameters as shown in the previous exercise.
For instance, to find the optimal value for ML_RCUT1, we compute the training-set errors (ML_LOGFILE file) and force fields (ML_FF file) for several values of ML_RCUT1. The ML_FF files allow to do an error analysis on the test set using py4vasp like in the previous exercise. To reduce computational time and disk usage, it is sufficient to consider only the training-set error during the parameter scan. Then, one analyzes the test set only on the MLFF model with the lowest training-set error.
2.2 Input¶
Check out the input at $TUTORIALS/MLFF/e02_hyperparameter_scan
.
INCAR.refit:
ML_LMLFF = TRUE
ML_MODE = refit
ML_RCUT1 = #vary this value
KSPACING = 300
INCAR.run:
IBRION = -1 # no ionic update (default)
ML_LMLFF = TRUE
ML_MODE = run
KSPACING = 300
Click to see the structure!
Li H
1.00000000000000
4.0271000861999999 0.0000000000000000 0.0000000000000000
0.0000000000000000 4.0271000861999999 0.0000000000000000
0.0000000000000000 0.0000000000000000 4.0271000861999999
Li H
4 4
Direct
0.0021931998366077 0.9377313783581447 0.0399062356234574
0.9873787948250990 0.4401410952913111 0.5508732802249812
0.4655028962322062 0.0412653312320183 0.4666666187329044
0.5673178577048460 0.5946132074966266 0.9329094887162737
0.4620390886414084 0.4849929008378770 0.4952311142297970
0.4556333788011961 0.0088289694138672 0.0177579698840076
0.9786514408600437 0.4389104900251807 0.0101037431973456
0.9467029288350567 0.9708730319095862 0.5445141862785567
ML_AB (will be copied from exercise 1)
Collection of ab-initio data from previous calculations for LiH: Lattice vectors, atomic positions, energies, forces, and stress tensors.
Pseudopotentials for two atom types.
The two central input files are the ML_AB file and the INCAR file. While VASP requires the POSCAR and the POTCAR files, they do not affect the hyperparameter scan. We avoid using a KPOINTS file by including the KSPACING tag in the INCAR file. In any case, the reciprocal lattice is irrelevant when refitting a force field.
We are using the ML_AB file from the previous exercise. Here, the ab-initio data contained in the ML_AB file will be refitted several times with different cutoff radii for the two-body descriptor. Create the following script and run it using bash run_hyper_param_scan.sh
to create an INCAR file
to scan over ML_RCUT1 and run VASP:
run_hyperparam_scan.sh
#!/usr/bin/env bash
for rcut in 13.0 14.0 15.0
do
echo "RCUT1" $rcut
# make INCAR file on-the-fly
cat << EOF > INCAR
ML_LMLFF = TRUE
ML_MODE = refit
ML_RCUT1 = ${rcut}
KSPACING = 300
EOF
mpirun -np 4 vasp_gam
cp ML_LOGFILE ML_LOGFILE_${rcut}
cp ML_FFN ML_FF_${rcut}
done
2.3 Calculation¶
To execute the parameter scan for ML_RCUT1, execute the run_hyperparam_scan.sh script.
cd $TUTORIALS/MLFF/e02_*
ln -s ../e01_*/ML_AB .
bash run_hyperparam_scan.sh
After running the calculation, plot the training-set error over the proposed cutoff radii by entering
bash extract_training_error.sh
This will run the following script to extract the training-set error from the ML_LOGFILEs extract_training_error.sh
#!/usr/bin/env bash
rm training_error.dat
# write header to file
echo "# rcut energy force stress " > training_error.dat
for rcut in 12.0 13.0 14.0
do
# extract energy, force, stress training-set
# errors from ML_LOGFILE and
# write to output file
grep ERR ML_LOGFILE_${rcut} | tail -n 1 | awk -v r=${rcut} '{print r, $3, $4, $5}'
done >> training_error.dat
%%bash
cat e02_*/training_error.dat
Now, there is a file named training_error.dat that you can plot by executing the code provided below. In addition to the three data points that were computed, there is also a file called training_error_store.dat which contains more data points. The tutorial is reduced to just three data points in order to save time.
import numpy as np
import plotly.express as px
import plotly.graph_objects as go
data = np.loadtxt("./e02_hyperparameter_scan/training_error.dat")
data_store = np.loadtxt("./e02_hyperparameter_scan/training_error_store.dat")
fig1 = px.scatter(x=data_store[:,0], y=data_store[:,1], labels={"x":"ML_RCUT1 [Å]", "y":"RMSE Energy [eV]"},color_discrete_sequence=['blue'])
fig2 = px.line(x=data_store[:,0], y=data_store[:,1],color_discrete_sequence=['blue'])
# plotting your data
fig3 = px.scatter(x=data[:,0], y=data[:,1], color_discrete_sequence=['red'] )
fig4 = px.line(x=data[:,0], y=data[:,1],color_discrete_sequence=['red'])
fig5 = go.Figure(data=fig1.data + fig2.data + fig3.data + fig4.data, layout = fig1.layout)
fig5.show()
fig1 = px.scatter(x=data_store[:,0], y=data_store[:,2], labels={"x":"ML_RCUT1 [Å]", "y":"RMSE Force [eV/Å]"},color_discrete_sequence=['blue'])
fig2 = px.line(x=data_store[:,0], y=data_store[:,2],color_discrete_sequence=['blue'])
fig3 = px.scatter(x=data[:,0], y=data[:,2], color_discrete_sequence=['red'] )
fig4 = px.line(x=data[:,0], y=data[:,2],color_discrete_sequence=['red'] )
fig5 = go.Figure(data=fig1.data + fig2.data + fig3.data + fig4.data, layout = fig1.layout)
fig5.show()
fig1 = px.scatter(x=data_store[:,0], y=data_store[:,3], labels={"x":"ML_RCUT1 [Å]", "y":"RMSE Stress [kBar]"},color_discrete_sequence=['blue'])
fig2 = px.line(x=data_store[:,0], y=data_store[:,3],color_discrete_sequence=['blue'])
fig3 = px.scatter(x=data[:,0], y=data[:,3],color_discrete_sequence=['red'] )
fig4 = px.line(x=data[:,0], y=data[:,3],color_discrete_sequence=['red'])
fig5 = go.Figure(data=fig1.data + fig2.data + fig3.data + fig4.data, layout = fig1.layout)
fig5.show()
The plot contains three different errors. The energy error, the force error and the error of the stress tensor. The red part of the plot shows the data points computed during this tutorial. In principle, it is always good to have all three errors as small as possible. But there might also be cases where the user has to pay special attention to one of them. For example, for the computation of defect-formation energies, the MLFF model should be optimized to minimize the energy error. In contrast, phonon frequencies require a MLFF model with minimal errors of the forces. To pay special attention to one of the errors, weights for the quantity of interest can be set during fitting. To increase the importance of the total energy during fitting, adjust the value of ML_WTOTEN. Similarly, the importance of the forces is set by ML_WTIFOR and the weight factor for the stress tensor by ML_WTSIF. Focus on the stress tensor if you are interested in an exact prediction of the volume.
From the plot we see that the model with ML_RCUT1=13 is the most accurate. Thus, we keep ML_ABN_13.0 and ML_FFN_13.0 for further analysis. In particular, we repeat the steps of the test-set analysis. Execute the following command in the bash shell:
ln -s ML_FF_13.0 ML_FF
cp INCAR.run INCAR
bash test_set_analysis.sh
This will run the script test_set_analysis.sh
#!/usr/bin/env bash
mkdir MLFF_data
for i in {1..50}
do
echo "test configuration " $i
cp ../test_set/structures/POSCAR.${i} POSCAR
mpirun -np 4 vasp_gam
cp ./vaspout.h5 ./MLFF_data/vaspout_${i}.h5
done
from py4vasp import MLFFErrorAnalysis
mlff_error_analysis = MLFFErrorAnalysis.from_files(
dft_data="./test_set/DFTdata/*.h5",
mlff_data="./e02_hyperparameter_scan/MLFF_data/*.h5",
)
energy_error = mlff_error_analysis.get_energy_error_per_atom(normalize_by_configurations=True)
force_error = mlff_error_analysis.get_force_rmse(normalize_by_configurations=True)
stress_error = mlff_error_analysis.get_stress_rmse(normalize_by_configurations=True)
print(energy_error,force_error,stress_error)
In principle, one should repeat this procedure for every hyperparameter available in VASP to obtain the best MLFF model; however in practice, it is not possible to scan all hyperparameters. Usually VASP has reasonable default choices for these parameters because they were optimized on a selected set of bulk materials and on a molecular data base. Depending on the intended amount of production calculations for your force field, it might be worth considering the parameters of the table above.
2.4 Questions¶
- What are hyperparameters?
- In this exercise, we optimized the hyperparameters. Compared to the previous exercise, how much did this optimization improve the error in the forces?
- Can hyperparameter optimization increase the performance of the force field?
3 Optimization of MLFF performance ¶
3.1 Task¶
Optimize the performance of a force field based on an existing ML_AB file.
Efficient sparsification increases the performance of a force field. For instance, VASP can reduce the number of angular descriptors of the MLFF model using the CUR algorithm (ML_LSPARSDES). The descriptor-covariance matrix A is decomposed into a product of three matrices, $C$, $U$, and $R$, where $C$ consists of a small number of actual columns of $A$, $R$ consists of a small number of actual rows of $A$, and $U$ is a small unitary matrix that guarantees that the product CUR is close to A. While the original CUR algorithm was developed to efficiently select a few significant columns of the matrix A, we use it to search for insignificant angular descriptors and remove them. This decision is based on the leverage scores and ML_RDES_SPARSDENS.
For detailed information on the CUR algorithm, check out Mahoney et al., Comp. Sci. 106(3), 697-702 (2009) and Jinnouchi et al., J. Chem. Phys. 152, 234102 (2020).
Removing descriptors increases the performance of the force field at the cost of its accuracy. To find an optimal tradeoff between accuracy and performance one can perform a Pareto front that has the timing of the code on the x-axis and the test-set accuracy on the y-axis of the plot. From the Pareto front you can select the optimal value for ML_RDES_SPARSDES. It requires the following steps for each value of ML_RDES_SPARSDES:
refit the MLFF model with a certain ML_RDES_SPARSDES value
determine the test-set accuracy of the MLFF model (as in exercise 1)
determine the timing of the MLFF model
To obtain meaningful timings, you need to average the timing of the force-evaluation routine over several MD steps. Moreover, writing to files is usually very slow and hence this task has to be eliminated during a timing run. This is achieved by setting ML_OUTBLOCK and ML_OUTPUT_MODE.
3.2 Input¶
Check out the input at $TUTORIALS/MLFF/e03_sparsification
!
INCAR.sparse:
ML_LMLFF = True
ML_MODE = refit
# the previously determined cutoff is kept
ML_RCUT1 = 13.0
ML_LSPARSDES = True
ML_RDES_SPARSDES = 0.5
KPACING = 300
INCAR.run:
IBRION = -1 # no ionic update (default)
ML_LMLFF = TRUE
ML_MODE = run
KSPACING = 300
INCAR.timing:
# molecular dynamics (MD)
IBRION = 0 # switch on MD
NSW = 100 # number of MD steps
POTIM = 2.0 # time step in fs
# NVT ensemble
MDALGO = 3 # Langevin thermostat
TEBEG = 300 # temperature
LANGEVIN_GAMMA = 3.0 3.0 # friction coefficients
ISIF = 2 # const. volume
RANDOM_SEED = 10 0 0
# production run
ML_LMLFF = .TRUE.
ML_MODE = RUN
ML_OUTBLOCK = 100
ML_OUTPUT_MODE = 0
KPACING = 300
Click to see the structure!
Li H
1.00000000000000
4.0271000861999999 0.0000000000000000 0.0000000000000000
0.0000000000000000 4.0271000861999999 0.0000000000000000
0.0000000000000000 0.0000000000000000 4.0271000861999999
Li H
4 4
Direct
0.0021931998366077 0.9377313783581447 0.0399062356234574
0.9873787948250990 0.4401410952913111 0.5508732802249812
0.4655028962322062 0.0412653312320183 0.4666666187329044
0.5673178577048460 0.5946132074966266 0.9329094887162737
0.4620390886414084 0.4849929008378770 0.4952311142297970
0.4556333788011961 0.0088289694138672 0.0177579698840076
0.9786514408600437 0.4389104900251807 0.0101037431973456
0.9467029288350567 0.9708730319095862 0.5445141862785567
Collection of ab-initio data from previous calculations for LiH: Lattice vectors, atomic positions, energies, forces, and stress tensors.
Pseudopotentials for two atom types.
In this exercise, we use two different INCAR files:
- The INCAR.sparse does a sparsification of the three-body descriptors and refits the force field with the reduced number of descriptors. We set ML_RCUT1=13.0 because in the last exercise we determined it to be the optimal value.
- The INCAR.run computes the energy, forces and stress tensor for a given POSCAR
We are reusing the ML_AB file from the previous exercises to have access to ab-initio data.
The POSCAR file defines the initial structure for the MD run and is not used during the sparsification. The POTCAR file is not used in both cases.
3.3 Calculation¶
Perform the sparsification by entering the following into the terminal:
cd $TUTORIALS/MLFF/e03_*
ln -s ../e01_*/ML_AB
cp INCAR.sparse INCAR
mpirun -np 4 vasp_gam
cp ML_LOGFILE ML_LOGFILE.sparse
cp ML_FFN ML_FF.sparse
Now, we can compare the number of descriptors of the accurate model from the previous exercise with the sparsified model.
The number of descriptors can be found in the ML_LOGFILE with the keyword NDESC
. Enter the grep command as follows:
grep "NDESC" ML_LOGFILE.sparse
%%bash
grep "NDESC" e03_*/ML_LOGFILE.sparse
The number of descriptors can be found in the last line of the output. Compare the number of descriptors before and after sparsification by executing the following in the bash shell:
grep NDESC ../e02_*/ML_LOGFILE_13.0 | tail -n 1
grep NDESC ML_LOGFILE.sparse | tail -n 1
%%bash
grep NDESC e02_*/ML_LOGFILE_13.0 | tail -n 1
grep NDESC e03_*/ML_LOGFILE.sparse | tail -n 1
From the output we can deduce that the number of radial descriptors is the same in both cases, namely 24. The number of angular descriptors cut in half minus 1 from 452 to 225 for both atom types.
Next, we analyse the obtained force field in the same way as in exercise 1. We have to loop over the structures of our test set and store the vaspout.h5 files. Then, we execute the error analysis in py4vasp.
cp INCAR.run INCAR
ln -s ML_FF.sparse ML_FF
bash test_set_analysis.sh
from py4vasp import MLFFErrorAnalysis
mlff_error_analysis = MLFFErrorAnalysis.from_files(
dft_data="./test_set/DFTdata/*.h5",
mlff_data="./e03_sparsification/MLFF_data/*.h5",
)
energy_error = mlff_error_analysis.get_energy_error_per_atom(normalize_by_configurations=True)
force_error = mlff_error_analysis.get_force_rmse(normalize_by_configurations=True)
stress_error = mlff_error_analysis.get_stress_rmse(normalize_by_configurations=True)
print(energy_error,force_error,stress_error)
The following table shows the test-set errors for the three force fields you trained so far:
standard MLFF | MLFF ML_RUCT1 scan | MLFF sparse |
---|---|---|
8.5E-3 eV/Å | 7.9E-3 eV/Å | 8.2E-3 eV/Å |
What can you conclude from the test-set errors in the table?
Click to see the answer!
The sparsification increases the test-set error slightly but reduces the number of angular descriptors by half for every atom type.
The subsequent phase involves evaluating the performance of the MLFF models. This analysis is strongly affected by the number of jobs running simultaneously on the compute node, which significantly influences the timings. Nevertheless, let us discuss how you could analyze the performance of the MLFF models.
First, reduce the I/O of your calculation. The INCAR.timing shows how to reduce the output written during a timed MD run. To run the calculation, you would copy INCAR.timing to INCAR and execute VASP.
Run calculations for different MLFF that you want to compare and save the OUTCAR file.
Finally, average the timing for an MD step by grepping the line
LOOP+
from the OUTCAR file that contains the CPU time and averaging over the CPU time. The following bash lines achieves this for OUTCAR.sparse_timing and OUTCAR.no-sparse_timing, which we have prepared for you in advance. Enter the following into the Terminal:
grep LOOP+ OUTCAR.sparse_timing | awk '{sum+=$7;n++} END {print sum/n}'
grep LOOP+ OUTCAR.no-sparse_timing | awk '{sum+=$7;n++} END {print sum/n}'
Note that here we are using a rather small database of ab-initio data (ML_AB file). The larger the database, the more the sparsification impacts the timing. The maximum achievable speed-up factor is ML_RDES_SPARSDES.
%%bash
grep LOOP+ e03_*/OUTCAR.sparse_timing | awk '{sum+=$7;n++} END {print sum/n}'
grep LOOP+ e03_*/OUTCAR.no-sparse_timing | awk '{sum+=$7;n++} END {print sum/n}'
3.4 Questions¶
- How can you find an optimal tradeoff between performance and accuracy?
- What does ML_RDES_SPARSDES specify?
- What output is written if ML_OUTPUT_MODE=1?
In the last part of this exercise, the obtained sparse MLFF model is used for a small production run. Since the model is now properly trained and all the parameters are stored in the ML_FF file, there is no need anymore to set them in the INCAR file. So, the only tags in the INCAR file which have to be set for the finished MLFF model are ML_LMLFF=TRUE and ML_MODE=run. To remember which hyperparameters were used during the creation of the MLFF model they are stored in the first line of the ML_FF file. The rest of the ML_FF file is stored in a binary format. To check the hyperparameters of the ML_FF model execute the following command
cd $TUTORIALS/MLFF/e04_*
ln -s ../e03_*/ML_FF.sparse ML_FF
head -n 1 ML_FF
which should produce the following output:
ML_FF 0.2.1 binary { "date" : "2023-09-22T14:43:08.294", "ML_LFAST" : True, "ML_DESC_TYPE" : 0, "types" : [ "Li", "H" ],
"training_structures" : 122, "local_reference_cfgs" : [ 737, 752 ], "descriptors" : [ 249, 249 ], "ML_IALGO_LINREG" : 4,
"ML_RCUT1" : 1.3000E+01, "ML_RCUT2" : 5.0000E+00, "ML_W1" : 1.0000E-01, "ML_SION1" : 5.0000E-01, "ML_SION2" : 5.0000E-01,
"ML_LMAX2" : 3, "ML_MRB1" : 12, "ML_MRB2" : 8, "ML_IWEIGHT" : 3, "ML_WTOTEN" : 1.0000E+00, "ML_WTIFOR" : 1.0000E+00,
"ML_WTSIF" : 1.0000E+00 }
You see that we optimized ML_RCUT1 to a value of 13.0 and that the number of descriptors is [249, 249] for the atom types Li and H, respectively.
To use our MLFF model for a production run, we are going to run a molecular-dynamics (MD) simulation and compute the pair-distribution function. The pair-distribution function is related to the static structure factor by means of a Fourier transform. The static structure factor is an interesting quantity because it can be directly obtained from scattering experiments such as X-ray or neutron diffraction experiments. Therefore, the pair distribution gives us the possibility to validate our model on experimental data. For further information on the structure factor check L. Van Hove, Phys. Rev. 95, 249 (1954).
In this part of the tutorial we will determine the pair distribution of LiH at a temperature of 450K and a pressure of 1bar. To maintain the temperature and pressure we use a Langevin thermostat.
4.2 Input¶
Check out the input at $TUTORIALS/MLFF/e04_production
.
# molecular dynamics (MD)
IBRION = 0 # switch on MD
NSW = 2000 # number of MD steps
POTIM = 2.0 # time step in fs
# NVT ensemble
MDALGO = 3 # Langevin thermostat
TEBEG = 450 # temperature
LANGEVIN_GAMMA = 0.1 0.1 # friction coefficients
ISIF = 2 # const. volume
RANDOM_SEED = 10 0 0
# production run
ML_LMLFF = .TRUE.
ML_MODE = RUN
ML_OUTPUT_MODE = 1 # Ensures that PCDAT is printed
# Misc
KPACING = 300
ISYM = -1
LiH
1.0000000000000000
7.8287847029081057 0.0184022147739740 0.0436744842177140
-0.0000000000059867 7.8831855213703577 0.0109444497488786
-0.0000000000095877 -0.0000000000034072 7.6332104165900363
Li H
32 32
Direct
1.9817054671128955 -0.4615590061054727 -1.4210973981086219
1.9908342020561749 -0.4245884677547543 -1.9403613271831182
2.0291987979331716 0.0764675410482946 -1.4473190803670182
2.0003819712963882 0.0434600233223213 -1.9372585128028623
1.5123565198129114 -0.4631422998001105 -1.4223489601879342
1.5009761278969092 -0.4453976364291092 -1.9415619597553522
1.4994078683320884 0.0453111416662948 -1.4395625528728533
1.4920787358153047 0.0749759754227328 -1.9554193908340134
1.4771192200171763 -0.6956853853527750 -1.1740875527008734
1.4857876526841061 -0.6917686740837428 -1.6720984387748790
1.4982238953402405 -0.2052814837011414 -1.2032635147179722
1.5076711504068163 -0.1525020681581678 -1.6625242247821244
2.0237745834200087 -0.6865011313240997 -1.1880793891256338
1.9619016330665227 -0.6920859965869365 -1.7098418060597396
1.9848255490801441 -0.1805732277990417 -1.1730681923310977
1.9930394513513248 -0.1840355873006903 -1.6854182262485986
1.2413563659888442 0.0625339636088172 -1.1946327025499690
1.2718302311426040 0.0484902467170686 -1.6864281291080219
1.2314950738872033 -0.4199584043288853 -1.1989048647882197
1.2661536447359654 -0.4379143690275620 -1.6987405959330355
1.7628840424723418 0.0709537191751443 -1.2065539892018160
1.7525527647373844 0.0770286007123581 -1.6762495528836525
1.7485055244828722 -0.4215982120264667 -1.1845971178647334
1.7448584151990505 -0.4067962476727136 -1.6789000393433424
1.2150187769175835 -0.6663000573428698 -1.9216020378849861
1.2524959087735665 -0.6812575935175010 -1.4462894477063559
1.2531731761067684 -0.1867855213621671 -1.9291100549777591
1.2259867851441322 -0.1856967875773643 -1.4252590344016063
1.7493982411143223 -0.6973680885139401 -1.9354784312219775
1.7260537783199135 -0.7018034121966544 -1.4161537454728923
1.7208441337659361 -0.1685614662605335 -1.9438107029069940
1.7333301563888721 -0.1708630537252926 -1.4381194728742182
1.2649257565892971 -0.6556014343128211 -1.1817810013660568
1.2585150056419852 -0.6950521445902047 -1.7022316528699457
1.2366899715948825 -0.2016212867097995 -1.1889179623891479
1.2435972373518811 -0.1674805421763512 -1.6746199753397368
1.7360565085394983 -0.7014677640229965 -1.1806409696057392
1.7359989851552748 -0.7073444407917674 -1.6708498403301331
1.7474656103131458 -0.1777402172080236 -1.1826824484643708
1.7624647683600680 -0.1757906844274497 -1.6801347751005680
1.2674652456927549 0.0503884660816078 -1.4356097484617691
1.2310466249001411 0.0327844150204245 -1.9403971909709119
1.2098635737050121 -0.4263929003423095 -1.4182292509337020
1.2265347871913885 -0.4279538904443739 -1.9537757614594222
1.7283483994297175 0.0558563680517394 -1.4483120337520408
1.7756017110881346 0.0672337699055441 -1.9278288436644531
1.7750253655838191 -0.4648904046255498 -1.4549195521056619
1.7295923315567368 -0.4219817905587360 -1.9470964548760867
1.4770209675407064 -0.6938438306751112 -1.4335017836486030
1.4723646353634945 -0.6770195558022604 -1.8958872528692552
1.4723304891734692 -0.2127593172465675 -1.4473881729001288
1.4891149694386838 -0.1626854203039640 -1.9603811580254049
1.9926313874455659 -0.6908264960781072 -1.4497987024242049
1.9867827816577046 -0.6980021465438837 -1.9437561167974060
1.9931966385787605 -0.1915982553314239 -1.4347752030340506
1.9783410196089279 -0.1914865493283802 -1.9326203981603027
1.4764840292260042 -0.4461711698868597 -1.1775341299154893
1.4875834918595106 -0.4154993260957937 -1.6775044859519459
1.4810701055683364 0.0320617038904891 -1.2116883860657408
1.5095460900056492 0.0791435657918060 -1.6881031427292668
1.9825393542246055 -0.4577940687402631 -1.1998173435832227
1.9904498339985650 -0.4649735127116422 -1.6891634696973543
2.0269030501276393 0.0453801235753682 -1.1771020717088660
1.9888483400537196 0.0679206981683836 -1.6933672606587671
Force field for LiH.
Pseudopotentials for two atom types.
Check the meaning of all INCAR tags by referring to the VASP Wiki!
The POSCAR used during this exercise was already pre-equilibrated and contains starting velocities.
We want to use the force field we have created in the previouse exercise. Link the file, to avoid duplicating a large file, with the following command:
cd $TUTORIALS/MLFF/e04_*
ln -s ../e03_*/ML_FF.sparse ML_FF
The POTCAR file is again a dummy file because we do a pure MLFF model run. It has to be present but its content is of minor importance.
4.3 Calculation¶
To execute the production run for the perfomant MLFF model, execute the following commands
cd $TUTORIALS/MLFF/e04_*
mpirun -np 4 vasp_std
First, plot the energy and temperature to confirm that the energy is conserved during the MD run and the temperature is fluctuating around the desired value.
from py4vasp import Calculation
my_calc = Calculation.from_path("./e04_production")
my_calc.energy[:].plot("TOTEN temperature")
If both lines are fluctuating around an average value, we have determined that the MD run was executed properly.
Next, plot the pair-distribution function!
my_calc.pair_correlation.plot()
How could you check the energy and temperature without py4vasp?
Click to see the answer!
You could extract the energy and temperature from the OSZICAR file and then plot them with your favorite tool. For instance, first enter the following commands into the terminal:
grep T= OSZICAR | awk '{print $7}' > energy.out
grep T= OSZICAR | awk '{print $3}' > temperature.out
Then, open gnuplot and run
p 'energy.out' w lp
How could you check the pair-distribution function without py4vasp?
Click to see the answer!
Compute the pair-distribution function with the following bash script:
extract_PCDAT.sh
#!/usr/bin/env bash
file=PCDAT
awk <$file >PCDAT.xy '
NR==8 { pcskal=$1}
NR==9 { pcfein=$1}
NR==7 { npaco=$1}
NR>=13 {
line=line+1
if (line==1) s=s+1
if (line==(npaco+1)) {
print " "
line=0
}
else {
a1[line]= a1[line] + $1
a2[line]= a2[line] + $2
a3[line]= a3[line] + $3
a4[line]= a4[line] + $4
print (line-0.5)*pcfein/pcskal,$1,$2, $3, $4, $5
}
}'
The script will create a file called PCDAT.xy, e.g., using gnuplot
set xlabel "r [Å]"
set ylabel "g(r)"
p 'PCDAT.xy' w l
Good job! You have finished Part 1!¶
After doing this tutorial you have accomplished the following tasks:
- computing training and test set errors and interpreting them
- fine tuning the hyperparameters of your model to increase the accuracy of your MLFF model
- timing the force evaluation of the MLFF
- optimizing the performance of the MLFF model by sparsifying the angular descriptor
- using the produced MLFF model to compute real quantities which are comparable to experiment