ML ISTART: Difference between revisions
No edit summary |
No edit summary |
||
(48 intermediate revisions by 4 users not shown) | |||
Line 1: | Line 1: | ||
{{TAGDEF| | {{DISPLAYTITLE:ML_ISTART}} | ||
{{TAGDEF|ML_ISTART|[integer]|0}} | |||
{{NB|warning|This tag is deprecated and we advise to use {{TAG|ML_MODE}} instead.}} | |||
Description: This tag selects the mode of operation (e.g. start from scratch, prediction-only,...) of the machine learning force fields method. | |||
---- | |||
If the machine learning force fields method is enabled via {{TAG|ML_LMLFF}} = .TRUE., this tag further specifies the mode of operation when {{VASP}} is run. The following cases can be selected: | |||
*{{TAG|ML_ISTART}} = 0: On-the-fly learning is enabled, starting from scratch. Force predictions from the machine learning force field are used to drive the MD simulation. However, if the error estimation performed in each time step indicates a high force error an ab initio calculation is performed instead and the collected energy, forces and stress are used to improve the machine learning force field. Setting {{TAG|ML_ISTART}} = 0 starts the machine learning force field from scratch. Hence, in the beginning of the MD run there is no force field available and ab initio calculations will happen frequently. | |||
---- | *{{TAG|ML_ISTART}} = 1: Same as {{TAG|ML_ISTART}} = 0 but taking into account pre-existing ab initio data. This is the usual choice for continuing a previous MD simulation with activated machine learning. Before the MD run starts the {{FILE|ML_AB}} file, copied from {{FILE|ML_ABN}} from a previous run, is read and the contained ab initio energies, forces and stresses are used to generate an initial force field. Note that this preparative learning step adopts the previous choice of local reference configurations, i.e. the reference atomic environments entering the kernel are taken from a list in the {{FILE|ML_AB}} file. Then, the MD simulation is started with on-the-fly learning enabled. The {{FILE|ML_AB}} file does not necessarily need to contain structures matching the current starting configuration in the {{FILE|POSCAR}} file in terms of simulation box, present elements or number of atoms. However, if the same elements appear the initial force field is of course used for predictions. In any case the provided training data is included in the finally generated machine learning force field, i.e. the {{FILE|ML_FFN}} file will define a force field applicable to both, the structures in the {{FILE|ML_AB}} file '''and''' the current MD simulation. By restarting repeatedly with {{TAG|ML_ISTART}} = 1 while providing an {{FILE|ML_AB}} file from the last run it is possible to iteratively extend the applicability of the resulting machine learning force field, e.g. by exploring different temperature ranges or element compositions. | ||
{{NB|tip|Setting {{TAG|ML_ISTART}} {{=}} 1 together with {{TAG|NSW}} {{=}} 0 allows to repeat learning on the given training data and create a new force field in {{FILE|ML_FFN}} without actually performing additional MD steps. In this way force field parameters (e.g. cutoff radii, number of radial basis functions, etc.) can be varied without recalculating the entire trajectory. Moreover, because Bayesian error estimation is not required when no MD is run it is possible to switch the regression algorithm via the tag {{TAG|ML_IALGO_LINREG}} and check whether in this way better fitting results can be achieved. In order to avoid that the starting structure in the {{FILE|POSCAR}} file is processed and eventually added to the training data just set {{TAG|ML_CTIFOR}} to a large value (e.g. 1000).|:}} | |||
*{{TAG|ML_ISTART}} = 2: Prediction only. In this mode the previously trained machine learning force field is read from the {{FILE|ML_FF}} file. The MD simulation is driven with predictions from the force field only, no ab initio calculations are performed and no learning is executed. However, in order to monitor the quality of predictions the Bayesian error estimate of forces is still computed and logged in the {{FILE|ML_LOGFILE}}. This setting is typically used when the machine learning force field is considered mature and ready for production runs. | |||
*{{TAG|ML_ISTART}} = 3: Learning from given ab initio data only, no MD time steps. In this operation mode a new machine learning force field is generated from ab initio data provided in the {{FILE|ML_AB}} file. The structures are read in and processed one by one as if harvested via an MD simulation. In other words, the same steps are performed as in on-the-fly training but the source of data is not an MD run but the series of structures available in {{FILE|ML_AB}}. This operation mode can be used to generate {{VASP}} machine learning force fields from pre-computed or external ab initio data sets. At first glance {{TAG|ML_ISTART}} = 3 looks very similar to the combination of {{TAG|ML_ISTART}} = 1 and {{TAG|NSW}} = 0 described above. However, there is an important difference: Setting {{TAG|ML_ISTART}} = 3 will ignore the list of local reference configurations in the {{FILE|ML_AB}} file and instead will determine a new collection which is written to the resulting {{FILE|ML_ABN}} file. {{NB|tip|If calculations for {{TAG|ML_ISTART}} {{=}} 3 are too time-consuming using the default settings, it is useful to increase {{TAG|ML_MCONF_NEW}} to values around 10-16 and set {{TAG|ML_CDOUB}} {{=}} 4. This often accelerates the calculations by a factor of 2-4.|:}} The {{TAG|ML_AB}} file may contain values for ''CTIFOR'' for each training structure. These are the thresholds used to sample that structure from the previous training. If a value for {{TAG|ML_CTIFOR}} is specified in the {{TAG|INCAR}} file, that value is then used and the thresholds from the {{TAG|ML_AB}} are ignored. Otherwise: 1) If thresholds exist in the {{TAG|ML_AB}} they are used. 2) If no thresholds are specified the default value for {{TAG|ML_CTIFOR}} is used. | |||
*{{TAG|ML_ISTART}} = 4: Refitting of the force field is done based on an existing {{TAG|ML_AB}} file, but the number of local reference configurations is taken from the {{TAG|ML_AB}} file. {{TAG|NSW}} on the input is ignored and only a single step is executed. No ab-initio calculation is carried out. | |||
== Related | == Related tags and articles == | ||
{{TAG|ML_LMLFF}}, {{TAG|ML_IALGO_LINREG}}, {{TAG|ML_IWEIGHT}}, {{TAG|ML_ICRITERIA}}, {{TAG|ML_IREG}}, {{TAG|ML_LSPARSDES}}, {{TAG|ML_ISCALE_TOTEN}}, {{TAG|ML_LCOUPLE}}, {{TAG|ML_LHEAT}}, {{TAG|ML_LEATOM}}, {{TAG|ML_MB}}, {{TAG|ML_MCONF}} | |||
{{sc| | {{sc|ML_ISTART|Examples|Examples that use this tag}} | ||
---- | ---- | ||
[[Category:INCAR]][[Category:Machine | [[Category:INCAR tag]][[Category:Machine-learned force fields]] |
Latest revision as of 14:40, 19 October 2023
ML_ISTART = [integer]
Default: ML_ISTART = 0
Warning: This tag is deprecated and we advise to use ML_MODE instead. |
Description: This tag selects the mode of operation (e.g. start from scratch, prediction-only,...) of the machine learning force fields method.
If the machine learning force fields method is enabled via ML_LMLFF = .TRUE., this tag further specifies the mode of operation when VASP is run. The following cases can be selected:
- ML_ISTART = 0: On-the-fly learning is enabled, starting from scratch. Force predictions from the machine learning force field are used to drive the MD simulation. However, if the error estimation performed in each time step indicates a high force error an ab initio calculation is performed instead and the collected energy, forces and stress are used to improve the machine learning force field. Setting ML_ISTART = 0 starts the machine learning force field from scratch. Hence, in the beginning of the MD run there is no force field available and ab initio calculations will happen frequently.
- ML_ISTART = 1: Same as ML_ISTART = 0 but taking into account pre-existing ab initio data. This is the usual choice for continuing a previous MD simulation with activated machine learning. Before the MD run starts the ML_AB file, copied from ML_ABN from a previous run, is read and the contained ab initio energies, forces and stresses are used to generate an initial force field. Note that this preparative learning step adopts the previous choice of local reference configurations, i.e. the reference atomic environments entering the kernel are taken from a list in the ML_AB file. Then, the MD simulation is started with on-the-fly learning enabled. The ML_AB file does not necessarily need to contain structures matching the current starting configuration in the POSCAR file in terms of simulation box, present elements or number of atoms. However, if the same elements appear the initial force field is of course used for predictions. In any case the provided training data is included in the finally generated machine learning force field, i.e. the ML_FFN file will define a force field applicable to both, the structures in the ML_AB file and the current MD simulation. By restarting repeatedly with ML_ISTART = 1 while providing an ML_AB file from the last run it is possible to iteratively extend the applicability of the resulting machine learning force field, e.g. by exploring different temperature ranges or element compositions.
Tip: Setting ML_ISTART = 1 together with NSW = 0 allows to repeat learning on the given training data and create a new force field in ML_FFN without actually performing additional MD steps. In this way force field parameters (e.g. cutoff radii, number of radial basis functions, etc.) can be varied without recalculating the entire trajectory. Moreover, because Bayesian error estimation is not required when no MD is run it is possible to switch the regression algorithm via the tag ML_IALGO_LINREG and check whether in this way better fitting results can be achieved. In order to avoid that the starting structure in the POSCAR file is processed and eventually added to the training data just set ML_CTIFOR to a large value (e.g. 1000).
- ML_ISTART = 2: Prediction only. In this mode the previously trained machine learning force field is read from the ML_FF file. The MD simulation is driven with predictions from the force field only, no ab initio calculations are performed and no learning is executed. However, in order to monitor the quality of predictions the Bayesian error estimate of forces is still computed and logged in the ML_LOGFILE. This setting is typically used when the machine learning force field is considered mature and ready for production runs.
- ML_ISTART = 3: Learning from given ab initio data only, no MD time steps. In this operation mode a new machine learning force field is generated from ab initio data provided in the ML_AB file. The structures are read in and processed one by one as if harvested via an MD simulation. In other words, the same steps are performed as in on-the-fly training but the source of data is not an MD run but the series of structures available in ML_AB. This operation mode can be used to generate VASP machine learning force fields from pre-computed or external ab initio data sets. At first glance ML_ISTART = 3 looks very similar to the combination of ML_ISTART = 1 and NSW = 0 described above. However, there is an important difference: Setting ML_ISTART = 3 will ignore the list of local reference configurations in the ML_AB file and instead will determine a new collection which is written to the resulting ML_ABN file.
Tip: If calculations for ML_ISTART = 3 are too time-consuming using the default settings, it is useful to increase ML_MCONF_NEW to values around 10-16 and set ML_CDOUB = 4. This often accelerates the calculations by a factor of 2-4.
- ML_ISTART = 4: Refitting of the force field is done based on an existing ML_AB file, but the number of local reference configurations is taken from the ML_AB file. NSW on the input is ignored and only a single step is executed. No ab-initio calculation is carried out.
Related tags and articles
ML_LMLFF, ML_IALGO_LINREG, ML_IWEIGHT, ML_ICRITERIA, ML_IREG, ML_LSPARSDES, ML_ISCALE_TOTEN, ML_LCOUPLE, ML_LHEAT, ML_LEATOM, ML_MB, ML_MCONF