Page 1 of 1
OOM during Wannier90 AMN computation
Posted: Tue Nov 05, 2024 8:05 am
by francesco_martinelli
Hi all,
I'm trying to produce the input file for a Wannierization starting from a SOC converged calculations.
Unfortunately, the calculation runs OOM when it calculates the AMN projections.
Here you have the INCAR and the calculation output until it stops.
It would be great if someone could help me solve this problem.
Best,
Francesco
Code: Select all
Workflow INCAR
#General tags
PREC = Accurate
ENCUT = 600
EDIFF = 1E-8
LORBMOM = TRUE
#KPAR = 4
#NCORE = 8
GGA = PE
ISMEAR = 0
SIGMA = 0.01
NELM = 300
LORBIT = 11
NBANDS = 128
LDAUPRINT = 2
LMAXMIX = 4
LASPH = TRUE
#DOS
NEDOS = 2001
#Magnetism
LSORBIT = TRUE
ISYM = -1
SAXIS = 0 0 1
ISPIN = 2
LNONCOLLINEAR = TRUE
RWIGS = 1.9 1.0 1.5 0.4
MAGMOM = 6*0 3*0 0.0 0.0 0 18*0
#Wannier90
LWANNIER90 = TRUE
NUM_WANN = 6
ICHARG = 11
LCHARG = FALSE
WANNIER90_WIN = "
exclude_bands = 1-76,83-128
guiding_centres = T
begin projections
Re:dxz,dyz,dxy
end projections
dis_win_min = 3
dis_win_max = 7
write_hr = true
write_u_matrices = true
write_xyz = true
num_iter = 0
conv_tol = 1E-9
conv_window = 10
dis_num_iter = 0
dis_conv_tol = 1E-9
dis_conv_window = 10
bands_plot = true
begin kpoint_path
G .0 .0 .0 X .5 .0 .5
X .5 .0 .5 W .5 .25 .75
W .5 .25 .75 L .5 .5 .5
L .5 .5 .5 G .0 .0 .0
G .0 .0 .0 K .375 .375 .75
end kpoint_path
bands_num_points 40"
Code: Select all
running 128 mpi-ranks, with 1 threads/rank, on 1 nodes
distrk: each k-point on 128 cores, 1 groups
distr: one band on 1 cores, 128 groups
vasp.6.4.2 20Jul23 (build Jul 22 2024 14:06:31) complex
POSCAR found type information on POSCAR BaMgReO
POSCAR found : 4 types and 10 ions
Reading from existing POTCAR
scaLAPACK will be used
Reading from existing POTCAR
LDA part: xc-table for Pade appr. of Perdew
WARNING: stress and forces are not correct
POSCAR, INCAR and KPOINTS ok, starting setup
FFT: planning ... GRIDC
FFT: planning ... GRID_SOFT
FFT: planning ... GRID
WAVECAR not read
reading imaginary part of occupancies ...
charge-density read from file: unknown
reading imaginary part of occupancies ...
magnetization density read from file 1
reading imaginary part of occupancies ...
magnetization density read from file 2
reading imaginary part of occupancies ...
magnetization density read from file 3
entering main loop
N E dE d eps ncg rms rms(c)
DAV: 1 0.311775800494E+03 0.31178E+03 -0.52590E+04131072 0.145E+03
DAV: 2 -0.695837669671E+02 -0.38136E+03 -0.38127E+03196608 0.208E+02
DAV: 3 -0.727009882628E+02 -0.31172E+01 -0.31172E+01131072 0.276E+01
DAV: 4 -0.727166305005E+02 -0.15642E-01 -0.15642E-01262144 0.354E+00
DAV: 5 -0.727166532600E+02 -0.22759E-04 -0.22759E-04131072 0.752E-02
DAV: 6 -0.727166534213E+02 -0.16125E-06 -0.16112E-06262144 0.743E-03
DAV: 7 -0.727166534224E+02 -0.10905E-08 -0.11155E-08131072 0.349E-04
Calling wannier_setup of wannier90 in library mode
Wannier90 mode
Computing MMN (overlap matrix elements)
Computing AMN (projections onto localized orbitals)
Re: OOM during Wannier90 AMN computation
Posted: Wed Nov 06, 2024 9:15 am
by henrique_miranda
Hi Francesco,
My suggestion would be to start with a calculation that you can run quickly and does not go out of memory:
- Use the default ENCUT which is chosen based on the maximum value of ENMAX in the POTCAR file
- Reduce the number of k-points (I don't see how many you are using currently because you did not share the KPOINTS file)
- Reduce the number of bands and MPI ranks but still reserve the full node. I see you are setting manually NBANDS=128, but then you're excluding them from the Wannierization procedure, perhaps you can reduce the number of bands. If you still reserve the full node, then this should increase the amount of memory available per MPI rank.
Once you have a calculation that runs, then you can gradually increase KPOINTS and ENCUT and compare the results with the calculation that ran.
My suspicion is that you can still get very accurate results with lower ENCUT and KPOINTS. If that is not the case, then you need to increase the amount of memory per MPI rank as I mentioned in point 3.
Let me know if this helps.
Re: OOM during Wannier90 AMN computation
Posted: Wed Nov 13, 2024 10:23 am
by francesco_martinelli
Hi Henrique,
First of all, thank you for your help, now the interface is running correctly (with fewer bands, lower cutoff etc.), but I'd like to share with you the .win file generated during the calculation which led to the OOM.
Here in the 'generated automatically by VASP' section it printed out num_bands = 128 even if exclude_bands = 1-76,83-128 was stated before, while if I repeat the same calculations on a different machine (with an increased memory) the OOM doesn't happen and the num_bands tag is set to 6.
Is it only a memory-related issue or could be caused by a different/erroneous Vasp compilation on the machine that gives the OOM?
Thank you in advance for the reply.
Code: Select all
exclude_bands = 1-76,83-128
guiding_centres = T
begin projections
Re:dxz,dyz,dxy
end projections
dis_win_min = 3
dis_win_max = 7
write_hr = true
write_u_matrices = true
write_xyz = true
num_iter = 0
conv_tol = 1E-9
conv_window = 10
dis_num_iter = 0
dis_conv_tol = 1E-9
dis_conv_window = 10
bands_plot = true
begin kpoint_path
G .0 .0 .0 X .5 .0 .5
X .5 .0 .5 W .5 .25 .75
W .5 .25 .75 L .5 .5 .5
L .5 .5 .5 G .0 .0 .0
G .0 .0 .0 K .375 .375 .75
end kpoint_path
bands_num_points 40
# This part was generated automatically by VASP
num_bands = 128
num_wann = 6
spinors = .true.
begin unit_cell_cart
4.0401000 0.0000000 4.0401000
4.0401000 4.0401000 0.0000000
0.0000000 4.0401000 4.0401000
end unit_cell_cart
begin atoms_cart
Ba 2.0200500 2.0200500 2.0200500
Ba 6.0601500 6.0601500 6.0601500
Mg 4.0401000 4.0401000 4.0401000
Re 0.0000000 0.0000000 0.0000000
O 1.9260773 4.0401000 4.0401000
O 4.0401000 1.9260773 4.0401000
O 4.0401000 4.0401000 1.9260773
O 6.1541227 4.0401000 4.0401000
O 4.0401000 6.1541227 4.0401000
O 4.0401000 4.0401000 6.1541227
end atoms_cart
mp_grid = 8 8 8
begin kpoints
0.000000000000 0.000000000000 0.000000000000
0.125000000000 0.000000000000 0.000000000000
0.250000000000 0.000000000000 0.000000000000
0.375000000000 -0.000000000000 -0.000000000000
...
end kpoints
Re: OOM during Wannier90 AMN computation
Posted: Mon Nov 18, 2024 9:42 am
by henrique_miranda
Good to hear that you were able to run your calculations!
Starting any calculation from low ENCUT and KPOINTS settings is one of the most simple and important advice I can give.
Once you obtain the final result you intend to get, in your case Wannier functions, then you can easily run new calculations with larger ENCUT and KPOINTS and check if these final results change.
In some cases, you might find that instead of using a large ENCUT or KPOINTS 'to be safe' and spending more computational resources, you can get very accurate results with a lower ENCUT or KPOINTS.
The short response is: that can happen when the code is stopped abruptly.
The long response is very technical:
In the interface between VASP and wannier90 there are two stages, setup and execution.
In the setup step, wannier90 reads the wannier90.win file and returns some data present on it like index of the excluded bands, projections, etc.
This wannier90.win file is created by VASP before the setup call.
There is a little chicken-and-egg problem here because to know the number of bands (num_bands) we need to read which bands to exclude (exclude_bands) from the wannier90.win file.
To solve this we first write a wannier90.win with num_bands=NBANDS, then before the execution step we rewrite it with num_bands=NBANDS-num_exclude_bands with num_exclude_bands the total number of bands that are excluded which we read in the setup step.
Note that wannier90 only needs to use num_bands in the execution step.
Now, it is between the setup and execution step that VASP computes the AMN and MMN information to write to the AMN and MMN files.
If the code goes OOM there, then the wannier90.win file will not be updated with the correct num_bands.
I hope this answers your question!
Re: OOM during Wannier90 AMN computation
Posted: Mon Nov 18, 2024 9:58 am
by francesco_martinelli
Great! Thank you for your explanation!