Background
DeltaSpin Parameter Unification & DFT+U PW Bug Fixes
Summary
This branch (feat/ds-lambda-optimize from repo dyzheng) addresses critical parameter design issues in the DeltaSpin (spin-constrained DFT) module, fixes several DFT+U PW bugs, and stabilizes the direction_only collinear spin-constraint mode. The changes unify the parameter scheme by replacing magic numbers with named parameters, fix incorrect MPI reduction and spin-channel indexing bugs, and restructure the esolver control flow for correctness.
Motivation
1. Magic Number Convention is Undocumented and Error-Prone
The PW basis uses sc_scf_thr = 10.0 as an implicit flag meaning "activate lambda loop immediately". This convention:
- Is completely undocumented in user-facing parameter descriptions
- Relies on comparing
sc_scf_thr != 10.0 in reset_value logic to distinguish modes
- Makes it impossible for users to genuinely set a threshold of 10.0 Ry
- Conflicts with the parameter's stated purpose as a "density error threshold"
2. direction_only Mode is Broken for Collinear (nspin=2) Calculations
When sc_direction_only = true with nspin=2:
mixing_restart is auto-set to sc_scf_thr (~1e-3), which triggers init_mixing() unexpectedly during Phase 1 BFGS updates, polluting the mixing history and destabilizing convergence
- Phase 1 duration is hardcoded to 5 iterations with no user control
- There is no smooth transition from Phase 1 (magnitude constraint) to Phase 2 (direction relaxation), causing a discontinuous Hamiltonian jump
3. DFT+U PW Produces Wrong Results with kpar > 1
- Spin channel selection uses
ik >= nk/2 to determine spin-down k-points, which is only correct for the global k-point ordering. When kpar > 1, k-points are distributed across MPI pools, and the second half of local k-points is NOT the spin-down channel. This means DFT+U occupation matrices and energies are wrong in any parallel k-point distribution.
reduce_double_allpool() uses PARAM.globalv.nproc_in_pool which is not yet initialized at call time, causing incorrect MPI reduction for both DFT+U and DeltaSpin.
4. onsite_op Kernel Crashes for nspin=2 DFT+U/DeltaSpin
The onsite_ps_op kernel always uses spinor (npol=2) indexing, causing buffer overreads and crashes when running nspin=2 DFT+U or DeltaSpin with PW basis.
5. Unfinished Lambda Strategy Code Creates Maintenance Burden
LambdaUpdateStrategy, HybridDelayedUpdate, AugmentedLagrangianUpdate, and LinearResponseUpdate classes are declared but reference undeclared members, cannot compile, and have no working tests. This dead code complicates the module interface and confuses contributors.
Changes
New Parameters
| Parameter |
Type |
Default |
Description |
sc_scf_thr_mode |
string |
"threshold" |
Controls when the DeltaSpin lambda loop activates. "threshold": activate when drho < sc_scf_thr. "immediate": activate from iteration 2 (replaces sc_scf_thr=10.0 convention). |
sc_dir_phase1_steps |
int |
5 |
Number of SCF iterations for Phase 1 (magnitude constraint) in the direction_only two-phase strategy for collinear (nspin=2) calculations. Minimum: 2. |
Bug Fixes
- DFT+U PW spin channel: Replace
ik >= psi_p->get_nk()/2 with isk[ik] in cal_occ_pw() for correct spin channel detection under k-point parallelism (kpar > 1)
- MPI reduction: Use
GlobalV::NPROC_IN_POOL instead of PARAM.globalv.nproc_in_pool in reduce_double_allpool() for both DFT+U and DeltaSpin
onsite_op kernel: Restore npol==1/npol==2 branches in onsite_ps_op for CPU, CUDA, and ROCm backends (fixes nspin=2 crash)
- Force/stress nonlocal formula: Correct term order for SOC/DFT+U/DeltaSpin in
force_op and stress_op kernels
- LCAO DeltaSpin SCF crash: Fix charge density synchronization issues in LCAO DeltaSpin path
mixing_restart auto-setting: Rewrite from magic-number-based (sc_scf_thr != 10.0) to clean three-way branching (direction_only → 0, threshold → sc_scf_thr, immediate → scf_thr/10)
sc_scf_thr default: Unified from conflicting 1e-3/1e-4 to 1e-3; fixed comment from "minimum number of outer scf loop" to "density error threshold"
- PW
run_lambda_loop iter indexing: Correct off-by-one error (iter → iter - 1) in PW basis path
- MPI/FFTW cleanup order: Fix segfault at
MPI_Finalize
- LCAO DFT+U NSCF assertion: Fix assertion failures in non-self-consistent DFT+U calculations
New Features
- Constraint SCF annealing for
direction_only nspin=2: After Phase 1 BFGS converges magnitude, lambda decays by factor 0.5^(1/3) per step during Phase 2, with mix_reset() at the Phase transition to clear Broyden history
linear_scan strategy for PW basis: sc_lambda_strategy = "linear_scan" now works with PW (previously LCAO-only) for energy landscape mapping
- nspin=1 support for PW DeltaSpin: PW path now handles nspin=1 calculations
- Full/incremental lambda update: Both PW and LCAO paths support
full_update parameter for calculate_delta_hcc, enabling incremental Hamiltonian correction updates
Refactoring
- Deleted unfinished lambda strategy code: Removed
lambda_update_strategies.h/cpp, lambda_strategy_integration.cpp, and their test (4 files, ~740 lines)
- Removed LCAO-specific lambda loop path: Consolidated
run_lambda_loop_lcao() into the unified run_lambda_loop()
- Encapsulated LCAO-specific DeltaSpin reset: Moved LCAO-specific operator reset logic into
dspin_lcao.cpp/h to avoid #ifdef __LCAO in core module
- Guarded BFGS/linear_scan with
__LCAO: Conditional compilation for no-LCAO builds
- Comprehensive English documentation: Added detailed comments for nspin=2/4 paths, two-phase strategy, and parameter semantics throughout
module_deltaspin
Test Changes
- Added DFT+U+DeltaSpin NSCF test cases (60-64) with pre-converged charge density and
onsite.dm
- Added
sc_scf_thr_mode and sc_dir_phase1_steps test cases
- Added nspin=2
direction_only LCAO test case
- Shielded 17 unstable LCAO cases in CI
- Consolidated from 39 to 24 active cases in 17_DS_DFTU test suite
- Updated
catch_properties.sh to parse DeltaSpin-specific output fields
Files Changed (Key Files)
DeltaSpin Core
source/source_lcao/module_deltaspin/spin_constrain.h — Removed strategy classes; added PW support methods
source/source_lcao/module_deltaspin/spin_constrain.cpp — Unified lambda loop; nspin=1/2/4 path documentation
source/source_lcao/module_deltaspin/lambda_loop.cpp — Unified PW/LCAO loop; removed LCAO-specific path
source/source_lcao/module_deltaspin/lambda_loop_helper.cpp — Helper functions for two-phase strategy
source/source_lcao/module_deltaspin/cal_mw_from_lambda.cpp — Extended for PW basis; MPI reduction fix
source/source_lcao/module_deltaspin/cal_mw.cpp — Refactored for unified parameter scheme
source/source_lcao/module_deltaspin/deltaspin_lcao.cpp — Encapsulated LCAO-specific reset logic
source/source_lcao/module_deltaspin/init_sc.cpp — Updated for new parameters
- Deleted:
lambda_update_strategies.h/cpp, lambda_strategy_integration.cpp, cal_h_lambda.cpp, test/lambda_update_strategies_test.cpp
PW DeltaSpin
source/source_pw/module_pwdft/deltaspin_pw.cpp — Rewrote with sc_scf_thr_mode; added linear_scan and immediate mode
DFT+U PW
source/source_lcao/module_dftu/dftu_pw.cpp — Fixed isk spin channel; MPI reduction fix
source/source_lcao/module_dftu/dftu.h — Updated cal_occ_pw signature
source/source_pw/module_pwdft/dftu_pw.cpp — Same isk and MPI fixes
source/source_pw/module_pwdft/dftu_pw.h — Same signature update
source/source_pw/module_pwdft/kernels/onsite_op.cpp — Restored npol=1/2 branches
source/source_pw/module_pwdft/kernels/cuda/onsite_op.cu — Same GPU fix
source/source_pw/module_pwdft/kernels/rocm/onsite_op.hip.cu — Same ROCm fix
source/source_pw/module_pwdft/kernels/force_op.cpp — Fixed nonlocal formula order
source/source_pw/module_pwdft/kernels/stress_op.cpp — Same fix
Esolver & Parameters
source/source_esolver/esolver_ks_lcao.cpp — Restructured DeltaSpin control flow with two-phase strategy
source/source_io/module_parameter/input_parameter.h — Added sc_scf_thr_mode, sc_dir_phase1_steps
source/source_io/module_parameter/read_input_item_other.cpp — New parameter items and validation
source/source_io/module_parameter/read_input_item_elec_stru.cpp — Rewrote mixing_restart auto-setting
Force/Stress
source/source_pw/module_pwdft/kernels/force_op.cpp — Nonlocal formula order fix
source/source_pw/module_pwdft/kernels/stress_op.cpp — Same fix
source/source_pw/module_pwdft/kernels/cuda/force_op.cu — GPU variant
source/source_pw/module_pwdft/kernels/cuda/stress_op.cu — GPU variant
source/source_pw/module_pwdft/kernels/rocm/force_op.hip.cu — ROCm variant
source/source_pw/module_pwdft/kernels/rocm/stress_op.hip.cu — ROCm variant
Backward Compatibility
sc_scf_thr = 10.0: Previously used as an undocumented "immediate activation" flag. Users must now set sc_scf_thr_mode = "immediate" instead. The old sc_scf_thr = 10.0 will still produce drho < 10.0 threshold behavior, which is functionally similar but semantically different.
sc_scf_nmin: Removed (was already deprecated).
- Deleted lambda strategies:
linear_response, augmented_lagrangian, hybrid_delayed are no longer accepted by sc_lambda_strategy.
- All other parameters: Fully backward compatible; default behavior matches old behavior when
sc_scf_thr_mode = "threshold".
Describe the solution you'd like
raise PR after #7304 merged
Task list only for developers
Notice Possible Changes of Behavior (Reminder only for developers)
No response
Notice any changes of core modules (Reminder only for developers)
No response
Notice Possible Changes of Core Modules (Reminder only for developers)
No response
Additional Context
No response
Task list for Issue attackers (only for developers)
Background
DeltaSpin Parameter Unification & DFT+U PW Bug Fixes
Summary
This branch (
feat/ds-lambda-optimizefrom repo dyzheng) addresses critical parameter design issues in the DeltaSpin (spin-constrained DFT) module, fixes several DFT+U PW bugs, and stabilizes thedirection_onlycollinear spin-constraint mode. The changes unify the parameter scheme by replacing magic numbers with named parameters, fix incorrect MPI reduction and spin-channel indexing bugs, and restructure the esolver control flow for correctness.Motivation
1. Magic Number Convention is Undocumented and Error-Prone
The PW basis uses
sc_scf_thr = 10.0as an implicit flag meaning "activate lambda loop immediately". This convention:sc_scf_thr != 10.0inreset_valuelogic to distinguish modes2.
direction_onlyMode is Broken for Collinear (nspin=2) CalculationsWhen
sc_direction_only = truewith nspin=2:mixing_restartis auto-set tosc_scf_thr(~1e-3), which triggersinit_mixing()unexpectedly during Phase 1 BFGS updates, polluting the mixing history and destabilizing convergence3. DFT+U PW Produces Wrong Results with
kpar > 1ik >= nk/2to determine spin-down k-points, which is only correct for the global k-point ordering. Whenkpar > 1, k-points are distributed across MPI pools, and the second half of local k-points is NOT the spin-down channel. This means DFT+U occupation matrices and energies are wrong in any parallel k-point distribution.reduce_double_allpool()usesPARAM.globalv.nproc_in_poolwhich is not yet initialized at call time, causing incorrect MPI reduction for both DFT+U and DeltaSpin.4.
onsite_opKernel Crashes for nspin=2 DFT+U/DeltaSpinThe
onsite_ps_opkernel always uses spinor (npol=2) indexing, causing buffer overreads and crashes when running nspin=2 DFT+U or DeltaSpin with PW basis.5. Unfinished Lambda Strategy Code Creates Maintenance Burden
LambdaUpdateStrategy,HybridDelayedUpdate,AugmentedLagrangianUpdate, andLinearResponseUpdateclasses are declared but reference undeclared members, cannot compile, and have no working tests. This dead code complicates the module interface and confuses contributors.Changes
New Parameters
sc_scf_thr_mode"threshold""threshold": activate whendrho < sc_scf_thr."immediate": activate from iteration 2 (replacessc_scf_thr=10.0convention).sc_dir_phase1_steps5direction_onlytwo-phase strategy for collinear (nspin=2) calculations. Minimum: 2.Bug Fixes
ik >= psi_p->get_nk()/2withisk[ik]incal_occ_pw()for correct spin channel detection under k-point parallelism (kpar > 1)GlobalV::NPROC_IN_POOLinstead ofPARAM.globalv.nproc_in_poolinreduce_double_allpool()for both DFT+U and DeltaSpinonsite_opkernel: Restorenpol==1/npol==2branches inonsite_ps_opfor CPU, CUDA, and ROCm backends (fixes nspin=2 crash)force_opandstress_opkernelsmixing_restartauto-setting: Rewrite from magic-number-based (sc_scf_thr != 10.0) to clean three-way branching (direction_only→ 0,threshold→sc_scf_thr,immediate→scf_thr/10)sc_scf_thrdefault: Unified from conflicting1e-3/1e-4to1e-3; fixed comment from "minimum number of outer scf loop" to "density error threshold"run_lambda_loopiter indexing: Correct off-by-one error (iter→iter - 1) in PW basis pathMPI_FinalizeNew Features
direction_onlynspin=2: After Phase 1 BFGS converges magnitude, lambda decays by factor0.5^(1/3)per step during Phase 2, withmix_reset()at the Phase transition to clear Broyden historylinear_scanstrategy for PW basis:sc_lambda_strategy = "linear_scan"now works with PW (previously LCAO-only) for energy landscape mappingfull_updateparameter forcalculate_delta_hcc, enabling incremental Hamiltonian correction updatesRefactoring
lambda_update_strategies.h/cpp,lambda_strategy_integration.cpp, and their test (4 files, ~740 lines)run_lambda_loop_lcao()into the unifiedrun_lambda_loop()dspin_lcao.cpp/hto avoid#ifdef __LCAOin core module__LCAO: Conditional compilation for no-LCAO buildsmodule_deltaspinTest Changes
onsite.dmsc_scf_thr_modeandsc_dir_phase1_stepstest casesdirection_onlyLCAO test casecatch_properties.shto parse DeltaSpin-specific output fieldsFiles Changed (Key Files)
DeltaSpin Core
source/source_lcao/module_deltaspin/spin_constrain.h— Removed strategy classes; added PW support methodssource/source_lcao/module_deltaspin/spin_constrain.cpp— Unified lambda loop; nspin=1/2/4 path documentationsource/source_lcao/module_deltaspin/lambda_loop.cpp— Unified PW/LCAO loop; removed LCAO-specific pathsource/source_lcao/module_deltaspin/lambda_loop_helper.cpp— Helper functions for two-phase strategysource/source_lcao/module_deltaspin/cal_mw_from_lambda.cpp— Extended for PW basis; MPI reduction fixsource/source_lcao/module_deltaspin/cal_mw.cpp— Refactored for unified parameter schemesource/source_lcao/module_deltaspin/deltaspin_lcao.cpp— Encapsulated LCAO-specific reset logicsource/source_lcao/module_deltaspin/init_sc.cpp— Updated for new parameterslambda_update_strategies.h/cpp,lambda_strategy_integration.cpp,cal_h_lambda.cpp,test/lambda_update_strategies_test.cppPW DeltaSpin
source/source_pw/module_pwdft/deltaspin_pw.cpp— Rewrote withsc_scf_thr_mode; addedlinear_scanandimmediatemodeDFT+U PW
source/source_lcao/module_dftu/dftu_pw.cpp— Fixediskspin channel; MPI reduction fixsource/source_lcao/module_dftu/dftu.h— Updatedcal_occ_pwsignaturesource/source_pw/module_pwdft/dftu_pw.cpp— Sameiskand MPI fixessource/source_pw/module_pwdft/dftu_pw.h— Same signature updatesource/source_pw/module_pwdft/kernels/onsite_op.cpp— Restored npol=1/2 branchessource/source_pw/module_pwdft/kernels/cuda/onsite_op.cu— Same GPU fixsource/source_pw/module_pwdft/kernels/rocm/onsite_op.hip.cu— Same ROCm fixsource/source_pw/module_pwdft/kernels/force_op.cpp— Fixed nonlocal formula ordersource/source_pw/module_pwdft/kernels/stress_op.cpp— Same fixEsolver & Parameters
source/source_esolver/esolver_ks_lcao.cpp— Restructured DeltaSpin control flow with two-phase strategysource/source_io/module_parameter/input_parameter.h— Addedsc_scf_thr_mode,sc_dir_phase1_stepssource/source_io/module_parameter/read_input_item_other.cpp— New parameter items and validationsource/source_io/module_parameter/read_input_item_elec_stru.cpp— Rewrotemixing_restartauto-settingForce/Stress
source/source_pw/module_pwdft/kernels/force_op.cpp— Nonlocal formula order fixsource/source_pw/module_pwdft/kernels/stress_op.cpp— Same fixsource/source_pw/module_pwdft/kernels/cuda/force_op.cu— GPU variantsource/source_pw/module_pwdft/kernels/cuda/stress_op.cu— GPU variantsource/source_pw/module_pwdft/kernels/rocm/force_op.hip.cu— ROCm variantsource/source_pw/module_pwdft/kernels/rocm/stress_op.hip.cu— ROCm variantBackward Compatibility
sc_scf_thr = 10.0: Previously used as an undocumented "immediate activation" flag. Users must now setsc_scf_thr_mode = "immediate"instead. The oldsc_scf_thr = 10.0will still producedrho < 10.0threshold behavior, which is functionally similar but semantically different.sc_scf_nmin: Removed (was already deprecated).linear_response,augmented_lagrangian,hybrid_delayedare no longer accepted bysc_lambda_strategy.sc_scf_thr_mode = "threshold".Describe the solution you'd like
raise PR after #7304 merged
Task list only for developers
Notice Possible Changes of Behavior (Reminder only for developers)
No response
Notice any changes of core modules (Reminder only for developers)
No response
Notice Possible Changes of Core Modules (Reminder only for developers)
No response
Additional Context
No response
Task list for Issue attackers (only for developers)