Discretized 1D Kohn-Sham Equation#

Problem Description#

In this part, we consider the singleparticle Hamiltonian arising from discretizing an 1D Kohn-Sham equation in electronic structure calculations,

\[\begin{split} \begin{aligned} \min_{X \in \mathbb{R}^{n\times p}}\quad &\frac{1}{2} \mathrm{tr}\left( X^\top LX \right) + \frac{\alpha}{4} \rho^\top L^{-1} \rho\\ \text{s. t.} \quad &X^\top X = I_p, \end{aligned} \end{split}\]

where \(\rho := \mathrm{Diag}(XX^\top)\), \(L\) is a tri-diagonal matrix with \(2\) on its diagonal and \(-1\) on its subdiagonal, and \(\alpha > 0\) is a parameter. Such problems have become standard testing problems for investigating the convergence of self-consistent field methods due to its simplicity. Clearly, these problems are smooth optimization problems on the Stiefel manifold, and we show how to solve these problems with cdopt package.

Importing modules#

We first import all the necessary modules for this optimization problem.

import cdopt 
import numpy as np
import scipy as sp
from scipy.sparse import csr_matrix, diags
from scipy.sparse.linalg import spsolve
import time

Generating datas#

We generate necessary data. Notice that \(L\) is a sparse matrix, we apply the functions from scipy.sparse to accelerate the computation.

n = 1000
p = 10
alpha = 1
L = diags([-1,2,-1], [-1,0,1], format='csr', shape=(n,n))

Set functions and problems#

Then we set the objective function and the Stiefel manifold. Notice that all the existing AD packages has limited support for operations on sparse matrices, we manually define the gradient and Hessian-vector product of the objective function.

def obj_fun(X):
    rho = np.sum(X ** 2, 1)
    Linvrho = spsolve(L, rho)
    LX = L @ X
    return 0.5 * np.sum(X*LX) + alpha/4 * np.sum(Linvrho * rho)

def obj_grad(X):
    rho = np.sum(X ** 2, 1)
    Linvrho = alpha * spsolve(L, rho)
    return L@ X + Linvrho[:, np.newaxis] * X 

def obj_hvp(X, D): 
    rho = np.sum(X ** 2, 1)
    rhoXdot = 2*np.sum(X*D, 1)
    LinvrhoXdot = alpha * spsolve(L, rhoXdot)
    Linvrho = alpha * spsolve(L, rho)
    return L @ D + LinvrhoXdot[:, np.newaxis] * X  + Linvrho[:, np.newaxis] * D 

M = cdopt.manifold_np.stiefel_np((n,p))   # The Stiefel manifold.

Describe the optimization problem#

The optimization problem can be described by the manifold and the drivatives of objective function.

problem_test = cdopt.core.problem(M, obj_fun,obj_grad=obj_grad, obj_hvp=obj_hvp, beta = 30)  # describe the optimization problem and set the penalty parameter \beta.

Apply optimization solvers#

After describe the optimization problem, we can directly function value, gradient and Hessian-vector product from the cdopt.core.Problem class.

# the vectorized function value, gradient and Hessian-vector product of the constraint dissolving function. Their inputs are numpy 1D array, and their outputs are float or numpy 1D array.
cdf_fun_np = problem_test.cdf_fun_vec_np   
cdf_grad_np = problem_test.cdf_grad_vec_np 
cdf_hvp_np = problem_test.cdf_hvp_vec_np


## Apply limit memory BFGS solver from scipy.minimize 
from scipy.optimize import fmin_bfgs, fmin_cg, fmin_l_bfgs_b, fmin_ncg
Xinit = problem_test.Xinit_vec_np  # set initial point
# optimize by L-BFGS method
t_start = time.time()
out_msg = sp.optimize.minimize(cdf_fun_np, Xinit,method='L-BFGS-B',jac = cdf_grad_np, options={'disp': None, 'maxcor': 10, 'ftol': 0, 'gtol': 1e-06, 'eps': 0e-08,})
t_end = time.time() - t_start

# Statistics
feas = M.Feas_eval(M.v2m(M.array2tensor(out_msg.x)))   # Feasibility
stationarity = np.linalg.norm(out_msg['jac'],2)   # stationarity

result_lbfgs = [out_msg['fun'], out_msg['nit'], out_msg['nfev'],stationarity,feas, t_end]

# print results
print('Solver   fval         iter   f_eval   stationarity   feaibility     CPU time')
print('& L-BFGS & {:.2e}  & {:}  & {:}    & {:.2e}     & {:.2e}     & {:.2f} \\\\'.format(*result_lbfgs))
Solver   fval         iter   f_eval   stationarity   feaibility     CPU time
& L-BFGS & 3.57e+01  & 171  & 177    & 6.45e-06     & 3.98e-08     & 0.34 \\
# optimize by CG method
t_start = time.time()
out_msg = sp.optimize.minimize(cdf_fun_np, Xinit,method='CG',jac = cdf_grad_np, options={'disp': None,'gtol': 1e-06, 'eps': 0})
t_end = time.time() - t_start

# Statistics
feas = M.Feas_eval(M.v2m(M.array2tensor(out_msg.x)))   # Feasibility
stationarity = np.linalg.norm(out_msg['jac'],2)   # stationarity

result_cg = [out_msg['fun'], out_msg['nit'], out_msg['nfev'],stationarity,feas, t_end]

# print results
print('Solver   fval         iter   f_eval   stationarity   feaibility     CPU time')
print('& CG     & {:.2e}  & {:}  & {:}    & {:.2e}     & {:.2e}     & {:.2f} \\\\'.format(*result_cg))
Solver   fval         iter   f_eval   stationarity   feaibility     CPU time
& CG     & 3.57e+01  & 158  & 270    & 5.38e-06     & 1.17e-07     & 0.36 \\
# optimize by Newton GLTR trust-region method
t_start = time.time()
out_msg = sp.optimize.minimize(cdf_fun_np, Xinit, method='trust-krylov',jac = cdf_grad_np, hessp = cdf_hvp_np, options={'gtol': 1e-06, 'disp': False})
t_end = time.time() - t_start

# Statistics
feas = M.Feas_eval(M.v2m(M.array2tensor(out_msg.x)))   # Feasibility
stationarity = np.linalg.norm(out_msg['jac'],2)   # stationarity

result_cg = [out_msg['fun'], out_msg['nit'], out_msg['nfev'],stationarity,feas, t_end]

# print results
print('Solver   fval         iter   f_eval   stationarity   feaibility     CPU time')
print('& TR-KRY & {:.2e}  & {:}  & {:}    & {:.2e}     & {:.2e}     & {:.2f} \\\\'.format(*result_cg))
Solver   fval         iter   f_eval   stationarity   feaibility     CPU time
& TR-KRY & 3.57e+01  & 38  & 39    & 1.09e-08     & 7.94e-11     & 0.69 \\
# optimize by Newton conjugate gradient trust-region method  
t_start = time.time()
out_msg = sp.optimize.minimize(cdf_fun_np, Xinit, method='trust-ncg',jac = cdf_grad_np, hessp = cdf_hvp_np, options={'gtol': 1e-06, 'disp': False})
t_end = time.time() - t_start

# Statistics
feas = M.Feas_eval(M.v2m(M.array2tensor(out_msg.x)))   # Feasibility
stationarity = np.linalg.norm(out_msg['jac'],2)   # stationarity

result_cg = [out_msg['fun'], out_msg['nit'], out_msg['nfev'],stationarity,feas, t_end]

# print results
print('Solver   fval         iter   f_eval   stationarity   feaibility     CPU time')
print('& TR-NCG & {:.2e}  & {:}  & {:}    & {:.2e}     & {:.2e}     & {:.2f} \\\\'.format(*result_cg))
Solver   fval         iter   f_eval   stationarity   feaibility     CPU time
& TR-NCG & 3.57e+01  & 41  & 42    & 2.66e-09     & 2.71e-11     & 0.52 \\

Reference#

  1. Lin L, Yang C. Elliptic preconditioner for accelerating the self-consistent field iteration in Kohn–Sham density functional theory[J]. SIAM Journal on Scientific Computing, 2013, 35(5): S277-S298.

  2. Xiao N, Liu X. Solving Optimization Problems over the Stiefel Manifold by Smooth Exact Penalty Function[J]. arXiv preprint arXiv:2110.08986, 2021.