Low-Rank Nearest Correlation Estimation#

Problem Description#

Given a symmetrix matrix \(G \in \mathbb{R}^{n\times n}\) and a nonnegative symmetric weigh matrix \(H \in \mathbb{R}^{n\times n}\), the low-rank nearest correlation estimation (NCM) problem aims to find a correlation matrix \(W \in \mathbb{R}^{n\times n}\) that minimizes the weighted distance between \(W\) and \(G\),

\[\begin{split} \begin{aligned} \min_{W \in \mathbb{R}^{n\times n}}\quad &\frac{1}{2} || H\circ( W - G ) ||_F^2 \\ \text{s. t.} \quad &W_{ii} = 1,~ i = 1,2,...,n, ~ \mathrm{rank}(W) \leq p. \end{aligned} \end{split}\]

We consider the low-rand descomposition of \(W = X^\top X\) with \(X = [x_1,..., x_n]^\top \in \mathbb{R}^{n\times p}\). Then the NCM problem becomes

\[\begin{split} \begin{aligned} \min_{X \in \mathbb{R}^{n\times p}}\quad &\frac{1}{2} || H\circ( X^\top X - G ) ||_F^2 \\ \text{s. t.} \quad & ||x_i||_2^2 = 1, ~i = 1,...,n. \end{aligned} \end{split}\]

Clearly, this problem is a smooth optimization problem on the Oblique manifold, and we show how to solve these problems with cdopt package.

Importing modules#

We first import all the necessary modules for this optimization problem.

import cdopt 
import numpy as np
import scipy as sp
import torch
import time

Generating datas#

We generate necessary data and load the datas to GPU device.

n = 1000
p = 40

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
dtype = torch.float64


Y = torch.randn(n,p, device= device, dtype= dtype) 
Y = Y / torch.sqrt(torch.sum(Y ** 2, 1, keepdim= True))

G = Y @ Y.T + 0.5 * torch.randn(n,n, device= device, dtype= dtype)
H = (torch.rand(n,n, device= device, dtype= dtype ) + 1)/2

Set functions and problems#

Then we set the objective function and the Oblique manifold.

def obj_fun(X):
    return 0.5 * torch.sum(  (H * (X@ X.T - G)) ** 2)

M = cdopt.manifold_torch.oblique_torch((n,p), device=device, dtype= dtype)   # The Oblique manifold.

Describe the optimization problem#

The optimization problem can be described by the manifold and the expression of the objective function.

problem_test = cdopt.core.problem(M, obj_fun, beta = 'auto')  # describe the optimization problem and set the penalty parameter \beta.

Apply optimization solvers#

After describe the optimization problem, we can directly function value, gradient and Hessian-vector product from the cdopt.core.Problem class.

# the vectorized function value, gradient and Hessian-vector product of the constraint dissolving function. Their inputs are numpy 1D array, and their outputs are float or numpy 1D array.
cdf_fun_np = problem_test.cdf_fun_vec_np   
cdf_grad_np = problem_test.cdf_grad_vec_np 
cdf_hvp_np = problem_test.cdf_hvp_vec_np


## Apply limit memory BFGS solver from scipy.minimize 
from scipy.optimize import fmin_bfgs, fmin_cg, fmin_l_bfgs_b, fmin_ncg
Xinit = problem_test.Xinit_vec_np  # set initial point
# optimize by L-BFGS method
t_start = time.time()
out_msg = sp.optimize.minimize(cdf_fun_np, Xinit,method='L-BFGS-B',jac = cdf_grad_np, options={'disp': None, 'maxcor': 10, 'ftol': 0, 'gtol': 1e-06, 'eps': 0e-08,})
t_end = time.time() - t_start

# Statistics
feas = M.Feas_eval(M.v2m(M.array2tensor(out_msg.x)))   # Feasibility
stationarity = np.linalg.norm(out_msg['jac'],2)   # stationarity

result_lbfgs = [out_msg['fun'], out_msg['nit'], out_msg['nfev'],stationarity,feas, t_end]

# print results
print('Solver   fval         iter   f_eval   stationarity   feaibility     CPU time')
print('& L-BFGS & {:.2e}  & {:}  & {:}    & {:.2e}     & {:.2e}     & {:.2f} \\\\'.format(*result_lbfgs))
Solver   fval         iter   f_eval   stationarity   feaibility     CPU time
& L-BFGS & 6.98e+04  & 94  & 101    & 2.99e-05     & 1.51e-09     & 0.98 \\
# optimize by CG method
t_start = time.time()
out_msg = sp.optimize.minimize(cdf_fun_np, Xinit,method='CG',jac = cdf_grad_np, options={'disp': None,'gtol': 1e-06, 'eps': 0})
t_end = time.time() - t_start

# Statistics
feas = M.Feas_eval(M.v2m(M.array2tensor(out_msg.x)))   # Feasibility
stationarity = np.linalg.norm(out_msg['jac'],2)   # stationarity

result_cg = [out_msg['fun'], out_msg['nit'], out_msg['nfev'],stationarity,feas, t_end]

# print results
print('Solver   fval         iter   f_eval   stationarity   feaibility     CPU time')
print('& CG     & {:.2e}  & {:}  & {:}    & {:.2e}     & {:.2e}     & {:.2f} \\\\'.format(*result_cg))
Solver   fval         iter   f_eval   stationarity   feaibility     CPU time
& CG     & 6.98e+04  & 103  & 166    & 4.00e-05     & 2.68e-07     & 0.95 \\

Reference#

  1. Gao Y, Sun D F. A majorized penalty approach for calibrating rank constrained correlation matrix problems[J]. Preprint available at http://www.math.nus.edu.sg/~matsundf/MajorPen_May5.pdf, 2010, 4(9): 17.