Debasish Das Santanu Das
debasish.das@verizon.com santanu.das@verizon.com
Big Data Analytics
Verizon
What is it- To decompose observed data (R rating matrix):
$D(R | W,H)=\frac{1}{2}||R - WH||^{2}_{F} + \\ {\alpha}_{l1w}||W|| + {\alpha}_{l2w}||W||^{2}_{F} + \lambda_{l1h}|H||| + \lambda_{l2h}||H||^{2}_{F}$
Fixed-point RALS Algorithm: Equating gradient to zero, obtain iterative update scheme of $W$, $H$
Current effort aims to introduce flexibility to impose additional constraints (e.g. bounds on variables, sparsity, etc.)
$D(R | W,H)=\frac{1}{2}\sum^{n}_{i=1}||r_{i}-Wh_{i}||^{2}_{2}$
Solve $n$ independent problems:
Our contribution: given $W$, we compute cluster possibilities ($H$) using a QP solver in Spark Mllib. (Note: This is inner loop)
Objective $f(h): 0.5||r-Wh||^{2}_{2} => 0.5h^{T}(WW^{T})h - (r^{T}W)h$
Constraints $g(z) : z >= 0$
ADMM formulation $f(h) + g(z) \\ \text{s.t } h = z$
ADMM Steps
class DirectQpSolver(nGram: Int,
lb: Option[DoubleMatrix] = None, ub: Option[DoubleMatrix] = None,
Aeq: Option[DoubleMatrix] = None,
alpha: Double = 0.0, rho: Double = 0.0,
addEqualityToGram: Boolean = false) = {
def solve(H: DoubleMatrix, q: DoubleMatrix,
beq: Option[DoubleMatrix] = None): DoubleMatrix = {
wsH = H + rho*I
solve(q, beq)
}
def solve(q: DoubleMatrix, beq: Option[DoubleMatrix]): DoubleMatrix = {
//Dense cholesky factorization
val R = Decompose.cholesky(wsH)
ADMM(R)
}
def ADMM(R: DoubleMatrix): DoubleMatrix = {
rho = 1.0
alpha = 1.0
while (k ≤ MAX_ITERS) {
scale = rho*(z - u) - q
// x = R \ (R' \ scale)
solveTriangular(R, scale)
//z-update with relaxation
zold = (1-alpha)*z
x_hat = alpha*x + zold
z = xHat + u
Proximal(z)
if(converged(x, z, u)) return x
k = k + 1
}
}
Objective transformation $\text{minimize } t \\ \text{ s.t } 0.5h^{T}(WW^{T})h - (r^{T}W)h \leq t$
Constraints $h >= 0 \\ Aeq \times h = Beq \\ A \times h \leq B$
Quadratic constraint transformation
class QpSolver(nGram: Int, nLinear: Int = 0, diagonal: Boolean = false,
Equalities: Option[CSCMatrix[Double]] = None,
Inequalities: Option[CSCMatrix[Double]] = None,
lbFlag: Boolean = false, ubFlag: Boolean = false) = {
NativeECOS.loadLibraryAndCheckErrors()
def solve(H: DoubleMatrix, f: Array[Double]): (Int, Array[Double]) = {
updateHessian(H)
updateLinearObjective(f)
val status = NativeECOS.solveSocp(c, G, h, Aeq, beq, linear, cones, x)
(status, x.slice(0, n))
}
}
Application: Image feature extraction / energy spectrum where negative coefficients or factors are counter intuitive
OCTAVE | ALS | ECOS | ADMM | ||
---|---|---|---|---|---|
Negative Coefficients: | 0 | 2 | 0 | 1 | |
RMSE: | N.A. | 2.3e-2 | 3e-4 | 2.7e-4 |
//Spark Driver
val lb = DoubleMatrix.zeros(rank, 1)
val ub = DoubleMatrix.zeros(rank, 1).addi(1.0)
val directQpSolver = new DirectQpSolver(rank, Some(lb), Some(ub)).setProximal(ProjectBox)
val factors = Array.range(0, numUsers).map { index =>
// Compute the full XtX matrix from the lower-triangular part we got above
fillFullMatrix(userXtX(index), fullXtX)
val H = fullXtX.add(YtY.get.value)
val f = userXy(index).mul(-1)
val directQpResult = directQpSolver.solve(H, f)
directQpResult
}
//DirectQpSolver Projection Operator
def projectBox(z: Array[Double], l: Array[Double], u: Array[Double]) {
for(i <- 0 until z.length) z.update(i, max(l(i), min(x(i), u(i))))
}
Application: signal recovery problems in which the original signal is known to have a sparse representation
OCTAVE | ALS | ECOS | ADMM | ||
---|---|---|---|---|---|
Non-Sparse Coefficients: | 16 | 20 | 18 | 18 | |
RMSE: | N.A. | 9e-2 | 2e-2 | 2e-2 |
//Spark Driver
val directQpSolverL1 = new DirectQpSolver(rank).setProximal(ProximalL1)
directQpSolverL1.setLambda(lambdaL1)
val factors = Array.range(0, numUsers).map { index =>
// Compute the full XtX matrix from the lower-triangular part we got above
fillFullMatrix(userXtX(index), fullXtX)
val H = fullXtX.add(YtY.get.value)
val f = userXy(index).mul(-1)
val directQpL1Result = directQpSolverL1.solve(H, f)
directQpL1Result
}
//DirectQpSolver Proximal Operator
def proximalL1(z: Array[Double], scale: Double) {
for(i <- 0 until z.length)
z.update(i, max(0, z(i) - scale) - max(0, -z(i) - scale))
}
Application: Support Vector Machines based models with sparse weight representation or portfolio optimization
OCTAVE | ALS | ECOS | ADMM | ||
---|---|---|---|---|---|
Sum of Coefficients: | 1 | - | 0.99 | 1 | |
Non-Sparse Coefficients: | 4 | - | 4 | 4 | |
RMSE: | N.A. | 1.1 | 2e-4 | 5.5e-5 |
//Equality constraint x1 + x2 + ... + xr = 1
//Bound constraint is 0 ≤ x ≤ 1
val equalityBuilder = new CSCMatrix.Builder[Double](1, rank)
for (i <- until rank) equalityBuilder.add(0, i, 1)
val qpSolverEquality =
new QpSolver(rank, 0, false,
Some(equalityBuilder.result), None, true, true)
qpSolverEquality.updateUb(Array.fill[Double](rank)(1.0))
qpSolverEquality.updateEquality(Array[Double](1.0))
val factors = Array.range(0, numUsers).map { index =>
fillFullMatrix(userXtX(index), fullXtX)
val H = fullXtX.add(YtY.get.value)
val f = userXy(index).mul(-1)
val qpEqualityResult = qpSolverEquality.solve(H, f)
qpEqualityResult
}
LS | ECOS | ADMM | ||
---|---|---|---|---|
Qp | (30)+(57) | (3826)+(6943) | (99)+(143) | |
QpPos | (98)+(320) | (6288)+(11975) | (265) + (2135) | |
QpBounds | (39)+(55) | (6709)+(11951) | (1556)+(1329) | |
QpL1 | (54)+(80) | (32171)+(58766) | (352)+(1593) | |
QpEquality | (63)+(133) | (5231)+(7912) | (14681)+(65893) |
def ADMM(R: DoubleMatrix): DoubleMatrix = {
rho = 1.0
alpha = 1.0
while (k ≤ MAX_ITERS) {
scale = rho*(z - u) - q
// x = R \ (R' \ scale)
solveTriangular(R, scale)
//z-update with relaxation
zold = (1-alpha)*z
x_hat = alpha*x + zold
z = xHat + u
Proximal(z)
if(converged(x, z, u)) return x
k = k + 1
}
}
def ADMM(R: DoubleMatrix): DoubleMatrix = {
rho = 1.0
alpha = 1.0
while (k ≤ MAX_ITERS) {
zold = z
uold = u
alphaOld = alpha
scale = rho*(z - u) - q
// x = R \ (R' \ scale)
solveTriangular(R, scale)
//z-update without relaxation
z = x + u
Proximal(z)
updateRho(x,z,u)
if(converged(x, z, u)) return x
k = k + 1
//Nesterov's acceleration
alphaSolve = (1 + sqrt(1 + 4*alphaOld*alphaOld))/2
val alphaScaled = (1 - alphaOld)/alphaSolve
//z-acceleration
z = z + (z - zold)*alphaScaled
//u-acceleration
u = u + (u - uold)*alphaScaled
}
}
Comparison of ECOS, MOSEK, ADMM and Accelerated ADMM for Matlab random matrices
References