c(sparsediff = requireNamespace("sparsediff", quietly = TRUE),
Uno = requireNamespace("Uno", quietly = TRUE))sparsediff Uno
TRUE TRUE
Recognize a smooth but nonconvex problem, send it down CVXR’s nonlinear path with psolve(nlp = TRUE), and understand why a nonlinear solve gives only a local guarantee — and what to do about it.
This chapter uses two CRAN packages beyond CVXR’s built-in solvers — sparsediff (automatic derivatives) and Uno (the nonlinear solver) — and you installed both in the setup. So it is hands-on, like every other chapter. (The ipopt backend is optional and not on CRAN; you do not need it.)
Everything so far stayed inside the convex world, where a solution is guaranteed global. Disciplined Nonlinear Programming (DNLP), new in CVXR 1.9.1, relaxes that: it handles smooth problems built from differentiable atoms, even when they are not convex. You build the problem the usual way, check it with is_dnlp(), and solve with psolve(prob, nlp = TRUE). Every DCP problem is also a DNLP, so the convex world sits inside this larger one.
c(sparsediff = requireNamespace("sparsediff", quietly = TRUE),
Uno = requireNamespace("Uno", quietly = TRUE))sparsediff Uno
TRUE TRUE
A convex problem sent down the nonlinear path just works:
x <- Variable(2)
prob <- Problem(Minimize(sum_squares(x - c(1, 2))))
is_dnlp(prob)[1] TRUE
psolve(prob, nlp = TRUE, solver = "UNO", logger = "SILENT")[1] 0
value(x) [,1]
[1,] 1
[2,] 2
Now something DCP cannot express. Maximizing a convex quadratic is nonconvex — but it is smooth, hence a DNLP. A classic instance is maximizing \(x^\top A x\) over the unit sphere, whose optimum is the largest eigenvalue of \(A\) — giving us a known answer to check against:
\[\text{maximize}\ \ x^\top A x \qquad \text{subject to}\qquad \|x\|_2^2 = 1.\]
set.seed(0)
n <- 3
A <- matrix(rnorm(n * n), n); A <- t(A) %*% A # random PSD matrix
x <- Variable(n)
prob <- Problem(Maximize(quad_form(x, A)), list(sum_squares(x) == 1))
c(is_dcp = is_dcp(prob), is_dnlp = is_dnlp(prob)) # not DCP, but DNLP is_dcp is_dnlp
FALSE TRUE
value(x) <- rep(1, n) # a starting point
opt <- psolve(prob, nlp = TRUE, solver = "UNO", logger = "SILENT")
c(dnlp = opt, max_eigenvalue = max(eigen(A, only.values = TRUE)$values)) dnlp max_eigenvalue
4.659154 4.659154
The nonlinear solver lands exactly on the largest eigenvalue — a nonconvex problem solved from R, in the same modeling language.
This power comes with a caveat you must respect. Unlike convex optimization, a nonlinear solver returns a local optimum that can depend on where it starts — there is no global guarantee. Consider \(\sin(x) + \tfrac{1}{10}x^2\) on \([-10, 10]\), which has several local minima:
set.seed(1)
x <- Variable(bounds = list(-10, 10))
prob <- Problem(Minimize(sin(x) + 0.1 * x^2))
best <- psolve(prob, nlp = TRUE, solver = "UNO", logger = "SILENT", best_of = 5)
c(best_objective = best, x = value(x))best_objective x
-0.7945823 -1.3064400
best_of = 5 solves from five random starting points and keeps the best — a simple, effective hedge against local minima. The lesson: convex problems give you the global optimum for free; nonconvex ones make you work for it (good starting points, random restarts, and a healthy skepticism of any single solve).
On a nonconvex problem an NLP solver may return a local (not global) optimum, give a result that depends on the starting point, or fail to find a feasible point. Supply a starting point with value(x) <- ..., or use best_of to try several.
DNLP adds classically differentiable atoms that are not part of the convex grammar — sin, cos, tan, sinh, tanh, asinh, atanh, normcdf, prod — and lets existing smooth atoms (exp, log, quad_form, quad_over_lin, …) be used in either curvature direction. Three predicates classify an expression under the smooth grammar:
x <- Variable()
classify <- function(e) c(smooth = is_smooth(e),
lin_convex = is_linearizable_convex(e),
lin_concave = is_linearizable_concave(e))
rbind(`sin(x)` = classify(sin(x)),
`abs(x)` = classify(abs(x)),
`-abs(x)` = classify(-abs(x))) smooth lin_convex lin_concave
sin(x) TRUE TRUE TRUE
abs(x) FALSE TRUE FALSE
-abs(x) FALSE FALSE TRUE
A smooth atom like sin(x) linearizes in both directions, so it may appear anywhere; nonsmooth convex/concave atoms keep the curvature DCP assigns them.
Exercise 1. See the local-optimum trap for yourself. Solve \(\sin(x) + \tfrac{1}{10}x^2\) from a single starting point at x0 = 4, then compare with best_of = 20. Do they agree?
x <- Variable(bounds = list(-10, 10))
prob <- Problem(Minimize(sin(x) + 0.1 * x^2))
value(x) <- 4 # a poor starting point
one <- psolve(prob, nlp = TRUE, solver = "UNO", logger = "SILENT")
set.seed(2)
many <- psolve(prob, nlp = TRUE, solver = "UNO", logger = "SILENT", best_of = 20)
c(from_x0_4 = one, best_of_20 = many) from_x0_4 best_of_20
0.8315586 0.8315586
The single solve from x0 = 4 gets stuck in a nearby local minimum (objective \(\approx 0.83\)); the random restarts escape it and find the global one (\(\approx -0.79\)). This is the whole story of nonconvex optimization in one example.
Exercise 2 (⭐). Turn on verbose = TRUE for one of the solves above and read UNO’s log. How is it different from the convex-solver output you saw in Chapter 2?
psolve(prob, nlp = TRUE, solver = "UNO", verbose = TRUE)You will see UNO reformulate the model and run a nonlinear iteration (a Newton or SQP method), rather than the cone solver’s interior-point iterations. Same verbose = TRUE switch, a different engine underneath.
psolve(nlp = TRUE); is_dnlp() checks membership, and every DCP problem qualifies.sin/cos) — but trades the global guarantee for a local one; use starting points and best_of.sparsediff + Uno (the practical R default; ipopt is optional and not on CRAN).For the full treatment — duals on the NLP path, random-restart details, the smooth atom table — see the DNLP Tutorial.