What sparsediff is
sparsediff is a thin R interface to the SparseDiffEngine
C library — the sparse Jacobian and Hessian differentiation backend used
by CVXPY for its Disciplined Nonlinear Programming (DNLP) extension. It
is the R analog of the Python sparsediffpy package and
wraps the same C library.
You build a nonlinear expression graph out of the
sd_* atom constructors, assemble the pieces into a
problem, and then evaluate, at any primal point,
- the objective value and its dense gradient,
- the constraint vector and its sparse Jacobian (in COO form), and
- the lower-triangular Lagrangian Hessian (in COO form).
This is a low-level backend: the intended user is a higher-level modelling layer such as CVXR, not an analyst writing models by hand. The walk-through below shows the moving parts.
library(sparsediff)
engine_version()
#> [1] "0.3.0"Building an expression graph
Leaves are created with sd_variable() (a slice of the
differentiation vector) and sd_parameter() (fixed data).
Everything else is an atom applied to other expressions. Each
constructor returns an expression handle — an external
pointer into the engine’s graph that frees itself when garbage
collected.
Here is f(x) = sum(exp(x)) for a length-3 variable, with
a single linear constraint g(x) = sum(x):
n <- 3L
x <- sd_variable(d1 = n, d2 = 1L, var_id = 0L, n_vars = n) # 0-based var offset
obj <- sd_sum(sd_exp(x), axis = -1L) # sum(exp(x)) (axis -1 = all)
g1 <- sd_sum(x, axis = -1L) # sum(x)var_id is the 0-based, column-major offset of the
variable’s first entry in the flat primal vector; n_vars is
the total number of variables. Reduction axis follows the
engine convention: -1 (all entries), 0 (down
rows), 1 (across columns).
Assembling a problem and its derivative oracle
prob <- sd_problem(objective = obj, constraints = list(g1), verbose = FALSE)
# initialise the (structural) derivative layout once
sd_init_derivatives(prob)
sd_init_jacobian_coo(prob)
sd_init_hessian_coo(prob)Sparsity patterns are structural and fixed after these
sd_init_* calls, so you query them once and reuse them;
only the values are recomputed at each point.
Evaluating value and derivatives
Evaluation is ordered: a forward pass populates the
node values at a primal point u, after which the gradient,
Jacobian values and Hessian values can be read.
u <- c(0, 0.5, 1)
sd_objective_forward(prob, u) # value of sum(exp(x))
#> [1] 5.367003
sd_gradient(prob) # gradient = exp(u)
#> [1] 1.000000 1.648721 2.718282
sd_constraint_forward(prob, u) # value of sum(x)
#> [1] 1.5The constraint Jacobian comes back in COO form (0-based row/column indices) plus the matching nonzero values:
sd_jacobian_sparsity(prob) # rows / cols / nrow / ncol
#> $rows
#> [1] 0 0 0
#>
#> $cols
#> [1] 0 1 2
#>
#> $nrow
#> [1] 1
#>
#> $ncol
#> [1] 3
sd_jacobian_values(prob) # d/dx sum(x) = (1, 1, 1)
#> [1] 1 1 1The Lagrangian Hessian is the lower triangle of
obj_w * Hess(f) + sum_i w[i] * Hess(g_i). Here the
objective weight is 1 and the constraint is linear (zero Hessian), so
the result is diag(exp(u)):
sd_hessian_sparsity(prob)
#> $rows
#> [1] 0 1 2
#>
#> $cols
#> [1] 0 1 2
#>
#> $nrow
#> [1] 3
#>
#> $ncol
#> [1] 3
sd_hessian_values(prob, obj_w = 1, w = 0)
#> [1] 1.000000 1.648721 2.718282Parameters and fast re-evaluation
A sd_parameter() is fixed data that can be
updated between evaluations without rebuilding the
graph — the basis for CVXPY/CVXR’s Disciplined Parametrized Programming
(DPP) fast path. Register the parameters once, then push new values with
sd_update_params():
p <- sd_parameter(d1 = 1L, d2 = 1L, param_id = 0L, n_vars = n, values = 2)
obj2 <- sd_sum(sd_scalar_mult(p, sd_exp(x)), axis = -1L) # theta * sum(exp(x))
prob2 <- sd_problem(obj2, constraints = list(), verbose = FALSE)
sd_register_params(prob2, list(p))
sd_init_derivatives(prob2)
sd_objective_forward(prob2, u) # theta = 2
#> [1] 10.73401
sd_update_params(prob2, 3) # theta -> 3
sd_objective_forward(prob2, u) # re-evaluated, no rebuild
#> [1] 16.10101