Solvers in Python
Daniel P. Palomar (2025). Portfolio Optimization: Theory and Application. Cambridge University Press.
Last update: February 10, 2025
Contributors:
This Python session will introduce different solvers in Python that can be used for portfolio optimization.
General solvers¶
General solvers can deal with arbitrary nonlinear optimization problems, at the expense of perhaps slow convergence and risk of numerical issues.
Package scipy¶
The scipy package offers several general purpose optimization routines
via the optimize
module.
minimize_scalar()
: for one-dimensional unconstrained function optimization over an interval
(for one-dimensional root finding use root_scalar()
)
minimize()
: for general-purpose optimization.
import numpy as np
from scipy import optimize as opt
# Define a function
def foo(x):
return np.exp(-.5 * x) * np.sin(10 * np.pi * x)
foo(0.5)
4.768779430809241e-16
res = opt.minimize_scalar(foo, bounds = [0, 1], method = "Bounded")
res
message: Solution found. success: True status: 0 fun: -0.8395633414865528 x: 0.34949347963438415 nit: 11 nfev: 11
# We can do it via a lambda (instead of a function definition):
res = opt.minimize_scalar(fun = lambda x: np.exp(-.5 * x) * np.sin(10 * np.pi * x),
bounds = [0, 1], method = "Bounded")
res
message: Solution found. success: True status: 0 fun: -0.8395633414865528 x: 0.34949347963438415 nit: 11 nfev: 11
# Plot using matplotlib
import matplotlib.pyplot as plt
x = np.linspace(0, 1, 1000)
plt.plot(x, foo(x))
plt.plot(res.x, res.fun, 'ro') # Plotting the point as before
plt.show()
# Plot using seaborn
import seaborn as sns
sns.set(style="darkgrid")
x = np.linspace(0, 1, 1000)
sns.lineplot(x=x, y=foo(x))
sns.scatterplot(x=[res.x], y=[res.fun], color="red", s=100)
<Axes: >
minimize()
: General-purpose optimization function with 14 different optimization methods (see documentation):Nelder-Mead
: uses the Simplex algorithm. This algorithm is robust in many applications. However, if numerical computation of derivative can be trusted, other algorithms using the and/or second derivatives information might be preferred for their better performance in general.Powell
: very fast, derivative-free implementation of Powell's methodCG
: uses a nonlinear conjugate gradient algorithm by Polak and Ribiere, a variant of the Fletcher-Reeves method. Only the first derivatives are used.BFGS
: uses the quasi-Newton method of Broyden, Fletcher, Goldfarb, and Shanno (BFGS). It uses the first derivatives only. BFGS has proven good performance even for non-smooth optimizations. This method also returns an approximation of the Hessian inverse.L-BFGS-B
: uses the L-BFGS-B algorithm for bound constrained minimization.TNC
: uses a truncated Newton algorithm to minimize a function with variables subject to bounds. This algorithm uses gradient information; it is also called Newton Conjugate-Gradient. It differs from the Newton-CG method described above as it wraps a C implementation and allows each variable to be given upper and lower bounds.Newton-CG
: uses a Newton-CG algorithm (also known as the truncated Newton method). It uses a CG method to the compute the search direction.COBYLA
: uses the Constrained Optimization BY Linear Approximation (COBYLA) method. The algorithm is based on linear approximations to the objective function and each constraint. The method wraps a FORTRAN implementation of the algorithm. The constraints functions ‘fun’ may return either a single number or an array or list of numbers.SLSQP
: uses Sequential Least SQuares Programming to minimize a function of several variables with any combination of bounds, equality and inequality constraints. The method wraps the SLSQP Optimization subroutine originally implemented by Dieter Kraft. Note that the wrapper handles infinite values in bounds by converting them into large floating values.trust-constr
: is a trust-region algorithm for constrained optimization. It swiches between two implementations depending on the problem definition. It is the most versatile constrained minimization algorithm implemented in SciPy and the most appropriate for large-scale problems. For equality constrained problems it is an implementation of Byrd-Omojokun Trust-Region SQP method.trust-ncg
: uses the Newton conjugate gradient trust-region algorithm for unconstrained minimization. This algorithm requires the gradient and either the Hessian or a function that computes the product of the Hessian with a given vector. Suitable for large-scale problems.trust-krylov
: uses the Newton GLTR trust-region algorithm for unconstrained minimization. This algorithm requires the gradient and either the Hessian or a function that computes the product of the Hessian with a given vector. Suitable for large-scale problems. On indefinite problems it requires usually less iterations than the trust-ncg method and is recommended for medium and large-scale problems.trust-exact
: is a trust-region method for unconstrained minimization in which quadratic subproblems are solved almost exactly. This algorithm requires the gradient and the Hessian (which is not required to be positive definite). It is, in many situations, the Newton method to converge in fewer iteraction and the most recommended for small and medium-size problems.dogleg
: uses the dog-leg trust-region algorithm for unconstrained minimization. This algorithm requires the gradient and Hessian; furthermore the Hessian is required to be positive definite.
Examples¶
This example does a least squares fitting: $$ \begin{array}{ll} \underset{a,b}{\textsf{minimize}} & \sum\limits_{i=1}^{m} \left(y_i - (a+bx_i)\right)^2 \end{array} $$
import numpy as np
from scipy.optimize import minimize
import matplotlib.pyplot as plt
# data points to be fitted
x = 1 + np.arange(6)
y = np.array([1, 3, 5, 6, 8, 12])
# l2-norm error function
def obj_fun(param):
return np.sum((y - (param[0] + param[1] * x))**2)
# call solver with initial value [0, 1]
res = minimize(obj_fun, x0 = np.array([0, 1]), method = 'Nelder-Mead')
print(res)
message: Optimization terminated successfully. success: True status: 0 fun: 2.819047623875957 x: [-1.267e+00 2.029e+00] nit: 64 nfev: 126 final_simplex: (array([[-1.267e+00, 2.029e+00], [-1.267e+00, 2.029e+00], [-1.267e+00, 2.029e+00]]), array([ 2.819e+00, 2.819e+00, 2.819e+00]))
plt.plot(x, y, 'ko') # plot linear regression
plt.plot(x, res.x[0] + res.x[1] * x, 'k-')
[<matplotlib.lines.Line2D at 0x11acfa8b0>]
The next example illustrates the use of the gradient with the famous Rosenbrock banana function: $$ f(\mathbf{x}) = 100(x_2-x_1^2)^2 + (1-x_1)^2 $$ with gradient $$ \nabla f(\mathbf{x}) = \left[ \begin{array}{c} -400x_1(x_2-x_1^2) -2(1-x_1)\\ 200(x_2-x_1^2) \end{array}\right] $$ with the unconstrained minimization problem $$ \begin{array}{ll} \underset{\mathbf{x}\in \mathbb{R}^2}{\textsf{minimize}} & 100(x_2-x_1^2)^2 + (1-x_1)^2 \end{array} $$
# Rosenbrock banana function and its gradient
def f_banana(x):
return(100 * (x[1] - x[0] * x[0])**2 + (1 - x[0])**2)
def gr_banana(x):
return(np.array([[-400 * x[0] * (x[1] - x[0] * x[0]) - 2 * (1 - x[0])],
[200 * (x[1] - x[0] * x[0])]]).reshape((-1))) # reshape((-1)) converts the column matrix into a vector
res = minimize(f_banana, np.array([-1.2, 1]), method='Nelder-Mead')
res
message: Optimization terminated successfully. success: True status: 0 fun: 8.177661197416674e-10 x: [ 1.000e+00 1.000e+00] nit: 85 nfev: 159 final_simplex: (array([[ 1.000e+00, 1.000e+00], [ 1.000e+00, 1.000e+00], [ 1.000e+00, 1.000e+00]]), array([ 8.178e-10, 1.108e-09, 1.123e-09]))
res = minimize(f_banana, np.array([-1.2, 1]), method='BFGS', jac = gr_banana)
res
message: Optimization terminated successfully. success: True status: 0 fun: 2.535309047218261e-15 x: [ 1.000e+00 1.000e+00] nit: 32 jac: [-1.705e-06 8.233e-07] hess_inv: [[ 5.019e-01 1.003e+00] [ 1.003e+00 2.010e+00]] nfev: 39 njev: 39
This example uses the simulated annealing method (for global optimization):
from scipy import optimize as opt
# "wild" function , global minimum at about -15.81515
def f_wild(x):
return(10*np.sin(0.3*x) * np.sin(1.3 * x**2) + 0.00001 * x**4 + 0.2 * x + 80)
t = np.linspace(-50, 50, 1000)
res = opt.basinhopping(f_wild, 50, niter = 500)
res
message: ['requested number of basinhopping iterations completed successfully'] success: True fun: 67.46773474158677 x: [-1.582e+01] nit: 500 minimization_failures: 72 nfev: 12816 njev: 6044 lowest_optimization_result: message: Optimization terminated successfully. success: True status: 0 fun: 67.46773474158677 x: [-1.582e+01] nit: 2 jac: [ 3.815e-06] hess_inv: [[ 5.917e-05]] nfev: 12 njev: 6
res = minimize(f_wild, res.x, method='BFGS') # now improve locally (typically only by a small bit):
res
message: Optimization terminated successfully. success: True status: 0 fun: 67.46773474158677 x: [-1.582e+01] nit: 0 jac: [ 3.815e-06] hess_inv: [[1]] nfev: 2 njev: 1
plt.plot(t, f_wild(t), 'k-')
plt.plot(res.x, f_wild(res.x), 'ro')
[<matplotlib.lines.Line2D at 0x11ad862e0>]
Note that some methods accept bounds on the optimization variable via the argument bounds
.
However, make sure to verify in the documentation
whether the particular method accepts bounds or not.
Solvers for specific classes of problems¶
If the problem to be solved falls within a particular class of problems, such as LS, LP, MILP, QP, SOCP, or SDP, then it is much better to use a solver specific for that class.
Least Squares (LS)¶
A linear least-squares (LS) problem minimizes the $\ell_2$-norm $\|\mathbf{A}\mathbf{x}-\mathbf{b}\|_2$ possibly
with bounds or linear constraints. The function scipy.optimize.lsq_linear
from scipy
solves over- and underdetermined linear systems in the least-squares sense.
Linear Programming (LP)¶
The package scipy contains the function scipy.optimize.linprog
(documentation) to conveniently solve LPs of the form (it can also contain equality constraints):
$$
\begin{array}{ll}
\underset{\mathbf{x}\in \mathbb{R}^n}{\textsf{minimize}} & \mathbf{c}^T\mathbf{x}\\
{\textsf{subject to}} & \begin{array}[t]{l} \mathbf{A}\mathbf{x} \le \mathbf{b}\\
\mathbf{l} \le \mathbf{x} \le \mathbf{u}.\end{array}\\
\end{array}
$$
Let's consider the following LP: $$ \begin{array}{ll} \underset{x_0, x_1}{\textsf{minimize}} & 4 x_1 - x_0\\ {\textsf{subject to}} & \begin{array}[t]{l} x_1 - 3 x_0 \leq 6\\ x_0 + 2x_1 \leq 4\\ x_1 \geq -3 \\ -\infty \leq x_0 \leq \infty \end{array} \end{array} $$
from scipy.optimize import linprog
c = [-1, 4]
A = [[-3, 1], [1, 2]]
b = [6, 4]
x0_bounds = (None, None)
x1_bounds = (-3, None)
res = linprog(c, A_ub=A, b_ub=b, bounds=(x0_bounds, x1_bounds), options={"disp": True})
res
Running HiGHS 1.2.0 [date: 2021-07-09, git hash: n/a] Copyright (c) 2022 ERGO-Code under MIT licence terms Presolving model 0 rows, 0 cols, 0 nonzeros 0 rows, 0 cols, 0 nonzeros Presolve : Reductions: rows 0(-2); columns 0(-2); elements 0(-4) - Reduced to empty Solving the original LP from the solution after postsolve Model status : Optimal Objective value : -2.2000000000e+01 HiGHS run time : 0.00
message: Optimization terminated successfully. (HiGHS Status 7: Optimal) success: True status: 0 fun: -22.0 x: [ 1.000e+01 -3.000e+00] nit: 0 lower: residual: [ inf 0.000e+00] marginals: [ 0.000e+00 6.000e+00] upper: residual: [ inf inf] marginals: [ 0.000e+00 0.000e+00] eqlin: residual: [] marginals: [] ineqlin: residual: [ 3.900e+01 0.000e+00] marginals: [-0.000e+00 -1.000e+00] mip_node_count: 0 mip_dual_bound: 0.0 mip_gap: 0.0
Quadratic Programming (QP)¶
The package quadprog
contains the function solve_qp
to conveniently solve QPs of the form:
$$
\begin{array}{ll}
\underset{\mathbf{x}\in \mathbb{R}^n}{\textsf{minimize}} &
-\mathbf{a}^T\mathbf{x} + \frac{1}{2}\mathbf{x}^T\mathbf{G}\mathbf{x}\\
{\textsf{subject to}} & \mathbf{C}^T\mathbf{x} \geq \mathbf{b}\\
\end{array}
$$
from quadprog import solve_qp # pip install quadprog
import numpy as np
# Set up problem:
# minimize -(0 5 0) x + 1/2 x^T x
# subject to A^T x >= b
# with b = (-8,2,0)^T
# (-4 2 0)
# A = (-3 1 -2)
# ( 0 0 1)
Gmat = np.eye(3)
avec = np.array([0., 5., 0.])
Cmat = np.array([[-4., -3., 0.],
[2., 1., 0.],
[0., -2., 1.]])
bvec = np.array([-8., 2., 0.])
# run solver
res = solve_qp(Gmat, avec, Cmat, bvec)
res
(array([0., 5., 0.]), -12.5, array([0., 5., 0.]), array([1, 0], dtype=int32), array([0., 0., 0.]), array([], dtype=int32))
The package osqp offers a object-oriented way to solve QPs written in the following form: $$ \begin{array}{ll} \underset{\mathbf{x}}{\mbox{minimize}} & \frac{1}{2} \mathbf{x}^T \mathbf{P} \mathbf{x} + \mathbf{q}^T \mathbf{x} \\ \mbox{subject to} & \mathbf{l} \leq \mathbf{A} \mathbf{x} \leq \mathbf{u} \end{array} $$
import osqp # pip install osqp
import numpy as np
import scipy as sp
from scipy import sparse
# Generate problem data
np.random.seed(1)
m = 30
n = 20
Ad = sparse.random(m, n, density=0.7, format='csc')
b = np.random.randn(m)
# OSQP data
P = sparse.block_diag([sparse.csc_matrix((n, n)), sparse.eye(m)], format='csc')
q = np.zeros(n+m)
A = sparse.vstack([
sparse.hstack([Ad, -sparse.eye(m)]),
sparse.hstack([sparse.eye(n), sparse.csc_matrix((n, m))])], format='csc')
l = np.hstack([b, np.zeros(n)])
u = np.hstack([b, np.ones(n)])
# Create an OSQP object
prob = osqp.OSQP()
# Setup workspace
prob.setup(P, q, A, l, u)
res = prob.solve() # Solve problem
res
<osqp.OSQP_results at 0x11ade12f0>
Second-Order Cone Programming (SOCP)¶
There are several packages:
- Package ecos-python provides an interface to the
Embedded COnic Solver (ECOS), a well-known, efficient, and robust C library for convex problems. Conic and equality constraints can be specified in addition to integer and boolean variable constraints for mixed-integer problems.
Cone Programming and Semi-Definite Programming (SDP)¶
There are several good packages:
- Package Irene
Irene was originally written to find reliable approximations for optimum value of an arbitrary optimization problem. It implements a modification of Lasserre’s SDP Relaxations based on generalized truncated moment problem to handle general optimization problems algebraically.
- Package mosek contains a Python
interface to the (commercial) MOSEK optimization library for large-scale LP, QP, and MIP problems, with emphasis on (nonlinear) conic, semidefinite, and convex tasks; academic licenses are available.
- Package scs is a numerical optimization package for solving large-scale
convex cone problems, based on the paper Conic Optimization via Operator Splitting and Homogeneous Self-Dual Embedding.
Mixed Integer Programming¶
- Package mip
is a collection of Python tools for the modeling and solution of Mixed-Integer Linear programs (MIPs). It provides access to advanced solver features like cut generation, lazy constraints, MIP starts and solution pools.
Optimization infrastructure packages¶
CVXPY for Convex Problems¶
CVXPY is a Python package that provides an object-oriented modeling language for convex optimization, similar to CVX, CVXR, YALMIP, and Convex.jl. It allows the user to formulate convex optimization problems in a natural mathematical syntax rather than the restrictive standard form required by most solvers. The user specifies an objective and set of constraints by combining constants, variables, and parameters using a library of functions with known mathematical properties. CVXR then applies signed disciplined convex programming (DCP) to verify the problem’s convexity. Once verified, the problem is converted into standard conic form using graph implementations and passed to a cone solver such as ECOS or SCS. See CVX tutorials for details. Let's see now several examples.
**Least squares
Let's start with a simple LS example: $$ \begin{array}{ll} \underset{\boldsymbol{\beta}}{\textsf{minimize}} & \|\mathbf{y} - \mathbf{X}\boldsymbol{\beta}\|_2^2\\ \end{array} $$
Let's first solve it via Scipy lsq_linear
:
import numpy as np
from scipy.optimize import lsq_linear
# generate data
m = 100
n = 10
beta_true = np.arange(10) - 4
X = np.random.normal(size=n*m).reshape((m,-1))
y = X @ beta_true + np.random.normal(size=m)
# estimate x
res = lsq_linear(X, y)
res
message: The unconstrained solution is optimal. success: True status: 3 fun: [-1.111e+00 1.494e+00 ... 2.154e-01 -6.792e-01] x: [-4.016e+00 -3.009e+00 -2.134e+00 -6.944e-01 -5.416e-02 7.346e-01 1.870e+00 3.207e+00 3.844e+00 5.001e+00] nit: 0 cost: 50.55498554873578 optimality: 5.650672290843415e-13 active_mask: [ 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00] unbounded_sol: (array([-4.016e+00, -3.009e+00, -2.134e+00, -6.944e-01, -5.416e-02, 7.346e-01, 1.870e+00, 3.207e+00, 3.844e+00, 5.001e+00]), array([ 1.011e+02]), 10, array([ 1.224e+01, 1.170e+01, 1.101e+01, 1.035e+01, 1.006e+01, 9.406e+00, 9.177e+00, 9.033e+00, 7.789e+00, 7.247e+00]))
Now let's do it with CVXPY:
import cvxpy as cp # pip install cvxpy
import numpy as np
# define variable
beta = cp.Variable(n)
# define objective
objective = cp.Minimize(cp.sum_squares(X @ beta - y))
# create problem
prob = cp.Problem(objective)
# solve it!
result = prob.solve()
beta.value
array([-4.01608366, -3.00908591, -2.13356382, -0.69438355, -0.05416262, 0.73455804, 1.86975671, 3.20690015, 3.84395209, 5.00103773])
We can now easily add a contraint to solve a nonnegative LS:
constraints = [beta >= 0]
prob = cp.Problem(objective, constraints)
result = prob.solve()
beta.value
array([-5.26194620e-17, -5.73574109e-17, -7.18346976e-17, -1.11599212e-16, 4.05790128e-01, 1.60249957e+00, 1.93370680e+00, 2.53939243e+00, 3.92161170e+00, 5.76468454e+00])
We can add other constraints:
constraint1 = beta[2] + beta[3] <= 0
A = np.diag(np.concatenate([np.array([1, 0, 0]), np.array([1] * 7)]))
print(A)
[[1 0 0 0 0 0 0 0 0 0] [0 0 0 0 0 0 0 0 0 0] [0 0 0 0 0 0 0 0 0 0] [0 0 0 1 0 0 0 0 0 0] [0 0 0 0 1 0 0 0 0 0] [0 0 0 0 0 1 0 0 0 0] [0 0 0 0 0 0 1 0 0 0] [0 0 0 0 0 0 0 1 0 0] [0 0 0 0 0 0 0 0 1 0] [0 0 0 0 0 0 0 0 0 1]]
constraint2 = A @ beta >= 0
# define problem and solve it
prob = cp.Problem(objective, constraints = [constraint1, constraint2])
result = prob.solve()
beta.value
array([-8.12622880e-17, -2.55930726e+00, -2.38179431e+00, -7.57895729e-18, 5.69530272e-01, 1.56146521e+00, 1.83898696e+00, 2.49285673e+00, 3.68068355e+00, 5.85268505e+00])
Robust Huber regression
Let's consider another simple example of robust regression: $$ \begin{array}{ll} \underset{\boldsymbol{\beta}}{\textsf{minimize}} & \sum\limits_{i=1}^m \phi(y_i - \mathbf{x}_i^T\boldsymbol{\beta}) \end{array} $$ where $$ \phi(u) = \begin{cases} u^2 & \text{if }|u| \le M \\ 2Mu-M^2 & \text{if }|u| > M \end{cases} $$
M = 0.1
beta = cp.Variable(n)
obj = cp.sum(cp.huber(y - X @ beta, M))
cp.Problem(cp.Minimize(obj)).solve()
beta.value
array([-4.17042853, -3.04044983, -2.10513397, -0.71206119, -0.01068557, 0.76251828, 1.8334574 , 3.23198934, 3.77406989, 4.90310484])
Elastic net regularization
We will now solve the problem: $$ \begin{array}{ll} \underset{\boldsymbol{\beta}}{\textsf{minimize}} & \frac{1}{2m}\|\mathbf{y} - \mathbf{X}\boldsymbol{\beta}\|_2^2 + \lambda (1-\alpha)\|\boldsymbol{\beta}\|_2^2 + \alpha\| \boldsymbol{\beta} \|_1 \end{array} $$
# define the regularization term
def elastic_reg(beta, lmd = 0, alpha = 0):
ridge = (1 - alpha) * cp.sum(beta ** 2)
lasso = alpha * cp.norm(beta, 1)
return lmd * (lasso + ridge)
# define problem and solve it
alpha = 0.5
lmd = 0.5
obj = cp.sum((y - X @ beta)**2) + elastic_reg(beta, lmd, alpha)
cp.Problem(cp.Minimize(obj)).solve()
beta.value
array([-4.00380815, -2.99831083, -2.12711484, -0.69208925, -0.05285668, 0.73604469, 1.86351502, 3.19408019, 3.83259749, 4.9906979 ])
Sparse inverse covariance estimation
Consider the matrix-valued convex problem: $$ \begin{array}{ll} \underset{\mathbf{X}}{\textsf{maximize}} & \textsf{logdet}(\mathbf{X}) - \textsf{Tr}(\mathbf{X}\mathbf{S})\\ \textsf{subject to} & \mathbf{X} \succeq \mathbf{0}, \qquad \sum_{i=1}^n\sum_{j=1}^n\left|X_{ij}\right| \le \alpha \end{array} $$
n = 3
np.random.seed(1)
S = np.random.randn(n, n)
X = cp.Variable((n, n), PSD=True)
obj = cp.log_det(X) - cp.trace(X @ S)
alpha = 1
constr = [cp.sum(cp.abs(X)) <= alpha]
cp.Problem(cp.Maximize(obj), constr).solve()
X.value
array([[ 2.64619429e-01, 1.06538048e-07, -3.78401285e-08], [ 1.06538048e-07, 3.31118026e-01, 8.97513153e-08], [-3.78401285e-08, 8.97513153e-08, 4.04255112e-01]])
np.linalg.eig(X.value)[0] # compute the eigenvalues of X to verify it is PD
array([0.26461943, 0.33111803, 0.40425511])
Conclusion¶
The solvers available in Python should be enough for optimization problems in finance.
The following steps are recommended when approaching an optimization problem:
- If the problem is convex, then start with CVXPY for initial tests.
- When faster speed is required, then:
- If the problem belongs to one of the well defined classes, such as LP, MILP, QP, SOCP, or SDP, use a solver specific for that class (for example, for LPs we recommend scipy.optimize.linprog and for QPs quadprog.solve_qp).
- However, if the problem does not fit any class, then one has to use a general solver for nonlinear optimization using the function
minimize
. In that sense, only a local solution is guaranteed. If a global solver is required, one may use the global optimization functions available in Scipy.