Solvers in Python

Daniel P. Palomar (2025). Portfolio Optimization: Theory and Application. Cambridge University Press.

Last update: February 8, 2026

Contributors:


This Python session will introduce different solvers in Python that can be used for portfolio optimization.

General solvers

General solvers can deal with arbitrary nonlinear optimization problems, at the expense of perhaps slow convergence and risk of numerical issues.

Package SciPy

The SciPy library offers several general‑purpose optimization routines via the optimize module.

  • minimize_scalar(): one-dimensional unconstrained minimization over an interval (for one-dimensional root finding use root_scalar()).
  • minimize(): unified interface to many local optimization methods (e.g., Nelder–Mead, Powell, CG, BFGS/L-BFGS-B, TNC, SLSQP, trust-region methods). It supports unconstrained and constrained problems, optional gradients/Hessians, and method-, bounds-, and constraint-selection via keyword arguments (see documentation).
import numpy as np
from scipy import optimize as opt

# Define a function
def foo(x):
    return np.exp(-.5 * x) * np.sin(10 * np.pi * x)

foo(0.5)
np.float64(4.768779430809241e-16)
res = opt.minimize_scalar(foo, bounds = [0, 1], method = "Bounded")
res
 message: Solution found.
 success: True
  status: 0
     fun: -0.8395633414865528
       x: 0.34949347963438415
     nit: 11
    nfev: 11
# We can do it via a lambda (instead of a function definition):
res = opt.minimize_scalar(fun = lambda x: np.exp(-.5 * x) * np.sin(10 * np.pi * x), 
                          bounds = [0, 1], method = "Bounded")
res
 message: Solution found.
 success: True
  status: 0
     fun: -0.8395633414865528
       x: 0.34949347963438415
     nit: 11
    nfev: 11
# Plot
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns  # only for styling
sns.set_theme(style="darkgrid")  # or sns.set_style("darkgrid")
# or, if you prefer: plt.style.use("seaborn-v0_8-darkgrid")

x = np.linspace(0, 1, 1000)
y = foo(x)

plt.figure()
plt.plot(x, y)
plt.plot(res.x, res.fun, "ro")
plt.show()

Examples

This example does a least squares fitting: $$ \begin{array}{ll} \underset{a,b}{\textsf{minimize}} & \sum\limits_{i=1}^{m} \left(y_i - (a+bx_i)\right)^2 \end{array} $$

import numpy as np
from scipy.optimize import minimize
import matplotlib.pyplot as plt

# data points to be fitted
x = 1 + np.arange(6)
y = np.array([1, 3, 5, 6, 8, 12])

# l2-norm error function
def obj_fun(param):
    return np.sum((y - (param[0] + param[1] * x))**2)

# call solver with initial value [0, 1]
res = minimize(obj_fun, x0 = np.array([0, 1]), method = 'Nelder-Mead')
print(res)
       message: Optimization terminated successfully.
       success: True
        status: 0
           fun: 2.819047623875957
             x: [-1.267e+00  2.029e+00]
           nit: 64
          nfev: 126
 final_simplex: (array([[-1.267e+00,  2.029e+00],
                       [-1.267e+00,  2.029e+00],
                       [-1.267e+00,  2.029e+00]]), array([ 2.819e+00,  2.819e+00,  2.819e+00]))
plt.plot(x, y, 'ko') # plot linear regression
plt.plot(x, res.x[0] + res.x[1] * x, 'k-')
[<matplotlib.lines.Line2D at 0x12261fe30>]

The next example illustrates the use of the gradient with the famous Rosenbrock banana function: $$ f(\mathbf{x}) = 100(x_2-x_1^2)^2 + (1-x_1)^2 $$ with gradient $$ \nabla f(\mathbf{x}) = \left[ \begin{array}{c} -400x_1(x_2-x_1^2) -2(1-x_1)\\ 200(x_2-x_1^2) \end{array}\right] $$ with the unconstrained minimization problem $$ \begin{array}{ll} \underset{\mathbf{x}\in \mathbb{R}^2}{\textsf{minimize}} & 100(x_2-x_1^2)^2 + (1-x_1)^2 \end{array} $$

# Rosenbrock banana function and its gradient
def f_banana(x):
   return(100 * (x[1] - x[0] * x[0])**2 + (1 - x[0])**2)
def gr_banana(x):
   return(np.array([[-400 * x[0] * (x[1] - x[0] * x[0]) - 2 * (1 - x[0])],
                    [200 * (x[1] - x[0] * x[0])]]).reshape((-1)))   # reshape((-1)) converts the column matrix into a vector
res = minimize(f_banana, np.array([-1.2, 1]), method='Nelder-Mead')
res
       message: Optimization terminated successfully.
       success: True
        status: 0
           fun: 8.177661197416674e-10
             x: [ 1.000e+00  1.000e+00]
           nit: 85
          nfev: 159
 final_simplex: (array([[ 1.000e+00,  1.000e+00],
                       [ 1.000e+00,  1.000e+00],
                       [ 1.000e+00,  1.000e+00]]), array([ 8.178e-10,  1.108e-09,  1.123e-09]))
res = minimize(f_banana, np.array([-1.2, 1]), method='BFGS', jac = gr_banana)
res
  message: Optimization terminated successfully.
  success: True
   status: 0
      fun: 2.5353073555215442e-15
        x: [ 1.000e+00  1.000e+00]
      nit: 32
      jac: [-1.705e-06  8.233e-07]
 hess_inv: [[ 5.019e-01  1.003e+00]
            [ 1.003e+00  2.010e+00]]
     nfev: 39
     njev: 39

This example uses the simulated annealing method (for global optimization):

from scipy import optimize as opt

# "wild" function , global minimum at about -15.81515
def f_wild(x):
    return(10*np.sin(0.3*x) * np.sin(1.3 * x**2) + 0.00001 * x**4 + 0.2 * x + 80)
t = np.linspace(-50, 50, 1000)

res = opt.basinhopping(f_wild, 50, niter = 500)
res
                    message: ['requested number of basinhopping iterations completed successfully']
                    success: True
                        fun: 69.32479803451044
                          x: [-2.566e+01]
                        nit: 500
      minimization_failures: 86
                       nfev: 14561
                       njev: 6801
 lowest_optimization_result:  message: Optimization terminated successfully.
                              success: True
                               status: 0
                                  fun: 69.32479803451044
                                    x: [-2.566e+01]
                                  nit: 5
                                  jac: [ 0.000e+00]
                             hess_inv: [[ 2.274e-05]]
                                 nfev: 16
                                 njev: 8
res = minimize(f_wild, res.x, method='BFGS') # now improve locally (typically only by a small bit):
res
  message: Optimization terminated successfully.
  success: True
   status: 0
      fun: 69.32479803451044
        x: [-2.566e+01]
      nit: 0
      jac: [ 0.000e+00]
 hess_inv: [[1]]
     nfev: 2
     njev: 1
plt.plot(t, f_wild(t), 'k-')
plt.plot(res.x, f_wild(res.x), 'ro')
[<matplotlib.lines.Line2D at 0x12233f740>]

Note that some methods accept bounds on the optimization variable via the argument bounds. However, make sure to verify in the documentation whether the particular method accepts bounds or not.

Problem-Specific Solvers

If your optimization problem falls into a standard class—such as LS, LP, QP, SOCP, or SDP—it is often more efficient and robust to use a specialized solver rather than a general-purpose nonlinear optimizer.

Least Squares (LS)

A linear least-squares (LS) problem minimizes the $\ell_2$-norm $\|\mathbf{A}\mathbf{x}-\mathbf{b}\|_2$, possibly subject to bounds or linear constraints.

  • Unconstrained: For simple unconstrained problems, scipy.linalg.lstsq is efficient and robust.
  • Bounded: The function scipy.optimize.lsq_linear (documentation) solves linear least-squares problems with bound constraints $\mathbf{l} \le \mathbf{x} \le \mathbf{u}$.
  • Nonlinear: For nonlinear least-squares, scipy.optimize.least_squares is the standard tool.

Linear Programming (LP)

The library SciPy contains the function scipy.optimize.linprog (documentation) to conveniently solve LPs. Since SciPy 1.9.0, it defaults to the high-performance HiGHS solvers, making it suitable for larger and harder problems.

The standard form (which can also support equality constraints) is: $$ \begin{array}{ll} \underset{\mathbf{x}\in \mathbb{R}^n}{\textsf{minimize}} & \mathbf{c}^T\mathbf{x}\\ {\textsf{subject to}} & \begin{array}[t]{l} \mathbf{A}_{ub}\mathbf{x} \le \mathbf{b}_{ub}\\ \mathbf{A}_{eq}\mathbf{x} = \mathbf{b}_{eq}\\ \mathbf{l} \le \mathbf{x} \le \mathbf{u} \end{array} \end{array} $$

Example

Consider the following LP: $$ \begin{array}{ll} \underset{x_0, x_1}{\textsf{minimize}} & 4 x_1 - x_0\\ {\textsf{subject to}} & \begin{array}[t]{l} -3 x_0 + x_1 \leq 6\\ x_0 + 2x_1 \leq 4\\ x_1 \geq -3 \\ \end{array} \end{array} $$

from scipy.optimize import linprog

c = [-1, 4]
A = [[-3, 1], [1, 2]]
b = [6, 4]
x0_bounds = (None, None)
x1_bounds = (-3, None)
res = linprog(c, A_ub=A, b_ub=b, bounds=(x0_bounds, x1_bounds), options={"disp": True})
res
Running HiGHS 1.8.0 (git hash: 222cce7): Copyright (c) 2024 HiGHS under MIT licence terms
Coefficient ranges:
  Matrix [1e+00, 3e+00]
  Cost   [1e+00, 4e+00]
  Bound  [3e+00, 3e+00]
  RHS    [4e+00, 6e+00]
Presolving model
0 rows, 0 cols, 0 nonzeros  0s
0 rows, 0 cols, 0 nonzeros  0s
Presolve : Reductions: rows 0(-2); columns 0(-2); elements 0(-4) - Reduced to empty
Solving the original LP from the solution after postsolve
Model status        : Optimal
Objective value     : -2.2000000000e+01
Relative P-D gap    :  0.0000000000e+00
HiGHS run time      :          0.00
        message: Optimization terminated successfully. (HiGHS Status 7: Optimal)
        success: True
         status: 0
            fun: -22.0
              x: [ 1.000e+01 -3.000e+00]
            nit: 0
          lower:  residual: [       inf  0.000e+00]
                 marginals: [ 0.000e+00  6.000e+00]
          upper:  residual: [       inf        inf]
                 marginals: [ 0.000e+00  0.000e+00]
          eqlin:  residual: []
                 marginals: []
        ineqlin:  residual: [ 3.900e+01  0.000e+00]
                 marginals: [-0.000e+00 -1.000e+00]
 mip_node_count: 0
 mip_dual_bound: 0.0
        mip_gap: 0.0

Quadratic Programming (QP)

There are several specialized libraries for solving QPs in Python, which are generally preferred over general-purpose nonlinear solvers:

  • quadprog: This library implements a dual algorithm for solving strictly convex QPs (i.e., the quadratic matrix must be positive definite). It is highly efficient for small-to-medium dense problems.
  • OSQP: The Operator Splitting Quadratic Program solver is a modern, high-performance numerical optimization package for solving convex QPs. It is designed for large-scale sparse problems, is very robust, and supports warm-starting. It is currently the default QP solver in CVXPY.
from quadprog import solve_qp  # pip install quadprog
import numpy as np

# Set up problem: 
#   minimize    -(0 5 0) x + 1/2 x^T x
#   subject to  A^T x >= b
#   with b = (-8,2,0)^T
#       (-4 2  0)
#   A = (-3 1 -2)
#       ( 0 0  1)
Gmat = np.eye(3)
avec = np.array([0., 5., 0.])
Cmat = np.array([[-4., -3., 0.],
                  [2., 1., 0.],
                  [0., -2., 1.]])
bvec = np.array([-8., 2., 0.])

# run solver
res = solve_qp(Gmat, avec, Cmat, bvec)
res
(array([0., 5., 0.]),
 -12.5,
 array([0., 5., 0.]),
 array([1, 0], dtype=int32),
 array([0., 0., 0.]),
 array([], dtype=int32))

The library OSQP offers a object-oriented way to solve QPs written in the following form: $$ \begin{array}{ll} \underset{\mathbf{x}}{\mbox{minimize}} & \frac{1}{2} \mathbf{x}^T \mathbf{P} \mathbf{x} + \mathbf{q}^T \mathbf{x} \\ \mbox{subject to} & \mathbf{l} \leq \mathbf{A} \mathbf{x} \leq \mathbf{u} \end{array} $$

import osqp # pip install osqp
import numpy as np
from scipy import sparse

# Generate problem data
np.random.seed(1)
m = 30
n = 20
Ad = sparse.random(m, n, density=0.7, format='csc')
b = np.random.randn(m)

# OSQP data
P = sparse.block_diag([sparse.csc_matrix((n, n)), sparse.eye(m)], format='csc')
q = np.zeros(n+m)
A = sparse.vstack([
        sparse.hstack([Ad, -sparse.eye(m)]),
        sparse.hstack([sparse.eye(n), sparse.csc_matrix((n, m))])], format='csc')
l = np.hstack([b, np.zeros(n)])
u = np.hstack([b, np.ones(n)])

# Create an OSQP object
prob = osqp.OSQP()

# Setup workspace
prob.setup(P, q, A, l, u)

res = prob.solve() # Solve problem
print(f"Status: {res.x[:5]}...")  # Show first 5 elements
print(f"Optimal value: {res}")
-----------------------------------------------------------------
           OSQP v1.0.0  -  Operator Splitting QP Solver
              (c) The OSQP Developer Team
-----------------------------------------------------------------
problem:  variables n = 50, constraints m = 50
          nnz(P) + nnz(A) = 500
settings: algebra = Built-in,
          OSQPInt = 4 bytes, OSQPFloat = 8 bytes,
          linear system solver = QDLDL v0.1.8,
          eps_abs = 1.0e-03, eps_rel = 1.0e-03,
          eps_prim_inf = 1.0e-04, eps_dual_inf = 1.0e-04,
          rho = 1.00e-01 (adaptive: 50 iterations),
          sigma = 1.00e-06, alpha = 1.60, max_iter = 4000
          check_termination: on (interval 25, duality gap: on),
          time_limit: 1.00e+10 sec,
          scaling: on (10 iterations), scaled_termination: off
          warm starting: on, polishing: off, 
iter   objective    prim res   dual res   gap        rel kkt    rho         time
   1   0.0000e+00   2.50e+00   4.47e+02  -3.58e+03   4.47e+02   1.00e-01    7.24e-05s
  50   1.4079e+01   1.49e-01   2.73e-04  -1.69e+00   1.49e-01   2.39e+00*   2.35e-04s
  75   1.6487e+01   4.25e-04   2.63e-04  -7.88e-03   4.25e-04   2.39e+00    2.80e-04s

status:               solved
number of iterations: 75
optimal objective:    16.4870
dual objective:       16.4949
duality gap:          -7.8795e-03
primal-dual integral: 3.5820e+03
run time:             2.93e-04s
optimal rho estimate: 3.25e+00

Status: [ 1.16271794e-01 -1.61643557e-04  1.70167161e-01 -3.94214608e-04
 -1.82469015e-04]...
Optimal value: namespace(x=array([ 1.16271794e-01, -1.61643557e-04,  1.70167161e-01, -3.94214608e-04,
       -1.82469015e-04, -6.65480745e-05, -1.17548212e-04, -1.86604664e-04,
       -2.37768087e-04, -4.24806828e-04,  7.55935663e-02, -1.68698146e-04,
       -2.57139565e-04, -2.24347890e-04, -2.01812958e-04,  2.41442430e-05,
       -2.03870128e-04,  3.46764616e-01, -8.07295133e-05, -2.90484448e-04,
        3.29388604e-02, -2.13451312e-01, -1.57046657e+00,  1.35985297e+00,
        6.16781533e-02,  1.23322122e+00, -3.96582518e-01,  9.86134562e-01,
       -7.43102945e-01, -1.19476661e+00,  2.32567427e-01,  1.14871954e+00,
        1.23942885e+00, -8.31436652e-03, -4.92103767e-01, -1.67001256e+00,
        2.85133392e+00,  1.12197044e+00, -5.97523530e-01,  5.32090457e-03,
       -2.56936966e-01, -1.17695555e+00,  4.29516710e-01,  1.74325657e+00,
       -1.30655245e-01,  8.55495862e-01,  4.86696240e-02,  4.88486562e-01,
        1.29894245e+00, -9.68940171e-01]), y=array([ 3.29388604e-02, -2.13451312e-01, -1.57046657e+00,  1.35985297e+00,
        6.16781534e-02,  1.23322122e+00, -3.96582518e-01,  9.86134562e-01,
       -7.43102945e-01, -1.19476661e+00,  2.32567427e-01,  1.14871954e+00,
        1.23942885e+00, -8.31436653e-03, -4.92103767e-01, -1.67001256e+00,
        2.85133392e+00,  1.12197044e+00, -5.97523530e-01,  5.32090469e-03,
       -2.56936966e-01, -1.17695555e+00,  4.29516710e-01,  1.74325657e+00,
       -1.30655245e-01,  8.55495862e-01,  4.86696240e-02,  4.88486562e-01,
        1.29894245e+00, -9.68940172e-01,  0.00000000e+00, -2.09016273e+00,
        0.00000000e+00, -2.82520974e+00, -4.01571752e+00, -1.46825063e+00,
       -1.77059921e+00, -4.09291956e+00, -2.34218193e+00, -2.51005536e+00,
        6.93889390e-19, -4.09335448e+00, -1.89704203e+00, -2.66674720e+00,
       -2.85187950e+00, -3.66497675e-02, -3.02740080e-01,  0.00000000e+00,
       -1.43639144e+00, -2.18811320e+00]), prim_inf_cert=array([2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
       2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
       2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
       2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
       2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
       2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
       2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
       2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
       2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
       2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
       2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
       2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
       2.14328934e+09, 2.14328934e+09]), dual_inf_cert=array([2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
       2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
       2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
       2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
       2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
       2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
       2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
       2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
       2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
       2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
       2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
       2.14328934e+09, 2.14328934e+09, 2.14328934e+09, 2.14328934e+09,
       2.14328934e+09, 2.14328934e+09]), info=namespace(_pybind11_conduit_v1_=<bound method pybind11_detail_function_record_v1_system_libcpp_abi1._pybind11_conduit_v1_ of <osqp.ext_builtin.OSQPInfo object at 0x1230b71f0>>, status='solved', status_val=1, status_polish=0, obj_val=16.4869734917924, dual_obj_val=16.494853011581807, prim_res=0.00042480682772998775, dual_res=0.0002628840560558355, duality_gap=-0.007879519789405265, iter=75, rho_updates=1, rho_estimate=3.247080303250078, setup_time=5.875e-05, solve_time=0.000233875, update_time=0.0, polish_time=0.0, run_time=0.000292625, primdual_int=3581.9988680820356, rel_kkt_error=0.00042480682772998775))

Comparison: quadprog vs OSQP

  • quadprog: Simpler interface, good for small-to-medium problems
  • OSQP: More flexible, handles larger problems, supports warm-starting
  • Performance: OSQP typically faster for problems with > 1000 variables
import time

# Compare quadprog vs OSQP
start = time.time()
res1 = solve_qp(Gmat, avec, Cmat, bvec)
time_quadprog = time.time() - start

start = time.time()
res2 = prob.solve()
time_osqp = time.time() - start

print(f"quadprog: {time_quadprog:.4f}s")
print(f"OSQP: {time_osqp:.4f}s")
iter   objective    prim res   dual res   gap        rel kkt    rho         time
   1   1.6489e+01   3.39e-04   2.10e-04  -6.29e-03   3.39e-04   2.39e+00    3.08e-05s
  25   1.6495e+01   1.52e-06   9.42e-07  -2.82e-05   1.52e-06   2.39e+00    1.40e-04s

status:               solved
number of iterations: 25
optimal objective:    16.4950
dual objective:       16.4950
duality gap:          -2.8229e-05
primal-dual integral: 3.5820e+03
run time:             2.05e-04s
optimal rho estimate: 3.25e+00

quadprog: 0.0001s
OSQP: 0.0003s

Second-Order Cone Programming (SOCP)

There are several libraries for solving SOCPs in Python:

  • Clarabel: An open-source interior-point solver for convex conic optimization problems (including LP, QP, SOCP, and SDP). It is written in Rust with a Python interface and uses a homogeneous embedding algorithm. Clarabel is the modern default solver in CVXPY for conic problems, offering superior stability and performance compared to older open-source solvers.

  • ECOS: The Embedded COnic Solver is a lightweight, efficient numerical software for solving convex second-order cone programs (SOCPs). While it was the standard open-source SOCP solver for many years, it is now largely superseded by Clarabel for general usage.

Cone Programming and Semi-Definite Programming (SDP)

There are several good libraries for solving general cone programs and SDPs in Python:

  • Library MOSEK contains a Python interface to the commercial MOSEK optimization library for large-scale LP, QP, and MIP problems, with strong support for (nonlinear) conic, semidefinite, and other convex tasks; academic licenses are available.

  • Library SCS is a numerical optimization package for large-scale convex cone problems based on operator splitting and the homogeneous self-dual embedding framework.

  • Library Clarabel is an interior-point solver for conic optimization problems (including SOCP and SDP) with Python bindings, designed to be robust and efficient on large sparse cone programs.

Mixed Integer Programming (MIP)

  • Python-MIP: A Python package for modeling and solving Mixed-Integer Linear Programs (MIPs). It provides a unified interface to the CBC (default), HiGHS, and Gurobi solvers. It supports advanced features like cut generation, lazy constraints, MIP starts, and solution pools.

Modeling Language for Convex Optimization (CVXPY)

CVXPY is a Python package that provides an object-oriented modeling language for convex optimization, similar in spirit to CVX, CVXR, YALMIP, and Convex.jl. It allows users to formulate convex optimization problems in a natural mathematical syntax, rather than in the restrictive standard forms required by most solvers.

The user specifies an objective and a set of constraints by combining constants, variables, and parameters using a library of functions with known mathematical properties. CVXPY then applies Disciplined Convex Programming (DCP) to verify the problem’s convexity. Once verified, the problem is converted into a standard conic form using graph implementations and passed to a cone solver such as Clarabel, SCS, or OSQP.

  • Clarabel is the default interior-point solver for general conic problems (LP, SOCP, SDP).
  • OSQP is the default solver for quadratic programs (QP).
  • SCS is used for large-scale conic problems where high precision is less critical.
  • MOSEK is a premier commercial solver (with free academic licenses) widely considered the gold standard for stability and speed on large-scale conic and mixed-integer problems.

See the CVXPY tutorials for details. In the following, we present several illustrative examples.

Least squares

Let's start with a simple LS example: $$ \begin{array}{ll} \underset{\boldsymbol{\beta}}{\textsf{minimize}} & \|\mathbf{y} - \mathbf{X}\boldsymbol{\beta}\|_2^2\\ \end{array} $$

Let's first solve it via Scipy lsq_linear:

import numpy as np
from scipy.optimize import lsq_linear

# generate data
m = 100
n = 10
beta_true = np.arange(10) - 4
X = np.random.normal(size=n*m).reshape((m,-1))
y = X @ beta_true + np.random.normal(size=m)
# estimate x
res = lsq_linear(X, y)
res
       message: The unconstrained solution is optimal.
       success: True
        status: 3
           fun: [-1.111e+00  1.494e+00 ...  2.154e-01 -6.792e-01]
             x: [-4.016e+00 -3.009e+00 -2.134e+00 -6.944e-01 -5.416e-02
                  7.346e-01  1.870e+00  3.207e+00  3.844e+00  5.001e+00]
           nit: 0
          cost: 50.55498554873579
    optimality: 3.552713678800501e-13
   active_mask: [ 0.000e+00  0.000e+00  0.000e+00  0.000e+00  0.000e+00
                  0.000e+00  0.000e+00  0.000e+00  0.000e+00  0.000e+00]
 unbounded_sol: (array([-4.016e+00, -3.009e+00, -2.134e+00, -6.944e-01,
                       -5.416e-02,  7.346e-01,  1.870e+00,  3.207e+00,
                        3.844e+00,  5.001e+00]), array([ 1.011e+02]), np.int32(10), array([ 1.224e+01,  1.170e+01,  1.101e+01,  1.035e+01,
                        1.006e+01,  9.406e+00,  9.177e+00,  9.033e+00,
                        7.789e+00,  7.247e+00]))

Now let's do it with CVXPY:

import cvxpy as cp  # pip install cvxpy
import numpy as np

# define variable
beta = cp.Variable(n)
# define objective
objective = cp.Minimize(cp.sum_squares(X @ beta - y))
# create problem
prob = cp.Problem(objective)
# solve it!
result = prob.solve()
beta.value
array([-4.01608366, -3.00908591, -2.13356382, -0.69438355, -0.05416262,
        0.73455804,  1.86975671,  3.20690015,  3.84395209,  5.00103773])

We can now easily add a constraint to solve a nonnegative LS:

constraints = [beta >= 0]
prob = cp.Problem(objective, constraints)
result = prob.solve()
beta.value
array([-5.26257863e-17, -5.71912473e-17, -7.18335878e-17, -1.11541124e-16,
        4.05790128e-01,  1.60249957e+00,  1.93370680e+00,  2.53939243e+00,
        3.92161170e+00,  5.76468454e+00])

We can add other constraints:

constraint1 = beta[2] + beta[3] <= 0
A = np.diag(np.concatenate([np.array([1, 0, 0]), np.array([1] * 7)]))
print(A)
[[1 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0]
 [0 0 0 1 0 0 0 0 0 0]
 [0 0 0 0 1 0 0 0 0 0]
 [0 0 0 0 0 1 0 0 0 0]
 [0 0 0 0 0 0 1 0 0 0]
 [0 0 0 0 0 0 0 1 0 0]
 [0 0 0 0 0 0 0 0 1 0]
 [0 0 0 0 0 0 0 0 0 1]]
constraint2 = A @ beta >= 0

# define problem and solve it
prob = cp.Problem(objective, constraints = [constraint1, constraint2])
result = prob.solve()
beta.value
array([-8.12587069e-17, -2.55930726e+00, -2.38179431e+00, -7.56294003e-18,
        5.69530272e-01,  1.56146521e+00,  1.83898696e+00,  2.49285673e+00,
        3.68068355e+00,  5.85268505e+00])

Robust Huber regression

Let's consider another simple example of robust regression: $$ \begin{array}{ll} \underset{\boldsymbol{\beta}}{\textsf{minimize}} & \sum\limits_{i=1}^m \phi(y_i - \mathbf{x}_i^T\boldsymbol{\beta}) \end{array} $$ where $$ \phi(u) = \begin{cases} u^2 & \text{if }|u| \le M \\ 2Mu-M^2 & \text{if }|u| > M \end{cases} $$

M = 0.1
beta = cp.Variable(n)
obj = cp.sum(cp.huber(y - X @ beta, M))
cp.Problem(cp.Minimize(obj)).solve()
beta.value
array([-4.17042853, -3.04044983, -2.10513397, -0.71206119, -0.01068557,
        0.76251828,  1.8334574 ,  3.23198934,  3.77406989,  4.90310484])

Elastic net regularization

We will now solve the problem: $$ \begin{array}{ll} \underset{\boldsymbol{\beta}}{\textsf{minimize}} & \frac{1}{2m}\|\mathbf{y} - \mathbf{X}\boldsymbol{\beta}\|_2^2 + \lambda (1-\alpha)\|\boldsymbol{\beta}\|_2^2 + \alpha\| \boldsymbol{\beta} \|_1 \end{array} $$

# define the regularization term
def elastic_reg(beta, lmd = 0, alpha = 0):
  ridge = (1 - alpha) * cp.sum(beta ** 2)
  lasso = alpha * cp.norm(beta, 1)
  return lmd * (lasso + ridge)

# define problem and solve it
alpha = 0.5
lmd = 0.5
beta = cp.Variable(n)
obj = cp.sum((y - X @ beta)**2) + elastic_reg(beta, lmd, alpha)
cp.Problem(cp.Minimize(obj)).solve()
beta.value
array([-4.00380815, -2.99831083, -2.12711484, -0.69208925, -0.05285668,
        0.73604469,  1.86351502,  3.19408019,  3.83259749,  4.9906979 ])

Sparse inverse covariance estimation

CVXPY will automatically select an installed solver, typically Clarabel (for high precision) or SCS (for large-scale/approximate solutions). You can also explicitly specify a solver (e.g., solver=cp.MOSEK).

Note: In recent versions of CVXPY (1.4+), ECOS is being phased out as a required dependency and Clarabel is now the preferred default solver for most conic problems. Clarabel tends to be more robust and faster than ECOS, especially for SOCP and SDP problems.

Consider the matrix-valued convex problem: $$ \begin{array}{ll} \underset{\mathbf{X}}{\textsf{maximize}} & \textsf{logdet}(\mathbf{X}) - \textsf{Tr}(\mathbf{X}\mathbf{S})\\ \textsf{subject to} & \mathbf{X} \succeq \mathbf{0}, \qquad \sum_{i=1}^n\sum_{j=1}^n\left|X_{ij}\right| \le \alpha \end{array} $$

n = 3
np.random.seed(1)
S = np.random.randn(n, n)
S = (S + S.T) / 2  # Make S symmetric
X = cp.Variable((n, n), PSD=True)
obj = cp.log_det(X) - cp.trace(X @ S)
alpha = 1
constr = [cp.sum(cp.abs(X)) <= alpha]
result = cp.Problem(cp.Maximize(obj), constr).solve(solver=cp.SCS)
X.value
array([[ 2.64619429e-01,  1.06538048e-07, -3.78401286e-08],
       [ 1.06538048e-07,  3.31118026e-01,  8.97513152e-08],
       [-3.78401286e-08,  8.97513152e-08,  4.04255112e-01]])
np.linalg.eig(X.value)[0] # compute the eigenvalues of X to verify it is PD
array([0.26461943, 0.33111803, 0.40425511])

When specifying a solver, we should make sure that it is installed. For example, to use SCS:

# Check available solvers
print("Available solvers:", cp.installed_solvers())
Available solvers: ['CLARABEL', 'CVXOPT', 'GLPK', 'GLPK_MI', 'HIGHS', 'OSQP', 'SCIPY', 'SCS']

Portfolio Optimization

Consider the following mean-variance portfolio optimization problem: $$ \begin{array}{ll} \underset{\boldsymbol{w}}{\text{maximize}} & \boldsymbol{w}^\mathsf{T}\boldsymbol{\mu} - \frac{1}{2}\boldsymbol{w}^\mathsf{T}\boldsymbol{\Sigma}\boldsymbol{w}\\ \text{subject to} & \boldsymbol{1}^\mathsf{T}\boldsymbol{w}=1, \quad \boldsymbol{w}\ge\boldsymbol{0} \end{array} $$

We will first prepare the data and then we will solve it via three different methods:

  1. Using quadprog
  2. Using CVXPY
  3. Using skfolio

Examples

# Mean-Variance Portfolio Optimization Example
# Problem: maximize mu^T * w - (lambda/2) * w^T * Sigma * w
#          subject to: sum(w) = 1, w >= 0
# Equivalently: minimize -mu^T * w + (lambda/2) * w^T * Sigma * w

import numpy as np
from sklearn.model_selection import train_test_split
from skfolio.datasets import load_sp500_dataset
from skfolio.preprocessing import prices_to_returns

# Load and prepare data (using 5 assets for clarity)
prices = load_sp500_dataset()
prices = prices.iloc[:, :5]  # Select first 5 assets
X = prices_to_returns(prices)
X_train, X_test = train_test_split(X, test_size=0.33, shuffle=False)

# Calculate mean returns and covariance matrix
mu = X_train.mean().values  # Expected returns
Sigma = X_train.cov().values  # Covariance matrix
n = len(Sigma)  # Number of assets
lmbda = 1.0  # Risk aversion parameter

Mean-Variance Portfolio via quadprog

# --- Using quadprog ---
from quadprog import solve_qp

# minimize 1/2 * x^T * G * x - a^T * x
# subject to: C^T * x >= b
G = lmbda * Sigma.copy()
a = mu.copy()
C = np.vstack([np.ones(n), np.eye(n)]).T  # [sum=1, w>=0]
b = np.hstack([1, np.zeros(n)])

weights_quadprog = solve_qp(G, a, C, b, meq=1)[0]  # meq=1: first constraint is equality

print("Weights (quadprog):")
print(np.round(weights_quadprog, 4))
print(f"Sum: {weights_quadprog.sum():.6f}\n")
Weights (quadprog):
[ 0.417 -0.    -0.     0.583  0.   ]
Sum: 1.000000

Mean-Variance Portfolio via CVXPY

# --- Using CVXPY ---
import cvxpy as cp

w = cp.Variable(n)
objective = cp.Minimize(-mu @ w + (lmbda/2) * cp.quad_form(w, Sigma))
constraints = [cp.sum(w) == 1, w >= 0]
problem = cp.Problem(objective, constraints)
problem.solve()

weights_cvxpy = w.value

print("Weights (CVXPY):")
print(np.round(weights_cvxpy, 4))
print(f"Sum: {weights_cvxpy.sum():.6f}\n")
Weights (CVXPY):
[ 0.417 -0.    -0.     0.583 -0.   ]
Sum: 1.000000

Mean-Variance Portfolio via skfolio

# --- Using skfolio ---
from skfolio.optimization import MeanRisk, ObjectiveFunction
from skfolio import RiskMeasure

model = MeanRisk(
    objective_function=ObjectiveFunction.MAXIMIZE_UTILITY,  # uses mean - λ·risk
    risk_measure=RiskMeasure.VARIANCE,
    risk_aversion=lmbda / 2.0,
    min_weights=0.0,            # long-only
    budget=1.0                  # weights sum to 1
)
model.fit(X_train)

weights_skfolio = model.weights_

print("Weights (skfolio):")
print(np.round(weights_skfolio, 4))
print(f"Sum: {weights_skfolio.sum():.6f}\n")
Weights (skfolio):
[0.417 0.    0.    0.583 0.   ]
Sum: 1.000000

Comparison

# Verify all methods give same results
print("\nVerification:")
print(f"quadprog vs CVXPY: {np.allclose(weights_quadprog, weights_cvxpy, atol=1e-3)}")
print(f"CVXPY vs skfolio: {np.allclose(weights_cvxpy, weights_skfolio, atol=1e-3)}")
print(f"quadprog vs skfolio: {np.allclose(weights_quadprog, weights_skfolio, atol=1e-3)}")
Verification:
quadprog vs CVXPY: True
CVXPY vs skfolio: True
quadprog vs skfolio: True

Numerical Stability: QP vs SOCP Formulation

For mean-variance portfolio optimization, there are two equivalent formulations:

Standard QP formulation: $$ \begin{array}{ll} \text{maximize} \quad & \boldsymbol{\mu}^\mathsf{T}\boldsymbol{w} - \frac{\lambda}{2}\boldsymbol{w}^\mathsf{T}\boldsymbol{\Sigma}\boldsymbol{w}\\ \text{subject to} & \boldsymbol{1}^\mathsf{T}\boldsymbol{w}=1, \quad \boldsymbol{w}\ge\boldsymbol{0} \end{array} $$

SOCP reformulation (more numerically stable): $$ \begin{array}{ll} \text{maximize} \quad & \boldsymbol{\mu}^\mathsf{T}\boldsymbol{w} - \frac{\lambda}{2}\sigma^2\\ \text{subject to} & \|\boldsymbol{L}\boldsymbol{w}\|_2 \le \sigma\\ & \boldsymbol{1}^\mathsf{T}\boldsymbol{w}=1, \quad \boldsymbol{w}\ge\boldsymbol{0} \end{array} $$

where $\boldsymbol{L}$ is the Cholesky factor of $\boldsymbol{\Sigma}$ (i.e., $\boldsymbol{L}\boldsymbol{L}^\mathsf{T} = \boldsymbol{\Sigma}$).

Why SOCP is more stable: In the QP formulation, the covariance matrix $\boldsymbol{\Sigma}$ appears directly in the quadratic term. When $\boldsymbol{\Sigma}$ is ill-conditioned (has very small eigenvalues), the KKT linear systems become highly sensitive to numerical errors. The SOCP formulation uses the Cholesky factor $\boldsymbol{L}$ explicitly and enforces the risk constraint via a second-order cone $\|\boldsymbol{L}\boldsymbol{w}\|_2 \le \sigma$, which typically improves numerical scaling. Modern conic solvers like Clarabel handle SOCP constraints very reliably.

Domain-specific libraries like skfolio exploit this reformulation automatically for better robustness with real-world covariance estimates.

Conclusion

The solvers available in Python should be sufficient for most optimization problems in finance.

Decision Tree for Solver Selection

  1. Is your problem convex?

    • YES: Start with CVXPY for rapid prototyping. It allows you to write the problem in a natural mathematical form.
    • NO: Use scipy.optimize.minimize (general nonlinear) with methods like SLSQP or trust-constr. For global optimization, consider scipy.optimize.differential_evolution.
  2. Need better performance or large-scale solving?

    • LP: Use scipy.optimize.linprog or MOSEK.
    • QP: Use OSQP (great for sparse/large problems) or MOSEK. quadprog is good for small, dense problems.
    • SOCP / SDP: Use Clarabel (modern open-source default) or MOSEK (commercial, robust). SCS is useful for very large problems where lower precision is acceptable.
    • MIP: Use HiGHS (via scipy or mip library) or MOSEK.
  3. Need global optimization?

    • Use scipy.optimize.basinhopping or scipy.optimize.differential_evolution.

Performance Tips

  • Provide analytical gradients when possible (2-10x speedup).
  • For large problems ($n > 1000$), avoid dense matrix operations; use specialized solvers like OSQP or Clarabel.
  • Warm-start iterative solvers (like OSQP or SCS) when solving similar problems repeatedly (e.g., in a backtest loop).