Python Code for Portfolio Optimization
Chapter 13 – Index Tracking Portfolios

Daniel P. Palomar (2025). Portfolio Optimization: Theory and Application. Cambridge University Press.

Last update: March 24, 2025

Contributors:

Libraries¶

The following libraries are used in the examples:

# Core numerical computing
import numpy as np
import pandas as pd
from typing import Dict, Tuple, List, Callable
import warnings
warnings.filterwarnings('ignore')

# For financial data
import yfinance as yf       # Loading financial data
import empyrical as emp     # Performance metrics

# Book data (pip install "git+https://github.com/dppalomar/pob.git#subdirectory=python")
from pob_python import SP500_stocks_2015to2020, SP500_index_2015to2020

# Plotting
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import seaborn as sns
sns.set_theme(style="darkgrid")

# Optimization
import cvxpy as cp                  # interface for convex optimization solvers
from qpsolvers import solve_qp

Helper Functions¶

Let's define the core backtest function:

def backtest(
        portfolio_funcs: Dict[str, callable],
        dataset: Dict[str, pd.DataFrame],
        lookback: int,
        rebalance_every: int = 1,
        optimize_every: int = 1,
        cost_bps: float = 0,
        verbose: bool = True
) -> Tuple[pd.DataFrame, Dict[str, pd.DataFrame]]:
    """
    Backtest portfolio strategies for index tracking.

    Parameters:
    portfolio_funcs: Dictionary mapping strategy names to portfolio weight functions
    dataset: Dictionary containing 'prices' (asset prices) and 'index' (benchmark prices)
    lookback: Number of periods to use for estimation
    rebalance_every: Number of periods between rebalancing
    optimize_every: Number of periods between optimization
    cost_bps: Transaction costs in basis points
    verbose: Whether to print progress messages

    Returns:
    Tuple of (returns DataFrame, weights dictionary)
    """
    stock_prices = dataset['prices']
    index_prices = dataset['index'] if dataset['index'] is not None else None
    T, N = stock_prices.shape
    portf_rets = {}
    ws = {}

    for portfolio_name, portfolio_func in portfolio_funcs.items():
        if verbose:
            print(f"Backtesting portfolio '{portfolio_name}'...")

        # Initialize variables
        current_w = np.zeros(N)
        w = pd.DataFrame(index=stock_prices.index, columns=stock_prices.columns)
        portf_ret = pd.Series(index=stock_prices.index)
        portf_ret.name = portfolio_name

        for t in range(lookback, T):
            # Rebalance portfolio if necessary
            if (t - lookback) % rebalance_every == 0:
                # Reoptimize portfolio if necessary
                if (t - lookback) % optimize_every == 0:
                    new_w = portfolio_func({
                        'prices': stock_prices.iloc[t-lookback:t],
                        'index': index_prices.iloc[t-lookback:t] if index_prices is not None else None
                    })
                    if verbose:
                        print(f"  Reoptimizing portfolio at time {t} (cardinality={sum(new_w > 1e-5)})")
                # Calculate turnover and transaction costs
                turnover = np.abs(new_w - current_w).sum()
                current_w = new_w
                transaction_cost = turnover * cost_bps / 10000
            else:
                transaction_cost = 0

            # Store weights
            w.iloc[t] = current_w

            # Calculate portfolio return for this period
            Xt = stock_prices.iloc[t]/stock_prices.iloc[t-1] - 1
            period_return = (Xt * current_w).sum()
            portf_ret.iloc[t] = period_return - transaction_cost

            # Update normalized weights based on asset performance
            current_w = current_w * (1 + Xt)
            current_w = current_w / current_w.sum() if current_w.sum() != 0 else current_w

        # Keep a record (remove initial NaNs)
        portf_rets[portfolio_name] = portf_ret[lookback:]
        ws[portfolio_name] = w[lookback:]

    # Combine all portfolio returns into a single DataFrame
    portf_rets_df = pd.DataFrame(portf_rets)

    return portf_rets_df, ws

Some math definitions for the equations:

$\def\bm#1{\boldsymbol{#1}}$ $\def\textm#1{\textsf{#1}}$ $\def\T{{\mkern-2mu\raise-1mu\mathsf{T}}}}$ $\def\R{\mathbb{R}}$ $\def\E{\rm I\kern-.2em E}$ $\def\w{\bm{w}}$ $\def\bmu{\bm{\mu}}}$ $\def\bSigma{\bm{\Sigma}}}$

What is index tracking?¶

ndex tracking is a portfolio management strategy that aims to replicate the performance of a specific market index. Let's visualize how the SPDR S&P 500 ETF (SPY) tracks the S&P 500 index:

# Download S&P 500 index and SPY ETF data from Yahoo Finance
SP500_prices = yf.download("^GSPC", start="2007-01-01", end=None)['Close']
SPY_prices = yf.download("SPY", start="2007-01-01", end=None)['Close']

# Combine into a single DataFrame
SP500_SPY_prices = SP500_prices.merge(SPY_prices, left_index=True, right_index=True, how='outer')
SP500_SPY_prices.columns = ['S&P 500', 'SPY']

# Plot the data
plt.figure(figsize=(12, 6))
for col in SP500_SPY_prices.columns:
    plt.plot(SP500_SPY_prices.index, SP500_SPY_prices[col], linewidth=1.0, label=col)

plt.yscale('log')
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%Y'))
plt.gca().xaxis.set_major_locator(mdates.YearLocator())
plt.gca().xaxis.set_minor_locator(mdates.MonthLocator())
plt.title("Tracking of S&P 500 index")
plt.ylabel("dollars")
plt.xlabel(None)
plt.legend(title=None)
plt.tight_layout()
plt.show()

YF.download() has changed argument auto_adjust default to True

Sparse index tracking¶

Sparse index tracking is a type of sparse regression problem formulated as $$ \begin{array}{ll} \underset{\w}{\textm{minimize}} & \frac{1}{T}\left\|\bm{r}^\textm{b} - \bm{X}\w\right\|^2_2 + \lambda \|\w\|_0\\ \textm{subject to} & \w \in \mathcal{W}, \end{array} $$ where the matrix $\bm{X}$ contains the returns of the constituent assets, the vector $\bm{r}^\textm{b}$ contains the returns of the index to replicate, the parameter $\lambda$ controls the level of sparsity in the portfolio, and $\mathcal{W}$ is the constraint set which could be $\{\w \mid \bm{1}^\T\w=1, \w\ge\bm{0} \}$.

To design sparse tracking portfolios, we define a function with the simple iterative code based on the MM (Majorization-Minimization) algorithm applied to index tracking:

def sparse_tracking_via_MM(X: pd.DataFrame, r: pd.DataFrame, lambda_val: float,
                           w0: np.ndarray = None, max_iter: int = 5,
                           thres: float = 1e-9, p: float = 1e-3, u: float = 1, reg: float = 1e-7,
                           verbose: bool = False) -> np.ndarray | List:
    """
    Sparse index tracking via MM algorithm.

    Parameters:
    X: DataFrame of asset returns
    r: Series of benchmark returns
    lambda_val: Regularization parameter for sparsity
    w0: Initial weights (optional)
    max_iter: Maximum number of iterations
    thres: Threshold for weight pruning
    p, u: Parameters for log penalty
    verbose: Whether to return objective values

    Returns:
    Optimal weights or dictionary with weights and objective values
    """
    # Convert to numpy arrays
    X_mat = X.values
    r_vec = r.values.reshape(-1, 1)
    T, N = X_mat.shape

    # Initialize weights
    if w0 is None:
        w = np.ones(N) / N
    else:
        w = w0.copy()

    # Initialize objective value tracking if verbose
    obj_values = []
    if verbose:
        obj_val = (1/T) * np.linalg.norm(r_vec - X_mat @ w.reshape(-1, 1))**2 + \
                 lambda_val * np.sum(np.log(1 + np.abs(w)/p)/np.log(1 + u/p))
        obj_values.append(obj_val)

    # MM loop
    for k in range(max_iter):
        #print(f"Iteration {k+1}")

        # Calculate weights for MM update
        d = (1 / (np.log(1 + u/p) * (p + np.abs(w)))).reshape(-1, 1)
        w_prev = w.copy()

        # # Solve the convex subproblem using CVXPY
        # w_var = cp.Variable(N)
        # objective = cp.Minimize((1/T) * cp.sum_squares(r_vec - X_mat @ w_var) +
        #                        lambda_val * cp.sum(cp.multiply(d, cp.abs(w_var))))
        # constraints = [cp.sum(w_var) == 1, w_var >= 0]
        # problem = cp.Problem(objective, constraints)

        # Solve the convex subproblem using quadprog
        P = 2 * (1/T) * X_mat.T @ X_mat
        P += reg * np.mean(np.diag(P)) * np.eye(N)  # Add small regularization
        q = -2 * (1/T) * X_mat.T @ r_vec + lambda_val * d
        G = np.vstack((-np.ones(N), -np.eye(N)))
        h = np.concatenate(([-1], np.zeros(N)))
        A = np.ones((1, N))
        b = np.array([1.0])
        w = solve_qp(P, q, G, h, A, b, solver="quadprog")
        if w is None:
            print("    QP solver failed, increasing the matrix regularization factor...")
            w = w_prev
            reg *= 2
            continue

        # Track objective value if verbose
        if verbose:
            obj_val = (1/T) * np.linalg.norm(r_vec - X_mat @ w.reshape(-1, 1))**2 + \
                     lambda_val * np.sum(np.log(1 + np.abs(w)/p)/np.log(1 + u/p))
            obj_values.append(obj_val)

        # Check convergence
        if np.linalg.norm(w - w_prev) / np.linalg.norm(w_prev) < 1e-3:
            break

    # Threshold small weights
    w[w < thres] = 0

    # Normalize weights to sum to 1
    if np.sum(w) > 0:
        w = w / np.sum(w)

    if verbose:
        return w, obj_values
    else:
        return w

Simple sanity check:

X = SP500_stocks_2015to2020.iloc[:, :10].pct_change().dropna()
r = SP500_index_2015to2020.pct_change().dropna()

sparse_tracking_via_MM(X.iloc[:, :10], r, lambda_val=5e-7)

array([0.12490579, 0.04051468, 0.08009517, 0.16234165, 0.0636828 ,
       0.07883649, 0.        , 0.11939667, 0.24794548, 0.08228126])

Fixed versus time-varying portfolio¶

The tracking error (TE) of a fixed portfolio $\w_t=\w$ is $$ \textm{TE} = \frac{1}{T}\left\|\bm{r}^\textm{b} - \bm{X}\w\right\|^2_2, $$ where $\bm{X} \in \R^{T\times N}$ is the matrix containing the $T$ return vectors $\bm{r}_t$ of the $N$ constituent assets along the rows and $\bm{r}^\textm{b} \in \R^{T}$ is the vector containing the $T$ returns of the index.

Now, it the portfolio is not rebalanced, it will naturally evolve over time as $$ \w_t = \frac{\w_{t-1} \odot \left(\bm{1} + \bm{r}_t\right)}{\w_{t-1}^\T \left(\bm{1} + \bm{r}_t\right)} \approx \w_{0} \odot \bm{\alpha}_t, $$ where $\w_{0}$ is the initial portfolio and $$ \bm{\alpha}_t = \prod_{t'=1}^{t}\frac{\bm{1} + \bm{r}_{t'}}{1 + r_{t'}^\textm{b}} $$ denotes weights based on the cumulative returns. The resulting tracking error can be conveniently written as $$ \textm{TE}^\textm{time-varying} = \frac{1}{T}\left\|\bm{r}^\textm{b} - \tilde{\bm{X}}\w\right\|^2_2, $$ where now $\tilde{\bm{X}}$ contains the weighted returns $\tilde{\bm{r}}_t = \bm{r}_t \odot \bm{\alpha}_{t-1}$ along the rows and $\w$ denotes the initial portfolio $\w_0$.

The code to compute $\bm{\alpha}_{t}$ is straightforward:

def compute_alpha(X, r):
    """
    Compute alpha values for time-varying portfolio weights.

    Parameters:
    X: DataFrame of asset returns
    r: Series of benchmark returns

    Returns:
    Dictionary containing alpha values and related calculations
    """
    X_cum = (1 + X).cumprod()
    r_cum = (1 + r).cumprod()

    # Convert r_cum to array for division
    r_cum_array = r_cum.values.reshape(-1, 1)

    # Calculate alpha
    alpha = X_cum.divide(r_cum_array)

    # Calculate lagged alpha (shift by 1)
    alpha_lagged = alpha.shift(1)
    alpha_lagged.iloc[0] = 1

    # Calculate X_tilde
    X_tilde = X.multiply(alpha_lagged.values)

    return {
        'alpha': alpha,
        'alpha_lagged': alpha_lagged,
        'last_alpha_lagged': alpha_lagged.iloc[-1].values,
        'X_tilde': X_tilde
    }

We now perform a backtest to assess the tracking error over time with $K=20$ active assets under the approximation (or assumption) of a fixed portfolio $\w$ and with the more accurate approximation of a time-varying portfolio, which shows improved results.

def tracking_using_plain_returns_wfixed(dataset, lambda_val=5e-7):
    """Index tracking assuming fixed portfolio weights."""
    # Calculate returns
    X = dataset['prices'].pct_change()
    r = dataset['index'].pct_change()

    # Remove NA's
    common_valid_indices = X.dropna().index.intersection(r.dropna().index)
    X = X.loc[common_valid_indices]
    r = r.loc[common_valid_indices]

    # Get optimal weights
    w = sparse_tracking_via_MM(X, r, lambda_val=lambda_val, max_iter=10)
    return w

def tracking_using_plain_returns_wt(dataset, lambda_val=5e-7):
    """Index tracking assuming time-varying portfolio weights."""
    # Calculate returns
    X = dataset['prices'].pct_change()
    r = dataset['index'].pct_change()

    # Remove NA's
    common_valid_indices = X.dropna().index.intersection(r.dropna().index)
    X = X.loc[common_valid_indices]
    r = r.loc[common_valid_indices]

    # Compute alpha
    alpha = compute_alpha(X, r)

    # Get optimal weights
    w0 = sparse_tracking_via_MM(alpha['X_tilde'], r, lambda_val=lambda_val, max_iter=10)

    # Adjust weights by last alpha
    w = w0 * alpha['last_alpha_lagged']

    # Normalize weights
    if np.sum(w) > 0:
        w = w / np.sum(w)
    return w

Some sanity checks:

w = tracking_using_plain_returns_wfixed(dataset={
        'prices': SP500_stocks_2015to2020.loc["2018":],
        'index': SP500_index_2015to2020.loc["2018":]
    })
sum(w), min(w)

(1.0, 0.0)

w = tracking_using_plain_returns_wt(dataset={
        'prices': SP500_stocks_2015to2020.loc["2018":],
        'index': SP500_index_2015to2020.loc["2018":]
    })
sum(w), min(w)

(1.0000000000000002, 0.0)

Let's run the backtest:

# Prep data
T, N = SP500_stocks_2015to2020.shape
T_trn = 2*252

# Run backtest
returns, weights = backtest(
    portfolio_funcs={
        "Tracking assuming fixed portfolio": tracking_using_plain_returns_wfixed,
        "Tracking assuming time-varying portfolio": tracking_using_plain_returns_wt
    },
    dataset={
        'prices': SP500_stocks_2015to2020,  #.loc["2018":]
        'index': SP500_index_2015to2020  # .loc["2018":]
    },
    lookback=T_trn,
    rebalance_every=6*21,  # Rebalance every 6 months (assuming 21 trading days per month)
    optimize_every=6*21,
)

Backtesting portfolio 'Tracking assuming fixed portfolio'...
  Reoptimizing portfolio at time 504 (cardinality=21)
  Reoptimizing portfolio at time 630 (cardinality=27)
  Reoptimizing portfolio at time 756 (cardinality=22)
  Reoptimizing portfolio at time 882 (cardinality=17)
  Reoptimizing portfolio at time 1008 (cardinality=22)
  Reoptimizing portfolio at time 1134 (cardinality=21)
  Reoptimizing portfolio at time 1260 (cardinality=20)
  Reoptimizing portfolio at time 1386 (cardinality=27)
Backtesting portfolio 'Tracking assuming time-varying portfolio'...
  Reoptimizing portfolio at time 504 (cardinality=24)
  Reoptimizing portfolio at time 630 (cardinality=24)
  Reoptimizing portfolio at time 756 (cardinality=20)
    QP solver failed, increasing the matrix regularization factor...
  Reoptimizing portfolio at time 882 (cardinality=20)
  Reoptimizing portfolio at time 1008 (cardinality=22)
  Reoptimizing portfolio at time 1134 (cardinality=22)
  Reoptimizing portfolio at time 1260 (cardinality=23)
  Reoptimizing portfolio at time 1386 (cardinality=26)

We can then plot the tracking error and cardinality over time:

def compute_tracking_error(returns: pd.DataFrame, benchmark: pd.DataFrame) -> pd.DataFrame:
    benchmark_rets = benchmark.pct_change().dropna()

    # Align the indices
    common_idx = returns.index.intersection(benchmark_rets.index)
    returns_aligned = returns.loc[common_idx]
    benchmark_rets_aligned = benchmark_rets.loc[common_idx]

    # Calculate tracking error
    tracking_errors = (returns_aligned.subtract(benchmark_rets_aligned.iloc[:, 0], axis=0)) ** 2

    return tracking_errors.mean()

def plot_tracking_error(returns: pd.DataFrame, benchmark: pd.DataFrame, window: int = 252) -> pd.DataFrame:
    """
    Plot tracking error over time.

    Parameters:
    returns: DataFrame of portfolio returns
    benchmark: Series of benchmark prices (will be converted to returns)
    window: Rolling window size for smoothing
    """
    benchmark_rets = benchmark.pct_change().dropna()

    # Align the indices
    common_idx = returns.index.intersection(benchmark_rets.index)
    returns_aligned = returns.loc[common_idx]
    benchmark_rets_aligned = benchmark_rets.loc[common_idx]

    # Calculate tracking error
    tracking_errors = (returns_aligned.subtract(benchmark_rets_aligned.iloc[:, 0], axis=0)) ** 2

    # Apply rolling mean
    smoothed_te = tracking_errors.rolling(window=window).mean()

    plt.figure(figsize=(12, 6))
    smoothed_te.plot()
    plt.title(f'Tracking Error (Rolling {window}-day window)')
    plt.xlabel(None)
    plt.ylabel('Tracking Error')
    plt.legend(title="Portfolio")
    plt.grid(True)
    plt.show()

    return tracking_errors

TE = plot_tracking_error(returns, benchmark=SP500_index_2015to2020, window=252)

<Figure size 1200x600 with 0 Axes>

def compute_cardinality(weights: Dict[str, pd.DataFrame]) -> pd.DataFrame:
    """Plot portfolio cardinality (number of assets with non-zero weights)."""
    cardinality = {}

    for name, w in weights.items():
        cardinality[name] = (w > 1e-5).sum(axis=1)

    cardinality_df = pd.DataFrame(cardinality)

    return cardinality_df.mean()

def plot_cardinality(weights: Dict[str, pd.DataFrame]) -> pd.DataFrame:
    """Plot portfolio cardinality (number of assets with non-zero weights)."""
    cardinality = {}

    for name, w in weights.items():
        cardinality[name] = (w > 1e-5).sum(axis=1)

    cardinality_df = pd.DataFrame(cardinality)

    plt.figure(figsize=(12, 6))
    cardinality_df.plot()
    plt.title('Portfolio Cardinality Over Time')
    plt.xlabel(None)
    plt.ylabel('Number of Assets')
    plt.legend(title="Portfolio")
    plt.grid(True)
    plt.show()

    return cardinality_df

cardinality = plot_cardinality(weights)

<Figure size 1200x600 with 0 Axes>

# Calculate column means for TE and cardinality
summary_df = pd.DataFrame({
    "TE": TE.mean(),
    "cardinality": cardinality.mean()
})

# Display
summary_df.style.format({
    'TE': '{:.2e}',  # Scientific notation with 6 decimal places
    'cardinality': '{:.2f}'  # Fixed-point with 2 decimal places
})

	TE	cardinality
Tracking assuming fixed portfolio	4.63e-06	21.75
Tracking assuming time-varying portfolio	3.71e-06	22.37

Plain versus cumulative returns¶

Minimizing errors in period-by-period returns does not guarantee better long-term cumulative return tracking, as return errors can accumulate unevenly over time. To control long-term cumulative return deviations, the error measure should use long-term or cumulative returns. However, for short-term hedging, period-by-period return tracking is essential, as outperformance is not the objective.

The price error can be measured by the tracking error in terms of the cumulative returns as $$ \textm{TE}^\textm{cum} = \frac{1}{T}\left\|\bm{r}^\textm{b,cum} - \bm{X}^\textm{cum}\w\right\|^2_2, $$ where $\bm{X}^\textm{cum}$ (similarly $\bm{r}^\textm{b,cum}$) denotes the cumulative returns, whose rows can be obtained as $$ \bm{r}^\textm{cum}_t \approx \sum_{t'=1}^{t}\bm{r}_{t'}. $$

We now perform a backtest to assess the tracking error over time using plain returns and cumulative returns:

def tracking_using_cumreturns_wfixed(dataset, lambda_val=20e-7):
    """Index tracking using cumulative returns."""
    # Calculate returns
    X = dataset['prices'].pct_change()
    r = dataset['index'].pct_change()

    # Remove NA's
    common_valid_indices = X.dropna().index.intersection(r.dropna().index)
    X = X.loc[common_valid_indices]
    r = r.loc[common_valid_indices]

    # Calculate cumulative returns
    X_cum = X.cumsum()
    r_cum = r.cumsum()

    # Get optimal weights
    w = sparse_tracking_via_MM(X_cum, r_cum, lambda_val=lambda_val, max_iter=10)
    return w


def tracking_using_cumreturns_wt(dataset, lambda_val=15e-7):
    """Index tracking assuming time-varying portfolio weights."""
    # Calculate returns
    X = dataset['prices'].pct_change()
    r = dataset['index'].pct_change()

    # Remove NA's
    common_valid_indices = X.dropna().index.intersection(r.dropna().index)
    X = X.loc[common_valid_indices]
    r = r.loc[common_valid_indices]

    # Compute alpha
    alpha = compute_alpha(X, r)

    # Calculate cumulative returns
    X_cum = alpha['X_tilde'].cumsum()
    r_cum = r.cumsum()

    # Get optimal weights
    w0 = sparse_tracking_via_MM(X_cum, r_cum, lambda_val=lambda_val, max_iter=10)

    # Adjust weights by last alpha
    w = w0 * alpha['last_alpha_lagged']

    # Normalize weights
    if np.sum(w) > 0:
        w = w / np.sum(w)
    return w

Let's run the backtest:

# Prep data
T, N = SP500_stocks_2015to2020.shape
T_trn = 2*252

# Run backtest
returns, weights = backtest(
    portfolio_funcs={
        "Tracking using plain returns": tracking_using_plain_returns_wfixed,
        "Tracking using cumulative returns": tracking_using_cumreturns_wfixed
    },
    dataset={
        'prices': SP500_stocks_2015to2020,  #.loc["2018":]
        'index': SP500_index_2015to2020  # .loc["2018":]
    },
    lookback=T_trn,
    rebalance_every=6*21,  # Rebalance every 6 months (assuming 21 trading days per month)
    optimize_every=6*21,
)

Backtesting portfolio 'Tracking using plain returns'...
  Reoptimizing portfolio at time 504 (cardinality=21)
  Reoptimizing portfolio at time 630 (cardinality=27)
  Reoptimizing portfolio at time 756 (cardinality=22)
  Reoptimizing portfolio at time 882 (cardinality=17)
  Reoptimizing portfolio at time 1008 (cardinality=22)
  Reoptimizing portfolio at time 1134 (cardinality=21)
  Reoptimizing portfolio at time 1260 (cardinality=20)
  Reoptimizing portfolio at time 1386 (cardinality=27)
Backtesting portfolio 'Tracking using cumulative returns'...
  Reoptimizing portfolio at time 504 (cardinality=19)
  Reoptimizing portfolio at time 630 (cardinality=19)
  Reoptimizing portfolio at time 756 (cardinality=18)
  Reoptimizing portfolio at time 882 (cardinality=17)
  Reoptimizing portfolio at time 1008 (cardinality=22)
  Reoptimizing portfolio at time 1134 (cardinality=24)
  Reoptimizing portfolio at time 1260 (cardinality=23)
  Reoptimizing portfolio at time 1386 (cardinality=24)

We can then plot the (normalized) price tracked over time:

def plot_prices(returns: pd.DataFrame, benchmark: pd.DataFrame) -> None:
    NAV = (1 + returns).cumprod()
    benchmark = benchmark.loc[NAV.index]
    B0 = benchmark.iloc[0,0]
    all_prices = pd.concat([benchmark, B0 * NAV], axis=1)

    # Plot
    plt.figure(figsize=(12, 6))
    all_prices.plot()
    plt.title(f'Tracking of S&P 500 index')
    plt.xlabel(None)
    plt.ylabel('dollars')
    plt.legend(title="Portfolio")
    plt.grid(True)
    plt.show()

plot_prices(returns, benchmark=SP500_index_2015to2020)

<Figure size 1200x600 with 0 Axes>

Convergence of the iterative reweighted $\ell_1$-norm method¶

The convergence of the iterative reweighted $\ell_1$-norm method (based on the MM framework) for sparse index tracking is very fast:

# Prep data
X = SP500_stocks_2015to2020.pct_change().dropna()
r = SP500_index_2015to2020.pct_change().dropna()

# Run index tracking MM algorithmn
w, obj_values = sparse_tracking_via_MM(X, r, lambda_val=5e-7, max_iter=10, verbose=True)

# Plot convergence over iterations
last_iter = len(obj_values) - 1
iterations = list(range(last_iter + 1))

plt.figure(figsize=(10, 6))
plt.plot(iterations, obj_values, linewidth=1.2, color="blue")
plt.title("Convergence over iterations")
plt.xlabel("k")
plt.ylabel("Objective Value")
plt.xticks(ticks=iterations)
plt.grid(True)
plt.show()

Comparison of algorithms¶

We now compare the tracking of the S&P 500 index based on different tracking methods, namely:

the naive two-step approach with proportional weights to the index definition;
the two-step approach with refitted weights;
the iterative reweighted $\ell_1$-norm method; and
the MIP formulation (although the computational complexity is too high).

We now perform a backtest to assess the tracking error over time of the S&P 500 index with $K=20$ active assets for different algorithms. The tracking portfolios are computed on a rolling window basis with a lookback period of two years and recomputed every six months.

def full_tracking(dataset):
    """Full index tracking (dense portfolio)."""
    # Calculate returns
    X = dataset['prices'].pct_change().dropna()
    r = dataset['index'].pct_change().dropna()

    # Remove NA's
    common_valid_indices = X.index.intersection(r.index)
    X = X.loc[common_valid_indices]
    r = r.loc[common_valid_indices]

    # Convert to numpy arrays
    X_mat = X.values
    r_vec = r.values.reshape(-1, 1)
    T, N = X_mat.shape

    # # Solve the convex subproblem using CVXPY
    # N = X_mat.shape[1]
    # w = cp.Variable(N)
    # objective = cp.Minimize(cp.sum_squares(X_mat.values @ w - r_vec.values))
    # constraints = [cp.sum(w) == 1, w >= 0]

    # Solve the convex subproblem using quadprog
    P = 2 * (1 / T) * X_mat.T @ X_mat  # Quadratic term matrix
    P += 1e-7 * np.mean(np.diag(P)) * np.eye(N)  # Regularization to ensure positive definiteness
    q = -2 * (1 / T) * X_mat.T @ r_vec  # Linear term vector
    G = -np.eye(N)  # Non-negativity constraint (w >= 0)
    h = np.zeros(N)
    A = np.ones((1, N))  # Equality constraint (sum(w) == 1)
    b = np.array([1.0])
    w = solve_qp(P, q.flatten(), G, h, A, b, solver="quadprog")
    return w

Sanity check:

w = full_tracking(dataset={
        'prices': SP500_stocks_2015to2020.loc["2018":],
        'index': SP500_index_2015to2020.loc["2018":]
    })
sum(w), min(w)

(0.9999999999999993, -1.9709701462025454e-17)

def two_stage_naive_tracking(dataset, K=20):
    """Two-stage tracking with proportional weights."""
    # Calculate returns
    X = dataset['prices'].pct_change().dropna()
    r = dataset['index'].pct_change().dropna()

    # Remove NA's
    common_valid_indices = X.index.intersection(r.index)
    X = X.loc[common_valid_indices]
    r = r.loc[common_valid_indices]

    # Convert to numpy arrays
    X_mat = X.values
    r_vec = r.values.reshape(-1, 1)
    T, N = X_mat.shape

    # Step 1: Get dense portfolio
    P = 2 * (1 / T) * X_mat.T @ X_mat  # Quadratic term matrix
    P += 1e-7 * np.mean(np.diag(P)) * np.eye(N)  # Regularization to ensure positive definiteness
    q = -2 * (1 / T) * X_mat.T @ r_vec  # Linear term vector
    G = -np.eye(N)  # Non-negativity constraint (w >= 0)
    h = np.zeros(N)
    A = np.ones((1, N))  # Equality constraint (sum(w) == 1)
    b = np.array([1.0])
    b = solve_qp(P, q.flatten(), G, h, A, b, solver="quadprog")

    # Step 2: Keep K largest weights
    idx_Klargest = np.argsort(b)[-K:]

    w_sparse = np.zeros(N)
    w_sparse[idx_Klargest] = b[idx_Klargest]

    # Normalize weights
    if np.sum(w_sparse) > 0:
        w_sparse = w_sparse / np.sum(w_sparse)

    return w_sparse

Sanity check:

w = two_stage_naive_tracking(dataset={
        'prices': SP500_stocks_2015to2020.loc["2018":],
        'index': SP500_index_2015to2020.loc["2018":]
    }, K=10)
sum(w), min(w), sum(w > 1e-5)

(1.0000000000000002, 0.0, 10)

def two_stage_refitted_tracking(dataset, K=20):
    """Two-stage tracking with refitted weights."""
    # Calculate returns
    X = dataset['prices'].pct_change().dropna()
    r = dataset['index'].pct_change().dropna()

    # Remove NA's
    common_valid_indices = X.index.intersection(r.index)
    X = X.loc[common_valid_indices]
    r = r.loc[common_valid_indices]

    # Convert to numpy arrays
    X_mat = X.values
    r_vec = r.values.reshape(-1, 1)
    T, N = X_mat.shape

    # Step 1: Get dense portfolio
    P = 2 * (1 / T) * X_mat.T @ X_mat  # Quadratic term matrix
    P += 1e-7 * np.mean(np.diag(P)) * np.eye(N)  # Regularization to ensure positive definiteness
    q = -2 * (1 / T) * X_mat.T @ r_vec  # Linear term vector
    G = -np.eye(N)  # Non-negativity constraint (w >= 0)
    h = np.zeros(N)
    A = np.ones((1, N))  # Equality constraint (sum(w) == 1)
    b = np.array([1.0])
    b = solve_qp(P, q.flatten(), G, h, A, b, solver="quadprog")

    # Step 2: Keep K largest weights and reoptimize
    idx_Klargest = np.argsort(b)[-K:]
    X_selected = X_mat[:, idx_Klargest]
    P = 2 * (1 / T) * X_selected.T @ X_selected  # Quadratic term matrix
    P += 1e-7 * np.mean(np.diag(P)) * np.eye(K)  # Regularization to ensure positive definiteness
    q = -2 * (1 / T) * X_selected.T @ r_vec  # Linear term vector
    G = -np.eye(K)  # Non-negativity constraint (w >= 0)
    h = np.zeros(K)
    A = np.ones((1, K))  # Equality constraint (sum(w) == 1)
    b = np.array([1.0])
    wK = solve_qp(P, q.flatten(), G, h, A, b, solver="quadprog")

    # Create sparse weight vector
    w_sparse = np.zeros(N)
    w_sparse[idx_Klargest] = wK

    return w_sparse

Sanity check:

w = two_stage_refitted_tracking(dataset={
        'prices': SP500_stocks_2015to2020.loc["2018":],
        'index': SP500_index_2015to2020.loc["2018":]
    }, K=15)
sum(w), min(w), sum(w > 1e-5)

(1.0, 0.0, 15)

def iterative_reweighted_l1_norm_tracking(dataset, lambda_val=5e-7):
    """Index tracking assuming fixed portfolio weights."""
    # Calculate returns
    X = dataset['prices'].pct_change().dropna()
    r = dataset['index'].pct_change().dropna()

    # Keep only overlapping indices
    common_valid_indices = X.index.intersection(r.index)
    X = X.loc[common_valid_indices]
    r = r.loc[common_valid_indices]

    # Get optimal weights
    w = sparse_tracking_via_MM(X, r, lambda_val=lambda_val, max_iter=10)
    return w

Sanity check:

w = iterative_reweighted_l1_norm_tracking(dataset={
        'prices': SP500_stocks_2015to2020.loc["2018":],
        'index': SP500_index_2015to2020.loc["2018":]
    })
sum(w), min(w), sum(w > 1e-5)

(1.0, 0.0, 27)

Let's run the backtest:

# Prep data
T, N = SP500_stocks_2015to2020.shape
T_trn = 2*252

# Parameters for the desired sparsity
K = 20
lambda_val=5e-7

# Run backtest
returns, weights = backtest(
    portfolio_funcs={
        "Full tracking": full_tracking,
        "Two-step design with proportional weights": two_stage_naive_tracking,
        "Two-step design with refitted weights": two_stage_refitted_tracking,
        "Iterative reweighted l1-norm method": iterative_reweighted_l1_norm_tracking
    },
    dataset={
        'prices': SP500_stocks_2015to2020,  #.loc["2018":]
        'index': SP500_index_2015to2020  # .loc["2018":]
    },
    lookback=T_trn,
    rebalance_every=6*21,  # Rebalance every 6 months (assuming 21 trading days per month)
    optimize_every=6*21,
)

Backtesting portfolio 'Full tracking'...
  Reoptimizing portfolio at time 504 (cardinality=267)
  Reoptimizing portfolio at time 630 (cardinality=272)
  Reoptimizing portfolio at time 756 (cardinality=277)
  Reoptimizing portfolio at time 882 (cardinality=278)
  Reoptimizing portfolio at time 1008 (cardinality=283)
  Reoptimizing portfolio at time 1134 (cardinality=271)
  Reoptimizing portfolio at time 1260 (cardinality=288)
  Reoptimizing portfolio at time 1386 (cardinality=243)
Backtesting portfolio 'Two-step design with proportional weights'...
  Reoptimizing portfolio at time 504 (cardinality=20)
  Reoptimizing portfolio at time 630 (cardinality=20)
  Reoptimizing portfolio at time 756 (cardinality=20)
  Reoptimizing portfolio at time 882 (cardinality=20)
  Reoptimizing portfolio at time 1008 (cardinality=20)
  Reoptimizing portfolio at time 1134 (cardinality=20)
  Reoptimizing portfolio at time 1260 (cardinality=20)
  Reoptimizing portfolio at time 1386 (cardinality=20)
Backtesting portfolio 'Two-step design with refitted weights'...
  Reoptimizing portfolio at time 504 (cardinality=20)
  Reoptimizing portfolio at time 630 (cardinality=19)
  Reoptimizing portfolio at time 756 (cardinality=20)
  Reoptimizing portfolio at time 882 (cardinality=20)
  Reoptimizing portfolio at time 1008 (cardinality=20)
  Reoptimizing portfolio at time 1134 (cardinality=20)
  Reoptimizing portfolio at time 1260 (cardinality=20)
  Reoptimizing portfolio at time 1386 (cardinality=20)
Backtesting portfolio 'Iterative reweighted l1-norm method'...
  Reoptimizing portfolio at time 504 (cardinality=21)
  Reoptimizing portfolio at time 630 (cardinality=27)
  Reoptimizing portfolio at time 756 (cardinality=22)
  Reoptimizing portfolio at time 882 (cardinality=17)
  Reoptimizing portfolio at time 1008 (cardinality=22)
  Reoptimizing portfolio at time 1134 (cardinality=21)
  Reoptimizing portfolio at time 1260 (cardinality=20)
  Reoptimizing portfolio at time 1386 (cardinality=27)

We can then plot the price, tracking error, and cardinality over time:

plot_prices(returns, benchmark=SP500_index_2015to2020)

<Figure size 1200x600 with 0 Axes>

methods = ["Two-step design with proportional weights",
           "Two-step design with refitted weights",
           "Iterative reweighted l1-norm method"]

TE = plot_tracking_error(returns[methods], benchmark=SP500_index_2015to2020, window=252)

<Figure size 1200x600 with 0 Axes>

cardinality = plot_cardinality({key: weights[key] for key in methods})

<Figure size 1200x600 with 0 Axes>

# Calculate column means for TE and cardinality
summary_df = pd.DataFrame({
    "TE": TE.mean(),
    "cardinality": cardinality.mean()
})

# Display
summary_df.style.format({
    'TE': '{:.2e}',  # Scientific notation with 6 decimal places
    'cardinality': '{:.2f}'  # Fixed-point with 2 decimal places
})

	TE	cardinality
Two-step design with proportional weights	7.56e-06	20.00
Two-step design with refitted weights	5.50e-06	19.87
Iterative reweighted l1-norm method	4.63e-06	21.75

Comparison of formulations¶

We consider again the comparison between assuming a fixed portfolio and a time-varying one in the definition of tracking error. More specifically, we now study the tradeoff of tracking error versus cardinality. Indeed, the assumption of the time-varying portfolio is more accurate.

# Prep data
T, N = SP500_stocks_2015to2020.shape
T_trn = 2*252

# Sweeping loop
lmd_sweep = 1e-5 * np.array([1, 2, 5, 10, 20, 50, 100, 200, 1000]) / 300
df_list = []
for lmd in lmd_sweep:
    print(f"Backtesting for lambda: {lmd:.2e}")

    # Run backtest
    returns, weights = backtest(
        portfolio_funcs={
            "Assuming fixed portfolio": lambda dataset: tracking_using_plain_returns_wfixed(dataset, lambda_val=lmd),
            "Assuming time-varying portfolio": lambda dataset: tracking_using_plain_returns_wt(dataset, lambda_val=lmd)
        },
        dataset={
            'prices': SP500_stocks_2015to2020,
            'index': SP500_index_2015to2020
        },
        lookback=T_trn,
        rebalance_every=6*21,  # Rebalance every 6 months (assuming 21 trading days per month)
        optimize_every=6*21,
        verbose=False
    )

    # Calculate tracking error and cardinality
    TE = compute_tracking_error(returns, benchmark=SP500_index_2015to2020)
    cardinality = compute_cardinality(weights)
    df_list.append(pd.DataFrame({
        "method": TE.index,
        "TE": TE.values,
        "K": cardinality.values,
    }))

# Combine all rows into a single DataFrame
df = pd.concat(df_list, ignore_index=True)

Backtesting for lambda: 3.33e-08
    QP solver failed, increasing the matrix regularization factor...
    QP solver failed, increasing the matrix regularization factor...
    QP solver failed, increasing the matrix regularization factor...
    QP solver failed, increasing the matrix regularization factor...
Backtesting for lambda: 6.67e-08
Backtesting for lambda: 1.67e-07
Backtesting for lambda: 3.33e-07
    QP solver failed, increasing the matrix regularization factor...
    QP solver failed, increasing the matrix regularization factor...
Backtesting for lambda: 6.67e-07
Backtesting for lambda: 1.67e-06
Backtesting for lambda: 3.33e-06
    QP solver failed, increasing the matrix regularization factor...
    QP solver failed, increasing the matrix regularization factor...
Backtesting for lambda: 6.67e-06
Backtesting for lambda: 3.33e-05
    QP solver failed, increasing the matrix regularization factor...
    QP solver failed, increasing the matrix regularization factor...
    QP solver failed, increasing the matrix regularization factor...
    QP solver failed, increasing the matrix regularization factor...
    QP solver failed, increasing the matrix regularization factor...
    QP solver failed, increasing the matrix regularization factor...
    QP solver failed, increasing the matrix regularization factor...
    QP solver failed, increasing the matrix regularization factor...

# Plot tracking error versus sparsity level
plt.figure(figsize=(10, 6))
sns.lineplot(data=df, x="K", y="TE", hue="method", style="method", linewidth=1.2)
plt.title("Tracking error versus sparsity level")
plt.xlabel("K (active assets)")
plt.ylabel("Tracking Error")
plt.legend(title=None)
plt.grid(True)
plt.show()

Enhanced index tracking ¶

We now compare the following tracking error (TE) measures to track the S&P 500 index:

basic TE: $$ \textm{TE}(\w) = \frac{1}{T}\left\|\bm{r}^\textm{b} - \bm{X}\w\right\|^2_2; $$
robust Huberized TE: $$ \textm{Hub-TE}(\w) = \frac{1}{T}\sum_{t=1}^T \phi^{\textm{hub}}(r_t^\textm{b} - \bm{X}_{t,:}\w), $$ where $$\phi^{\textm{hub}}(x)=\begin{cases} x^{2}, & \quad|x|\leq M\\ M\left(2|x| - M\right), & \quad|x|>M; \end{cases}$$
$\ell_1$-norm TE: $$ \textm{TE}_1(\w) = \frac{1}{T}\left\|\bm{r}^\textm{b} - \bm{X}\w\right\|_1; $$
downside risk (DR) error measure: $$ \textm{DR}(\w) = \frac{1}{T}\left\|\left(\bm{r}^\textm{b} - \bm{X}\w\right)^+\right\|^2_2, $$ where $(\cdot)^+\triangleq\textm{max}(0,\cdot)$.

First, let's define the design functions for each of the methods based on the MM approach for enhanced index tracking:

def TE_index_tracking(dataset):
    """Index tracking using cumulative returns."""
    # Calculate returns
    X = dataset['prices'].pct_change().dropna()
    r = dataset['index'].pct_change().dropna()

    # Keep only overlapping indices
    common_valid_indices = X.index.intersection(r.index)
    X = X.loc[common_valid_indices]
    r = r.loc[common_valid_indices]

    # Get optimal weights
    w = sparse_tracking_via_MM(X.cumsum(), r.cumsum(), lambda_val=150*6e-7, u=0.5, max_iter=10)
    return w

def DR_index_tracking(dataset):
    """
    Index tracking using downside risk measure with cumulative returns.

    Parameters:
    dataset: Dictionary containing 'prices' (asset prices) and 'index' (benchmark prices)

    Returns:
    Optimal portfolio weights
    """
    # Calculate returns
    X = dataset['prices'].pct_change().dropna()
    r = dataset['index'].pct_change().dropna()

    # Keep only overlapping indices
    common_valid_indices = X.index.intersection(r.index)
    X = X.loc[common_valid_indices]
    r = r.loc[common_valid_indices]
    T, N = X.shape

    # Cumulative returns
    X_cum = X.cumsum()
    r_cum = r.cumsum()

    # Initial weights using cumulative returns
    w = sparse_tracking_via_MM(X_cum, r_cum, lambda_val=150*6e-7, u=0.5, max_iter=10)

    # Iterative reweighting to approximate L1 norm
    for k in range(5):
        # Calculate modified benchmark
        r_cum_tilde = r_cum + np.maximum(0, (X_cum.values @ w).reshape(-1, 1) - r_cum.values)

        # Solve the modified problem
        w = sparse_tracking_via_MM(X_cum, r_cum_tilde, lambda_val=150*6e-7, u=0.5, max_iter=10)  #190*1.8e-7

    return w

def TE1_index_tracking(dataset):
    """L1-norm index tracking using cumulative returns with iterative reweighting."""
    # Calculate returns
    X = dataset['prices'].pct_change().dropna()
    r = dataset['index'].pct_change().dropna()

    # Keep overlap of indices
    common_valid_indices = X.index.intersection(r.index)
    X = X.loc[common_valid_indices]
    r = r.loc[common_valid_indices]

    # Cumulative returns
    X_cum = X.cumsum()
    r_cum = r.cumsum()

    # Initial weights using cumulative returns
    w = sparse_tracking_via_MM(X_cum, r_cum, lambda_val=100*6e-7, u=0.5, max_iter=10)

    # Iterative reweighting to approximate L1 norm
    for k in range(5):
        # Calculate weights based on current tracking error
        alpha_sqrt = np.sqrt(1/(2*np.abs((X_cum.values @ w).reshape(-1, 1) - r_cum.values)))

        # Weight the returns by alpha
        X_cum_weighted = X_cum.multiply(alpha_sqrt.reshape(-1, 1))
        r_cum_weighted = r_cum * alpha_sqrt

        # Solve the weighted problem using cumulative returns
        w = sparse_tracking_via_MM(X_cum_weighted, r_cum_weighted, lambda_val=150*6e-7, u=0.5, max_iter=10)

    return w

def HubTE_index_tracking(dataset):
    """L1-norm index tracking using cumulative returns with iterative reweighting."""
    # Calculate returns
    X = dataset['prices'].pct_change().dropna()
    r = dataset['index'].pct_change().dropna()

    # Keep overlap of indices
    common_valid_indices = X.index.intersection(r.index)
    X = X.loc[common_valid_indices]
    r = r.loc[common_valid_indices]

    # Cumulative returns
    X_cum = X.cumsum()
    r_cum = r.cumsum()

    # Initial weights using cumulative returns
    w = sparse_tracking_via_MM(X_cum, r_cum, lambda_val=50*5e-8, u=0.5, max_iter=10) #50*5e-8

    # Iterative reweighting to approximate L1 norm
    M = 1e-4
    for k in range(5):
        # Calculate weights based on current tracking error
        alpha_sqrt = np.sqrt(np.minimum(1, M/np.abs((X_cum.values @ w).reshape(-1, 1) - r_cum.values)))

        # Weight the returns by alpha
        X_cum_weighted = X_cum.multiply(alpha_sqrt.reshape(-1, 1))
        r_cum_weighted = r_cum * alpha_sqrt

        # Solve the weighted problem using cumulative returns
        w = sparse_tracking_via_MM(X_cum_weighted, r_cum_weighted, lambda_val=50*5e-8, u=0.5, max_iter=10)

    return w

Let's run the backtest:

# Prep data
T, N = SP500_stocks_2015to2020.shape
T_trn = 2*252

# Run backtest
returns, weights = backtest(
    portfolio_funcs={
        "TE based design": TE_index_tracking,
        "Hub-TE based design": HubTE_index_tracking,
        "TE_1 based design": TE1_index_tracking,
        "DR based design": DR_index_tracking
    },
    dataset={
        'prices': SP500_stocks_2015to2020,
        'index': SP500_index_2015to2020
    },
    lookback=T_trn,
    rebalance_every=6*21,
    optimize_every=6*21,
    verbose=True
)

Backtesting portfolio 'TE based design'...
  Reoptimizing portfolio at time 504 (cardinality=4)
  Reoptimizing portfolio at time 630 (cardinality=5)
  Reoptimizing portfolio at time 756 (cardinality=5)
  Reoptimizing portfolio at time 882 (cardinality=5)
  Reoptimizing portfolio at time 1134 (cardinality=32)
  Reoptimizing portfolio at time 1260 (cardinality=41)
  Reoptimizing portfolio at time 1386 (cardinality=30)
Backtesting portfolio 'DR based design'...
  Reoptimizing portfolio at time 504 (cardinality=4)
  Reoptimizing portfolio at time 630 (cardinality=5)
  Reoptimizing portfolio at time 756 (cardinality=5)
    QP solver failed, increasing the matrix regularization factor...
  Reoptimizing portfolio at time 882 (cardinality=3)
    QP solver failed, increasing the matrix regularization factor...
    QP solver failed, increasing the matrix regularization factor...
    QP solver failed, increasing the matrix regularization factor...
  Reoptimizing portfolio at time 1008 (cardinality=4)
    QP solver failed, increasing the matrix regularization factor...
  Reoptimizing portfolio at time 1134 (cardinality=4)
    QP solver failed, increasing the matrix regularization factor...
    QP solver failed, increasing the matrix regularization factor...
    QP solver failed, increasing the matrix regularization factor...
    QP solver failed, increasing the matrix regularization factor...
  Reoptimizing portfolio at time 1260 (cardinality=3)
  Reoptimizing portfolio at time 1386 (cardinality=5)

plot_prices(returns, benchmark=SP500_index_2015to2020)

<Figure size 1200x600 with 0 Axes>

Automatic sparsity control¶

The sparse index tracking formulation includes a regularization term $\lambda \|\w\|_0$ in the objective or as a constraint $\|\w\|_0 \le k$, where the hyper-parameter $\lambda$ or $k$ controls the sparsity level. Adjusting this hyper-parameter allows achieving different points on the error versus sparsity trade-off curve.

In practice, the goal is to choose a proper operating point on this trade-off curve without computing the entire curve. The false discovery rate (FDR), which is the probability of wrongly including a variable, is a key quantity for this decision. Controlling the FDR is the ideal way to decide whether to include a variable.

The T-Rex method, based on "dummy" variables, provides a practical approach for FDR control in sparse index tracking. Instead of tuning the sparsity hyper-parameter, T-Rex automatically selects the assets to include by controlling the FDR, avoiding the selection of irrelevant assets that are highly correlated with already selected ones. The T-Rex method is efficiently implemented in the R package TRexSelector (and under development in Python as trexselector) based on the paper:

Machkour, J., Tien, S., Palomar, D. P., and Muma, M. (2022). TRexSelector: T-Rex selector: High-dimensional variable selection & FDR control.

We now backest the tracking portfolios obtained from the sparse penalized regression formulation with the FDR-controlling T-Rex method:

Machkour, J., Palomar, D. P., and Muma, M. (2025). FDR-controlled portfolio optimization for sparse financial index tracking. In Proceedings of the IEEE international conference on acoustics, speech, and signal processing (ICASSP). Hyderabad, India.

#
# T-Rex index tracking in Pyhon under development.
#
# See R version in: https://portfoliooptimizationbook.com/R-code/R-index-tracking.html
#

Libraries¶

Helper Functions¶

What is index tracking?¶

Sparse index tracking¶

Fixed versus time-varying portfolio¶

Plain versus cumulative returns¶

Convergence of the iterative reweighted $\ell_1$-norm method¶

Comparison of algorithms¶

Comparison of formulations¶

Enhanced index tracking¶

Automatic sparsity control¶

Enhanced index tracking ¶