Portfolio Optimization
Chapter 6: Portfolio Basics

R code

Published

September 14, 2024

R code examples for Chapter 6 of the book:

Daniel P. Palomar (2024). Portfolio Optimization: Theory and Application. Cambridge University Press.

Loading packages

The following packages are used in the examples:

# basic finance
library(xts)                    # to manipulate time series of stock data
library(quantmod)               # to download stock data
library(PerformanceAnalytics)   # to compute performance measures
library(portfolioBacktest)      # to conduct backtests
library(pob)                    # book package with financial data

# plotting
library(ggplot2)                # for nice plots
library(reshape2)               # to reshape data

# optimization
library(CVXR)
library(riskParityPortfolio)

Preliminaries

We start with basic aspects such as data loading and plotting.

Loading market data

Loading market data could be conveniently done with the package quantmod from Yahoo!Finance or any other source:

# download data from Yahoo!Finance
getSymbols(c("AMD", "MGM", "AAPL", "AMZN", "TSCO"), from = "2019-10-01", to = "2021-12-31")

# extract the adjusted close prices of AAPL
Ad(APPL)

However, for convenience we will use stock data and cryptocurrency data from the official portfolio optimization book package pob:

library(pob)

# crypto data
data(cryptos_2017to2021)
names(cryptos_2017to2021)
[1] "daily"  "hourly"
head(cryptos_2017to2021$hourly[, 1:5])
                         BTC      ETH       ADA       DOT       XRP
2021-01-07 09:00:00 37485.61 1204.525 0.3309979 10.005659 0.3051329
2021-01-07 10:00:00 37040.38 1183.403 0.3085464  9.677910 0.3237329
2021-01-07 11:00:00 37806.57 1201.001 0.3119042  9.823281 0.3579904
2021-01-07 12:00:00 37936.25 1227.162 0.3179058  9.966991 0.3364566
2021-01-07 13:00:00 38154.69 1218.127 0.3185917  9.882065 0.3285119
2021-01-07 14:00:00 38441.49 1223.862 0.3156046  9.785281 0.3412451
# stock S&P500 market data
data(SP500_2015to2020)
names(SP500_2015to2020)
[1] "stocks" "index" 
head(SP500_2015to2020$stocks[, 1:5])
                 A     AAL     AAP   AAPL    ABBV
2015-01-05 37.7523 51.0467 154.217 24.239 50.8722
2015-01-06 37.1642 50.2556 154.108 24.241 50.6204
2015-01-07 37.6574 50.2272 157.420 24.581 52.6663
2015-01-08 38.7862 50.8430 158.800 25.526 53.2171
2015-01-09 38.5016 49.2891 157.991 25.553 51.7614
2015-01-12 38.0463 46.9772 156.641 24.923 51.7457

Plotting data

Plotting data could be conveniently done with the packages xts and PerformanceAnalytics:

stock_prices <- SP500_2015to2020$stocks["2019-10::", "AAPL"]

# basic xts plot
plot(stock_prices)

# using PerformanceAnalytics
rets <- CalculateReturns(stock_prices)
chart.CumReturns(rets)
chart.Drawdown(rets)

However, we prefer to use the package ggplot2 for more beautiful plots (albeit at the expense of a few extra lines of code):

stock_prices <- SP500_2015to2020$stocks["2019-10::", c("AMD", "MGM", "AAPL")]
rets <- stock_prices/lag(stock_prices) - 1

ggplot(fortify(stock_prices, melt = TRUE), aes(x = Index, y = Value, color = Series)) +
  geom_line(linewidth=1) +
  scale_x_date(date_breaks = "2 months", date_labels = "%b %Y", date_minor_breaks = "1 week") +
  labs(title = "Prices", x = "", y = "", color = "stocks")

ggplot(fortify(Drawdowns(rets), melt = TRUE), aes(x = Index, y = Value, color = Series)) +
  geom_line(linewidth=1) +
  scale_x_date(date_breaks = "2 months", date_labels = "%b %Y", date_minor_breaks = "1 week") +
  labs(title = "Drawdown", x = "", y = "", color = "stocks")

Backtesting

We now consider how to perform backtests and explore rebalancing aspects.

Naive backtesting

For illustrative purposes we now backtest the \(1/N\) portfolio:

stock_prices <- SP500_2015to2020$stocks["2019-10::", c("AMD", "MGM", "AAPL", "AMZN", "TSCO")]
N <- ncol(stock_prices)
T <- nrow(stock_prices)

# linear returns
X <- stock_prices/lag(stock_prices) - 1

# portfolio
w <- rep(1/N, N)

# naive backtest (assuming daily rebalancing and no transaction cost)
portf_ret <- xts(X %*% w, index(X))
head(na.omit(portf_ret))
                   [,1]
2019-10-02 -0.013131245
2019-10-03  0.012373766
2019-10-04  0.009848104
2019-10-07 -0.003871650
2019-10-08 -0.016876593
2019-10-09  0.011320010

However, multiplying the matrix of linear returns of the assets X by the portfolio w implicitly assumes that we are rebalancing at every period (and also ignoring the transaction costs). To perform more realistic backtests we can use the package portfolioBactest where one can specify how often the portfolio is reoptimized, rebalanced, and the lookback windows to be used in a rolling-window basis (as well as transaction costs). Let’s start by reproducing the previous naive backtest assuming daily rebalancing:

library(portfolioBacktest)

# define our portfolio
EWP <- function(dataset, ...) {
  N <- ncol(dataset$prices)
  return(rep(1/N, N))
}

# perform backtest
bt <- portfolioBacktest(portfolio_funs = list("1/N" = EWP),
                        dataset_list = list("dataset1" = list("prices" = stock_prices)),
                        lookback = 1, optimize_every = 1, rebalance_every = 1)
Backtesting 1 portfolios over 1 datasets (periodicity = daily data)...
head(bt$`1/N`$dataset1$return)
                 return
2019-10-02 -0.013131245
2019-10-03  0.012373766
2019-10-04  0.009848104
2019-10-07 -0.003871650
2019-10-08 -0.016876593
2019-10-09  0.011320010

Rebalancing in backtesting

Now we can perform a more realistic backtest rebalancing, say, every week (i.e., 5 days), including transaction costs:

bt <- portfolioBacktest(portfolio_funs = list("1/N" = EWP),
                        dataset_list = list("dataset1" = list("prices" = stock_prices)),
                        lookback = 1, optimize_every = 5, rebalance_every = 5,
                        cost = list(buy = 30e-4, sell = 30e-4))
Backtesting 1 portfolios over 1 datasets (periodicity = daily data)...
head(bt$`1/N`$dataset1$wealth)
                 NAV
2019-10-01 1.0000000
2019-10-02 0.9868688
2019-10-03 0.9991699
2019-10-04 1.0088971
2019-10-07 1.0049400
2019-10-08 0.9881182
backtestChartCumReturn(bt)

Let’s observe the evolution of the \(1/N\) portfolio over time for a universe of 5 stocks, showing the effect of rebalancing (indicated with black vertical lines):

stock_prices <- SP500_2015to2020$stocks["2019-10::", c("AMD", "MGM", "AAPL", "AMZN", "TSCO")]

bt <- portfolioBacktest(portfolio_funs = list("1/N" = EWP),
                        dataset_list = list("dataset1" = list("prices" = stock_prices)),
                        lookback = 10, optimize_every = 90, rebalance_every = 90)
Backtesting 1 portfolios over 1 datasets (periodicity = daily data)...
bt$`1/N`$dataset1$w_bop["2020-01::2020-08", ] |>
  fortify(melt = TRUE) |>
  ggplot(aes(x = Index, y = Value, fill = Series)) +
  geom_bar(stat = "identity", width = 4.0) +
  scale_x_date(date_breaks = "1 month", date_labels = "%b %Y", date_minor_breaks = "1 week") +
  labs(title = "Weight allocation over time for portfolio 1/N", x = NULL, y = "weight", color = "stocks") +
  geom_vline(xintercept = as.Date("2020-02-23"), color = "black") +
  geom_vline(xintercept = as.Date("2020-07-03"), color = "black")

Heuristic portfolios

We now compare the following heuristic portfolios:

  • \(1/N\) portfolio: \[ \w = \frac{1}{N}\bm{1}; \]
  • GMRP: \[ \begin{array}{ll} \underset{\w}{\textm{maximize}} & \w^\T\bmu\\ \textm{subject to} & \bm{1}^\T\w=1, \quad \w\ge\bm{0}; \end{array} \]
  • Quintile portfolio (sorting stocks by \(\bm{\mu}\)): \[ \w = \frac{1}{N/5}\left[\begin{array}{c} \begin{array}{c} 1\\ \vdots\\ 1 \end{array}\\ \begin{array}{c} 0\\ \vdots\\ 0 \end{array} \end{array}\right]\begin{array}{c} \left.\begin{array}{c} \\ \\ \\ \end{array}\right\} 20\%\\ \left.\begin{array}{c} \\ \\ \\ \end{array}\right\} 80\% \end{array} \]
  • Quintile portfolio (sorting stocks by \(\bm{\mu}/\bm{\sigma}\)); and
  • Quintile portfolio (sorting stocks by \(\bm{\mu}/\bm{\sigma}^2\)).
data(SP500_2015to2020)
stock_prices <- SP500_2015to2020$stocks["2019::", c("AAPL", "AMZN", "AMD", "GM", "GOOGL", "MGM", "MSFT", "QCOM", "TSCO", "UPS")]
T <- nrow(stock_prices)
T_trn <- round(0.70*T)

#
# Define portfolios
#
QuintP_mu <- function(dataset, ...) {
  N <- ncol(dataset$prices)
  X <- diff(log(dataset$prices))[-1]
  mu <- colMeans(X)
  idx <- order(mu, decreasing = TRUE)
  w <- rep(0, N)
  w[idx[1:round(N/5)]] <- 1/round(N/5)
  return(w)
}

QuintP_mu_over_sigma <- function(dataset, ...) {
  N <- ncol(dataset$prices)
  X <- diff(log(dataset$prices))[-1]
  mu <- colMeans(X)
  Sigma <- cov(X)
  idx <- order(mu/sqrt(diag(Sigma)), decreasing = TRUE)
  w <- rep(0, N)
  w[idx[1:round(N/5)]] <- 1/round(N/5)
  return(w)
}

QuintP_mu_over_sigma2 <- function(dataset, ...) {
  N <- ncol(dataset$prices)
  X <- diff(log(dataset$prices))[-1]
  mu <- colMeans(X)
  Sigma <- cov(X)
  idx <- order(mu/diag(Sigma), decreasing = TRUE)
  w <- rep(0, N)
  w[idx[1:round(N/5)]] <- 1/round(N/5)
  return(w)
}

GMRP <- function(dataset, ...) {
  N <- ncol(dataset$prices)
  X <- diff(log(dataset$prices))[-1]
  mu <- colMeans(X)
  i_max <- which.max(mu)
  w <- rep(0, N)
  w[i_max] <- 1
  return(w)
}

# single backtest
bt <- portfolioBacktest(
  portfolio_funs = list("1/N"                          = EWP,
                        "GMRP"                         = GMRP,
                        "QuintP (sorted by mu)"        = QuintP_mu,
                        "QuintP (sorted by mu/sigma)"  = QuintP_mu_over_sigma,
                        "QuintP (sorted by mu/sigma2)" = QuintP_mu_over_sigma2),
  dataset_list = list("dataset1" = list("prices" = stock_prices)),
  lookback = T_trn, optimize_every = 30, rebalance_every = 1)
Backtesting 5 portfolios over 1 datasets (periodicity = daily data)...
# barplot
data.frame(
  "stocks" = names(stock_prices),
  "1/N"    = as.numeric(bt$`1/N`$dataset1$w_optimized[1, ]),
  "GMRP"   = as.numeric(bt$`GMRP`$dataset1$w_optimized[1, ]),
  "QuintP (sorted by mu)" = as.numeric(bt$`QuintP (sorted by mu)`$dataset1$w_optimized[1, ]),
  "QuintP (sorted by mu/sigma)" = as.numeric(bt$`QuintP (sorted by mu/sigma)`$dataset1$w_optimized[1, ]),
  "QuintP (sorted by mu/sigma2)" = as.numeric(bt$`QuintP (sorted by mu/sigma2)`$dataset1$w_optimized[1, ]),
  check.names = FALSE) |>
  melt(id.vars = "stocks") |>
  ggplot(aes(x = stocks, y = value, fill = variable)) +
  geom_bar(stat = "identity", position = "dodge", color = "black", width = 0.8) +
  labs(title = "Capital weight allocation for heuristic portfolios", y = "weight", fill = "portfolios")

backtestChartCumReturn(bt) +
  scale_x_date(date_breaks = "1 month", date_labels = "%b %Y", date_minor_breaks = "1 day") +
  ggtitle("Cumulative return (out of sample)")

backtestChartDrawdown(bt) +
  scale_x_date(date_breaks = "1 month", date_labels = "%b %Y", date_minor_breaks = "1 day") +
  ggtitle("Drawdown (out of sample)")

summary_bt <- backtestSummary(bt)
options(DT.options = list(dom = "t"))  # only show bare table (no search box)
summaryTable(summary_bt, type = "DT",
             measures = c("Sharpe ratio", "annual return", "annual volatility", "max drawdown")) 

Risk-based portfolios

We now compare the following risk-based portfolios:

  • Global minimum variance portfolio (GMVP): \[ \begin{array}{ll} \underset{\w}{\textm{minimize}} & \w^\T\bSigma\w\\ \textm{subject to} & \bm{1}^\T\w=1, \quad \w\ge\bm{0}; \end{array} \]

  • Inverse volatility portfolio (IVP): \[ \w = \frac{\bm{\sigma}^{-1}}{\bm{1}^\T\bm{\sigma}^{-1}}; \]

  • Risk parity portfolio (RPP);

  • Most diversified portfolio (MDP): \[ \begin{array}{ll} \underset{\w}{\textm{minimize}} & \dfrac{\w^\T\bm{\sigma}}{\sqrt{\w^\T\bmu\w}}\\ \textm{subject to} & \bm{1}^\T\w=1, \quad \w\ge\bm{0}; \end{array} \]

  • Maximum decorrelation portfolio (MDCP): \[ \begin{array}{ll} \underset{\w}{\textm{minimize}} & \w^\T\bm{C}\w\\ \textm{subject to} & \bm{1}^\T\w=1, \quad \w\ge\bm{0}, \end{array} \] where \(\bm{C} = \textm{Diag}\left(\bSigma\right)^{-1/2} \bSigma \textm{Diag}\left(\bSigma\right)^{-1/2}.\)

data(SP500_2015to2020)
stock_prices <- SP500_2015to2020$stocks["2019::", c("AAPL", "AMZN", "AMD", "GM", "GOOGL", "MGM", "MSFT", "QCOM", "TSCO", "UPS")]
T <- nrow(stock_prices)
T_trn <- round(0.70*T)

#
# Define portfolios
#
design_GMVP <- function(Sigma) {
  w <- Variable(nrow(Sigma))
  prob <- Problem(Minimize(quad_form(w, Sigma)),
                  constraints = list(w >= 0, sum(w) == 1))
  result <- solve(prob)
  w <- as.vector(result$getValue(w))
  return(w)
}

GMVP <- function(dataset, ...) {
  N <- ncol(dataset$prices)
  X <- diff(log(dataset$prices))[-1]
  Sigma <- cov(X)
  w <- design_GMVP(Sigma)
  return(w)
}

design_IVP <- function(Sigma) {
  sigma <- sqrt(diag(Sigma))
  w <- 1/sigma
  w <- w/sum(w)
  return(w)
}

IVP <- function(dataset, ...) {
  N <- ncol(dataset$prices)
  X <- diff(log(dataset$prices))[-1]
  Sigma <- cov(X)
  w <- design_IVP(Sigma)
  return(w)
}


RPP <- function(dataset, ...) {
  N <- ncol(dataset$prices)
  X <- diff(log(dataset$prices))[-1]
  Sigma <- cov(X)
  w <- riskParityPortfolio(Sigma)$w
  return(w)
}

design_MSRP <- function(mu, Sigma) {
  w_ <- Variable(nrow(Sigma))
  prob <- Problem(Minimize(quad_form(w_, Sigma)),
                  constraints = list(w_ >= 0, t(mu) %*% w_ == 1))
  result <- solve(prob)
  w <- as.vector(result$getValue(w_)/sum(result$getValue(w_)))
  names(w) <- colnames(Sigma)
  return(w)
}

MDP <- function(dataset, ...) {
  N <- ncol(dataset$prices)
  X <- diff(log(dataset$prices))[-1]
  Sigma <- cov(X)
  w <- design_MSRP(mu = sqrt(diag(Sigma)), Sigma)
  return(w)
}

design_MDCP <- function(Sigma) {
  C <- diag(1/sqrt(diag(Sigma))) %*% Sigma %*% diag(1/sqrt(diag(Sigma)))
  colnames(C) <- colnames(Sigma)
  return(design_GMVP(Sigma = C))
}

MDCP <- function(dataset, ...) {
  N <- ncol(dataset$prices)
  X <- diff(log(dataset$prices))[-1]
  Sigma <- cov(X)
  w <- design_MDCP(Sigma)
  return(w)
}


# single backtest
bt <- portfolioBacktest(portfolio_funs = list("GMVP"  = GMVP,
                                              "IVP"   = IVP,
                                              "RPP"   = RPP,
                                              "MDP"   = MDP,
                                              "MDCP"  = MDCP),
                        dataset_list = list("dataset1" = list("prices" = stock_prices)),
                        lookback = T_trn, optimize_every = 30, rebalance_every = 1)
Backtesting 5 portfolios over 1 datasets (periodicity = daily data)...
# barplot
data.frame(
  "stocks" = names(stock_prices),
  "GMVP"   = as.numeric(bt$`GMVP`$dataset1$w_optimized[1, ]),
  "IVP"    = as.numeric(bt$`IVP`$dataset1$w_optimized[1, ]),
  "RPP"    = as.numeric(bt$`RPP`$dataset1$w_optimized[1, ]),
  "MDP"    = as.numeric(bt$`MDP`$dataset1$w_optimized[1, ]),
  "MDCP"   = as.numeric(bt$`MDCP`$dataset1$w_optimized[1, ]),
  check.names = FALSE) |> 
  melt(id.vars = "stocks") |>
  ggplot(aes(x = stocks, y = value, fill = variable)) +
  geom_bar(stat = "identity", position = "dodge", color = "black", width = 0.8) +
  labs(title = "Capital weight allocation for risk-based portfolios", y = "weight", fill = "portfolios")

backtestChartCumReturn(bt) +
  scale_x_date(date_breaks = "1 month", date_labels = "%b %Y", date_minor_breaks = "1 day") +
  ggtitle("Cumulative return (out of sample)")

backtestChartDrawdown(bt) +
  scale_x_date(date_breaks = "1 month", date_labels = "%b %Y", date_minor_breaks = "1 day") +
  ggtitle("Drawdown (out of sample)")

summary_bt <- backtestSummary(bt)
summaryTable(summary_bt, type = "DT",
             measures = c("Sharpe ratio", "annual return", "annual volatility", "max drawdown"))