I am trying to construct a Long / Short portfolio in R. Say I have two portfolios Tech
and Mature
and I want to go long on the Tech
portfolio and short on the Mature
portfolio.
I collect and download the data using:
tickers <- c("GOOG", "MSFT", "GE", "WAL")
library(tidyquant)
library(dplyr)
library(tidyverse)
library(glue)
data <- tq_get(tickers,
get = "stock.prices",
from = "2010-01-01",
to = "2015-01-01") %>%
group_by(symbol) %>%
tq_transmute(select = adjusted,
mutate_fun = periodReturn,
period = "monthly",
type = "arithmetic") %>%
mutate(portfolio = case_when(
symbol == "GOOG" ~ "Tech",
symbol == "MSFT" ~ "Tech",
symbol == "GE" ~ "Mature",
symbol == "WAL" ~ "Mature"
),
date = ceiling_date(date, "month") - days(1)) %>% # set the date to the end of the month
group_by(portfolio, date) %>%
summarise(
Port_Monthly_Rets = mean(monthly.returns)
)
Which looks like:
# A tibble: 120 x 3
# Groups: portfolio [2]
portfolio date Port_Monthly_Rets
1 Mature 2010-01-31 0.173
2 Mature 2010-02-28 0.0639
3 Mature 2010-03-31 0.0605
4 Mature 2010-04-30 0.283
5 Mature 2010-05-31 -0.104
6 Mature 2010-06-30 -0.111
7 Mature 2010-07-31 0.0659
8 Mature 2010-08-31 -0.130
9 Mature 2010-09-30 0.113
10 Mature 2010-10-31 -0.0563
# ... with 110 more rows
I also download the Fama French Factors:
# Collect Fama French Factors:
temp <- tempfile()
base <- "http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/ftp/"
factor <- "North_America_3_Factors"
format<-"_CSV.zip"
full_url <-glue(base,factor,format,sep ="")
download.file(full_url,temp,quiet = TRUE)
FF_Factors <- read_csv(unz(temp, "North_America_3_Factors.csv"), skip = 6) %>%
rename(date = X1) %>%
mutate_at(vars(-date), as.numeric) %>%
mutate(date = rollback(ymd(parse_date_time(date, "%Y%m") + months(1)))) %>%
drop_na(date) %>%
rename(Mkt_Rf_3 = `Mkt-RF`,
SMB_3 = SMB,
HML_3 = HML,
RF_3 = RF)
I join everything together and compute the excess returns:
# Join factors with portfolios
df <- data %>%
left_join(FF_Factors, by = "date") %>%
mutate(R_excess = Port_Monthly_Rets - RF_3)
Which now looks like:
# A tibble: 120 x 8
# Groups: portfolio [2]
portfolio date Port_Monthly_Rets Mkt_Rf_3 SMB_3 HML_3 RF_3 R_excess
1 Mature 2010-01-31 0.173 -3.63 0.45 0.89 0 0.173
2 Mature 2010-02-28 0.0639 3.71 0.86 2.35 0 0.0639
3 Mature 2010-03-31 0.0605 6.27 1.5 1.48 0.01 0.0505
4 Mature 2010-04-30 0.283 1.92 3.73 1.9 0.01 0.273
5 Mature 2010-05-31 -0.104 -7.86 0.74 -2.32 0.01 -0.114
6 Mature 2010-06-30 -0.111 -5.5 -1.07 -3.73 0.01 -0.121
7 Mature 2010-07-31 0.0659 6.95 0.02 1.03 0.01 0.0559
8 Mature 2010-08-31 -0.130 -4.44 -1.61 -2.35 0.01 -0.140
9 Mature 2010-09-30 0.113 9.47 3.67 -1.85 0.01 0.103
10 Mature 2010-10-31 -0.0563 3.89 0.98 -1.55 0.01 -0.0663
# ... with 110 more rows
I run Fama French regressions on my portfolios:
summary(lm(R_excess ~ Mkt_Rf_3 + SMB_3 + HML_3, data = subset(df, portfolio == "Mature")))
summary(lm(R_excess ~ Mkt_Rf_3 + SMB_3 + HML_3, data = subset(df, portfolio == "Tech")))
For example, the Tech
portfolio looks like:
Call:
lm(formula = R_excess ~ Mkt_Rf_3 + SMB_3 + HML_3, data = subset(df,
portfolio == "Mature"))
Residuals:
Min 1Q Median 3Q Max
-0.10673 -0.03100 -0.01297 0.02820 0.18958
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.010782 0.007560 1.426 0.15934
Mkt_Rf_3 0.009913 0.002130 4.654 0.0000204 ***
SMB_3 0.008572 0.004173 2.054 0.04463 *
HML_3 0.011192 0.004137 2.705 0.00903 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.05491 on 56 degrees of freedom
Multiple R-squared: 0.5271, Adjusted R-squared: 0.5018
F-statistic: 20.81 on 3 and 56 DF, p-value: 0.00000000349
I want to go long on the tech
portfolio and short on the mature
portfolio. Here is my attempt at creating the Long/Short portfolio.
LS_Portfolio <- df %>%
select(-Port_Monthly_Rets) %>%
pivot_wider(names_from = portfolio, values_from = c(R_excess)) %>%
mutate(Long_Short = Tech - Mature)
summary(lm(Long_Short ~ Mkt_Rf_3 + SMB_3 + HML_3, data = LS_Portfolio))
Which outputs:
Call:
lm(formula = Long_Short ~ Mkt_Rf_3 + SMB_3 + HML_3, data = LS_Portfolio)
Residuals:
Min 1Q Median 3Q Max
-0.248952 -0.027568 0.003106 0.029772 0.191031
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.017876 0.009873 -1.810 0.07559 .
Mkt_Rf_3 0.002387 0.002782 0.858 0.39457
SMB_3 -0.018172 0.005450 -3.334 0.00152 **
HML_3 -0.012778 0.005404 -2.365 0.02154 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.07171 on 56 degrees of freedom
Multiple R-squared: 0.2307, Adjusted R-squared: 0.1895
F-statistic: 5.597 on 3 and 56 DF, p-value: 0.001983
Which appears to me to be wrong, the Mkt_Rf_3
should be significant. I get very similar regression results when I apply this to a larger portfolio of stocks.
My question is not related to programming it but how should I construct the Long / Short portfolio given that I have a number of stocks in each portfolio.
What I do is:
1) Form a portfolio of stocks based on some criteria
2) Compute the average monthly returns of each portfolio
3) Compute the excess return of the portfolio by taking the average monthly portfolio return and taking away the Risk Free rate from the Fama French data
4) Take the tech
portfolio returns and subtract the mature
portfolio returns.
5) Run my regression analysis (which performs poorly).
I just want to know where I am going wrong in my logic.