Monday, February 23, 2015

programming - Finance: Portfolio - Long Short Portfolio construction


I am trying to construct a Long / Short portfolio in R. Say I have two portfolios Tech and Mature and I want to go long on the Tech portfolio and short on the Mature portfolio.


I collect and download the data using:


tickers <- c("GOOG", "MSFT", "GE", "WAL")

library(tidyquant)
library(dplyr)
library(tidyverse)

library(glue)



data <- tq_get(tickers,
get = "stock.prices",
from = "2010-01-01",
to = "2015-01-01") %>%
group_by(symbol) %>%
tq_transmute(select = adjusted,

mutate_fun = periodReturn,
period = "monthly",
type = "arithmetic") %>%
mutate(portfolio = case_when(
symbol == "GOOG" ~ "Tech",
symbol == "MSFT" ~ "Tech",
symbol == "GE" ~ "Mature",
symbol == "WAL" ~ "Mature"
),
date = ceiling_date(date, "month") - days(1)) %>% # set the date to the end of the month

group_by(portfolio, date) %>%
summarise(
Port_Monthly_Rets = mean(monthly.returns)
)

Which looks like:


# A tibble: 120 x 3
# Groups: portfolio [2]
portfolio date Port_Monthly_Rets


1 Mature 2010-01-31 0.173
2 Mature 2010-02-28 0.0639
3 Mature 2010-03-31 0.0605
4 Mature 2010-04-30 0.283
5 Mature 2010-05-31 -0.104
6 Mature 2010-06-30 -0.111
7 Mature 2010-07-31 0.0659
8 Mature 2010-08-31 -0.130
9 Mature 2010-09-30 0.113
10 Mature 2010-10-31 -0.0563

# ... with 110 more rows

I also download the Fama French Factors:


# Collect Fama French Factors:

temp <- tempfile()
base <- "http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/ftp/"
factor <- "North_America_3_Factors"
format<-"_CSV.zip"
full_url <-glue(base,factor,format,sep ="")

download.file(full_url,temp,quiet = TRUE)

FF_Factors <- read_csv(unz(temp, "North_America_3_Factors.csv"), skip = 6) %>%
rename(date = X1) %>%
mutate_at(vars(-date), as.numeric) %>%
mutate(date = rollback(ymd(parse_date_time(date, "%Y%m") + months(1)))) %>%
drop_na(date) %>%
rename(Mkt_Rf_3 = `Mkt-RF`,
SMB_3 = SMB,
HML_3 = HML,

RF_3 = RF)

I join everything together and compute the excess returns:


# Join factors with portfolios

df <- data %>%
left_join(FF_Factors, by = "date") %>%
mutate(R_excess = Port_Monthly_Rets - RF_3)

Which now looks like:



# A tibble: 120 x 8
# Groups: portfolio [2]
portfolio date Port_Monthly_Rets Mkt_Rf_3 SMB_3 HML_3 RF_3 R_excess

1 Mature 2010-01-31 0.173 -3.63 0.45 0.89 0 0.173
2 Mature 2010-02-28 0.0639 3.71 0.86 2.35 0 0.0639
3 Mature 2010-03-31 0.0605 6.27 1.5 1.48 0.01 0.0505
4 Mature 2010-04-30 0.283 1.92 3.73 1.9 0.01 0.273
5 Mature 2010-05-31 -0.104 -7.86 0.74 -2.32 0.01 -0.114
6 Mature 2010-06-30 -0.111 -5.5 -1.07 -3.73 0.01 -0.121

7 Mature 2010-07-31 0.0659 6.95 0.02 1.03 0.01 0.0559
8 Mature 2010-08-31 -0.130 -4.44 -1.61 -2.35 0.01 -0.140
9 Mature 2010-09-30 0.113 9.47 3.67 -1.85 0.01 0.103
10 Mature 2010-10-31 -0.0563 3.89 0.98 -1.55 0.01 -0.0663
# ... with 110 more rows

I run Fama French regressions on my portfolios:


summary(lm(R_excess ~ Mkt_Rf_3 + SMB_3 + HML_3, data = subset(df, portfolio == "Mature")))
summary(lm(R_excess ~ Mkt_Rf_3 + SMB_3 + HML_3, data = subset(df, portfolio == "Tech")))


For example, the Tech portfolio looks like:


Call:
lm(formula = R_excess ~ Mkt_Rf_3 + SMB_3 + HML_3, data = subset(df,
portfolio == "Mature"))

Residuals:
Min 1Q Median 3Q Max
-0.10673 -0.03100 -0.01297 0.02820 0.18958

Coefficients:

Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.010782 0.007560 1.426 0.15934
Mkt_Rf_3 0.009913 0.002130 4.654 0.0000204 ***
SMB_3 0.008572 0.004173 2.054 0.04463 *
HML_3 0.011192 0.004137 2.705 0.00903 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.05491 on 56 degrees of freedom
Multiple R-squared: 0.5271, Adjusted R-squared: 0.5018

F-statistic: 20.81 on 3 and 56 DF, p-value: 0.00000000349

I want to go long on the tech portfolio and short on the mature portfolio. Here is my attempt at creating the Long/Short portfolio.


LS_Portfolio <- df %>%
select(-Port_Monthly_Rets) %>%
pivot_wider(names_from = portfolio, values_from = c(R_excess)) %>%
mutate(Long_Short = Tech - Mature)

summary(lm(Long_Short ~ Mkt_Rf_3 + SMB_3 + HML_3, data = LS_Portfolio))


Which outputs:


Call:
lm(formula = Long_Short ~ Mkt_Rf_3 + SMB_3 + HML_3, data = LS_Portfolio)

Residuals:
Min 1Q Median 3Q Max
-0.248952 -0.027568 0.003106 0.029772 0.191031

Coefficients:
Estimate Std. Error t value Pr(>|t|)

(Intercept) -0.017876 0.009873 -1.810 0.07559 .
Mkt_Rf_3 0.002387 0.002782 0.858 0.39457
SMB_3 -0.018172 0.005450 -3.334 0.00152 **
HML_3 -0.012778 0.005404 -2.365 0.02154 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.07171 on 56 degrees of freedom
Multiple R-squared: 0.2307, Adjusted R-squared: 0.1895
F-statistic: 5.597 on 3 and 56 DF, p-value: 0.001983


Which appears to me to be wrong, the Mkt_Rf_3 should be significant. I get very similar regression results when I apply this to a larger portfolio of stocks.


My question is not related to programming it but how should I construct the Long / Short portfolio given that I have a number of stocks in each portfolio.


What I do is:


1) Form a portfolio of stocks based on some criteria


2) Compute the average monthly returns of each portfolio


3) Compute the excess return of the portfolio by taking the average monthly portfolio return and taking away the Risk Free rate from the Fama French data


4) Take the tech portfolio returns and subtract the mature portfolio returns.


5) Run my regression analysis (which performs poorly).


I just want to know where I am going wrong in my logic.





No comments:

Post a Comment

technique - How credible is wikipedia?

I understand that this question relates more to wikipedia than it does writing but... If I was going to use wikipedia for a source for a res...