# IBKR Quant Blog

1 2 3 4 5

### 5 Questions For Wesley Gray, AlphaArchitect.com

Momentum investing – betting on the persistence of price trends in the short to medium term — has captured the crowd’s attention in recent years. Consider, for instance, the strong growth in ETF assets in the niche. The first fund launched a bit more than five years ago; today, there are dozens of momentum ETFs, collectively holding nearly $15 billion in assets, according to etfdb.com. That’s still a small piece of the total ETF pie, but the strategy’s allure could keep growth bubbling for years to come. What should investors expect? Does the rising interest in momentum raise concerns about the strategy’s expected return? For some insight, The Capital Spectator asked Wesley Gray at Alpha Architect, a wealth manager near Philadelphia. Gray, who previously worked as a finance professor, is an obvious source for discussing momentum. In addition to managing variations of momentum-based portfolios for clients, he and his team have written extensively about the strategy at AlphaArchitect.com, a popular investing blog. Gray is also the co-author of Quantitative Momentum: A Practitioner’s Guide to Building a Momentum-Based Stock Selection System Why does momentum persist? It’s been identified in the literature for decades and traders have been using it for much longer in one form or another. Most return anomalies are arbitraged away or turn out to be data-mining illusions. Momentum seems to be different. Why? This is a debate that still rages in academic circles, but it boils down to a mix of fundamental risk and mispricing that is tough to arbitrage away. Fundamental risk is easy to understand — higher risk generally earns higher returns in a competitive equilibrium. Mispricing is a bit trickier. If the mispricing is easy to exploit — i.e., you can generate 2-plus Sharpe ratio strategies by exploiting momentum — one can be sure the highly leveraged computer geeks at fast-moving hedge funds and proprietary trading shops will take care of the mispricing. But what if trying to exploit momentum mispricing is akin to eating a hand grenade on occasion? Well, it turns out that strategies designed to “arbitrage” momentum profits away can be incredibly volatile and suffer huge drawdowns — not exactly low-risk-easy-to-leverage trading strategies that the 200 IQ types look forward to exploiting. Long story short, sometimes even the best evidence-based active investment strategies can create a formidable challenge to investors seeking to exploit them. It’s a kind of quid pro quo: in order to access the potential gain, you must willing to accept the potential pain. Could momentum be the most epic data-mining result in all of finance? Sure. Could it vanish in the future? Possible. However, if we believe that momentum stocks are 1) naturally riskier and 2) driven by systematic mispricing that is costly to “arbitrage,” we can expect momentum investing to work in the future.1 There’s been strong growth in momentum-focused strategies and investment products in recent years. Is there a capacity limit for the strategy? If so, are we near that limit? Jack Vogel, one of my business partners, recently published a long piece called, “Factor Investing and Trading Costs,” which addresses this question in great detail. The short answer, yes, the capacity on momentum strategies is limited. Some folks argue it’s anywhere from$5 billion to $300 billion-plus in capacity. On the question of “are we near the limit,” I’d guess that we are still a ways off, based on a few things. First, most so-called momentum funds are closet-indexers so their actual momentum exposure if fairly limited even with a large amount of assets under management. Also, David Blitz [Robeco Asset Management] highlights that the ETF market as a whole hasn’t taken a dramatic momentum bet. At some point momentum, or any strategy for that matter, could suffer from too many dollars chasing too few returns. That said, given the relatively poor performance of momentum over the past decade, I’m not convinced there are huge swaths of short-term-performance-chasing investors looking to dive into stock momentum strategies — I think most [performance-chasing] investors have turned to things like cryptocurrency speculation. You’ve previously noted that institutional investors have only dipped their toes into momentum. That’s surprising, given the strategy’s encouraging historical record. What accounts for the reluctance among the investment behemoths to dive in deeper? There are almost certainly some large institutional investors implementing uber-sophisticated momentum strategies at scale. However, I’ve spoken to chief investment officers at several multi-billion-dollar endowments who weren’t even familiar with the term and/or the strategy. This was really surprising when I engaged in one of these conversations, but then I quickly remembered that not every CIO is buried in academic finance research. Many CIOs are tried and true fundamental investors and their philosophies revolve around the “value investing” ethos. So, even in this day and age, when systematic strategies are en vogue in the ETF space, many in the institutional space are still enamored with human stock pickers as opposed to fairly simple systematic investment approaches. I’m not exactly sure why this is the case, but my guess is that there is a potential agency problem at play: the consultants and internal investment staffs wouldn’t have a job if the pension/endowment bought a handful of index or factor funds and called it a day. Are momentum strategies sufficiently robust to stand on their own? Or is it advisable to pair it with other strategies, such as value investing and/or a plain-vanilla market-indexing portfolio? Depends who you ask. If you ask a value investor they will say, “buy value,” and never touch momentum, and vice versa for a momentum/technical type. The answer is you should probably do both, because value and momentum are excellent diversifiers. AQR Capital Management published an excellent paper[“Value and Momentum Everywhere”] on the subject. Why do so many investors punt on momentum strategies? We wrote a piece, “Evidence-based investing requires less religion and more reason,” where we discuss the fundamental and technical religions in the marketplace. We think a lot of the “anti-momentum” sentiment is driven by a religious-like approach to investing. But why? Taking a step back, the mission for long-term active investors is to beat the market. Active investors should focus on the scientific method to address a basic question: What works? Warren Buffett obviously showed that value investing, irrespective of technical considerations, can work. But George Soros and Paul Tudor Jones also showed that technical analysis can work just as well. An ever growing body of academic research formalizes the evidence that fundamental strategies (e.g., value and quality) and technical strategies (e.g., momentum and trend-following) both seem to work. Many dogmatic investors, however, looking to confirm what they already believe, selectively adopt the research evidence that fits their investing religion. In contrast, an evidence-based investor will conclude that fundamental and technical analysis strategies can work because they are two sides of the same coin. They are cousins because they share the common objective of exploiting the poor decisions of market participants influenced by biased decision-making. As Andrew Lo, an influential and forward-looking financial economist at MIT, correctly observes about the debate between fundamental and technical traders, “In the end we all have the same goal, which is to forecast uncertain market prices. We should be able to learn from each other.” What’s the biggest risk with momentum investing generally? Is there some aspect of risk that’s unique to momentum? Volatility. For example, our public momentum indexes (see the data on our indexes here), are highly focused and concentrated long-only momentum strategies. These strategies are expected to have around 25% volatility versus 15% for the generic stock market. That’s intense! You’ll almost certainly experience violent portfolio pain so that you’ll wish you had never heard of momentum investing. But, of course, this intense volatility arguably comes with a reasonable chance of earning excess returns. One can apply trend-following overlays and other risk management strategies to try and ease the momentum pain, but the harsh reality is that volatility will always exist for well-constructed momentum strategies. There are some other risks associated with long/short momentum strategies, which are related to dynamically shifting beta. If one is going down that path they should certainly read “Momentum Crashes,” by Kent Daniel and Tobias Moskowitz. 1 For defining momentum, Gray notes: “Momentum can refer to trend-following strategies, also called ‘time series’ momentum, but let’s discuss the classic ‘momentum factor’ in academic finance research. This momentum is a relative strength, or ‘cross-sectional,’ momentum (described here).Quick example to highlight the difference: Consider stock A and B. A is down 10% and B is down 20% over the past 12 months. A trend-following, or time series momentum, strategy would not buy either of these stocks, however, a cross-sectional momentum strategy would buy A and short/avoid B, because A is relatively stronger than B, despite having poor absolute momentum.” CapitalSpectator.com is a finance/investment/economics blog that’s edited by James Picerno. The site’s focus is macroeconomics, the business cycle and portfolio strategy (with an emphasis on asset allocation and related analytics). Picerno is the author of Dynamic Asset Allocation: Modern Portfolio Theory Updated for the Smart Investor (Bloomberg Press, 2010) and Nowcasting The Business Cycle: A Practical Guide For Spotting Business Cycle Peaks (Beta Publishing, 2014). In addition, Picerno publishes The US Business Cycle Risk Report, a weekly newsletter that quantitatively evaluates US recession risk in real time. Picerno is also working on a new book about using R for portfolio analytics. The publication date is expected in mid-2018. This article is from CapitalSpectator.com and is being posted with CapitalSpectator.com’s permission. The views expressed in this article are solely those of the author and/or CapitalSpectator.com and IB is not endorsing or recommending any investment or trading discussed in the article. This material is for information only and is not and should not be construed as an offer to sell or the solicitation of an offer to buy any security. To the extent that this material discusses general market activity, industry or sector trends or other broad-based economic or political conditions, it should not be construed as research or investment advice. To the extent that it includes references to specific securities, commodities, currencies, or other instruments, those references do not constitute a recommendation by IB to buy, sell or hold such security. This material does not and is not intended to take into account the particular financial conditions, investment objectives or requirements of individual customers. Before acting on this material, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice. 17889 ##### Quant ### Interpreting and Visualizing AutoCorelation By Jithin J and Karthik Ravindra, Byte Academy Analyzing a Time Series Data needs special attention. Here, we would like to explore working with time series data and identify the eect of auto correlation to come up with a more practical approach to work in Linear Regression Models. When using some data to try to estimate some value, say equity precies, Autocorrelation is a common feature. It is defined as the situation when the error terms of the linear regression model are correlated. So, if one error term is positive (or negative), and this fact causes the next error term to also be positive (or negative), we say that the model suers from autocorrelation. It is a very serious problem, as it violates the common assumption that the error term is stochastic and non-deterministic. Maintaining a stochastic error term is important to maintain the integrity of a linear regression otherwise it risks inducing bias in the model's estimations. Let's take an example of some financial data during a stock market crash. The crash on day one increases the likelihood of observing a downward trend for the next few days, perhaps even weeks. If the model suers from autocorrelation and is used for extrapolation, the model will estimate a similar stock market crash in the future as well. Therefore, we must first be able to identify the presence of this trend. To prepare this article, we decided to pick a financial data set. After a quick research we decided to work on Shiller PE ratio and estimate the movement of S&P monthly closing price. The data was taken from: http://www.multpl.com/shiller-pe/table?f=m. Domain Knowledge The Shiller P/E is a valuation measure usually applied to the US S&P 500 equity market. It is defined as price divided by the average of ten years of earnings (moving average), adjusted for inflation. As such, it is principally used to assess likely future returns from equities over timescales of 10 to 20 years, with higher than average values implying lower than average long-term annual average returns. Webscraping We start with extracting data scraping the Shiller P/E ratio and S&P closing prices from http://www.multpl.com/shiller-pe/table?f=m . If interested in the webscraping, the Python code is here: https://github.com/jithinjkumar. Once our data has been extracted we store in pandas DataFrames. We create a pandas data frame with index column as time series and S&P closing and Shiller Ratio as our column. Once the data is stored, we need to clean and prepare it for analysis. Data Preparation and Data Cleaning using Pandas library: Creating a Time Series So we have Shiller ratio data and S&P closing price in two dierent data frames, now let’s perform a lookup function to get the Shiller PE ratio for each month into the closing price data frame. We have 1769 entries and 4 columns SandP_Date and sh_Date are date columns we could easily drop one of them and we need to check for null values. sh_Ratio has 120 null values, we could drop these values from our dataset safely as this accounts to less than 6% of total row items Now we create a time series for which the S&P Date column needs to format correctly so that we are able to assign the correct data type for each columns. Now our Dataframe is in a time series format and ready for further analysis. Stay tuned for the next post in this series, in which we will discuss Time Series Analysis. ------------------------------------------------------- Any trading symbols displayed are for illustrative purposes only and are not intended to portray recommendations. Byte Academy is based in New York, USA. It offers coding education, classes in FinTech, Blockchain, DataSci, Python + Quant. This article is from Byte Academy and is being posted with Byte Academy’s permission. The views expressed in this article are solely those of the author and/or Byte Academy and IB is not endorsing or recommending any investment or trading discussed in the article. This material is for information only and is not and should not be construed as an offer to sell or the solicitation of an offer to buy any security. To the extent that this material discusses general market activity, industry or sector trends or other broad-based economic or political conditions, it should not be construed as research or investment advice. To the extent that it includes references to specific securities, commodities, currencies, or other instruments, those references do not constitute a recommendation by IB to buy, sell or hold such security. This material does not and is not intended to take into account the particular financial conditions, investment objectives or requirements of individual customers. Before acting on this material, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice. 18066 ##### Quant ### R Tip of the Month: Correlation Over Time In my earlier post from March 2018, I introduced the rollapply function that executes a function on a rolling window basis. While this function is very useful, it needs a little modification for users to apply other general operations. Originally, I faced this issue when I tried to compute the correlation matrix across different asset returns on a rolling window. For the demonstration, let's consider the returns for all sector ETFs excluding real estate: library(quantmod) v <- c("XLE","XLU","XLK","XLB","XLP","XLY","XLI","XLV","XLF") t1 <- "1990-01-01" P.list <- lapply(v,function(x) get(getSymbols(x,from = t1)) ) P.list <- lapply(P.list,function(x) x[,6]) P <- Reduce(merge,P.list) names(P) <- v R <- na.omit(P/lag(P) - 1) By default, rollapply executes the given function on each time series separately and returns a time series object. For instance, tail(rollapply(R,25,mean)) returns the 25 moving average for each one separately. On the other hand, if I try to compute the moving correlation, instead, I get the following tail(rollapply(R,25,cor)) which computes the correlation with the same ETF rather than other ETFs - as it treats each time series separately. As a remedy, add by.column = F argument to the rollapply function. In this case, the function returns a time series xts object, however, with$9 \times 9 = 81\$ columns, where each column corresponds to the pairwise correlation between the 9 sector ETFs rather than a squared matrix.

COR <- rollapply(R,25,cor,by.column = F)
dim(COR)
class(COR)

What’s left to be done is to stack these vectors back into a correlation matrix, one for each time period. To do so, I will refer to the plyr package. The plyr package allows users to take an array (a), a data frame (d), or a list (l), execute a given function over the given object, and output the results in either format. For our case, I will input the time series COR object as an array and output it as a list, where each element in the list corresponds to the moving correlation matrix.

library(plyr)
COR.list <- alply(COR,1,function(x) matrix(x,nrow = ncol(R), byrow = T ))

The second argument in the alply specifies the margin, where 1 indicates that the given function to be executed over the rows, while 2 states that it should be executed over the columns instead. The third argument, which takes a function, stacks each row of the COR object into a squared matrix. As a result, we have:

round(COR.list[[25]],2)

which is identical to correlation matrix computed over the first 25 days in the data

round(cor(R[1:25,]),2)

Finally, one can either keep the rolling correlation matrix in a list or transform it back a time series using certain computations (e.g., construct portfolio weights and compute the out-of-sample return as a time series). As a final demonstration, I will show how one can stack the list into a time series of average correlation across sectors over time.

# the following computes average of the upper traingle correlation matrix elements
COR.mean <- sapply(COR.list, function(x) mean(x[upper.tri(x)]) )
summary(COR.mean)

To retrieve back into a time series object, following trick should serve well:

library(lubridate)
names(COR.mean) <- date(COR)
COR.mean <- as.xts(COR.mean)
plot(COR.mean)

Note that in order to transform a numerical vector into a time series, I label the values with the corresponding date and, then, set it object as an xts object, whereas the lubridate is an extremely useful package to handle date formats.

Visit Majeed’s GitHub – IBKR-R corner for more info.

Learn more about Majeed’s research in R during his presentation at the R/Finance 2018 Conference in Chicago on June 1, 2018: http://www.rinfinance.com/

Majeed Simaan, Ph,D Finance, is well versed in research areas related to banking, asset pricing, and financial modeling. His research interests revolve around Banking and Risk Management, with emphasis on asset allocation and pricing. He has been involved in a number of projects that apply state of the art empirical research tools in the areas of financial networks (interconnectedness), machine learning, and textual analysis. His research has been published in the International Review of Economics and Finance and the Proceedings of the 2016 IEEE Symposium Series on Computational Intelligence. Majeed also pursued graduate training in the area of Mathematical Finance at the London School of Economics (LSE). He has a strong quantitative background in both computing and statistical learning. He holds both BA and MA in Statistics from the University of Haifa with specialization in actuarial science.

18032

### K-Means Clustering For Pair Selection In Python - Overview

In this series, we will cover what K-Means clustering is, how it can be used for solving the age-old problem of pair selection for Statistical Arbitrage, and the advantage of using K-Means for pair selection compared to using a brute force method. We will also create a Statistical Arbitrage strategy using K-Means for pair selection and implement the elbow technique to determine the value of K.

Let’s get started!

Part I – Life Without K-Means

To gain an understanding of why we may want to use K-Means to solve the problem of pair selection we will attempt to implement a Statistical Arbitrage as if there was no K-Means. That is, we will attempt to develop a brute force solution to our pair selection problem and then apply that solution within our Statistical Arbitrage strategy.

Let’s take a moment to think about why K-Means could be used for trading. What’s the benefit of using K-Means to form subgroups of possible pairs? Couldn’t we just come up with the pairs ourselves?

This is a great question and one undoubtedly you may have wondered about. To better understand the strength of using a technique like K-Means for Statistical Arbitrage, we’ll do a walk-through of trading a Statistical Arbitrage strategy if there was no K-Means. I’ll be your ghost of trading past so to speak.

First, let’s identify the key components of any Statistical Arbitrage trading strategy.

1. We must identify assets that have a tradable relationship
2. We must calculate the Z-Score of the spread of these assets, as well as the hedge ratio for position sizing
3. We generate buy and sell decisions when the Z-Score exceeds some upper or lower bound

To begin, we need some pairs to trade. But we can’t trade Statistical Arbitrage without knowing whether or not the pairs we select are cointegrated. Cointegration simply means that the statistical properties between our two assets are stable. Even if the two assets move randomly, we can count on the relationship between them to be constant, or at least most of the time.

Traditionally, when solving the problem of pair selection, in a world with no K-Means, we must find pairs by brute force or trial and error. This was usually done by grouping stocks together that were merely in the same sector or industry. The idea was that if these stocks were of companies in similar industries, thus having similarities in their operations, their stocks should move similarly as well. But, as we shall see, this is not necessarily the case.

The first step is to think of some pairs of stocks that should yield a trading relationship. We’ll use stocks in the S&P 500 but this process could be applied to any stocks within any index. Hmm, how about Walmart and Target. They both are retailers and direct competitors. Surely they should be cointegrated and thus would allow us to trade them in a Statistical Arbitrage Strategy.

Let’s begin by importing the necessary libraries as well as the data that we will need. We will use 2014-2016 as our analysis period.

#importing necessary libraries

#data analysis/manipulation

import numpy as np
import pandas as pd

#importing pandas datareader to get our data
import pandas_datareader as pdr

#importing the Augmented Dickey Fuller Test to check for cointegration
from statsmodels.tsa.api import adfuller

Now that we have our libraries, let’s get our data.

#setting start and end dates
start='2014-01-01'
end='2916-01-01'
#importing Walmart and Target using pandas datareader
wmt=pdr.get_data_yahoo('WMT',start,end)
tgt=pdr.get_data_yahoo('TGT',start,end)

Before testing our two stocks for cointegration, let’s take a look at their performance over the period. We’ll create a plot of Walmart and Target.

#Creating a figure to plot on
plt.figure(figsize=(10,8))

#Creating WMT and TGT plots
plt.plot(wmt["Close"],label='Walmart')

plt.plot(tgt[‘Close'],label='Target')
plt.title('Walmart and Target Over 2014-2016')

plt.legend(loc=0)
plt.show()

In the above plot, we can see a slight correlation at the beginning of 2014. But this doesn’t really give us a clear idea of the relationship between Walmart and Target. To get a definitive idea of the relationship between the two stocks, we’ll create a correlation heat-map.

To begin creating our correlation heatmap, must first place Walmart* and Target* prices in the same dataframe. Let’s create a new dataframe for our stocks.

#initializing newDF as a pandas dataframe
newDF=pd.DataFrame()
#adding WMT closing prices as a column to the newDF
newDF['WMT']=wmt['Close']
#adding TGT closing prices as a column to the newDF
newDF['TGT']=tgt['Close']

Now that we have created a new dataframe to hold our Walmart* and Target* stock prices, let’s take a look at it.

We can see that we have the prices of both our stocks in one place.

In the next post, we will create a correlation heat map of stocks and run some ADF tests

------------------------------------------------------------

*Disclaimer: All investments and trading in the stock market involve risk. Any decisions to place trades in the financial markets, including trading in stock or options or other financial instruments is a personal decision that should only be made after thorough research, including a personal risk and financial assessment and the engagement of professional assistance to the extent you believe necessary. The trading strategies or related information mentioned in this article is for informational purposes only.

If you want to learn more about K-Means Clustering for Pair Selection in Python, or to download the code, visit QuantInsti website and the educational offerings at their Executive Programme in Algorithmic Trading (EPAT™).

17880

### SGX - Update on India: Accessing the World's Fastest Growing Large Economy via Offshore Futures in 2018

Join us for a free webinar with Tariq Dennison, QuantOfAsia, on May 23, 2018 4:30pm HK

By some measures, India has surpassed China as the world’s fastest growing large economy, but it is still one of the most difficult stock and currency markets for foreign investors to access. This webinar discusses how to use offshore SGX-listed futures to trade India following this year’s updates and explains how the new SGX India Futures work.

Presented by Tariq Dennison, GFM Asset Management, QuantOfAsia,

Sponsored by Singapore Exchange

Information posted on IBKR Quant that is provided by third-parties and not by Interactive Brokers does NOT constitute a recommendation by Interactive Brokers that you should contract for the services of that third party. Third-party participants who contribute to IBKR Quant are independent of Interactive Brokers and Interactive Brokers does not make any representations or warranties concerning the services offered, their past or future performance, or the accuracy of the information provided by the third party. Past performance is no guarantee of future results.

17922

1 2 3 4 5