Machine Learning In Investing

machine learning in investingMachine learning can be used in investing in a variety of ways. This article seeks to show a few of those and describe the basics.

Hedge funds have tried for years to use machine learning (or AI) in their funds. Moving on to today, all quantitative funds use some form of machine learning. Some hedge funds even build around machine learning specifically, spurning human intervention.

Risk Management

Risk usually means standard deviation and positive correlation. However machine learning can approach risk management in a different way. Traditional investment correlation studies the way stocks move together. Machine learning today can look at correlation of risky investments with alternative data points. For example a model can look at how risk of a certain industry’s securities might rise and fall with weather data points, such as rain or temperature.

Simple methods of machine learning can be used to reduce the prediction errors in mean variance portfolio optimization. This helps manage the risk that can be reduced via diversification.

Building Portfolios

When discussing machine learning in investing, it’s natural to think about building portfolios. Since machine learning can assist with portfolio optimization, it can also be used to select investments to make up ideal portfolios.

Price Prediction

Of all the ways to use machine learning in investing, this one has gotten the most attention. This is because everyone is looking for the holy grail, also known as ‘the algorithm that accurately says when to buy and sell a security’. Many quant platforms are built around this concept.

Macro Economic Predictions

In addition to predicting prices, machine learning is also used to predict the state of the economy. Portfolio optimization depends quite a lot on return estimates, and return estimates are only as good as predicted economic conditions. Traditionally, to figure out what the economy was going to do, you would look at oil, interest rates, or similar macro data points. Now, you can throw huge amounts of alternative data at a model designed to look for similarities in past economic periods to current conditions. These models reinforce the estimates more than can be done with simple machine learning. They should result in much better estimates.

A word about Data

In order to use more complicated machine learning models, you need a lot of data. How much, that depends. It’s safe to say that the starting point to do meaningful things with deep learning, is at least a couple terabytes.

Limiting Drawdowns Rather Than Maximizing Returns

Prominent asset manager Mary Meeker once said her primary goal is limiting drawdowns! While most portfolio construction is explicitly or implicitly focused on returns, she knew that minimizing events that can decimate the portfolio is more important for long term gains than trying to maximize growth periods. Famously, a portfolio needs to double to make up for a 50% drawdown.

So how to minimize drawdowns?

Portfolio diversification will naturally make a difference in drawdowns, as long as it’s actually diversified. The problem with diversification is in order to try to come to the best asset weights, you have to subject it to some process of optimization. And many optimization methods are known to reduce the benefits of diversification either because the asset mix isn’t linked or due to estimation errors on the front end.

So using robust optimization methods is key.

These robust methods lead to a portfolio with less risk and volatility, and therefore less risk of drawdowns.

For example, taking a portfolio made up of securities randomly selected from 2000-2010 and run through typical covariance/correlation matrices, here is what a graph of the correlation matrix looks like:


The yellow diagonal is a correlation of 1, or every asset with itself. Then the blocks move from yellow to dark blue as the correlation between them is less or negative.

Now let’s re-sort the assets by their correlations so that the highest are on one end and lowest on the other. This results in a graph with a much more orderly appearance:


In this graph, the assets are now clustered and it looks like a heat map, with hot being high correlation.

So with the assets being more closely linked to those that are ‘corr-a-likes’, when we perform our optimization we end up with a set of weights that will have lower risk due to diversification having been preserved.

This lower risk portfolio results in lower drawdowns. And often higher returns, particularly during more challenging stretches in the market.

RIAengine will be launching a machine learning portfolio optimizer to a small beta group within a couple weeks.

Questions or comments? Love to hear them, please leave a comment below.

Machine Learning in Finance and Investing

The first time I heard the term ‘Artificial Intelligence’ was in the 1980s, when my dad worked for Digital Equipment Corporation, in the Silicon Valley of the East (Massachusetts). And yes they made computer chips from Silicon. In the 80’s Artificial Intelligence referred to, basically, complex logical structures, with each decision leading to a tree of subsequent possibilities. It’s all compute power allowed in the 80’s and 90’s.

Fast forward to where we are today, and compute power is vastly greater, and machine learning can now avail itself of that power. Now instead of advanced decision trees, machine learning experts are trying to mimic the human brain.

AI (Artificial Intelligence) and ML (Machine Learning) tools abound today, in every industry it seems. Netflix uses it to figure out their users, as does Amazon, Google, and Apple, and Uber uses it to optimize drivers and routes for passengers. Political candidates try to use it to gain an edge on other candidates. It’s literally everywhere.

In finance, it’s being used for evaluating credit risk, make suggestions for financial planning goals, and even for investments.

Hedge funds have used machine learning for years, poring over many petabytes of data to find opportunities. They spend billions of dollars every year on data, because without great and unique data sets they can’t gain that edge that allows them to gain 30%, 50% or more every year (the best hedge funds earn over 70%).

But machine learning shouldn’t be limited to the ultra privileged at the top of the financial food chain.

We can use machine learning to make incremental improvements in our portfolios. In the same way it’s common knowledge that high investment fees erode a portfolio over time, making a dramatic difference in the end result, so too incremental improvements in a portfolio, both on the risk side and returns, can make a dramatic difference in your client being able to reach the end goal.

And even a portfolio earning the same returns with less risk means a client who isn’t constantly being bullied by the market. You can help them sleep better at night which adds to your bottom line in the end.

RIAengine will be launching a machine learning portfolio optimizer to a small beta group within a month.

Questions or comments? Love to hear them, please leave a comment below.

How To Best Evaluate A Portfolio

Recently I saw an article about Monte Carlo Analysis, which (as you know) is a way to evaluate a person’s ability to meet their future income needs. Using a portfolio of securities or asset allocations it runs through as many rolling periods as available in the data, to come up with a probability of meeting future income needs.

Certainly Monte Carlo Analysis is not great with all the input of expectations, almost certainly it will give misleading results: “OK Client, you are 87% likely to meet your future income needs based on this plan!” But is that percentage really true?.

The article says it’s better to focus on ambiguity rather than just investment risk.

So can Monte Carlo Analysis be Improved Upon?

Monte Carlo has a lot of uncertainty built into it, from all the ambiguous client expectations through the uncertainty about future markets.

One way to improve upon Monte Carlo Analysis would be to incorporate better analysis of coming market cycles: Using machine learning to look at factors such as interest rates, commodities, global events, and stock market risk, we could place higher weights on Monte Carlo periods that began when characteristics were similar to the present. Rather than looking at every period as equally likely to occur, we’d be placing higher weight on the ones that our analysis determined to be more likely to occur.

The downside would be that this would introduce yet another set of expectations.

So an even more solid approach is to tweak the portfolio such that it’s more likely to achieve needed returns without unnecessary risk. By using the Hierarchical Risk Parity approach, we can optimize the portfolio without obliterating the benefits of diversification.

RIAengine will be launching a better portfolio optimizer to a small beta group within a month.

Questions or comments? Love to hear them, please leave a comment below.

A Better Portfolio Optimization Method

You may have heard of Markowitz’s Critical Line Algorithm, or CLA. CLA was Markowitz’s attempt to solve the problem of portfolio optimization using quadratic math. For a couple reasons is was an ingenious invention.

But it also has its problems.

The problem with CLA is that it demonstrates large changes in the portfolio with small changes in the forecasted returns of assets. This is a big problem that seems intuitive, but the widespread influence of CLA shows that people are not universally aware of a solution to this problem. While the Critical Line Algorithm isn’t used universally, its implications are everywhere in finance in the construction of portfolios.

So what causes these flaws?

According to Marcos Lopez de Prado in Advances in Financial Machine Learning, the problem with CLA is attributable to covariance matrices basically being unaware of asset classes. In other words, when the portfolio assets are calculated in a covariance matrix to determine whether the assets move together or against each other, the assets are all treated the same, as if they are each replaceable with any other of them. But the assets were added by asset classes and you would not consider replacing a large cap stock with a small cap stock if your portfolio approach said you need large caps in the portfolio. But you might replace a large cap stock with another large cap. So those two assets should be considered linked.

With CLA, the more you diversify, the more likely you have estimation errors in the resulting portfolio. This could entirely negate the benefits of diversification!

So using the covariance result, more effort is needed to keep like assets linked. How to do this? Using machine learning, we can cluster assets with like assets and not lose their linkage when optimizing allocations.

But will this work?

As with traditional portfolio optimization approaches, the covariance matrix should still be used. But instead of relying on its results in raw form, we can build the results into a hierarchical structure which is consistent with how most portfolio construction begins anyway. According to De Prado, now the head of machine learning for AQR Capital, a $226 Billion fund, this results in a portfolio optimization that generates less risky portfolios compared to traditional risk parity methods.