Statistical arbitrage/pairs trading system

Author: absurdo Date: 06.07.2017

Check out my ebook on quant trading where I teach you how to build profitable systematic trading strategies with Python tools, from scratch. Take a look at my new ebook on advanced trading strategies using time series analysis, machine learning and Bayesian statistics, with Python and R. To date in our time series analysis posts we have considered linear time series models including ARMAARIMA as well as the GARCH model for conditional heteroskedasticity.

In this article we are going to consider the theoretical basis of state space modelsthe primary benefit of which is that their parameters can adapt over time. State space models are very general and it is possible to put the models we have considered to date into a state space formulation.

However, in order to keep the analysis straightforward, it is often better to use the simpler representation. The general premise of a state space model is that we have a set of states that evolve in time such as the hedge ratio between two cointegrated pairs of equitiesbut our observations of these states contain statistical noise such as market microstructure noiseand hence we are unable to ever directly observe the "true" states.

The goal of the state space model is to infer information about the states, given the observations, as new information arrives. A famous algorithm for carrying out this procedure is the Kalman Filterwhich we will also discuss in this article.

In engineering, for instance, a Kalman Filter will be used to estimate values of the state, which are then used to control the system under study. This introduces a feedback loop, often in real-time. For an extremely interesting application of Kalman Filtering, one can consider the recent successful attempt of the private space firm, Space Exploration Technologiesto return and land the first-stage of their Falcon 9 rocket back at its original launch site.

The first stage booster was subject to an extremely precise dynamic control problem, involving asymmetric time-varying mass fuel sloshing at hypersonic through to subsonic velocities:.

Perhaps the most common usage of a Kalman Filter in quantitative trading is to update hedging ratios between assets in a statistical arbitrage pairs tradebut the algorithm is much more general than this and we will look at other use cases. Generally, there are three types of inference that we are interested in when considering state space models:. Filtering and smoothing are similar, but not the same. Perhaps the best way to think of the difference is that with smoothing we are really wanting to understand what has happened to states in the past given our current knowledge, whereas with filtering we really want to know what is happening with the state right now.

In this article we are going to discuss the theory of the state space model and how we can use the Kalman Filter to carry out the various types of inference described above.

statistical arbitrage/pairs trading system

In subsequent articles we will apply the Kalman Filter to trading situations, such as cointegrated pairs, as well as asset price prediction. We will be making use of a Bayesian approach to the problem, as this is a natural statistical framework for allowing us to readily update our beliefs in light of new information, which is precisely the desired behaviour of the Kalman Filter.

I want to warn you that state-space models and Kalman 24option binary options robot settings suffer from an abundance of mathematical notation, even if the conceptual ideas behind them are relatively straightforward.

I will try and explain all of this notation in depth, as it can be confusing for those new to engineering control problems or state-space models in general. This section follows closely the notation utilised in both Cowpertwait et forum earnings on binary options without attachments [1] and Pole et al [2].

I decided it wasn't particularly helpful to invent my own notation for the Kalman Filter, as I want you to be able to relate it to other research papers or texts. In order to simplify the analysis we are going to suggest that this noise is drawn from a multivariate normal distribution, but of course, other distributions can be used. The relationship is summarised below in what is often called the state equation:.

However, this is only half of the story.

We also need to discuss the observationsthat is, what we actually seesince the states are hidden to us by system noise. The observations are a linear combination of the current state and some additional random variation known as measurement noisealso drawn from a multivariate normal distribution. These terms are distributed as:. Clearly that is a lot of notation to specify the model. For completeness, I'll summarise all of the terms here to help you get to grips with the model:.

Now that we've specified the become forex broker dealer state-space model, we need an algorithm to actually solve it.

This is where the Kalman Filter comes in. We can use Bayes' Rule and conjugate priors to help us derive the algorithm. If we recall from the article on Bayesian statisticsBayes' Rule is given by:. We want to apply the rule to the idea of updating the probability of seeing a state given all of the previous data we have and our current observation.

State Space Models and the Kalman Filter | QuantStart

Once again, we need to introduce more ozforex yearly average rates That is, our current knowledge is a mixture of our previous knowledge plus our most recent observation. What stock broker income canada this mean?

While the notation may be somewhat verbose, it is a very natural statement.

statistical arbitrage/pairs trading system

One of the extremely useful aspects of Bayesian inference is that if our prior and likelihood are both normally distributed, we can use the concept of conjugate priors to state that our posterior i. We utilised the same concept, albeit with different distributional forms, in the discussion on the inference of binomial proprotions.

Well, let's specify the terms that we'll be using, from Bayes' Rule above. Firstly, we specify the distributional form of the prior:. The latter two parameters will be defined below. We've already outlined these terms in our list above. We won't derive where these values actually come from, but we will simply state them. Thankfully we can use library implementations in R to carry out the "heavy lifting" for us:.

Clearly that is a lot of notation! As I said above, we need not statistical arbitrage/pairs trading system about the excessive verboseness of the Kalman Filter, as we can simply use libraries in R england stock market wiki calculate the algorithm for us. So how does it all fit together? Now that we have an algorithmic procedure for updating our views on the observations and states, we can use it to make predictions, as well as smooth the data.

The Bayesian approach to the Kalman Filter leads naturally to a mechanism for prediction. Let's take the expected value of the observation tomorrowgiven our knowledge of the data today:. However it is not sufficient to simply calculate the meanwe must also know the variance of tomorrow's observation given today's data, otherwise we cannot truly characterise the distribution on which to draw tomorrow's prediction. Actually, it allows us to write a convenient shorthand for the following:.

statistical arbitrage/pairs trading system

As I've mentioned repeatedly in this article, we should not concern ourselves too much with the verboseness of the Kalman Filter and its notation, rather we should think about the overall procedure and its Bayesian underpinnings. Thus we now have the means of predicting new values of the series. This is an alternative to the predictions produced by combining ARIMA and GARCH. In subsequent articles we will actually carry this out for some real financial data and apply it to a predictive trading model.

We will also be able to use the "filter" aspect to provide us with continually updated views on a linear hedge ratio between two cointegrated pairs of assets, such as might be found in a stastical arbitrage strategy. QuantStart Log In Sign Up.

Learn about QuantStart Read our Books Browse the Articles List Explore the Reading List Backtest with QSTrader Query the Support Knowledge Base. State Space Models and the Kalman Filter. By Michael Halls-Moore on December 28th, To date in our time series analysis posts we have considered linear time series models including ARMAARIMA as well as the GARCH model for conditional heteroskedasticity.

The first stage booster was subject to an extremely precise dynamic control problem, involving asymmetric time-varying mass fuel sloshing at hypersonic through to subsonic velocities: Generally, there are three types of inference that we are interested in when considering state space models: Prediction - Forecasting subsequent values of the state Filtering - Estimating the current values of the state from past and current observations Smoothing - Estimating the past values of the state given the observations Filtering and smoothing are similar, but not the same.

Linear State-Space Model This section follows closely the notation utilised in both Cowpertwait et al [1] and Pole et al [2]. Let's begin by discussing all of the elements of the linear state-space model.

The relationship is summarised below in what is often called the state equation: These terms are distributed as: For completeness, I'll summarise all of the terms here to help you get to grips with the model: The Kalman Filter A Bayesian Approach If we recall from the article on Bayesian statisticsBayes' Rule is given by: Applying Bayes' Rule to this situation gives the following: So how does this help us produce a Kalman Filter?

Firstly, we specify the distributional form of the prior: Now let's consider the likelihood: Thankfully we can use library implementations in R to carry out the "heavy lifting" for us: Prediction The Bayesian approach to the Kalman Filter leads naturally to a mechanism for prediction.

Let's take the expected value of the observation tomorrowgiven our knowledge of the data today: Let's try and follow through the analysis: Actually, it allows us to write a convenient shorthand for the following: Introductory Time Series with R.

Rating 4,5 stars - 283 reviews
inserted by FC2 system