Creating a Rolling Correlation Scanner in Python.
Creating a Scanner to Detect Current & Past Correlation Readings.
Correlations across different assets and markets is crucial to determine any directional bias and trigger for a trade. If we have a buy signal on one asset and a sell signal on another, but yet they are almost perfectly correlated then we know that one of the signals is wrong because empirically, we have the information that they move together in the same direction.
I have just published a new book after the success of my previous one “New Technical Indicators in Python”. It features a more complete description and addition of structured trading strategies with a GitHub page dedicated to the continuously updated code. If you feel that this interests you, feel free to visit the below link, or if you prefer to buy the PDF version, you could contact me on LinkedIn.
Introduction to Correlation
Correlation is the degree of linear relationship between two or more variables. It is bounded between -1 and 1 with one being a perfectly positive correlation, -1 being a perfectly negative correlation, and 0 as an indication of no linear relationship between the variables (they relatively go in random directions). The measure is not perfect and can be biased by outliers and non-linear relationships, it does however provide quick glances to statistical properties. Two famous types of correlation exist and are commonly used:
Spearman correlation measures the relationship between two continuous or ordinal variables. Variables may tend to change together, but not necessarily at a constant rate. It is based on the ranks of values rather than the raw data.
Pearson correlation measures the linear relationship between two continuous variables. A relationship can be considered linear when a change in one is accompanied with a proportional change in the other.
The measure is not perfect and can be biased by outliers and non-linear relationships, it does however provide quick glances to statistical properties.
We can code the correlation function between two variables in Python using the below. Note that it has to be an array and not a data frame:
def adder(Data, times):
for i in range(1, times + 1):
new = np.zeros((len(Data), 1), dtype = float)
Data = np.append(Data, new, axis = 1)
return Data
def deleter(Data, index, times):
for i in range(1, times + 1):
Data = np.delete(Data, index, axis = 1)
return Data
def jump(Data, jump):
Data = Data[jump:, ]
return Data
def rolling_correlation(Data, first_data, second_data, lookback, where):
# Adding an extra column
Data = adder(Data, 1)
for i in range(len(Data)):
try:
Data[i, where] = pearsonr(Data[i - lookback + 1:i + 1, first_data], Data[i - lookback + 1:i + 1, second_data])[0]
except ValueError:
pass
Data = jump(Data, lookback)
return Data
Creating the Scanner
Let us start by fetching historical OHLC data before seeing how to code the indicator in Python.
One of the most famous trading platforms in the retail community is the MetaTrader5 software. It is a powerful tool that comes with its own programming language and its huge online community support. It also offers the possibility to export its historical short-term and long-term FX data.
The first thing we need to do is to simply download the platform from the official website. Then, after creating the demo account, we are ready to import the library in Python that allows to import the OHLC data from MetaTrader5.
A library is a group of structured functions that can be imported into our Python interpreter from where we can call and use the ones we want.
The easiest way to install the MetaTrader5 library is to go to the Python prompt on our computer and type:
pip install MetaTrader5
This should install the library in our local Python. Now, we want to import it to the Python interpreter (such as Pycharm or SPYDER) so that we can use it. Let us actually import all the libraries we will be using for this:
import datetime # Date acquiring
import pytz # Time zone management
import pandas as pd # Mostly for Data frame manipulation
import MetaTrader5 as mt5 # Importing OHLC data
import matplotlib.pyplot as plt # Plotting charts
import numpy as np # Mostly for array manipulation
Anything that comes after “as” is a shortcut. The plt shortcut is there so that each time we want to call a function from that library we do not have to type the full matplotlib.pyplot statement.
The official documentation for the Metatrader5 libary can be found here.
The first thing we can do is to select which time frame we want to import. Let us suppose that there are only two time frames, the 30-minute and the hourly bars. We can therefore create variables that hold the statement to tell the MetaTrader5 library which time frame we want.
# Choosing the 30-minute time frame
frame_M30 = mt5.TIMEFRAME_M30# Choosing the hourly time frame
frame_H1 = mt5.TIMEFRAME_H1
Then, by staying in the spirit of importing variables, we can define the variable that states what date is it now. This helps the algorithm know the stopping date of the import. We can do this by the simple line of code below.
# Defining the variable now to give out the current date
now = datetime.datetime.now()
Note that these code snippets are better used chronologically, hence, I encourage you to copy them in order and then execute them one by one so that you understand the evolution of what you are doing. The below is a function that holds which assets we want. Generally, I use 10 or more but for simplicity, let us consider that there are only two currency pairs: EURUSD and USDCHF.
def asset_list(asset_set):
if asset_set == 'FX': assets = ['EURUSD', 'USDCHF'] return assets
Now, with the key function that gets us the OHLC data. The below establishes a connection to MetaTrader5, applies the current date, and extracts the needed data. Notice the arguments year, month, and day. These will be filled by us to select from when do we want the data to start. Note, I have inputted Europe/Paris as my time zone, you should use your time zone to get more accurate data.
def get_quotes(time_frame, year = 2005, month = 1, day = 1, asset = "EURUSD"):
# Establish connection to MetaTrader 5
if not mt5.initialize():
print("initialize() failed, error code =", mt5.last_error())
quit()
timezone = pytz.timezone("Europe/Paris")
utc_from = datetime.datetime(year, month, day, tzinfo = timezone)
utc_to = datetime.datetime(now.year, now.month, now.day + 1, tzinfo = timezone)
rates = mt5.copy_rates_range(asset, time_frame, utc_from, utc_to)
rates_frame = pd.DataFrame(rates)
return rates_frame
And finally, the last function we will use is the one that uses the below get_quotes function and then cleans the results so that we have a nice array. We have selected data since January 2019 as shown below.
def mass_import(asset, horizon):
if horizon == 'M30':
data = get_quotes(frame_M30, 2019, 1, 1, asset = assets[asset])
data = data.iloc[:, 1:5].values
data = data.round(decimals = 5)
return data
Finally, we are done building the blocks necessary to import the data. To import EURUSD OHLC historical data, we simply use the below code line:
# Choosing the horizon
horizon = 'M30'# Creating an array called EURUSD having M30 data since 2019
EURUSD = mass_import(0, horizon)
And voila, now we have the EURUSD OHLC data from 2019.
Our job now is to import two currency pairs, let us say EURUSD and USDCHF and then chart their 20-period rolling correlation on the hourly values. Here is how we can do that.
# Base Parameters
assets = asset_list('FX')
# Horizon Parameters
horizon = 'H1'
# Mass Imports
EURUSD = mass_import(0, horizon)
USDCHF = mass_import(1, horizon)
# Correlation Window
lookback = 20
# Combining the two Historical Arrays into One
combined_data = np.concatenate((EURUSD[-1000:, 3:4], USDCHF[-1000:, 3:4]), axis = 1)
# Calling the Correlation Function
combined_data = rolling_correlation(combined_data, 0, 1, lookback, 2)
Note that the two generally move in opposite direction which is why we tend to see the correlation close to -1.00. Some anomalies might occur from time to time which make the correlation positive. These are the times where we can trade the EURCHF cross as many opportunities are present.
def indicator_plot_triple(Data, first_data, second_data, third_data, window = 250):
fig, ax = plt.subplots(3, figsize = (10, 5))Chosen = Data[-window:, ]
ax[0].plot(Data[:, 0], color = 'blue')
ax[0].grid()
ax[1].plot(Data[:, 1], color = 'red')
ax[1].grid()
ax[2].plot(Data[:, 2], color = 'purple', linewidth = 1, label = '20-period Correlation')
ax[2].grid()
ax[2].legend()
What we if we take a look at two pairs that are closely related? AUDUSD and NZDUSD are known to move approximately together due to their economical and political proximity.
def asset_list(asset_set):
if asset_set == 'FX':
assets = ['AUDUSD', 'NZDUSD']
return assets
# Mass Imports
AUDUSD = mass_import(0, horizon)
NZDUSD = mass_import(1, horizon)
# Correlation Window
lookback = 20
# Combining the two Historical Arrays into One
combined_data = np.concatenate((AUDUSD[-1000:, 3:4], NZDUSD[-1000:, 3:4]), axis = 1)
# Calling the Correlation Function
combined_data = rolling_correlation(combined_data, 0, 1, lookback, 2)
When we enlarge the correlation window, we tend to see the less noisy picture and a clearer positive correlation like the one shown in the below chart on the same currency pairs.
If you are also interested by more technical indicators and using Python to create strategies, then my best-selling book on Technical Indicators may interest you:
Conclusion
Remember to always do your back-tests. You should always believe that other people are wrong. My indicators and style of trading may work for me but maybe not for you.
I am a firm believer of not spoon-feeding. I have learnt by doing and not by copying. You should get the idea, the function, the intuition, the conditions of the strategy, and then elaborate (an even better) one yourself so that you back-test and improve it before deciding to take it live or to eliminate it. My choice of not providing specific Back-testing results should lead the reader to explore more herself the strategy and work on it more.
To sum up, are the strategies I provide realistic? Yes, but only by optimizing the environment (robust algorithm, low costs, honest broker, proper risk management, and order management). Are the strategies provided only for the sole use of trading? No, it is to stimulate brainstorming and getting more trading ideas as we are all sick of hearing about an oversold RSI as a reason to go short or a resistance being surpassed as a reason to go long. I am trying to introduce a new field called Objective Technical Analysis where we use hard data to judge our techniques rather than rely on outdated classical methods.