Trading A-Z #3: Creating an FX Correlation Heatmap

Article #3 on How to Create a Correlation Heatmap in Python to Aid in Currency Trading.

Sep 18, 2021

Correlations across different assets and markets is crucial to determine any directional bias and trigger for a trade. If we have a buy signal on one asset and a sell signal on another, but yet they are almost perfectly correlated then we know that one of the signals is wrong because empirically, we have the information that they move together in the same direction.

Correlation management is part of risk management and is an important part in it. In this article, we will se how to create a correlation heatmap so as to get quick glances at the current state of correlation between different FX pairs.

I have just published a new book after the success of my previous one “New Technical Indicators in Python”. It features a more complete description and addition of structured trading strategies with a GitHub page dedicated to the continuously updated code. If you feel that this interests you, feel free to visit the below link, or if you prefer to buy the PDF version, you could contact me on LinkedIn.

The Book of Trading Strategies

The Concept of Correlation

Correlation is the degree of linear relationship between two or more variables. It is bounded between -1 and 1 with one being a perfectly positive correlation, -1 being a perfectly negative correlation, and 0 as an indication of no linear relationship between the variables (they relatively go in random directions). The measure is not perfect and can be biased by outliers and non-linear relationships, it does however provide quick glances to statistical properties. Two famous types of correlation exist and are commonly used:

Spearman correlation measures the relationship between two continuous or ordinal variables. Variables may tend to change together, but not necessarily at a constant rate. It is based on the ranks of values rather than the raw data.
Pearson correlation measures the linear relationship between two continuous variables. A relationship can be considered linear when a change in one is accompanied with a proportional change in the other.

The measure is not perfect and can be biased by outliers and non-linear relationships, it does however provide quick glances to statistical properties.

We can code the correlation function between two variables in Python using the below. Note that it has to be an array and not a data frame:

def rolling_correlation(Data, first_data, second_data, lookback, where):
    
    for i in range(len(Data)):
        
        try:
            Data[i, where] = pearsonr(Data[i - lookback + 1:i + 1, first_data], Data[i - lookback + 1:i + 1, second_data])[0]
            
            
        except ValueError:
            pass
    
    Data = jump(Data, lookback) 
    
    return Data

Importing OHLC Data

One of the most famous trading platforms in the retail community is the MetaTrader5 software. It is a powerful tool that comes with its own programming language and its huge online community support. Most importantly, it offers the possibility to export its historical short-term and long-term FX data. The first thing we need to do is to simply download it from the official website. Then, after creating the demo account, we are ready to import the library in Python that allows to import the OHLC data from MetaTrader5.

A library is a group of structured functions that can be imported into our Python interpreter from where we can call and use the ones we want.

The easiest way to install the library is to go to the Python prompt on our computer and type:

pip install MetaTrader5

This should install the library in our local Python. Now, we want to import it to the Python interpreter (such as Pycharm or SPYDER) so that we can use it. Let us actually import all the libraries we will be using for this:

import datetime                 # Date acquiring
import pytz                     # Time zone management
import pandas            as pd  # Mostly for Data frame manipulation
import MetaTrader5       as mt5 # Importing OHLC data
import matplotlib.pyplot as plt # Plotting charts
import numpy             as np  # Mostly for array manipulation

Anything that comes after “as” is a shortcut. The plt shortcut is there so that each time we want to call a function from that library we do not have to type the full matplotlib.pyplot statement.

The first thing we can do is to select which time frame we want to import. Let us suppose that there are only two frames, the 30-minute and the hourly bars. We can therefore create variables that hold the statement to tell the MetaTrader5 library which frame we want.

# Choosing the 30-minute time frame
frame_M30  = mt5.TIMEFRAME_M30

# Choosing the hourly time frame
frame_H1   = mt5.TIMEFRAME_H1

Then, by staying in the spirit of importing variables, we can define the variable that states what date is it now. This helps the algorithm know the stopping date of the import. We can do this by the simple line of code below.

# Defining the variable now to give out the current date
now = datetime.datetime.now()

Note that these code snippet are better used chronologically, hence, I encourage you to copy them in order and then execute them one by one so that you understand the evolution of what you are doing. The below is a function that holds which assets we want. Generally, I use 10 or more but for simplicity, let us consider that there are only two currency pairs: EURUSD and USDCHF.

def asset_list(asset_set):
    
    if asset_set == 1:
        assets = ['EURUSD', 'USDCHF']    

    return assets

Now, with the key function that gets us the OHLC data. The below establishes a connection to MetaTrader5, applies the current date, and extracts the needed data. Notice the arguments year, month, and day. These will be filled by us to select from when do we want the data to start. Note, I have inputed Europe/Paris as my time zone, you should use your time zone to get more accurate data.

def get_quotes(time_frame, year = 2005, month = 1, day = 1, asset = "EURUSD"):
        
    # Establish connection to MetaTrader 5 
    if not mt5.initialize():
        print("initialize() failed, error code =", mt5.last_error())
        quit()
    
    timezone = pytz.timezone("Europe/Paris")
    
    utc_from = datetime.datetime(year, month, day, tzinfo = timezone)
    utc_to = datetime.datetime(now.year, now.month, now.day + 1, tzinfo = timezone)
    
    rates = mt5.copy_rates_range(asset, time_frame, utc_from, utc_to)
    
    rates_frame = pd.DataFrame(rates)    

    return rates_frame

And finally, the last function we will use is the one that uses the below get_quotes function and then cleans the results so that we have a nice array. We have selected data since January 2019 as shown below.

def mass_import(asset, horizon):
        
 if horizon == 'M30':
   data = get_quotes(frame_M30, 2019, 1, 1, asset = assets[asset])
   data = data.iloc[:, 1:5].values
   data = data.round(decimals = 5)  

 return data

Finally, we are done building the blocks necessary to import the data. To import EURUSD OHLC historical data, we simply use the below code line:

# Choosing the horizon
horizon = 'M30'

# Creating an array called EURUSD having M30 data since 2019
EURUSD = mass_import(0, horizon)

And voila, now we have the EURUSD OHLC data from 2019.

Creating the Correlation Heatmap

The heatmap will be created using a library called seaborn. It will take care of the plotting part. The data preparation and structuring part will be handled manually using the next code snippets. All we need is the importing function from the previous section. First, let us make sure we have the right libraries imported.

# Importing libraries
import pandas            as pd
import numpy             as np
import seaborn           as sn

The next step is to determine the array of the FX pairs we wish to cover. In our example, we have chosen 9 different currency pairs as shown below:

assets = ['EURUSD', 'USDCHF', 'GBPUSD', 'AUDUSD', 'NZDUSD', 'USDCAD', 'EURCAD', 'EURGBP', 'EURCHF']

The next step is to import them to the Python interpreter using the mass_import function which we have seen previously. For each of the following currency pairs, we will be doing three things:

Importing the OHLC historical data from MetaTrader5 using the mass_import function.
Removing all the columns except for the closing price column as we will calculate the correlation measure on it.
Select the last 1000 data as correlation is dynamic and is unlikely to resemble the correlation from a few years ago. With hourly values, we have selected the last 1000 data (hourly bars).

# Mass imports 
EURUSD = mass_import(0, horizon)
EURUSD = EURUSD[:, 3:4]
EURUSD = EURUSD[-1000:, ]USDCHF = mass_import(1, horizon)
USDCHF = USDCHF[:, 3:4]
USDCHF = USDCHF[-1000:, ]GBPUSD = mass_import(2, horizon)
GBPUSD = GBPUSD[:, 3:4]
GBPUSD = GBPUSD[-1000:, ]AUDUSD = mass_import(3, horizon)
AUDUSD = AUDUSD[:, 3:4]
AUDUSD = AUDUSD[-1000:, ]NZDUSD = mass_import(4, horizon)
NZDUSD = NZDUSD[:, 3:4]
NZDUSD = NZDUSD[-1000:, ]USDCAD = mass_import(5, horizon)
USDCAD = USDCAD[:, 3:4]
USDCAD = USDCAD[-1000:, ]EURCAD = mass_import(6, horizon)
EURCAD = EURCAD[:, 3:4]
EURCAD = EURCAD[-1000:, ]EURGBP = mass_import(7, horizon)
EURGBP = EURGBP[:, 3:4]
EURGBP = EURGBP[-1000:, ]EURCHF = mass_import(8, horizon)
EURCHF = EURCHF[:, 3:4]
EURCHF = EURCHF[-1000:, ]

The next step is to concatenate (join) every column from every pair into one array using the concatenate function from numpy as shown below.

Correlation_Matrix = np.concatenate((EURUSD, 
                                     USDCHF, 
                                     GBPUSD, 
                                     AUDUSD, 
                                     NZDUSD, 
                                     USDCAD, 
                                     EURCAD ,
                                     EURGBP, 
                                     EURCHF), axis = 1)

Then, we want to name the , but this can only be done in data frames. We will therefore simply convert the array into a dataframe and then name the columns as seen in the code snippet below.

Correlation_Matrix = pd.DataFrame(Correlation_Matrix)Correlation_Matrix.columns = ['EURUSD', 
                              'USDCHF', 
                              'GBPUSD', 
                              'AUDUSD', 
                              'NZDUSD', 
                              'USDCAD',
                              'EURCAD', 
                              'EURGBP', 
                              'EURCHF']

We have all we need for the heatmap now. All is left to do is the line of seaborn to plot the map. This can be done easily following the below syntax.

# Visual representation
sn.heatmap(Correlation_Matrix.corr(), cmap ="YlGnBu")

Interpreting the Data

What does the heatmap above tell us? The first thing we must notice is the scale on the right which tells us how to interpret the colors (or the heat). Beige color refers to strong negative correlation while dark blue color refers to strong positive color. The green-ish blue is where correlation is close to zero, meaning that the two market move independently and are not really related.

Take a look at the correlation of the EURUSD vs GBPUSD, AUDUSD, and NZDUSD. We can notice that as they are USD majors, the greenback’s movements have been dominant and therefore causing them to move together and to be lead by the USD.

Unrelated pairs like the NZDUSD and EURCHF will likely have a zero correlation which is the case if we take a look at the above heatmap. The color green and blue is the prevalent one, proving this hypothesis.

As expected, the EURUSD and USDCHF have an extremely high negative correlation. When one moves up, the other is likely to move down. CHF is related to the Euro zone and therefore it is likely to have the same relationship with the USD as the EUR has with it.

Before taking a second trade or simultaneous trades, we need to check first the correlation and see if we are increasing or decreasing our risk.

If you are interested in seeing more technical indicators and back-tests, feel free to check out the below article:

If you are also interested by more technical indicators and using Python to create strategies, then my best-selling book on Technical Indicators may interest you:

New Technical Indicators in Python

A New Approach to Non-Linear Correlation: The MIC

The Maximal Information Coefficient — MIC is a measure with origins from information theory and attempts to capture the strength of linear and non-linear correlations. It does not tell you whether they move in opposite directions or in the same one, but it does tell you how strong is the current relationship and this is extremely valuable in analyzing different pairs of variables.

Let us try an experiment to actually prove that the MIC can capture non-linear relationships as well. We will simulate a Sinus and Cosinus time series and then we will calculate the correlation between the two. Here’s the code to plot the below chart:

import numpy as np
import matplotlib.pyplot as plt

data_range = np.arange(0, 30, 0.1)
sine = np.sin(data_range)
cosine = np.cos(data_range)

plt.plot(sine, color = 'black', label = 'Sine Function')
plt.plot(cosine, color = 'red', label = 'Cosine Function')
plt.grid()
plt.legend()

Clearly, someone looking at the graph without knowing the functions will conclude that they are somehow correlated, whether it is the black line leading the red line or that they are both bounded by two levels. What we want to do is to calculate the MIC for these two and compare the calculation to the two other correlation measures, Spearman and Pearson. We can use the below function to do so.

from scipy.stats import pearsonr
from scipy.stats import spearmanr
from minepy import MINE

# Pearson Correlation
print('Correlation | Pearson: ', round(pearsonr(sine, cosine)[0], 3))

# Spearman Correlation
print('Correlation | Spearman: ', round(spearmanr(sine, cosine)[0], 3))

# MIC
mine = MINE(alpha = 0.6, c = 15)
mine.compute_score(sine,cosine)
mine.mic()

print('Correlation | MIC: ', round(MIC, 3))

# Output: Correlation | Pearson:  0.035
# Output: Correlation | Spearman:  0.027
# Output: Correlation | MIC: 0.602

The results show the following:

Pearson: Notice the absence of any type of correlation here due to it missing out on the non-linear association.
Spearman: The same situation applies here with an extremely weak correlation because it does not capture non-linear relationships as indicated before.
MIC: The measure returned a strong relationship of 0.60 between the two which is closer to reality and to what we are seeing.

The advantages of the Maximal Information Coefficient is that it is robust to outliers and it does not make any assumptions about the distribution of the variables used.

Can the MIC be used in Trading? I would like to believe that it can be. A rolling MIC measure may also be useful as an AutoCorrelation Indicator.

Note that to use the library of the Maximal Information Coefficient, we have to type the following into the Python prompt:

pip install minepy

If you are also interested by more technical indicators and using Python to create strategies, then my best-selling book on Technical Indicators may interest you:

New Technical Indicators in Python

Conclusion

This wraps up the third article of the Trading A - Z series which aim to present all you need to know about Discretionary and Systematic Trading. Remember to always do your research before initiating any position!

All About Trading!

Discussion about this post