Sentiment Analysis is a vast and promising field in data analytics and trading. It is a rapidly rising type of analysis that uses the current pulse and market feeling to detect what participants intend to do or what positions they are holding.
Imagine you are planning to go see a movie and you want to anticipate whether this movie will be good or not, therefore, you ask many of your friends — whom have already seen the movie — about their opinions. Assuming 75% said the movie was good, you will have a certain confidence level that you will like it because the sentiment around this movie was mostly good (ignoring tastes and preferences). The same analogy can be applied to the financial markets through many ways either quantitative or qualitative.
Sometimes, indicators will be classified as more than one type, meaning a technical indicator can also be a sentiment indicator (e.g. the On-Balance Volume). And the way we analyze can also be technical (e.g. drawing support and resistance lines) or quantitative (e.g. mean-reversion).
Introduction to Sentiment Analysis
Sentiment Analysis is a type of predictive analytics that deal with alternative data other than the basic OHLC price structure. It is usually based on subjective polls and scores but can also be based on more quantitative measures such as expected hedges through market strength or weakness. One known objective sentiment analysis indicator is the Commitment of Traders report.
This article will deal with an indicator called the Dark Index, a sophisticated time series provided by squeezemetrics on a daily basis. The data itself harbors extremely valuable information when it comes to equity indices and trading. We will see how to download historical data on the S&P500 automatically in Python as well as the Dark Index, then we will design a trading strategy and evaluate it through the signal quality metric.
For a detailed and thorough collection of trend following trading strategies, you can check out my book. The book features a huge number of classic and modern techniques as it dwelves into the realm of technical analysis with different trading strategies. The book comes with its own GitHub.
Trend Following Strategies in Python: How to Use Indicators to Follow the Trend
Amazon.com: Trend Following Strategies in Python: How to Use Indicators to Follow the Trend.: 9798756939620: Kaabar…amzn.to
The Dark Index
The Dark Index provides a way of peeking into the secret (dark) exchanges. it is calculated as an aggregate value of many dark pool indicators (another type of release provided by squeezemetrics) and measures the hidden market sentiment. When the values of this indicator are higher than usual, it means that more buying occurred in dark pools than usual. We can profit from this informational gap. It is therefore a trail of liquidity providers (i.e. negatively correlated with the market it follows due to hedging activities).
Downloading the Data & Designing the Strategy
We will use Python to download automatically the data from the website. We will be using a library called selenium, famous for fetching and downloading data online. The first thing we need to do is to define the necessary libraries.
# Importing Libraries
import pandas as pd
import numpy as np
from selenium import webdriver
We will assume that we will be using Google Chrome for this task, however, selenium supports other web browsers so if you are using another web browser, the below code will work but will need to change for the name of your browser, e.g. FireFox.
# Calling Chrome
driver = webdriver.Chrome('C:\user\your_file/chromedriver.exe')
# URL of the Website from Where to Download the Data
url = "https://squeezemetrics.com/monitor/dix"
# Opening the website
driver.get(url)
# Getting the button by ID
button = driver.find_element_by_id("fileRequest")
To understand what we are trying to do, we can think of this as an assistant that will open a Google Chrome window, type the address given, searches for the download button until it is found. The download will include the historical data of the S&P500 as well as the Dark Index and the GEX, an indicator discussed in a previous article. For now, the focus is on the Dark Index. All that is left now is to simply click the download button which is done using the following code:
# Clicking on the button
button.click()
You should see a completed download called DIX in the form of a csv excel file. It is time to import the historical data file to the Python interpreter and structure it the way we want it to be. Make sure the path of the interpreter is in the downloads section where the new DIX file is found.
We will use pandas to read the csv file, then numpy to transform it into an array and shape it. Before we do this, let us first define two needed primal manipulation functions:
# A Function to Add a Specified Number of Columns
def adder(Data, times):
for i in range(1, times + 1):
new = np.zeros((len(Data), 1), dtype = float)
Data = np.append(Data, new, axis = 1)
return Data
# A Function to Delete a Specified Number of Columns
def deleter(Data, index, times):
for i in range(1, times + 1):
Data = np.delete(Data, index, axis = 1)
return Data
Next, we can use the following syntax to organize and clean the data. Remember, we have a csv file composed of three columns, the S&P500, the Dark Index, and the GEX.
# Importing the Excel File Using pandas
my_data = pd.read_csv('DIX.csv')
# Transforming the File to an Array
my_data = np.array(my_data)
# Eliminating the time stamp
my_data = my_data[:, 1:3]
Right about now, we should have a clean 2-column array with the S&P500 and the Dark Index. Let us write the conditions for the trade following the intuition we have seen in the DIX section:
A bullish signal is generated if the DIX reaches or surpasses 0.475, the historical resistance level while the previous three DIX readings are below 0.475 so that we eliminate duplicates.
No bearish signal is generated so that we harness the DIX’s power in an upward sloping market. The reason we are doing this is that we want to build a trend-following system. However, it may also be interesting to try and find bearish signals on the DIX. For simplicity, we want to see how well it times the market using only long signals.
# Creating the Signal Function
def signal(Data, dix, buy):
Data = adder(Data, 1)
for i in range(len(Data)):
if Data[i, dix] >= 0.475 and Data[i - 1, buy] == 0 and Data[i - 2, buy] == 0 and Data[i - 3, buy] == 0:
Data[i, buy] = 1
return Data
my_data = signal(my_data, 1, 2)
The code above gives us the signals shown in the chat. We can see that they are of good quality and typically the false signals occur only on times of severe volatility and market-related issues. As a buying-the-dips timing indicator, the Dark Index may be promising. Let us evaluate this using one metric for simplicity, the signal quality.
The signal quality is a metric that resembles a fixed holding period strategy. It is simply the reaction of the market after a specified time period following the signal. Generally, when trading, we tend to use a variable period where we open the positions and close out when we get a signal on the other direction or when we get stopped out (either positively or negatively). Sometimes, we close out at random time periods. Therefore, the signal quality is a very simple measure that assumes a fixed holding period and then checks the market level at that time point to compare it with the entry level. In other words, it measures market timing by checking the reaction of the market.
def signal_quality(Data, spx, signal, period, result):
Data = adder(Data, 1)
for i in range(len(Data)):
if Data[i, signal] == 1:
Data[i + period, result] = Data[i + period, spx] - Data[i, spx]
return Data
# Using 21 Periods as a Window of signal Quality Check
my_data = signal_quality(my_data, 0, 2, 21, 3)
positives = my_data[my_data[:, 3] > 0]
negatives = my_data[my_data[:, 3] < 0]
# Calculating Signal Quality
signal_quality = len(positives) / (len(negatives) + len(positives))
print('Signal Quality = ', round(signal_quality * 100, 2), '%')
# Output: 71.13 %
A signal quality of 74.13% means that on 100 trades, we tend to see in 74 of the cases a higher price 21 periods after getting the signal. The signal frequency may need to be addressed as there were not much signals since 2011 (~ 97 trades). Of course, perfection is the rarest word in finance and this technique can sometimes give false signals as any strategy can.