How to Perform Crucial Statistical Analysis on Stock Data
Using Python to Perform Statistical Analysis
Statistical analysis plays a crucial role in market research as it allows us to understand the current trend and characteristicsfrom another angle, different to the techniques used in fundamental and technical analyses. This article presents a few important concepts in descriptive statistics, and shows how to statistically analyze stocks using financial metrics acquired from FMP.
Quick Introduction to Statistical Concepts
Statistical analysis is like the backbone of making sense out of data in various fields, including finance. Imagine you have a bunch of numbers or information, and you want to draw meaningful conclusions from them. That’s where statistical analysis comes into play. It is composed of a few important metrics such as:
Mean: It’s the average of a set of numbers. You add up all the values and then divide by the total number of values.
Median: This is the middle value when your data is arranged in numerical order. If there’s an even number of values, it’s the average of the two middle ones.
Mode: The mode is the value that appears most frequently in your dataset. It’s the one that shows up the most.
Variance: It measures how much each number in a dataset differs from the mean. It’s the average of the squared differences from the mean.
Standard deviation: This is the square root of variance. It gives you a sense of how spread out the values in your dataset are. Smaller standard deviation means values are closer to the mean.
Range: It’s the difference between the highest and lowest values in your dataset. It tells you how much your data spans from end to end.
These stats help you understand the story behind your numbers, whether you’re dealing with financial data or anything else. They’re like the basic tools in your statistical toolbox.
Think of statistical analysis as the toolkit that helps you make sense of a pile of numbers and draw meaningful conclusions. It’s like the Sherlock Holmes of data — figuring out the story that the numbers are trying to tell.You can also check out my other newsletter The Weekly Market Sentiment Report that sends tactical directional views every weekend to highlight the important trading opportunities using a mix between sentiment analysis (COT reports, Put-Call ratio, Gamma exposure index, etc.) and technical analysis.
Quick Introduction to FMP
Financial Modeling Prep (FMP) API provides real time stock price, company financial statements, major index prices, stock historical data, forex real time rate and cryptocurrencies.
FMP stock price API is in real time, the company reports can be found in quarter or annual format, and goes up to 30 years back in history. This article will show one way to use the API to import key financial metrics and evaluate them using theoretical financial analysis.
Importing the Quarterly Financial Metrics in Python
The aim of this article is to show how to use a few financial metrics in a statistical way. This means that we will take the historical values of these metrics and interpret their current value relative to the past. By doing so, we may have some clues as to what lies next. For this example, let’s use the price-to-earnings ratio.
It represents the price of a company’s stock relative to its earnings per share (EPS). The formula for calculating P/E ratio is:
A high P/E ratio may indicate that the stock is overvalued, while a low P/E ratio may suggest undervaluation. However, it’s essential to consider industry benchmarks and the company’s growth prospects when interpreting this ratio.
Let’s try importing Apple’s quarterly P/E ratio over the years using this code:
# For Python 3.0 and later
from urllib.request import urlopen
# Fall back to Python 2's urllib2
from urllib2 import urlopen
import matplotlib.pyplot as plt
import pandas as pd
Receive the content of ``url``, parse it as JSON and return the object.
url : str
response = urlopen(url, cafile=certifi.where())
data = response.read().decode("utf-8")
url_1 = ("https://financialmodelingprep.com/api/v3/ratios/AAPL?period=quarter&apikey=YOUR_API_KEY_HERE")
ratios = get_jsonparsed_data(url_1)
per = 
for dictionary in ratios:
if 'priceEarningsRatio' in dictionary:
per = per[::-1]
Now, let’s analyze it using one simple pandas line:
per = pd.Series(per[17:) # to remove the first zero values
The describe function returns several statistical metrics as shown in the output below:
count 136.000000 # The number of P/E values in the series
mean 19.227536 # The mean of the P/E values
std 32.679236 # The standard deviation (volatility)
min -210.346362 # The smallest value
25% 10.864534 # The 25% percentile
50% 17.129366 # The 50% percentile (also called the Median)
75% 25.943987 # The 75% percentile
max 253.139566 # The largest value
You have to create an account and get your API key and replace it in the appropriate place in the previous code. I highly recommend this as it takes no longer than a few minutes.
However, if we calculate the range of the values, we get a whopping 463. This means that the series may be too volatile. The Price-to-Earnings (P/E) ratio can be influenced by various factors, and certain conditions can contribute to its volatility. Here are some factors that can make the P/E ratio highly volatile:
Earnings volatility: The “E” in P/E stands for earnings. If a company’s earnings fluctuate significantly over time, it can cause the P/E ratio to be more volatile.
Market sentiment: Investor sentiment plays a crucial role in determining stock prices and, consequently, the P/E ratio. If there is heightened market uncertainty, fear, or exuberance, it can lead to more significant price swings, influencing the P/E ratio.
Economic conditions: Economic fluctuations and uncertainties can impact a company’s earnings and, subsequently, its P/E ratio. During economic downturns, companies may experience lower earnings, leading to higher P/E ratios. Conversely, economic booms may result in lower P/E ratios.
Therefore, measures such as the mean are not very suitable with such outliers. For this, it is better to calculate the median. The median can be calculated as follows:
# Output : 17.12
It seems that it is not very far from the mean. Therefore, the median of the P/E ratio of Apple is 17.12.
If we want to go back to the mean and understand it volatility-wise, we can see that the standard deviation is 32, which makes it a very volatile metric.
One of the ways to know that we have extremely high P/E ratios is to compare the current value with the mean added to the standard deviation. The current value is 29 which seems to be within the normality (19.22 + 32.67 = 51.89).
Certainly, the P/E ratio is not the most stable nor the best financial metric to analyze a stock, this is why we need to analyze other ratios to get a glimpse of the evolution of the company. Other metrics you can analyze include:
Price/Earnings to Growth Ratio
Return on Equity
Link for the FMP site here.
Before we end, let’s discuss some common pitfalls in statistical analysis which may inhibit the reliability of your results. Here are some things to steer clear of:
Be aware of the assumptions underlying the statistical tests you use. Violating these assumptions can lead to inaccurate results.
Avoid selectively choosing data that supports your hypothesis while neglecting contradictory information. This confirmation bias can lead to skewed conclusions.
Outliers can significantly impact statistical results. Don’t ignore them; instead, investigate their cause and decide whether to include or exclude them based on a valid reason.
If you explore your data extensively before formal analysis, you might inadvertently find patterns that appear significant but are due to chance. To mitigate this, split your data into training and testing sets.
You can also check out my other newsletter The Weekly Market Analysis Report that sends tactical directional views every weekend to highlight the important trading opportunities using technical analysis that stem from modern indicators. The newsletter is free.
If you liked this article, do not hesitate to like and comment, to further the discussion!