Exploring Neural Network Architectures for Time Series Prediction

Let's Build a Simple Neural Network Model for Time Series Prediction

May 27, 2025

Whatever happened to neural networks lately? It seems that the world has forgotten about them. Let’s revisit them and create a simple forecasting model and discuss the particularities.

Neural Networks in a Nutshell

A neural network is a stack of layers, each made up of nodes (neurons) that perform computations. Every neuron receives inputs, applies a weight to each, adds a bias, and passes the result through an activation function. In mathematical terms:

Where:

x are the input values (e.g., past time steps).
w are the weights.
b is the bias.
σ is an activation function (like ReLU or tanh).
y is the neuron's output.

These outputs then serve as inputs to the next layer, propagating forward through the network.

In time series prediction, the input to the network is usually a fixed-size window of past time steps (e.g., last 10 days of stock returns), and the output is a forecasted value (e.g., tomorrow’s return).

We reformulate the sequential nature of time series into a supervised learning setup. If we have a sequence like:

We can train a neural network using sliding windows of size k, where each training example is:

Input: [xt−k,...,xt−1][x_{t-k}, ..., x_{t-1}][xt−k,...,xt−1]
Target: x

This approach allows us to use standard feedforward neural networks. However, feedforward models don't preserve temporal order or internal memory, which are critical for sequences.

Training neural networks on time series data comes with practical challenges:

Stationarity: Neural networks aren’t inherently designed to handle trends or seasonality, so preprocessing (differencing, detrending, normalization) is often necessary.
Lag selection: Choosing the window size (how many past steps to consider) impacts performance. Too short, and you lose context. Too long, and you add noise.
Multivariate inputs: If external features (weather, events, etc.) affect the series, they can be included as additional input channels.

Do you want to master Deep Learning techniques tailored for time series, trading, and market analysis🔥? My book breaks it all down from basic machine learning to complex multi-period LSTM forecasting while going through concepts such as fractional differentiation and forecasting thresholds. Get your copy here 📖!

Deep Learning for Finance

Creating a Time Series Forecasting Model Using Neural Networks

First of all, we need to make sure that the data is stationary. Why? Making the data stationary before forecasting is important because many time series forecasting techniques assume that the underlying data generating process is stationary. Stationarity refers to the statistical properties of a time series remaining constant over time. The plan of attack is as follows:

Download the historical data of the S&P 500 index.
Difference the prices of the previous step to get a stationary time series.
Use 10 lagged values as features (independent variables) on the first 95% of the data.
Select the hyperparameters of the model. In our case, we will use 100 hidden layers, the ReLu activation function, and the Adam optimizer.
Run (fit) the model, and predict on the remaining 5%. Predictions are t+1 every time. This means that every time step you will have a forecast for the next change (up or down).
Evaluate the performance using the RMSE and the hit ratio.

Use the following code to implement the experiment.

import numpy as np
from sklearn.neural_network import MLPRegressor
import matplotlib.pyplot as plt
import pandas_datareader as pdr

def data_preprocessing(data, num_lags, train_test_split):
    # Prepare the data for training
    x = []
    y = []
    for i in range(len(data) - num_lags):
        x.append(data[i:i + num_lags])
        y.append(data[i+ num_lags])
    # Convert the data to numpy arrays
    x = np.array(x)
    y = np.array(y)
    # Split the data into training and testing sets
    split_index = int(train_test_split * len(x))
    x_train = x[:split_index]
    y_train = y[:split_index]
    x_test = x[split_index:]
    y_test = y[split_index:]
    
    return x_train, y_train, x_test, y_test 
# Set the dates
start_date = '1970-01-01'
end_date   = '2025-05-01'

# Prepare the data
data = (pdr.get_data_fred('SP500', start = start_date, end = end_date)).dropna()
data = data.diff().dropna()
data = np.array(data)
data = np.reshape(data, (-1))

x_train, y_train, x_test, y_test = data_preprocessing(data, 10, 0.95)
# Create the model
model = MLPRegressor(hidden_layer_sizes=(100,), activation='relu', solver='adam')
# Fit the model to the data
model.fit(x_train, y_train)
# Predict on the same data used for training
y_pred = model.predict(x_test)
# Plot the original sine wave and the predicted values
plt.plot(y_pred[-100:], label = 'Predicted Data', linestyle = '--', marker = 'o')
plt.plot(y_test[-100:], label = 'True Data', marker = 'o', alpha = 0.7)
plt.legend()
plt.grid()
plt.axhline(y = 0, color = 'black', linestyle = '--')
import math
from sklearn.metrics import mean_squared_error
rmse_test = math.sqrt(mean_squared_error(y_pred, y_test))
print(f"RMSE of Test: {rmse_test}")
same_sign_count = np.sum(np.sign(y_pred) == np.sign(y_test)) / len(y_test) * 100
print('Hit Ratio = ', same_sign_count, '%')

The following chart shows the predicted values in blue and the test values (remaining 5% that didn’t touch the linear regression model) in orange.

Before we continue, let’s discuss the hyperparameters of the MLP function:

hidden_layer_sizes: The ith element represents the number of neurons in the ith hidden layer. Think of it as intermediate layers between input and output that allow the network to learn complex patterns.
activation: This is the activation function for the hidden layer. A function that introduces non-linearity to help the network learn intricate relationships.
solver: The solver for weight optimization. The one we chose (adam) refers to a stochastic gradient-based optimizer.

The following is the result of the experiment.

RMSE of Test: 84.76767358057366
Hit Ratio =  54.40 %

The results mean nothing because this was a very simple experiment with no optimization nor validation techniques. It’s to simply refresh your knowledge about MLPs and the fact they constitute a solid machine learning (or deep learning) model.

Every week, I analyze positioning, sentiment, and market structure. Curious what hedge funds, retail, and smart money are doing each week? Then join hundreds of readers here in the Weekly Market Sentiment Report 📜 and stay ahead of the game through chart forecasts, sentiment analysis, volatility diagnosis, and seasonality charts.

Free trial available🆓

All About Trading!

Discussion about this post