Using TensorFlow for Time Series Forecasting

Sep 27, 2023

Time series forecasting is an important task in many industries, from finance to weather prediction. It involves predicting future values in a sequence of data, based on patterns learned from historical data. This kind of problem is unique because the order of data points matters—a challenge that standard machine learning algorithms are not well-suited to handle.

In this guide, we’ll explore how to use TensorFlow to build models for time series forecasting, focusing on key architectures such as Recurrent Neural Networks (RNNs) and Long Short-Term Memory networks (LSTMs). These networks are designed to handle sequential data effectively, making them ideal for time series tasks.

What is Time Series Forecasting?

Time series data consists of observations recorded at specific intervals, such as daily stock prices, monthly sales figures, or hourly weather readings. The goal of time series forecasting is to predict future values based on the trends, seasonality, and patterns present in the data.

For example, you might want to predict the next day’s stock price based on the prices from the previous days or forecast electricity demand based on past hourly consumption.

Key Characteristics of Time Series Data:

Temporal Dependency: Data points are not independent, meaning that past values influence future values.
Stationarity: A time series is considered stationary if its statistical properties (mean, variance) do not change over time. Many forecasting methods require stationary data. For non-stationary data, techniques such as differencing can be used to make it stationary.
Seasonality and Trend: Many time series data exhibit seasonal patterns or trends over time (e.g., electricity usage spikes during the summer).

Why Use RNNs and LSTMs for Time Series?

Recurrent Neural Networks (RNNs) and Long Short-Term Memory (LSTM) networks are specifically designed to handle sequential data. Unlike traditional feedforward neural networks, RNNs and LSTMs have loops that allow them to pass information from one step of the sequence to the next, capturing the temporal dependencies in time series data.

RNNs (Recurrent Neural Networks)

RNNs maintain a “hidden state” that is updated at each time step, allowing information to persist throughout the sequence. The mathematical formulation for the hidden state at time $t$ is:

h_t = \sigma(W_{xh}x_t + W_{hh}h_{t-1} + b_h)

Where:

$x_t$ is the input at time $t$ ,
$h_t$ is the hidden state at time $t$ ,
$W_{xh}$ and $W_{hh}$ are the weight matrices for input and hidden state, and
$\sigma$ is the activation function (typically tanh or ReLU).

However, RNNs suffer from the vanishing gradient problem when processing long sequences. This happens when the gradients used to update weights during backpropagation become too small, making it difficult for the network to learn long-term dependencies. This is where LSTMs come in.

LSTMs (Long Short-Term Memory Networks)

LSTMs are a special type of RNN designed to remember information over long time periods. They use gates to control the flow of information, which helps the network retain important information while forgetting irrelevant details. While LSTMs are more resistant to the vanishing gradient problem, they are not entirely immune to it.

The key components of an LSTM cell are:

Forget Gate:
- Decides what information to discard from the previous cell state.
- $f_t = \sigma(W_f \cdot [h_{t-1}, x_t] + b_f)$
Input Gate:
- Decides what new information to store in the cell state.
- $i_t = \sigma(W_i \cdot [h_{t-1}, x_t] + b_i)$
- $\tilde{C}_t = \tanh(W_C \cdot [h_{t-1}, x_t] + b_C)$
Cell State Update:
- Updates the cell state by combining the forget gate and input gate outputs.
- $C_t = f_t \cdot C_{t-1} + i_t \cdot \tilde{C}_t$
Output Gate:
- Decides what the next hidden state should be.
- $o_t = \sigma(W_o \cdot [h_{t-1}, x_t] + b_o)$
- $h_t = o_t \cdot \tanh(C_t)$

This gating mechanism allows LSTMs to avoid the vanishing gradient problem and learn dependencies over long sequences, making them ideal for time series forecasting tasks.

Building a Time Series Forecasting Model with TensorFlow

Let’s build a simple time series forecasting model using LSTM in TensorFlow. For this tutorial, we’ll use a synthetic dataset that simulates a simple trend with noise.

Step 1: Import Necessary Libraries

import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
from sklearn.preprocessing import MinMaxScaler

Step 2: Create a Synthetic Dataset

For demonstration purposes, we’ll generate a sine wave as our time series data.

# Generate synthetic time series data (sine wave with noise)
def generate_time_series(n, noise_factor=0.1):
    time = np.linspace(0, 50, n)
    series = np.sin(time) + noise_factor * np.random.randn(n)
    return series

# Create dataset
n = 1000
series = generate_time_series(n)
plt.plot(series)
plt.title('Synthetic Time Series Data')
plt.show()

Step 3: Prepare Data for Training

To prepare the data for training an LSTM, we need to create sliding windows of data points. Each window will be used to predict the next time step.

# Create sliding windows of data
def create_dataset(series, window_size):
    X, y = [], []
    for i in range(len(series) - window_size):
        X.append(series[i:i+window_size])
        y.append(series[i+window_size])
    return np.array(X), np.array(y)

window_size = 20
X, y = create_dataset(series, window_size)

# Split into training and test sets
split = int(len(X) * 0.8)
X_train, X_test = X[:split], X[split:]
y_train, y_test = y[:split], y[split:]

# Reshape data for LSTM (samples, time steps, features)
X_train = np.reshape(X_train, (X_train.shape[0], X_train.shape[1], 1))
X_test = np.reshape(X_test, (X_test.shape[0], X_test.shape[1], 1))

Step 4: Build the LSTM Model

We’ll use a simple LSTM model for this task.

# Build the LSTM model
model = tf.keras.Sequential([
    tf.keras.layers.LSTM(50, return_sequences=False, input_shape=(window_size, 1)),
    tf.keras.layers.Dense(1)
])

model.compile(optimizer='adam', loss='mse')

# Train the model
history = model.fit(X_train, y_train, epochs=20, validation_data=(X_test, y_test))

Step 5: Evaluate and Visualize Results

Once the model is trained, we can evaluate its performance on the test set and visualize the predictions.

# Predict and plot results
y_pred = model.predict(X_test)

plt.plot(y_test, label='True values')
plt.plot(y_pred, label='Predicted values')
plt.legend()
plt.title('Time Series Forecasting with LSTM')
plt.show()

Practical Tips for Time Series Forecasting

Scaling the Data: Time series data often needs to be scaled for better performance in neural networks. Use techniques like MinMaxScaler to scale your data between 0 and 1.
```
scaler = MinMaxScaler()
series_scaled = scaler.fit_transform(series.reshape(-1, 1))
```
Tuning the Window Size: Choosing the right window size (the number of time steps used to predict the next step) is crucial. You can experiment with different sizes to find the optimal one for your data.
Handling Seasonality: If your time series exhibits seasonal patterns, consider adding features like time-of-year or using more advanced models that incorporate seasonality, such as SARIMA.
Use Recurrent Dropout: To prevent overfitting, you can use dropout within the LSTM cells (recurrent dropout) to randomly drop connections during training.
Cross-Validation in Time Series: Unlike traditional machine learning models, time series data cannot be shuffled randomly. Use time series cross-validation, which involves training the model on earlier time segments and testing it on later segments.

Conclusion

Time series forecasting is a powerful tool for predicting future values based on historical data. With TensorFlow and architectures like RNNs and LSTMs, you can effectively model temporal dependencies and trends in sequential data. By following the steps outlined in this guide, you can start building your own time series forecasting models and apply them to real-world problems like stock price prediction, weather forecasting, or energy consumption forecasting.

As you gain experience, you can experiment with different model architectures, hyperparameters, and preprocessing techniques to improve your forecasts.