Lazy loaded image
Business
Lazy loaded imageTime Series Forecasting on Sales Data
Words 901Read Time 3 min
Apr 21, 2020
May 11, 2025
type
status
date
slug
summary
tags
category
icon
password

1.Understand the Basics of Time Series

•A time series is a sequence of data points collected or recorded at time intervals. •Components include: Trend: Long-term movement in data. Seasonality: Repeating short-term cycles. Cyclic patterns: Irregular long-term fluctuations. Noise: Random variation.

1. Trend – The Long-Term Direction of Data

Meaning:
A trend is the general movement of data over a long period. It shows whether the values are increasing, decreasing, or staying constant over time.
Example:
Imagine an online store where monthly sales go from $10,000 to $50,000 over three years. Even if there are some up-and-down fluctuations month to month, the overall direction is upward. That's an upward trend.
Real-life examples:
  • Housing prices increasing steadily over 10 years.
  • A growing number of users on a social media app.

2. Seasonality – Short-Term Repeating Patterns

Meaning:
Seasonality refers to regular, predictable patterns that happen at specific time intervals, like daily, weekly, monthly, or yearly.
Example:
  • Ice cream sales increase every summer and decrease every winter.
  • Online shopping spikes during Black Friday, Christmas, or Chinese New Year every year.
These patterns are consistent and repeat due to weather, holidays, or human behavior.

3. Cyclic Patterns – Irregular Long-Term Fluctuations

Meaning:
A cyclical pattern is a wave-like movement in the data that occurs over the long term, but not at fixed intervals like seasonality. These are usually driven by economic or market conditions.
Example:
  • The economy goes through booms and recessions, but we can't say they happen exactly every 3 years.
  • Stock markets rise and fall in cycles, but the timing isn't predictable.
Difference from Seasonality:
  • Seasonality is fixed (e.g., every December).
  • Cycles are not fixed and often hard to predict.

4. Noise – Random, Unpredictable Fluctuations

Meaning:
Noise refers to random variations in the data that don’t follow any pattern. It's caused by unexpected events, errors, or anomalies.
Example:
  • A restaurant’s sales suddenly drop one day because of road construction nearby.
  • A data entry mistake causes an incorrect sales value to appear.
Noise is what we try to filter out during analysis because it’s unpredictable and not meaningful in the long term.

Summary Table:
Component
Time Scale
Predictable?
Example
Trend
Long-term
Yes
Sales growing over 3 years
Seasonality
Short-term, fixed cycles
Yes
Higher sales during summer every year
Cyclic
Long-term, irregular cycles
Partially
Economic recession and recovery cycles
Noise
Any time
No
Sales dip due to a one-day power outage

2.Prepare the Sales Data

•Ensure your data is clean, with consistent date formatting. •Remove duplicates, fill or handle missing values. •Aggregate to the desired frequency (daily, weekly, monthly).
 
date
sales
year
month
day
weekday
week
is_weekend
is_month_end
2019-09-09
1200
2019
9
9
0
37
0
0
2019-09-10
1300
2019
9
10
1
37
0
0
2019-09-11
1100
2019
9
11
2
37
0
0
 
Column
Purpose
Use in Analysis
date
Main time index
Identify trends, seasonality, and forecast sales.
sales
Sales value for the specific day
Analyze trends, seasonality, and forecast sales.
year
Year part of the date
Compare sales year-over-year.
month
Month part of the date
Analyze monthly seasonality and trends.
day
Day of the month
Identify patterns for specific days of the month.
weekday
Day of the week (0=Monday, 6=Sunday)
Identify weekday vs weekend sales patterns.
week
Week number of the year
Track trends or seasonality on a weekly basis.
is_weekend
Indicator for weekends (1=Yes, 0=No)
Analyze if weekends have higher sales.
is_month_end
Indicator for month-end (1=Yes, 0=No)
Examine month-end effects (e.g., promotions, last-minute sales).

3.Visualize the Data

•Plot line graphs to observe trends and seasonality. •Use moving averages or decomposition to separate components.
notion image
 
 

4.Stationarize the Series

•Most models require a stationary time series (constant mean/variance). •Use differencing or log transformation to make it stationary. •Check stationarity with the ADF test (Augmented Dickey-Fuller).
Differencing is a technique used to make a time series stationary by removingits trend component. It works by computing the difference between each data point and its previous one, thereby eliminating long-term trends such as consistent increases or decreases in the data.
Log transformation is used when your data shows changing variance — for example, when fluctuations get bigger as values increase. Taking the logarithm compresses the scale of large values and stabilizes variance.
 

 
Model
Requires Stationary Data?
Reason
How to Make Stationary (if needed)
ARIMA / SARIMA
Yes
Assumes constant mean and variance, and requires stationary data for proper modeling.
Differencing, Log Transformation
LSTM / GRU (Deep Learning)
No
Can capture trends and seasonality directly from the data, even if it’s non-stationary.
Optional: Stationary data can help improve performance.
XGBoost / LightGBM (Tree Models)
No
Can handle non-stationary data as long as proper time-based features are included.
Feature engineering (time-based features like month, week).
Prophet
No
Handles trend and seasonality internally; no need to make the data stationary.
Optional: Stationary data can be used for improved results.

5.Choose a Forecasting Model

Common models include: •ARIMA (AutoRegressive Integrated Moving Average) •SARIMA (Seasonal ARIMA) •Prophet (Facebook’s model for trend/seasonality) •LSTM (for deep learning-based forecasting) •XGBoost or LightGBM with time features
notion image
Sales Forecast for Next 12 Months: ds yhat yhat_lower yhat_upper 48 2017-12-31 5140.288248 2657.765329 7442.897327 49 2018-01-31 4183.106428 1756.435972 6345.867659 50 2018-02-28 7548.698781 5307.431892 9923.442633 51 2018-03-31 3028.596319 670.286161 5353.491328 52 2018-04-30 5681.844430 3419.796569 7947.527499 53 2018-05-31 9623.025053 7407.128556 12210.501998 54 2018-06-30 12114.798430 9959.973203 14496.319345 55 2018-07-31 -347.569620 -2664.352405 1832.156175 56 2018-08-31 12416.723603 9963.292354 14761.788088 57 2018-09-30 11983.319951 9783.522730 14524.616427 58 2018-10-31 12056.335799 9619.594019 14287.291458 59 2018-11-30 13705.830357 11469.002839 16079.450986
 

Train and Evaluate the Model

•Split into training and test sets (e.g., last 3 months for testing). •Use RMSE, MAE, MAPE for evaluation. •Tune hyperparameters based on validation.
  1. Make Predictions •Forecast future sales. •Plot predicted vs actual to evaluate.
  1. Interpret and Apply •Use forecasts to make decisions (inventory, marketing). •Consider external factors (promotions, holidays) to improve accuracy.
 
Prophet Model Evaluation Metrics: RMSE: $1392.54 MAE: $1259.31 MAPE: 16.71%
上一篇
RFM Analysis + KMeans Clustering
下一篇
Common IT careers and their associated tech stacks