A comprehensive time series analysis and forecasting project using classical time series methods on airline passenger data (1949-1960).
Dataset: Airline Passengers (International Airline Passengers, 1949-1960)
Time Period: 12 years (144 monthly observations)
Forecast Horizon: 24 months (1961-1962)
Best Model: SARIMA(0,1,1)(0,1,1,12) - "Airline Model"
Model Accuracy: 0.48% MAPE on test set
time-series-analysis/ ├── data/ # Data files │ ├── airline_passengers.csv # Original dataset │ ├── full_processed_data.csv # Fully processed data │ ├── train_data.csv # Training data (80%) │ ├── test_data.csv # Testing data (20%) │ ├── decomposition_results.csv # Trend/seasonal/residual │ ├── seasonal_pattern.csv # Monthly seasonal effects │ ├── model_comparison.csv # Model performance comparison │ ├── best_model_forecast.csv # Test set forecasts │ ├── best_model_summary.txt # Model statistics │ ├── future_forecasts.csv # 24-month predictions │ └── forecast_summary.csv # Forecast statistics ├── notebooks/ # Analysis scripts │ ├── 01_exploration.py # Data exploration │ ├── 02_preprocessing.py # Data preprocessing │ ├── 03_decomposition.py # Time series decomposition │ ├── 04_model_selection.py # ARIMA/SARIMA model selection │ ├── 05_forecasting.py # Future predictions │ └── 06_executive_summary.py # Executive summary ├── reports/ # Analysis reports │ ├── images/ # Visualizations (12 images) │ ├── data_exploration_report.txt # Initial data analysis │ ├── preprocessing_report.txt # Data preparation summary │ ├── decomposition_report.txt # Time series decomposition │ ├── model_selection_report.txt # Model evaluation │ ├── forecasting_report.txt # Future predictions │ └── executive_summary_report.txt # Complete executive summary ├── requirements.txt # Python dependencies ├── LICENSE # MIT License └── README.md # This file
# Clone repository
git clone https://github.com/Awande07/time-series-analysis.git
cd time-series-analysis
# Install dependencies
pip install -r requirements.txt
# Run analysis (in order)
python notebooks/01_exploration.py
python notebooks/02_preprocessing.py
python notebooks/03_decomposition.py
python notebooks/04_model_selection.py
python notebooks/05_forecasting.py
python notebooks/06_executive_summary.py
Key Results
Model Performance
Best Model: SARIMA(0,1,1)(0,1,1,12)
Test Accuracy: 0.48% MAPE (Mean Absolute Percentage Error)
AIC Score: -323.7
Residuals: Stationary with mean close to zero
Forecast Results (1961-1962)
Average Forecast: 547,000 passengers/month
Peak Forecast: 739,000 passengers (July 1962)
Trough Forecast: 426,000 passengers (February 1961)
Total Growth: 16.6% over 2 years
Monthly Growth Rate: 0.6%
Uncertainty: ±16.8% (95% confidence interval)
Seasonal Insights
Peak Month: July (+63.83 passenger effect)
Trough Month: November (-53.59 passenger effect)
Seasonal Amplitude: 117 passengers
Pattern: Strong 12-month cycle with summer peaks
��� Methodology
Data Exploration & Preprocessing
Loaded and explored 144 monthly observations
Applied transformations (log, differencing)
Tested for stationarity (ADF test)
Created train/test split (80%/20%)
Time Series Decomposition
Applied additive and multiplicative decomposition
Identified trend, seasonal, and residual components
Multiplicative model preferred (335,541x better variance ratio)
Model Selection
Tested multiple ARIMA/SARIMA models
Selected based on AIC/BIC criteria
Validated with residual diagnostics
Achieved 0.48% MAPE on test set
Forecasting
Generated 24-month forecasts
Calculated confidence intervals
Analyzed uncertainty and growth patterns
Created business-ready insights
��� Visualizations
The project includes 12 comprehensive visualizations:
01_data_exploration.png - Initial data exploration
02_seasonality_analysis.png - Seasonal patterns
03_transformations.png - Time series transformations
04_acf_pacf.png - Autocorrelation analysis
05_train_test_split.png - Train/test split
06_decomposition.png - Time series decomposition
07_multiplicative_decomposition.png - Multiplicative model
08_model_performance.png - Model performance
09_residuals_analysis.png - Residuals analysis
10_forecast_analysis.png - Forecast analysis
11_forecast_growth.png - Forecast growth
12_executive_dashboard.png - Executive dashboard
���️ Technical Stack
Python 3.9
pandas - Data manipulation
numpy - Numerical computations
matplotlib & seaborn - Visualization
statsmodels - Time series analysis
scikit-learn - Model evaluation metrics
��� Files Generated
Data Files (8 files)
Original and processed datasets
Decomposition results
Model comparisons
Future forecasts
Reports (6 files)
Detailed analysis reports
Executive summary
Business recommendations
Visualizations (12 images)
Comprehensive charts and dashboards
Model diagnostics
Forecast visualizations
��� Business Applications
Capacity Planning: Plan for 547K avg passengers with 739K summer peaks
Resource Allocation: Increase resources by 16.6% over 2 years
Risk Management: Account for ±16.8% forecast uncertainty
Strategic Planning: Support investment decisions with data-driven forecasts
��� Contributing
Contributions are welcome! Please follow these steps:
Fork the repository
Create a feature branch (git checkout -b feature/improvement)
Commit changes (git commit -am 'Add improvement')
Push to branch (git push origin feature/improvement)
Create Pull Request
��� License
This project is licensed under the MIT License - see the LICENSE file for details.
��� Author
Awande Gcabashe
GitHub: Awande07
Project: Time Series Analysis & Forecasting
��� Acknowledgments
Dataset: International Airline Passengers (Box & Jenkins, 1976)
Methodology: Classical time series analysis techniques
Tools: Python, pandas, statsmodels, matplotlib
Inspiration: Real-world business forecasting challenges