This project presents an in-depth data analysis and time series forecasting study on global surface temperature and CO₂ emissions. The goal is to uncover key patterns, understand the impact of greenhouse gases on climate change, and forecast future temperature trends using statistical and machine learning models.
- Tools & Libraries: Python, Pandas, NumPy, Matplotlib, Seaborn, Statsmodels, scikit-learn, TensorFlow (Keras), Prophet
- Techniques: EDA, Time Series Forecasting, Hypothesis Testing, Correlation Analysis, Change Point Detection
This project primarily uses two data sources
- Global Surface Temperature Anomaly (1850-2024) GISTEMP v4
- Global CO₂ Emissions (1850-2023) CO₂ Emissions
-
Data Exploration: Conducted comprehensive EDA on Global Surface Temperature and CO₂ datasets, analyzing data types, missing values, duplicates, and outliers. Used statistical imputation and interpolation to clean the data.
-
Temperature Trend
-
Time Series Forecasting: Modeled temperature anomalies using:
- SARIMA
- LSTM (Long Short-Term Memory Neural Network)
- Prophet
- Final model: SARIMAX, chosen based on performance and reliability.
-
Hypothesis Testing: Revealed that global warming follows a quadratic trend (accelerating change) rather than a constant rate.
-
Change Point Detection (CPD): Identified a significant acceleration in global warming from 1989 to 2024 using CPD techniques.
-
Correlation Analysis: Found a strong correlation between CO₂ emissions and rising global temperatures, confirming CO₂ as the dominant contributor to greenhouse gas effects.
Key insights and visualizations from this analysis have been published to Tableau Public, providing an interactive overview of global temperature anomalies, CO₂ emissions, and long-term climate trends for easy exploration and sharing. Tableau Dashboard
For any questions, suggestions, or feedback, feel free to open an issue in this GitHub repository.