Skip to content

Commit

Permalink
Added more stuff.
Browse files Browse the repository at this point in the history
  • Loading branch information
jejjohnson committed Jan 10, 2024
1 parent 63ce720 commit 0a82e1e
Show file tree
Hide file tree
Showing 14 changed files with 1,353 additions and 3 deletions.
16 changes: 15 additions & 1 deletion _toc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,24 @@ root: notes/index.md
chapters:
- title: Introduction
sections:
- file: notes/project.md
- file: notes/litreview.md
- title: Theory
sections:
- file: notes/theory/evt
- file: notes/theory/eval
- file: notes/theory/bayes
- title: Data
sections:
- file: notes/data/datasets
- file: notes/data/datasets
- file: notes/data/data_access
- title: Modeling
sections:
- file: notes/modeling/features_manual
- file: notes/modeling/features_ai
- file: notes/modeling/software
- title: Cookbook
sections:
- file: notes/cookbook/filtering
- file: notes/cookbook/anomalies
- file: notes/cookbook/spatial_mean
3 changes: 2 additions & 1 deletion myst.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
# See docs at: https://mystmd.org/guide/frontmatter
version: 1
project:
title: Bayesian Extreme Value Modeling For Climate
title: Bayesian Methods for Extreme Value Modeling
subtitle: Applications in Climate
# description:
keywords: []
authors: []
Expand Down
124 changes: 124 additions & 0 deletions notes/cookbook/anomalies.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,124 @@
---
title: Anomalies in EO
subject: Anomalies with Spatiotemporal Data
short_title: EO Data Anomalies
authors:
- name: J. Emmanuel Johnson
affiliations:
- CSIC
- UCM
- IGEO
orcid: 0000-0002-6739-0053
email: juanjohn@ucm.es
license: CC-BY-4.0
keywords: simulations
abbreviations:
ERA5: ECMWF Reanalysis Version 5
CMIP6: Coupled Model Intercomparison Project Phase 6
AMIP6: Atmospherical Model Intercomparison Project Phase 6
PDEs: Partial Differential Equations
RHS: Right Hand Side
TLDR: Too Long Did Not Read
SSP: Shared Socioeconomic Pathways
GPD: Generalized Pareto Distribution
GEV: Generalized Extreme Value
---


## Climatology

$$
\begin{aligned}
\text{Climatology Equation}: && && \bar{y}_c(t) &= \frac{1}{N_s}\sum_{n=1}^{Ns}\boldsymbol{y}(\mathbf{x}_n,t) \\
\text{Climatology Function}: && && \bar{y}_c&: \Omega_\text{Globe}\times\mathcal{T}_\text{Reference} \rightarrow \mathbb{R}^{D_y} \\
\text{Spatial Domain}: && && \mathbf{x}&\in\Omega_\text{Globe}\subseteq\mathbb{R}^{D_s}\\
\text{Temporal Domain}: && && t&\in\mathcal{T}_\text{Reference}\subseteq\mathbb{R}^+
\end{aligned}
$$

:::{seealso} Tutorials
:class: dropdown

[**ClimateMatch**](https://comptools.climatematch.io/tutorials/W1D1_ClimateSystemOverview/student/W1D1_Tutorial5.html).
An simple tutorial showcasing how the `groupby` function works wrt monthly/seasonal means.

[**Xarray**](https://docs.xarray.dev/en/stable/examples/monthly-means.html).
A tutorial that showcases how to calculate seasonal averages from time series of monthly means.

:::

## Anomalies


There is an error in this formulation because you cannot subtract the climatology from the global time series because they are on different temporal domains.

$$
\begin{aligned}
\text{Climatology}: && &&
\boldsymbol{\bar{y}} &= \boldsymbol{\bar{y}}_c(t) && && t\in\mathcal{T}_\text{Reference}\subseteq\mathbb{R}^+ \\
\text{Data}: && &&
\boldsymbol{y} &= \boldsymbol{y}(\mathbf{x},t)
&& && t\in\mathcal{T}_\text{Globe}\subseteq\mathbb{R}^+ && &&
\mathbf{x}\in\Omega\subseteq\mathbb{R}^{D_s}
\end{aligned}
$$

From a code perspective, this can be stated where the global data is

```python
# global data
data_globe: Array["Nx Ny Nt"] = ...
# climatology reference period
data_climatology: Array["Nc"] = ...
# IMPOSSIBLE to subtract one timeseries from another (even with broadcasting)
data_anomaly: Array["Nx Ny Nt"] = data_globe - data_climatology

```

**TODO**: Need to figure out how this works.


**PsuedoCode**

https://xcdat.readthedocs.io/en/latest/examples/climatology-and-departures.html

### Example 1: Monthly Mean

This example was taken from the [xarray documentation](https://docs.xarray.dev/en/latest/examples/weather-data.html#Calculate-monthly-anomalies).

```python
# calculate monthly mean
climatology: xr.Dataset = ds.groupby("time.month").mean("time")

# calculate anomalies
anomalies: xr.Dataset = ds.groupby("time.month") - climatology
```

***

### Example 2: Monthly Standardization

We can also calculate the standardized monthly means.
This implies calculating the monthly mean and standard deviation.


```python
# calculate monthly mean
climatology_mean: xr.Dataset = ds.groupby("time.month").mean("time")
climatology_std: xr.Dataset = ds.groupby("time.month").std("time")

# create standardization function
std_fn = lambda x, mean, std: (x - mean) / std

# calculate anomalies
anomalies: xr.Dataset = xr.apply_ufunc(
std_fn,
ds.groupby("time.month"),
climatology_mean,
climatology_std
)
```

***

### Example 3: Seasonal
105 changes: 105 additions & 0 deletions notes/cookbook/filtering.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
---
title: Filtering EO Data
subject: Filtering Spatiotemporal Data
short_title: Filtering EO Data
authors:
- name: J. Emmanuel Johnson
affiliations:
- CSIC
- UCM
- IGEO
orcid: 0000-0002-6739-0053
email: juanjohn@ucm.es
license: CC-BY-4.0
keywords: simulations
abbreviations:
ERA5: ECMWF Reanalysis Version 5
CMIP6: Coupled Model Intercomparison Project Phase 6
AMIP6: Atmospherical Model Intercomparison Project Phase 6
PDEs: Partial Differential Equations
RHS: Right Hand Side
TLDR: Too Long Did Not Read
SSP: Shared Socioeconomic Pathways
GPD: Generalized Pareto Distribution
GEV: Generalized Extreme Value
---

> This is my quick start guide to filtering Earth Observation data.
> Filtering is a very useful tool.
> It can be used for smoothing which can remove high-frequency or low frequency signals.
> It can also be used to calculate min-max-mean values of a window of data.
> The unique thing about EO data is that we have spatiotemporal data.
> So this implies that we will have to often decide whether we want to apply the filter operation along the spatial axis and/or the temporal axis.
## Formulation

Most filtering operations are special cases of convolutions.

$$
\begin{aligned}
\text{Convolution Operator}: && && \boldsymbol{\bar{y}}{(\mathbf{x})} &= (\boldsymbol{f} \circledast \boldsymbol{y})(\mathbf{x}) \\
\text{Continuous Convolution}: && && &= \int_{-\infty}^\infty
\boldsymbol{y}(\tau)\boldsymbol{f}(\mathbf{x}-\tau)d\tau \\
\text{Discrete Convolution}: && && &= \sum_{m=-\infty}^\infty
\boldsymbol{y}(m)\boldsymbol{f}(n-m)
\end{aligned}
$$


Essentially, this is:

1. an element-wise multiplication of data points with filter coefficients.
2. Performing a mathematical operation on the weighted points within a window.

This is equivalent to the dot product of two vectors, where one vector is the data points within a window and the other vector is the filter coefficients.

***

## From Scratch


We can write some basic pseudo-code for filtering.
We can use some basic filtering

```python
# define kernel operator
kernel_size = (5,)
stride = (1,)
@kex.kmap(kernel_size=kernel_size, stride=stride)
def filter_all(u: Float[Array, "... Nt"]):
return jnp.average(u)
```


We can use some more advanced filtering methods.

***

## `xarray`

We can use `xarray` to compute the rolling mean

See [docs](https://docs.xarray.dev/en/stable/generated/xarray.DataArray.rolling.html) for more detailed examples.

```python
# initialize data
data: Float[Array, "Nt"] = np.linspace(0, 11, num=12)
coords: pd.DateRange = pd.date_range("1999-12-15", periods=12, freq=pd.DateOffset(months=1))

# create an xarray dataarray
da: xr.DataArray = xr.DataArray(data, coords, dims="time")

# compute rolling mean over 5 day window
time_window: int = 5 # 5 Days
da: xr.DataArray = da.rolling(time=5, center=True).mean()

# (Optional) Remove NANS from end-points
da: xr.DataArray = da.dropna("time")
```

***

## `gcm_filter`

There is another package called [gcm-filters](https://gcm-filters.readthedocs.io/en/latest/index.html) which is a convenient wrap-around the `xarray.Dataset`.
They feature many more advanced filters which take into account things like masks and curvilinear grids.
45 changes: 45 additions & 0 deletions notes/cookbook/spatial_mean.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
---
title: Spatial Averaging
subject: Averaging Spatial Data
short_title: Spatial Averaging
authors:
- name: J. Emmanuel Johnson
affiliations:
- CSIC
- UCM
- IGEO
orcid: 0000-0002-6739-0053
email: juanjohn@ucm.es
license: CC-BY-4.0
keywords: simulations
abbreviations:
ERA5: ECMWF Reanalysis Version 5
CMIP6: Coupled Model Intercomparison Project Phase 6
AMIP6: Atmospherical Model Intercomparison Project Phase 6
PDEs: Partial Differential Equations
RHS: Right Hand Side
TLDR: Too Long Did Not Read
SSP: Shared Socioeconomic Pathways
GPD: Generalized Pareto Distribution
GEV: Generalized Extreme Value
---


This example was taken from the [xarray documentation](https://docs.xarray.dev/en/latest/examples/area_weighted_temperature.html).


```python
weights: xr.Coordinates = np.cos(np.deg2rad(da.lat))
weights.name = "weights"
```

:::{seealso} Tutorials
:class: dropdown

[**xarray**](https://docs.xarray.dev/en/latest/examples/area_weighted_temperature.html).
An example from `xarray` that showcases the area weighted temperature.

[**xcdata**](https://xcdat.readthedocs.io/en/latest/examples/spatial-average.html).
An example for calculating geospatial weighted averages from monthly time series.
:::

Loading

0 comments on commit 0a82e1e

Please sign in to comment.