diff --git a/doc/user-guide/filtering-and-resampling.ipynb b/doc/user-guide/filtering-and-resampling.ipynb new file mode 100644 index 0000000..9fbe14d --- /dev/null +++ b/doc/user-guide/filtering-and-resampling.ipynb @@ -0,0 +1,685 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "b314e777-7ffb-4e62-b4c5-ce8a785c5181", + "metadata": {}, + "source": [ + "# Filtering and resampling Xarray datasets with xbatcher\n", + "\n", + "There are many cases in machine learning where we want to discard invalid observations or modify the distribution of a target variable. This notebook demonstrates how `BatchGenerators` can be used to make filtered or resampled datasets by passing functions that identify usable data or assign a sample weight to each patch." + ] + }, + { + "cell_type": "markdown", + "id": "7158f5f3-42f5-4dcd-87ee-045d9d0e85f5", + "metadata": {}, + "source": [ + "### Libraries and toy data" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "5d912ff0-d808-4704-8dea-b9e1b5a53bf1", + "metadata": {}, + "outputs": [], + "source": [ + "import matplotlib.pyplot as plt\n", + "import numpy as np\n", + "import xarray as xr\n", + "\n", + "import xbatcher as xb" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "7fb892c1-50fd-48c8-8567-b150946b53c9", + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
<xarray.Dataset> Size: 31MB\n", + "Dimensions: (lat: 25, time: 2920, lon: 53)\n", + "Coordinates:\n", + " * lat (lat) float32 100B 75.0 72.5 70.0 67.5 65.0 ... 22.5 20.0 17.5 15.0\n", + " * lon (lon) float32 212B 200.0 202.5 205.0 207.5 ... 325.0 327.5 330.0\n", + " * time (time) datetime64[ns] 23kB 2013-01-01 ... 2014-12-31T18:00:00\n", + "Data variables:\n", + " air (time, lat, lon) float64 31MB ...\n", + "Attributes:\n", + " Conventions: COARDS\n", + " title: 4x daily NMC reanalysis (1948)\n", + " description: Data is from NMC initialized reanalysis\\n(4x/day). These a...\n", + " platform: Model\n", + " references: http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly...