Skip to content

Commit

Permalink
Update headings for compatibility with nbsphinx/ docu rendering #25
Browse files Browse the repository at this point in the history
  • Loading branch information
MarcoHuebner committed Sep 18, 2024
1 parent 22d89d6 commit e7f1371
Showing 1 changed file with 70 additions and 16 deletions.
86 changes: 70 additions & 16 deletions nb/02_Geo_visualization_example.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,14 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Geo-Visualization Example Notebook\n",
"Geo-Visualization Example Notebook\n",
"==================================\n",
"\n",
"Welcome to the `Geo-Visualization Example` notebook! This notebook is designed to guide you through the process of visualizing geographical data from the Regionalstatistik database using Python and pystatis as API wrapper.\n",
"\n",
"## Libraries Overview\n",
"Libraries Overview\n",
"------------------\n",
"\n",
"In this notebook, we will require the following additional libraries:\n",
"\n",
"- GeoPandas: An open-source project that makes working with geospatial data in python easier. It extends the datatypes used by pandas to allow spatial operations on geometric types. GeoPandas enables us to work with geospatial data in Python similarly to how we work with pandas for regular data.\n",
Expand All @@ -35,7 +38,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Import Required Libraries"
"### Import Required Libraries\n",
"~~~~~~~~~~~~~~~~~~~~~~~~~~~~~"
]
},
{
Expand All @@ -56,7 +60,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Visualization on the Level of Bundesländer\n",
"Visualization on the Level of Bundesländer\n",
"------------------------------------------\n",
"\n",
"In this first example, we will visualize the ratio of international students among students on the level of the Bundesländer. We will use the table with code `21311-01-01-4` from the Regionalstatistik API for the student data and `12411-01-01-4` for the population data. You can find the data by either search on the website or use the `Find` class which we also provide in `pystatis` to skim through the available data.\n",
"\n",
Expand Down Expand Up @@ -245,7 +250,13 @@
"metadata": {},
"source": [
"### Load Regionalstatistik Data\n",
"\n",
"~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To now fill the map with our tables of interest, we need to query the data from the Regionalstatistik API. We will use our the `pystatis` library - more specifically the `Table` class - to query the data."
]
},
Expand Down Expand Up @@ -293,7 +304,13 @@
"metadata": {},
"source": [
"### Process Students Data\n",
"\n",
"~~~~~~~~~~~~~~~~~~~~~~~~~"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To determine the ratio of international students among students per year and region we need to first filter the data for the relevant columns. We will then merge the two tables and calculate the ratio of international students among students."
]
},
Expand Down Expand Up @@ -436,10 +453,21 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"The ratio is now calculated and grouped by `Kreise and kreisfreie Städte` (districts and urban districts) as well as further parameters and can be visualized on the map of Germany. The missing data for `Aachen, Kreis` will be discussed later.\n",
"\n",
"The ratio is now calculated and grouped by `Kreise and kreisfreie Städte` (districts and urban districts) as well as further parameters and can be visualized on the map of Germany. The missing data for `Aachen, Kreis` will be discussed later."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Plot the Development of International Student Ratio for All Bundesländer\n",
"\n",
"~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Before we do this, we will first convert the grouped data into a DataFrame and re-sort the data by year to have a look at the time development for individual Bundesländer first. Lastly, we merge the DataFrame with international student ratios with the geopandas DataFrame to visualize the data on the map."
]
},
Expand Down Expand Up @@ -648,7 +676,13 @@
"metadata": {},
"source": [
"### Plot the Development of International Student Ratio on the Map of Germany\n",
"\n",
"~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"As mentioned before, we merge the DataFrame now with international student ratios with the geopandas DataFrame to visualize the data on the map of Germany."
]
},
Expand Down Expand Up @@ -725,7 +759,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Visualization on the Level of Landkreise\n",
"Visualization on the Level of Landkreise\n",
"----------------------------------------\n",
"\n",
"In this second example, we will visualize the ratio of international students among students on the level of individual Landkreise. For this, we additionally need to load the map of Germany which outlines the individual Landkreise."
]
Expand Down Expand Up @@ -846,7 +881,13 @@
"metadata": {},
"source": [
"### Process Students Data\n",
"\n",
"~~~~~~~~~~~~~~~~~~~~~~~~~"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we can re-determine the ratio of international students among students per year and region. We will first look again at specific regions to see the time development of the ratio of international students among students before we then merge the DataFrame with the geopandas DataFrame to visualize the data on the map."
]
},
Expand Down Expand Up @@ -1021,7 +1062,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Plot the Development of International Student Ratio for Köln and Aachen"
"### Plot the Development of International Student Ratio for Köln and Aachen\n",
"~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~"
]
},
{
Expand Down Expand Up @@ -1252,15 +1294,27 @@
"Having looked at the time development of the ratio of international students among students for specific regions shows a continues increase in the ratio of international students among students - in different strengths. However, it also shows that for example for `Aachen, Kreis`, there is no data available for all years in question.\n",
"\n",
"### Investigating Missing Data and the Data Quality Parameter\n",
"\n",
"~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A more detailed investigation of the `quality` parameter of the data would be necessary to potentially determine the reason for the missing data. While this is in principle supported by the API via `quality=\"on\"`, regionalstatistik is the only of the three GENESIS databases to not actively support this. As a workaround, the website can be used to determine potential quality parameters of the data.\n",
"\n",
"Looking at the data on the [website](https://www.regionalstatistik.de/genesis/online?operation=ergebnistabelleUmfang&levelindex=3&levelid=1719518083070&downloadname=21311-01-01-4#abreadcrumb) reveals that there are indeed no values for `Aachen, Kreis` (more specifically, \"-\" means \"nichts vorhanden\"), while the data for `Aachen, kreisfreie Stadt` is unknown or to be kept secret (\".\" means \"Zahlenwert unbekannt oder geheimzuhalten\").\n",
"\n",
"(Explanation of legend [here](https://www.regionalstatistik.de/genesis/online?operation=ergebnistabelleQualitaet&language=de&levelindex=3&levelid=1719518083070#abreadcrumb))\n",
"\n",
"### Plot the Development of International Student Ratio on the Map of Germany With Finer Granularity\n",
"\n",
"~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"As before we merge the DataFrame with international student ratios with the geopandas DataFrame to visualize the data on the map of Germany. However, this time we will visualize the data on the level of individual Landkreise."
]
},
Expand Down Expand Up @@ -1379,7 +1433,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.6"
"version": "3.11.9"
}
},
"nbformat": 4,
Expand Down

0 comments on commit e7f1371

Please sign in to comment.