Intro to Databases in Industry: Data Cleaning, Querying, and Modeling at Scale
Speakers:
- Rodolfo Lourenzutti, University of British Columbia
- Arman Seyed-Ahmadi, University of British Columbia
- Diego Ardila, Shopify
You can install PostgreSQL on your own machine and load the database dump files provided in the databases/
folder to locally recreate the databases used in the workshop for further practicing. The instructions to do so are provided here.
The Jupyter notebooks in this repository use a few packages to run SQL commands within the Python environment of the notebooks, which are all provided in the environment.yml
. In order to reproduce this environment and make it accessible to Jupyter Lab, you need to install the nb_conda_kernels
package in your base
environment (or whichever environment Jupyter Lab is installed in) using the following command in your terminal:
conda install nb_conda_kernels
Then run the following command to recreate the environment
conda env create -f environment.yml
A new environment called ssc2022
should appear in the list of kernels when you launch Jupyter Lab on your computer.
© 2022 Arman Seyed-Ahmadi, Rodolfo Lourenzutti, Diego Ardila
Software licensed under the MIT License, non-software content licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) License. See the license file for more information.