Although Senzing can be used with Jupyter, Jupyter notebooks are not part of the Senzing product.
If you are beginning your journey with Senzing, please start with Senzing Quick Start guides.
The docker-jupyter
repository holds example Senzing
Jupyter
notebooks in the
notebooks
subdirectory.
The senzing/jupyter
docker image is a Senzing-ready image hosting
the example Senzing notebooks.
These notebooks are built upon the DockerHub Jupyter organization docker images. The default base image is jupyter/minimal-notebook. There is more information on the Jupyter Docker Stacks.
In addition, the Jupyter notebooks can be viewed on nbviewer.jupyter.org. For example, visit Senzing examples on NbViewer.
- 🤔 - A "thinker" icon means that a little extra thinking may be required. Perhaps you'll need to make some choices. Perhaps it's an optional step.
- ✏️ - A "pencil" icon means that the instructions may need modification before performing.
⚠️ - A "warning" icon means that something tricky is happening, so pay attention.
This repository and demonstration require 9 GB free disk space.
Budget 40 minutes to get the demonstration up-and-running, depending on CPU and network speeds.
This repository assumes a working knowledge of:
- If Senzing has not been initialized, visit "How to initialize Senzing with Docker".
Configuration values specified by environment variable or command line parameter.
Non-Senzing configuration can be seen at Jupyter Docker Stacks
- JUPYTER_NOTEBOOKS_SHARED_DIR
- SENZING_DATA_VERSION_DIR
- SENZING_ETC_DIR
- SENZING_G2_DIR
- SENZING_NETWORK
- SENZING_RUNAS_USER
- SENZING_VAR_DIR
-
✏️ Specify the directory containing the Senzing installation. Use the same
SENZING_VOLUME
value used when performing "How to initialize Senzing with Docker". Example:export SENZING_VOLUME=/opt/my-senzing
-
Here's a simple test to see if
SENZING_VOLUME
is correct. The following commands should return file contents. Example:cat ${SENZING_VOLUME}/g2/g2BuildVersion.json cat ${SENZING_VOLUME}/data/3.0.0/libpostal/data_version
-
⚠️ macOS - File sharing must be enabled forSENZING_VOLUME
. -
⚠️ Windows - File sharing must be enabled forSENZING_VOLUME
.
-
-
Identify the
data_version
,etc
,g2
, andvar
directories. Example:export SENZING_DATA_VERSION_DIR=${SENZING_VOLUME}/data/3.0.0 export SENZING_ETC_DIR=${SENZING_VOLUME}/etc export SENZING_G2_DIR=${SENZING_VOLUME}/g2 export SENZING_VAR_DIR=${SENZING_VOLUME}/var
🤔 Optional: Use if docker container is part of a docker network.
-
List docker networks. Example:
sudo docker network ls
-
✏️ Specify docker network. Choose value from NAME column of
docker network ls
. Example:export SENZING_NETWORK=*nameofthe_network*
-
Construct parameter for
docker run
. Example:export SENZING_NETWORK_PARAMETER="--net ${SENZING_NETWORK}"
🤔 Optional: Some database need additional support. For other databases, these steps may be skipped.
- Db2: See
Support Db2
instructions to set
SENZING_OPT_IBM_DIR_PARAMETER
. - MS SQL: See
Support MS SQL
instructions to set
SENZING_OPT_MICROSOFT_DIR_PARAMETER
.
-
✏️ Set environment variables. Example:
export JUPYTER_NOTEBOOKS_SHARED_DIR=$(pwd) export WEBAPP_PORT=8888
-
🤔 Optional: Run Jupyter without token authentication. Example:
export JUPYTER_PARAMETERS="start.sh jupyter notebook --NotebookApp.token=''"
-
Run docker container. Example:
sudo docker run \ --interactive \ --name senzing-jupyter \ --publish ${WEBAPP_PORT}:8888 \ --rm \ --tty \ --volume ${JUPYTER_NOTEBOOKS_SHARED_DIR}:/notebooks/shared \ --volume ${SENZING_DATA_VERSION_DIR}:/opt/senzing/data \ --volume ${SENZING_ETC_DIR}:/etc/opt/senzing \ --volume ${SENZING_G2_DIR}:/opt/senzing/g2 \ --volume ${SENZING_VAR_DIR}:/var/opt/senzing \ ${SENZING_NETWORK_PARAMETER} \ ${SENZING_OPT_IBM_DIR_PARAMETER} \ ${SENZING_OPT_MICROSOFT_DIR_PARAMETER} \ senzing/jupyter ${JUPYTER_PARAMETERS}
-
If no token authentication, access your jupyter notebooks at: http://127.0.0.1:8888/
-
If token authentication, locate the URL in the Docker log. Example:
Copy/paste this URL into your browser when you connect for the first time, to login with a token: http://(a152e5586fdc or 127.0.0.1):8888/?token=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Adjust the URL. Example:
http://127.0.0.1:8888/?token=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Paste the URL into a web browser.
The Jupyter notebooks in notebooks/senzing-examples are of two types:
- References - Information on specific method invocations and their parameters. Examples:
- Guides - Illustrations of how to use methods to accomplish tasks. Often points to appropriate "Reference" entries for specific method invocations. Examples:
The following software programs need to be installed:
For more information on environment variables, see Environment Variables.
-
Set these environment variable values:
export GIT_ACCOUNT=senzing export GIT_REPOSITORY=docker-jupyter export GIT_ACCOUNT_DIR=~/${GIT_ACCOUNT}.git export GIT_REPOSITORY_DIR="${GIT_ACCOUNT_DIR}/${GIT_REPOSITORY}"
-
Follow steps in clone-repository to install the Git repository.
-
Set environment variables for senzing directories. See Volumes. Example:
export SENZING_VOLUME=/opt/my-senzing export SENZING_DATA_DIR=${SENZING_VOLUME}/data export SENZING_DATA_VERSION_DIR=${SENZING_DATA_DIR}/3.0.0 export SENZING_ETC_DIR=${SENZING_VOLUME}/etc export SENZING_G2_DIR=${SENZING_VOLUME}/g2 export SENZING_VAR_DIR=${SENZING_VOLUME}/var
-
Set environment variables. Example:
export PYTHONPATH=${SENZING_G2_DIR}/python export LD_LIBRARY_PATH=${SENZING_G2_DIR}/lib:${SENZING_G2_DIR}/lib/debian export SENZING_SQL_CONNECTION="sqlite3://na:na@${SENZING_VAR_DIR}/sqlite/G2C.db"
-
Start juypter notebook. Example:
cd ${GIT_REPOSITORY_DIR} jupyter notebook
-
Option #1: Using
docker
command and GitHub.sudo docker build --tag senzing/jupyter https://github.com/senzing/docker-jupyter.git#main
-
Option #2: Using
docker
command and local repository.cd ${GIT_REPOSITORY_DIR} sudo docker build --tag senzing/jupyter .
-
Option #3: Using
make
command.cd ${GIT_REPOSITORY_DIR} sudo make docker-build
Note:
sudo make docker-build-development-cache
can be used to create cached docker layers.
- See docs/errors.md.