From 6610fe8eb64389ba93702306c5fa4523eb095c85 Mon Sep 17 00:00:00 2001 From: Marion Date: Thu, 22 Aug 2024 18:15:00 -0700 Subject: [PATCH 01/40] back up doc, delete old docs --- docs/authx-setup.md | 54 ------- docs/authz-permissions.md | 188 ------------------------ docs/backing-up-and-restoring-candig.md | 101 +++++++++++++ docs/configure-prometheus.md | 84 ----------- docs/configure-vault.md | 164 --------------------- docs/run-vault-helper-tool.md | 172 ---------------------- 6 files changed, 101 insertions(+), 662 deletions(-) delete mode 100644 docs/authx-setup.md delete mode 100644 docs/authz-permissions.md create mode 100644 docs/backing-up-and-restoring-candig.md delete mode 100644 docs/configure-prometheus.md delete mode 100644 docs/configure-vault.md delete mode 100644 docs/run-vault-helper-tool.md diff --git a/docs/authx-setup.md b/docs/authx-setup.md deleted file mode 100644 index 58a486ee2..000000000 --- a/docs/authx-setup.md +++ /dev/null @@ -1,54 +0,0 @@ -# CanDIGv2 Authentication and Authorization Module - -## Components - -- Keycloak -- Tyk - -## Deploy - -Make sure the relevant details in `.env` are correct. - -`make init-authx` - -## Clean - -`make clean-authx` - -## Adding New API - -Let's say the new API is called `example` and the route it redirects to us `http://example.org`. -This section will help you figure out how to add the details to the setup. - -- Copy an API template file like `lib/tyk/configuration_templates/api_candig.json.tpl` and give it a name. - e.g. api_example.json.tpl -- Change the appropriate pieces inside this template. -- Add the following variables to your environment file `.env` - ``` - TYK_EXAMPLE_API_ID=666 - TYK_EXAMPLE_API_NAME=Example - TYK_EXAMPLE_API_SLUG=example - TYK_EXAMPLE_API_TARGET=http://example.org - TYK_EXAMPLE_API_LISTEN_PATH=example - ``` - See section `## Extra APIs can be added here` -- Add the path in the `SESSION_ENDPOINTS` array. If you fail to add proper paths, then your application - will not redirect to login page properly. -- Add the new section of the API to `lib/tyk/configuration_templates/policies.json.tpl` under - the key `access_rights` -- Add the new section of the API to `lib/tyk/tyk_key_generation.sh` under - the key `access_rights` -- Add the new line to copy the file to the image in the `lib/tyk/Dockerfile` -- Add the new line to `envsubst` in `lib/tyk/tyk_setup.sh` (see section `# Extra APIs can be added here`) -- Redeploy the container OR use Tyk API (TODO: ask Jimmy about this) -- Regenerate the key. `lib/tyk/tyk_key_generation.sh` has clues for now. - If the environment variables needed by the `tyk_key_generation.sh` are set, then the script should work - -## Technical Debt Notes - -- This setup is flaky at best because of a myriad of styles used: -- Tyk's setup adds a `tmp` directory inside the lib/tyk which is sad because it deviates - from the repo's setup of a global `tmp` directory. -- Tyk's setup does not have a way via this repo/make to add new APIs or call key requests etc. -- Keycloak's setup `curl`s APIs with `-k` option which is insecure. -- Add commands to add new APIs (see above). \ No newline at end of file diff --git a/docs/authz-permissions.md b/docs/authz-permissions.md deleted file mode 100644 index 78fe9b30a..000000000 --- a/docs/authz-permissions.md +++ /dev/null @@ -1,188 +0,0 @@ -# Authorization Permissions - -JWTs that are sent by Vault contain permissions for OPA to act on. This -document notes the details of the structure of the JWTs and other related -information. - -## JWT at a glance - the end goal? - -We are not sure yet if our structure will be absolutely identical to ga4gh. - -- `iss`: Issuer -- `iat`: Issued time -- `exp`: Expiration time -- `jti`: Token ID -- `sub`: Suject ID (TODO: how to link to IdP's ID) -- `ga4gh_passport_v1`: List of the visas, could easily just be one item - for now but the idea is to use GA4GH terminology. Child `ga4gh_visa_v1`. - - `ga4gh_visa_v1`: a JWT with items - - `type`: Either of these (there are more but we need these) - - `LinkedIdentities`: [LinkedIdentities, GA4GH] - - `ControlledAccessGrants`: [ControlledAccessGrants, GA4GH] - - `value`: A string that represents any of the scope, - process, identifier and version of the assertion. The format of - the string can vary by the Passport Visa Type. - For CanDIG, we perhaps need the value to be the dataset ID and - level of the access. (TODO: data spec from Ksenia will help) - - `source`: A URL Field that provides at a minimum the - organization that made the assertion. If there is no organization making - the assertion, the "source" MUST be set to "https://no.organization". - - `asserted`: time at assertion - - `by`; [by field, GA4GH] (TODO: specify who? self? dac?) - - `iss`: - - `sub`: - - `iat`: - - `exp`: - - -## Draft v003 - -The most basic version with only dataset level permissions. - -### Data model at the time of this writing - -- Project top level has - - One or more Datasets - - One or more Phenopackets (+ some extra data) - - One or more Patients - -```json -{ - "aud": "cq_candig", - "exp": 1603988812, - "iat": 1603902412, - "iss": "/v1/identity/oidc", - - "ga4gh_passport_v1": { - "ga4gh_visa_v1": { - "type": "ControlledAccessGrants", - "value": { - "dataset1234": { - "access": 4 - } - } - } - }, - - "sub": "b6a4b63c...9a7a247db34f" -} -``` - -## Draft v002 - -Includes basics of v001 but untroduces association with GA4GH. - -- Not using JWT within a JWT structure -- `value` is not a string (as a result) -- `level` is for the entire database unless `entities` is listed -- `entities` is a list of URI spec'd strings where - - `namespace` is a specific table or schema or version - - uri paths can contain wild cards - - -```json -{ - "aud": "cq_candig", - "exp": 1603988812, - "iat": 1603902412, - "iss": "/v1/identity/oidc", - "namespace": "root", - - "ga4gh_passport_v1": { - "ga4gh_visa_v1": { - "type": "ControlledAccessGrants", - "value": { - "dataset1234": { - "level": 4, - "entities": [ - "namespace://path/to/entity1", - "namespace2://path/to/entity2/*" - ] - } - } - } - }, - - "sub": "b6a4b63c...9a7a247db34f" -} -``` - -## Draft v001 - -Trying to keep things minimal for the first draft we will focus on a few -options - - -1. Core JWT claims (fields) that are important security wise -2. Reduced focus on row/column level permissions - -The payload may look like - - -```json -{ - "aud": "cq_candig", // set by `allowed_client_ids` - "exp": 1603988812, - "iat": 1603902412, - "iss": "/v1/identity/oidc", - "namespace": "root", - "permissions": { //depends on Vault Role template - "dataset123": "4" - }, - "sub": "b6a4b63c...9a7a247db34f" //Entity ID in Vault (requester) -} -``` - -### Template - -The following part depends on the role's template - - -```json -"permissions": { - "dataset123": "4" -}, -``` - -The template for that role could look like - - -```json -{ - "key": "test-key", - "client_id": "cq_candig", - "template": "{\"permissions\": {{identity.entity.metadata}}}" -} -``` - -This tells Vault to fetch all the key-values in metadata -field of the entity. - -### Audience `aud` - -The `aud` claim is set by the command which creates the `key` - -```json -vault write identity/oidc/key/test-key -<` to any timeseries scraped from this config. - - job_name: 'cnv_service' - - # Override the global default and scrape targets from this job every 5 seconds. - scrape_interval: 5s - - static_configs: - - targets: ['127.0.0.0:3000'] -``` - -The above example will configure Prometheus to monitor the service running on http//127.0.0.0:3000 by reading its `/metrics` endpoint. - -You can read more about how to configure Prometheus at Prometheus' [webpage](https://prometheus.io/docs/prometheus/latest/getting_started/). - -To set the `/metrics` endpoint each framework has its own way on configuring it. Below we describe how configure it using Flask and DJango frameworks. - -## For Flask Applications - -In order to expose the `/metrics` endpoint of Flask Application, the below steps must be followed: - -- Install `prometheus_flask_exporter` library from Pypi -- On the file where you instantiate your Flask app, import `PrometheusMetrics` and add your app: - -```python -from prometheus_flask_exporter import PrometheusMetrics - -app = Flask(__name__) -metrics = PrometheusMetrics(app) -``` - -That’s really it! By adding an import and a line to initialize PrometheusMetrics you’ll get request duration metrics and request counters exposed on the `/metrics` endpoint of the Flask application it’s registered on. - -These are the basics configuration, for more information please visit Prometheus Flask exporter Github's [page](https://github.com/rycus86/prometheus_flask_exporter) - -## For DJango Applications - -In order to expose the `/metrics` endpoint of DJango Application, the below steps must be followed: - -- Install `django-prometheus` library from Pypi -- In your `settings.py`: - -```python -INSTALLED_APPS = ( - ... - 'django_prometheus', - ... -) - -MIDDLEWARE_CLASSES = ( - 'django_prometheus.middleware.PrometheusBeforeMiddleware', - # All your other middlewares go here, including the default - # middlewares like SessionMiddleware, CommonMiddleware, - # CsrfViewmiddleware, SecurityMiddleware, etc. - 'django_prometheus.middleware.PrometheusAfterMiddleware', -) -``` - -In your `urls.py`: - -```python -urlpatterns = [ - ... - path('', include('django_prometheus.urls')), -] -``` - -By adding middlewares and an url you’ll get request duration metrics and request counters exposed on the /metrics endpoint of the DJango application. - -These are the basics configuration, for more information please visit django-prometheus Github's [webpage](https://github.com/korfuri/django-prometheus) - -## Starting Prometheus service - -Once all the configuration is done you may start Prometheus by running one of the following commands. - -- `make tox-prometheus` -- `make conda-prometheus` diff --git a/docs/configure-vault.md b/docs/configure-vault.md deleted file mode 100644 index e5f09636f..000000000 --- a/docs/configure-vault.md +++ /dev/null @@ -1,164 +0,0 @@ -# Setup - -Disclaimer: These are notes that were taken to build a PoC in a dev environment. To read the steps taken to provision -the user entitlements dynamically, see the `Makefile`'s `compose-authx-setup` command, and `./etc/setup/scripts/subtasks/vault_setup.sh` - -## In dev - -More convenient to listen on 0.0.0.0 (will include IPs and custom hosts, -since the client calling are inside docker containers). - -``` -vault server -dev -dev-listen-address=0.0.0.0:8200 -log-level=debug -``` - -Load env variables for convenience (the token is included in the server -output). - -``` -export VAULT_ADDR='http://127.0.0.1:8200' -export VAULT_TOKEN="..." -``` - -May also need to enable audit to see the logs... - -``` -vault audit enable file file_path=/tmp/vault-audit.log -``` - -## Configure vault to accept JWT tokens - -Enable the plugin - -``` -vault auth enable jwt -``` - -Next, we need a policy (to give us access to some paths in vault): - -``` -vault policy write tyk vault-policy.hcl -``` - -In vault-policy.hcl there is: -``` -path "identity/oidc/token/*" { - capabilities = ["create", "read"] -} -``` - -Then we need a role to associate users authenticating with OIDC, here's -some more info regarding these fields: - - - user_claim: The name of the OIDC claim used to match with vault entities -(might change depending on your provider) - - bound_audiences: Correspond to the OIDC client ID - -``` -vault write auth/jwt/role/researcher \ - user_claim=preferred_username \ - bound_audiences=cq_candig \ - role_type=jwt \ - policies=tyk \ - ttl=1h -``` - -Next we need to provide some information for vault to validate the JWT token. -Again the OIDC discovery url will change based on the provider, the url down below -is based on a setup such as with candig_compose & Keycloak. - -If you are using a different OIDC provider, be aware that vault will want the -URL without "/.well-known/openid-configuration/". - -``` -vault write auth/jwt/config \ - oidc_discovery_url="http://$CANDIG_DOMAIN_NAME:8081/auth/realms/candig" \ - bound_issuer="http://$CANDIG_DOMAIN_NAME:8081/auth/realms/candig" \ - default_role="researcher" -``` - -Warning! It might happen, when writing and overwriting such configuration, -that the command line tool will output errors like: - -``` -Failed to parse K=V data: invalid key/value pair " ": format must be key=value -``` - -In which case you can also write config into vault with such a format: -``` -vault write auth/jwt/config -< Date: Thu, 22 Aug 2024 18:23:54 -0700 Subject: [PATCH 02/40] minor edits --- docs/backing-up-and-restoring-candig.md | 44 ++++++++++++++----------- 1 file changed, 25 insertions(+), 19 deletions(-) diff --git a/docs/backing-up-and-restoring-candig.md b/docs/backing-up-and-restoring-candig.md index 6a5c25878..c0696764b 100644 --- a/docs/backing-up-and-restoring-candig.md +++ b/docs/backing-up-and-restoring-candig.md @@ -1,36 +1,41 @@ -# Backing up and restoring CanDIG +# Backing up and restoring CanDIG data -There are two kinds of data stored in CanDIG that we recommend saving backup copies regularly. -1. Clinical and genomic data stored in CanDIGs's postgres databases +There are two kinds of data stored in CanDIG that we recommend backing up regularly. +1. Clinical and Genomic metadata stored in CanDIGs's postgres databases 1. Authorization data stored in vault that details user's authorization to access/edit ingested data -We recommend taking back ups after each ingest event and to store your one or more copies of your backups on a separate secure server to your CanDIG installation. We also recommend encrypting your backup so that it cannot be accessed by an unauthorizaed user. +We recommend taking back ups after each ingest event and to store one or more copies of your backups on a separate secure server from your CanDIG installation. We also recommend encrypting your backup so that it cannot be accessed by an unauthorizaed user. ## Backing up postgres databases -Both clinical and genomic metadata are stored within databases running in the postgres container. To backup the data stored in these databases: +Both clinical and genomic metadata are stored within databases running in the postgres container `metadata-db`. -1. Open a terminal inside the running postgres docker container with: +The commands below assume that you are connected to the machine that is hosting the dockerized CanDIGv2 stack. + +To backup the data stored in these databases: + +1. Open an interactive terminal inside the running postgres docker container with: ```bash docker exec -it candigv2_metadata-db_1 bash ``` -1. Dump contents of the two databases to file. `-d` specifies the database to dump, `-f` specifies the filename. Below we use the date and the name of the database being backed up: +1. Dump contents of the two databases to files. `-d` specifies the database to dump, `-f` specifies the filename. Below we use the date and the name of the database being backed up: ```bash pg_dump -U admin -d genomic -f yyyy-mm-dd-genomic-backup.sql pg_dump -U admin -d metadata -f yyyy-mm-dd-clinical-backup.sql ``` -You should then have two files with a complete copy of the sql databases. +You should then have two files, each with a complete copy of each of the databases. + You can now exit the container by entering ```bash exit ``` -You should copy these to a secure location outside of the running container and consider encrypting them or otherwise ensuring that unauthorized users will not have access to the information. To copy from the container on to the docker host you can use a command similar to: +You should copy these to a secure location outside of the running container and consider encrypting them or otherwise ensuring that unauthorized users will not have access to the information. To copy from the container on to the docker host, you can use a command similar to: ```bash docker cp candigv2_metadata-db_1:yyyy-mm-dd-genomic-backup.sql /desired/path/target @@ -39,35 +44,37 @@ docker cp candigv2_metadata-db_1:yyyy-mm-dd-clinical-backup.sql /desired/path/ta ## Restoring postgres databases -To restore the databases that we have backed up, assuming you have the CanDIG stack up and running 1. Stop the running katsu and htsget containers which are connected to the database +To restore the databases that we have backed up, assuming you have the CanDIG stack up and running + +1. Stop the running katsu and htsget containers which are connected to the databases ```bash docker stop candigv2_katsu_1 docker stop candigv2_htsget_1 ``` -1. Then we copy the `sql` backup files into the running postgres container +1. Then we need to copy the `sql` backup files into the running postgres container ```bash docker cp /path/to/backup/yyyy-mm-dd-genomic-backup.sql candigv2_metadata-db_1:/yyyy-mm-dd-genomic-backup.sql docker cp /path/to/backup/yyyy-mm-dd-clinical-backup.sql candigv2_metadata-db_1:/yyyy-mm-dd-clinical-backup.sql ``` -1. Next we need to delete the initialized databases so we can replace them with the backed up versions. +Next we need to delete the initialized databases so we can replace them with the backed up versions. -First, open an interactive terminal to the postgres container +1. Open an interactive terminal to the postgres container ```bash docker exec -it candigv2_metadata-db_1 bash ``` -Then connect to a database other than the ones we want to drop with +1. Then connect to the psql commandline prompt with a database other than the ones we want to drop: ```bash psql -U admin -d template1 ``` -Then drop the two existing databases, create empty replacement databases and exit the psql commandline prompt +1. Then drop the two existing databases, create empty replacement databases then quit the psql commandline prompt ```bash DROP DATABASE metadata; @@ -77,16 +84,16 @@ CREATE DATABASE genomic; \q ``` -1. Then load the backed up copies from file with these commands: +1. Load the backed up copies from file with these commands: ```bash psql -U admin -d metadata < yyyy-mm-dd-clinical-backup.sql psql -U admin -d genomic < yyyy-mm-dd-genomic-backup.sql ``` -Then exit the interactive terminal with the `exit` command. +1. Exit the interactive terminal with the `exit` command. -1. Then restart the katsu and htsget services +1. Restart the katsu and htsget services ```bash docker start candigv2_katsu_1 @@ -98,4 +105,3 @@ You should be able to see the restored data in the data portal. ## Backing up Authorization data - From b3c4b26f6dea9c84dacf8f7f0472836ab93c3bf5 Mon Sep 17 00:00:00 2001 From: Marion Date: Mon, 26 Aug 2024 11:07:51 -0700 Subject: [PATCH 03/40] update names --- docs/backing-up-and-restoring-candig.md | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/docs/backing-up-and-restoring-candig.md b/docs/backing-up-and-restoring-candig.md index c0696764b..05b4944b8 100644 --- a/docs/backing-up-and-restoring-candig.md +++ b/docs/backing-up-and-restoring-candig.md @@ -8,7 +8,7 @@ We recommend taking back ups after each ingest event and to store one or more co ## Backing up postgres databases -Both clinical and genomic metadata are stored within databases running in the postgres container `metadata-db`. +Both clinical and genomic metadata are stored within databases running in the postgres container `postgres-db`. The commands below assume that you are connected to the machine that is hosting the dockerized CanDIGv2 stack. @@ -17,14 +17,14 @@ To backup the data stored in these databases: 1. Open an interactive terminal inside the running postgres docker container with: ```bash -docker exec -it candigv2_metadata-db_1 bash +docker exec -it candigv2_postgres-db_1 bash ``` 1. Dump contents of the two databases to files. `-d` specifies the database to dump, `-f` specifies the filename. Below we use the date and the name of the database being backed up: ```bash pg_dump -U admin -d genomic -f yyyy-mm-dd-genomic-backup.sql -pg_dump -U admin -d metadata -f yyyy-mm-dd-clinical-backup.sql +pg_dump -U admin -d clinical -f yyyy-mm-dd-clinical-backup.sql ``` You should then have two files, each with a complete copy of each of the databases. @@ -38,8 +38,8 @@ exit You should copy these to a secure location outside of the running container and consider encrypting them or otherwise ensuring that unauthorized users will not have access to the information. To copy from the container on to the docker host, you can use a command similar to: ```bash -docker cp candigv2_metadata-db_1:yyyy-mm-dd-genomic-backup.sql /desired/path/target -docker cp candigv2_metadata-db_1:yyyy-mm-dd-clinical-backup.sql /desired/path/target +docker cp candigv2_postgres-db_1:yyyy-mm-dd-genomic-backup.sql /desired/path/target +docker cp candigv2_postgres-db_1:yyyy-mm-dd-clinical-backup.sql /desired/path/target ``` ## Restoring postgres databases @@ -56,8 +56,8 @@ docker stop candigv2_htsget_1 1. Then we need to copy the `sql` backup files into the running postgres container ```bash -docker cp /path/to/backup/yyyy-mm-dd-genomic-backup.sql candigv2_metadata-db_1:/yyyy-mm-dd-genomic-backup.sql -docker cp /path/to/backup/yyyy-mm-dd-clinical-backup.sql candigv2_metadata-db_1:/yyyy-mm-dd-clinical-backup.sql +docker cp /path/to/backup/yyyy-mm-dd-genomic-backup.sql candigv2_postgres-db_1:/yyyy-mm-dd-genomic-backup.sql +docker cp /path/to/backup/yyyy-mm-dd-clinical-backup.sql candigv2_postgres-db_1:/yyyy-mm-dd-clinical-backup.sql ``` Next we need to delete the initialized databases so we can replace them with the backed up versions. @@ -65,7 +65,7 @@ Next we need to delete the initialized databases so we can replace them with the 1. Open an interactive terminal to the postgres container ```bash -docker exec -it candigv2_metadata-db_1 bash +docker exec -it candigv2_postgres-db_1 bash ``` 1. Then connect to the psql commandline prompt with a database other than the ones we want to drop: @@ -77,8 +77,8 @@ psql -U admin -d template1 1. Then drop the two existing databases, create empty replacement databases then quit the psql commandline prompt ```bash -DROP DATABASE metadata; -CREATE DATABASE metadata; +DROP DATABASE clinical; +CREATE DATABASE clinical; DROP DATABASE genomic; CREATE DATABASE genomic; \q @@ -87,7 +87,7 @@ CREATE DATABASE genomic; 1. Load the backed up copies from file with these commands: ```bash -psql -U admin -d metadata < yyyy-mm-dd-clinical-backup.sql +psql -U admin -d clinical < yyyy-mm-dd-clinical-backup.sql psql -U admin -d genomic < yyyy-mm-dd-genomic-backup.sql ``` From 88e9b00f18c17e9b6db0c852ce6abba1931d6134 Mon Sep 17 00:00:00 2001 From: Marion Date: Fri, 6 Sep 2024 10:26:53 -0700 Subject: [PATCH 04/40] update doc --- docs/install-candig.md | 59 +++++------------------------------------- lib/htsget/htsget_app | 2 +- 2 files changed, 7 insertions(+), 54 deletions(-) diff --git a/docs/install-candig.md b/docs/install-candig.md index 47b192028..1955dcaa3 100644 --- a/docs/install-candig.md +++ b/docs/install-candig.md @@ -15,6 +15,12 @@ Docker Engine (also known as Docker CE) is recommened over Docker Desktop for li Note that CanDIG requires **Docker Compose v2**, which is provided alongside the latest version of Docker Engine. Versions of Docker which do not provide Docker Compose will unfortunately not work with CanDIG. +## Resource requirements + +We have successfully run and installed the CanDIGv2 stack on VMs with 4 CPUs and 8GB of memory. + +We recommend giving Docker at least 4 CPUs and 4GB of memory. + ## Install OS Dependencies @@ -307,33 +313,6 @@ If you can see the data portal at http://candig.docker.internal:5080/, your inst Confirm your installation with the [automatic tests](/docs/ingest-and-test.md). -### Old -The `init-docker` command will initialize CanDIGv2 and set up docker networks, volumes, configs, secrets, and perform other miscellaneous actions needed before deploying a CanDIGv2 stack. Running `init-docker` will override any previous configurations and secrets. - -```bash -# initialize docker environment -make init-docker - -## Do one of the following: -# pull latest CanDIGv2 images: -make docker-pull - -# or build images: -make build-images - -# deploy stack -make compose -make init-authx # If this command fails, try the #update-firewall section of this Markdown file - -# Specific cached modules may be out of date, so to disable caching for a specific module, add BUILD_OPTS='--no-cache' at the end of make like so: -# make build-htsget-server BUILD_OPTS='--no-cache' -# make compose-htsget-server -# make build-% and compose-% will work for any folder name in lib/ - -# TODO: post deploy auth configuration - -``` - ## Update Firewall If the command still fails, it may be necessary to disable your local firewall, or edit it to allow requests from all ports used in the Docker stack. @@ -347,32 +326,6 @@ sudo ufw allow from $DOCKER_BRIDGE_IP to Re-run `make clean-authx` and `make init-authx` and it should work. -## Cleanup CanDIGv2 Compose Environment - -Use the following steps to clean up running CanDIGv2 services in a docker-compose configuration. Note that these steps are destructive and will remove **ALL** containers, secrets, volumes, networks, certs, and images. If you are using docker in a shared environment (i.e. with other non-CanDIGv2 containers running) please consider running the cleanup steps manually instead. - -The following steps are performed by `make clean-all`: - -```bash -# 1. stop and remove running stacks -make clean-compose - -# 2. stop and remove remaining containers -make clean-containers - -# 3. remove all configs/secrets from docker and local dir -make clean-secrets - -# 4. remove all docker volumes and local data dir -make clean-volumes - -# 5. delete all cached images -make clean-images - -# 6. remove bin dir (inlcuding miniconda) -make clean-bin -``` - ## For Apple Silicon ### 1. Install OS Dependencies diff --git a/lib/htsget/htsget_app b/lib/htsget/htsget_app index 2871e8ed3..abadc3eaa 160000 --- a/lib/htsget/htsget_app +++ b/lib/htsget/htsget_app @@ -1 +1 @@ -Subproject commit 2871e8ed36af57943e55c38af1b203b29111258d +Subproject commit abadc3eaa2eb0ef14c8a24fa324e7e1aa538a579 From a8b7bd557dc71c838b00ce9b215cc01325d2533b Mon Sep 17 00:00:00 2001 From: Marion Date: Wed, 11 Sep 2024 04:33:59 +1000 Subject: [PATCH 05/40] add interact doc --- docs/interact-with-the-stack.md | 82 +++++++++++++++++++++++++++++++++ 1 file changed, 82 insertions(+) create mode 100644 docs/interact-with-the-stack.md diff --git a/docs/interact-with-the-stack.md b/docs/interact-with-the-stack.md new file mode 100644 index 000000000..c4e7350ad --- /dev/null +++ b/docs/interact-with-the-stack.md @@ -0,0 +1,82 @@ +# Interacting with the stack using Make + +The [Makefile](Makefile) contains a number of make targets that make interacting the stack more user-friendly. All Makefile commands need to be run from the root directory of the CanDIGv2 repo. + + +## Stopping services + +All services can be stopped with: + +```bash +make stop-all +``` + +Individual services can be stopped using the docker command: +```bash + +``` + +## Starting services + +Logging must be started first, postgres should be started before any relying services + +## Cleaning and rebuilding individual services + +Any individual services can be cleaned with: + +```bash +make clean- +``` + +for example: + +```bash +make clean-htsget +``` + +This stops the container, deletes the container and deletes the image (does it delete the actual data in postgres though?) + +> [!NOTE] +> + +## Non-destructive Rebuild + + +To rebuild the CanDIGv2 without destroying data in postgres or keycloak the make target `rebuild-keep-data` with: + +```bash +make rebuild-keep-data +``` + +> [!NOTE] +> If there are changes that have changed the structure of the database or impacted the versions of other CANDIG_DATA_MODULES this way of rebuilding cannot be used. + +## Destructive Cleanup + +Use the following steps to clean up running CanDIGv2 services in a docker-compose configuration. + +> [!CAUTION] +> Note that these steps are destructive and will remove **ALL** containers, secrets, volumes, networks, certs, and images. If you are using docker in a shared environment (i.e. with other non-CanDIGv2 containers running) please consider running the cleanup steps manually instead. + +The following steps are performed by `make clean-all`: + +```bash +# 1. stop and remove running stacks +make clean-compose + +# 2. stop and remove remaining containers +make clean-containers + +# 3. remove all configs/secrets from docker and local dir +make clean-secrets + +# 4. remove all docker volumes and local data dir +make clean-volumes + +# 5. delete all cached images +make clean-images +``` + +## Rebuild stack from scratch + + From f4f008b33c03a20de3c283a4b86227a6233df7b2 Mon Sep 17 00:00:00 2001 From: Marion Date: Wed, 11 Sep 2024 15:50:19 +1000 Subject: [PATCH 06/40] update/add docs --- ...ing-candig.md => backup-restore-candig.md} | 0 docs/install-candig.md | 4 +- docs/interact-with-the-stack.md | 107 +++++++++++++++-- docs/production-candig.md | 113 ++++++++++++++++++ 4 files changed, 210 insertions(+), 14 deletions(-) rename docs/{backing-up-and-restoring-candig.md => backup-restore-candig.md} (100%) create mode 100644 docs/production-candig.md diff --git a/docs/backing-up-and-restoring-candig.md b/docs/backup-restore-candig.md similarity index 100% rename from docs/backing-up-and-restoring-candig.md rename to docs/backup-restore-candig.md diff --git a/docs/install-candig.md b/docs/install-candig.md index 1955dcaa3..e19a6c7a8 100644 --- a/docs/install-candig.md +++ b/docs/install-candig.md @@ -10,7 +10,6 @@ Docker Engine (also known as Docker CE) is recommended over Docker Desktop for l Note that CanDIG requires **Docker Compose v2**, which is provided alongside the latest version of Docker Engine. Versions of Docker which do not provide Docker Compose will unfortunately not work with CanDIG. - Docker Engine (also known as Docker CE) is recommened over Docker Desktop for linux installations. Note that CanDIG requires **Docker Compose v2**, which is provided alongside the latest version of Docker Engine. Versions of Docker which do not provide Docker Compose will unfortunately not work with CanDIG. @@ -21,6 +20,9 @@ We have successfully run and installed the CanDIGv2 stack on VMs with 4 CPUs and We recommend giving Docker at least 4 CPUs and 4GB of memory. +## Production vs Development Environments + +CanDIG can be installed and deployed as below for development situations where no real data will ever be ingested into the system. For critical differences in production deployments, please see the [Guide to CanDIG production deployments](production-candig.md). ## Install OS Dependencies diff --git a/docs/interact-with-the-stack.md b/docs/interact-with-the-stack.md index c4e7350ad..05d20a694 100644 --- a/docs/interact-with-the-stack.md +++ b/docs/interact-with-the-stack.md @@ -2,7 +2,6 @@ The [Makefile](Makefile) contains a number of make targets that make interacting the stack more user-friendly. All Makefile commands need to be run from the root directory of the CanDIGv2 repo. - ## Stopping services All services can be stopped with: @@ -13,19 +12,41 @@ make stop-all Individual services can be stopped using the docker command: ```bash - +docker container stop candigv2__1 +``` +eg. to stop the ingest container this would be: +``` +docker container stop candigv2_candig-ingest_1 ``` ## Starting services Logging must be started first, postgres should be started before any relying services +When all containers are stopped the following command can be used to start all CanDIGv2 containers + +``` +make start-all +``` + +To start a single container, the following docker command can be used: + +``` +docker container start candigv2__1 +``` +e.g. for the ingest container: +``` +docker container start candigv2_candig-ingest_1 +``` + ## Cleaning and rebuilding individual services +If any individual services are updated, they will need to be cleaned, rebuilt and recomposed. + Any individual services can be cleaned with: ```bash -make clean- +make clean- ``` for example: @@ -34,13 +55,30 @@ for example: make clean-htsget ``` -This stops the container, deletes the container and deletes the image (does it delete the actual data in postgres though?) +This stops the container, deletes the container and deletes the image. > [!NOTE] -> +> For services that use the postgres container to save data, i.e. htsget (genomic data) and katsu (clinical data), deleting and rebuilding the service will not delete the data in postgres. If there have been changes to the underlying database, the postgres database will need to be deleted and rebuilt. -## Non-destructive Rebuild +To rebuild and recompose a service first run: + +```bash +make build- +``` + +> [!NOTE] +> Containers that have an associated volume will need to have the volume rebuild with `make docker-volumes` before being able to successfully compose the container. + +Then compose the container with: + +```bash +make compose- +``` +> [!IMPORTANT] +> Some services can't be rebuilt individually without causing issues with the stack, if you are facing issues with modules related to auth, it is recommended to rebuild the entire stack to ensure everything is in sync. + +## Non-destructive Rebuild To rebuild the CanDIGv2 without destroying data in postgres or keycloak the make target `rebuild-keep-data` with: @@ -56,27 +94,70 @@ make rebuild-keep-data Use the following steps to clean up running CanDIGv2 services in a docker-compose configuration. > [!CAUTION] -> Note that these steps are destructive and will remove **ALL** containers, secrets, volumes, networks, certs, and images. If you are using docker in a shared environment (i.e. with other non-CanDIGv2 containers running) please consider running the cleanup steps manually instead. +> Note that these steps are destructive and will remove **ALL** logs, containers, secrets, volumes, networks, certs, and images. If you are using docker in a shared environment (i.e. with other non-CanDIGv2 containers running) please consider running the cleanup steps manually instead. The following steps are performed by `make clean-all`: ```bash -# 1. stop and remove running stacks +# 1. delete log files +make clean-logs + +# 2. stop and remove running stacks make clean-compose -# 2. stop and remove remaining containers +# 3. stop and remove remaining containers make clean-containers -# 3. remove all configs/secrets from docker and local dir +# 4. remove all configs/secrets from docker and local dir make clean-secrets -# 4. remove all docker volumes and local data dir +# 5. remove all docker volumes and local data dir make clean-volumes -# 5. delete all cached images +# 6. delete all cached images make clean-images ``` -## Rebuild stack from scratch +See the [Makefile](../Makefile) for the exact commands that each of these targets runs. + +## Rebuild entire stack from scratch + +1. Perform any backups of data necessary if in a non-testing environment. (see [backup and restore doc](backup-restore-candig.md) for detailed instructions.) + +2. Clean up the current containers with `make clean-all` + +3. When complete, build all containers again with `make build-all` +## Troubleshooting + +### Conda env not activated + +If you get an error when running a make command, something like: + +``` +bash: python: command not found +``` +or an error message about `dotenv` not being found. + +Ensure the candig conda environment is activated in your terminal with `conda activate candig`. + +### docker volumes not remade + +If you get an error where after cleaning an individual service, when composing, it gets stuck at + +``` +waiting for x service to start ... +``` + +Use CTRL + c to exit the process then try running `make docker-volumes` and then try composing again with `make compose-` + +### No rule to make target + +It is common to move around within the repo and not realise where you are. If you try to run a make command and get the error + +``` +make: *** No rule to make target `clean-candig-ingest'. Stop. +``` + +Check to make sure you are in the root of the CanDIGv2 repo as the commands only work while in the same directory as the Makefile. diff --git a/docs/production-candig.md b/docs/production-candig.md new file mode 100644 index 000000000..12d981097 --- /dev/null +++ b/docs/production-candig.md @@ -0,0 +1,113 @@ +# Guide to CanDIG production deployments + +Apart from the basic steps in the [CanDIGv2 Install Guide](install-candig.md) to get the candig stack up and running, there are additional settings and security recommendations that need to set up in a production level environment. We provide the following as general advice, but it is important for all CanDIG deployers to also consult with their institutional infrastructure security personnel to ensure that their deployment meets the necessary level of data security. + +## Proxy + +It is essential to setup a proxy so that only specific ports are open to the internet. The software used for this is up to the deployer. + +Basically, the only ports that should be available are to tyk (5080) and keycloak (8080). + +Some specific examples of how existing institutes have approached this are below. + +### HAProxy - UHN & BCGSC + +At UHN, the candig.uhnresearch.ca domain is under a proxy, so requests to a specific service go through the following stack: + +``` +[ CLIENT ] ---> | HTTPS | ---> [ UHN PROXY ] ---> | HTTP | ---> [ CANDIG_PROD / TYK ] ---> | HTTP | ---> [ CANDIG_DATA_SERVICES ] +``` + +Specifically, the UHN proxy forwards all candig.uhnresearch.ca and candigauth.uhnresearch.ca requests (all ports) to candig1:5080 (tyk) and candig1:8080 (keycloak) respectively, thereby acting as a firewall. All CanDIGv2 microservices can only be accessed through Tyk. + +### nginx - C3G + +## Virtual Machine behind Virtual Private Network + +Any user that can access the VM where the CanDIG stack is running can access potentially private data. Users that have access to this VM should be strictly controlled to those users who are authorized to see any data that is ingested into the stack. One option is to use a VPN to ensure only those with access to the VPN can access the running VM. This strategy is currently being used at UHN and BCGSC. + +## .env settings + +The following default settings in the `.env` file should be changed when deploying CanDIG in a production environment: + +| default value | value in prod environment | function | +|-------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------|----------| +| `CANDIG_DEBUG_MODE=1` | `CANDIG_DEBUG_MODE=0` | | +| `CANDIG_PRODUCTION_MODE=0` | `CANDIG_PRODUCTION_MODE=1` | | +| `CANDIG_DOMAIN=candig.docker.internal` | `CANDIG_DOMAIN=` | | +| `CANDIG_AUTH_DOMAIN=candig.docker.internal` | `CANDIG_AUTH_DOMAIN=` | | +| `CANDIG_SITE_LOCATION=LOCAL` | `CANDIG_SITE_LOCATION=` | | +| `FEDERATION_SELF_SERVER`={see .env} | update id, province, province-code | | +| `KEYCLOAK_PUBLIC_PROTO=http` | `KEYCLOAK_PUBLIC_PROTO=https` | | +| `${KEYCLOAK_PUBLIC_PROTO}://${CANDIG_AUTH_DOMAIN}:${KEYCLOAK_PORT}` | `KEYCLOAK_PUBLIC_URL=${KEYCLOAK_PUBLIC_PROTO}://${CANDIG_AUTH_DOMAIN}` | | +| `KEYCLOAK_PRIVATE_URL=${KEYCLOAK_PRIVATE_PROTO}://${CANDIG_AUTH_DOMAIN}:${KEYCLOAK_PORT}` | `KEYCLOAK_PRIVATE_URL=${KEYCLOAK_PRIVATE_PROTO}://keycloak:${KEYCLOAK_PORT}` | | +| `TYK_LOGIN_TARGET_URL=http://${CANDIG_DOMAIN}:${TYK_SERVICE_PUBLIC_PORT}` | `TYK_LOGIN_TARGET_URL=https://${CANDIG_DOMAIN}` | | +| `TYK_USE_SSL=false` | `TYK_USE_SSL=true` | | +| `CANDIG_DATA_PORTAL_URL=http://${CANDIG_DOMAIN}:${CANDIG_DATA_PORTAL_PORT}/data-portal` | `CANDIG_DATA_PORTAL_URL=https://${CANDIG_DOMAIN}:${CANDIG_DATA_PORTAL_PORT}/data-portal` | | + +## Changing the default site admin + +When CanDIG is initially deployed, a `site_admin` user will be created by default. The username and password for this user can be found in the `env.sh` file. It is important to change this default to a real user who should have site administration privileges. + +1. Login to the data portal with the credentials you wish to make a site administrator to ensure the user can login successfully +2. Get a site admin token using the default site admin user: +```bash +source env.sh +``` + +```bash +CURL_OUTPUT=$(curl -s --request POST \ + --url $KEYCLOAK_PUBLIC_URL'/auth/realms/candig/protocol/openid-connect/token' \ + --header 'Content-Type: application/x-www-form-urlencoded' \ + --data grant_type=password \ + --data client_id=$CANDIG_CLIENT_ID \ + --data client_secret=$CANDIG_CLIENT_SECRET \ + --data username=$CANDIG_SITE_ADMIN_USER \ + --data password=$CANDIG_SITE_ADMIN_PASSWORD \ + --data scope=openid) +``` + +```bash +export TOKEN=$(echo $CURL_OUTPUT | grep -Eo 'access_token":"[a-zA-Z0-9._\-]+' | cut -d '"' -f3) +``` + +3. Set the role of the real user to a site admin with the following curl command: + +```bash +curl -X POST $CANDIG_URL'/ingest/site-role/admin/email/' -H 'Authorization: Bearer '$TOKEN +``` + +4. Check the role assignment was successful by verifying the following command returns `True`: + +```bash +curl -X GET $CANDIG_URL'/ingest/site-role/admin/email/' -H 'Authorization: Bearer '$TOKEN +``` + +5. Delete the default site admin user using your new real user site admin token + +```bash +CURL_OUTPUT=$(curl -s --request POST \ + --url $KEYCLOAK_PUBLIC_URL'/auth/realms/candig/protocol/openid-connect/token' \ + --header 'Content-Type: application/x-www-form-urlencoded' \ + --data grant_type=password \ + --data client_id=$CANDIG_CLIENT_ID \ + --data client_secret=$CANDIG_CLIENT_SECRET \ + --data username= \ + --data password= \ + --data scope=openid) +``` + +```bash +export TOKEN=$(echo $CURL_OUTPUT | grep -Eo 'access_token":"[a-zA-Z0-9._\-]+' | cut -d '"' -f3) +``` + +```bash +curl -X GET $CANDIG_URL'/ingest/site-role/admin/email/site_admin@test.ca' -H 'Authorization: Bearer '$TOKEN +``` + + +## Connecting Keycloak to instituional LDAP + + +## Federating with other CanDIG production instances + From 26d826514d3ae6e533b1400d83a9a20c3f0a45a1 Mon Sep 17 00:00:00 2001 From: Marion Date: Wed, 11 Sep 2024 15:55:30 +1000 Subject: [PATCH 07/40] add c3g --- docs/production-candig.md | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/docs/production-candig.md b/docs/production-candig.md index 12d981097..3de4b8fd2 100644 --- a/docs/production-candig.md +++ b/docs/production-candig.md @@ -2,11 +2,11 @@ Apart from the basic steps in the [CanDIGv2 Install Guide](install-candig.md) to get the candig stack up and running, there are additional settings and security recommendations that need to set up in a production level environment. We provide the following as general advice, but it is important for all CanDIG deployers to also consult with their institutional infrastructure security personnel to ensure that their deployment meets the necessary level of data security. -## Proxy +## Reverse Proxy It is essential to setup a proxy so that only specific ports are open to the internet. The software used for this is up to the deployer. -Basically, the only ports that should be available are to tyk (5080) and keycloak (8080). +Basically, the only ports that should be available are to tyk (443) and keycloak (80). Some specific examples of how existing institutes have approached this are below. @@ -20,7 +20,14 @@ At UHN, the candig.uhnresearch.ca domain is under a proxy, so requests to a spec Specifically, the UHN proxy forwards all candig.uhnresearch.ca and candigauth.uhnresearch.ca requests (all ports) to candig1:5080 (tyk) and candig1:8080 (keycloak) respectively, thereby acting as a firewall. All CanDIGv2 microservices can only be accessed through Tyk. -### nginx - C3G +### OpenStack security group & nginx - C3G + + OpenStack security group that allows access to ports 80 and 443 acts as a Firewall. + + nginx acts as a reverse proxy which: + 1. Re-routes http traffic to https + 2. Provides SSL certificates + 3. Routes ${CANDIG_DOMAIN} and ${CANDIG_AUTH_DOMAIN} http[s] traffic from outside to the appropriate microservice (tyk or keycloak respectively) and port. ## Virtual Machine behind Virtual Private Network From 94744d0f3848252f20e93e05a77a531c11c332ba Mon Sep 17 00:00:00 2001 From: Marion Date: Wed, 11 Sep 2024 16:04:28 +1000 Subject: [PATCH 08/40] smaller table --- docs/production-candig.md | 32 ++++++++++++++++---------------- 1 file changed, 16 insertions(+), 16 deletions(-) diff --git a/docs/production-candig.md b/docs/production-candig.md index 3de4b8fd2..6e7dc1069 100644 --- a/docs/production-candig.md +++ b/docs/production-candig.md @@ -2,9 +2,9 @@ Apart from the basic steps in the [CanDIGv2 Install Guide](install-candig.md) to get the candig stack up and running, there are additional settings and security recommendations that need to set up in a production level environment. We provide the following as general advice, but it is important for all CanDIG deployers to also consult with their institutional infrastructure security personnel to ensure that their deployment meets the necessary level of data security. -## Reverse Proxy +## Reverse Proxy & Firewall -It is essential to setup a proxy so that only specific ports are open to the internet. The software used for this is up to the deployer. +It is essential to setup a reverse proxy and firewall so that only specific ports are open to the internet. The software used for this is up to the deployer and is considered outside of the CanDIG stack. Basically, the only ports that should be available are to tyk (443) and keycloak (80). @@ -37,20 +37,20 @@ Any user that can access the VM where the CanDIG stack is running can access pot The following default settings in the `.env` file should be changed when deploying CanDIG in a production environment: -| default value | value in prod environment | function | -|-------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------|----------| -| `CANDIG_DEBUG_MODE=1` | `CANDIG_DEBUG_MODE=0` | | -| `CANDIG_PRODUCTION_MODE=0` | `CANDIG_PRODUCTION_MODE=1` | | -| `CANDIG_DOMAIN=candig.docker.internal` | `CANDIG_DOMAIN=` | | -| `CANDIG_AUTH_DOMAIN=candig.docker.internal` | `CANDIG_AUTH_DOMAIN=` | | -| `CANDIG_SITE_LOCATION=LOCAL` | `CANDIG_SITE_LOCATION=` | | -| `FEDERATION_SELF_SERVER`={see .env} | update id, province, province-code | | -| `KEYCLOAK_PUBLIC_PROTO=http` | `KEYCLOAK_PUBLIC_PROTO=https` | | -| `${KEYCLOAK_PUBLIC_PROTO}://${CANDIG_AUTH_DOMAIN}:${KEYCLOAK_PORT}` | `KEYCLOAK_PUBLIC_URL=${KEYCLOAK_PUBLIC_PROTO}://${CANDIG_AUTH_DOMAIN}` | | -| `KEYCLOAK_PRIVATE_URL=${KEYCLOAK_PRIVATE_PROTO}://${CANDIG_AUTH_DOMAIN}:${KEYCLOAK_PORT}` | `KEYCLOAK_PRIVATE_URL=${KEYCLOAK_PRIVATE_PROTO}://keycloak:${KEYCLOAK_PORT}` | | -| `TYK_LOGIN_TARGET_URL=http://${CANDIG_DOMAIN}:${TYK_SERVICE_PUBLIC_PORT}` | `TYK_LOGIN_TARGET_URL=https://${CANDIG_DOMAIN}` | | -| `TYK_USE_SSL=false` | `TYK_USE_SSL=true` | | -| `CANDIG_DATA_PORTAL_URL=http://${CANDIG_DOMAIN}:${CANDIG_DATA_PORTAL_PORT}/data-portal` | `CANDIG_DATA_PORTAL_URL=https://${CANDIG_DOMAIN}:${CANDIG_DATA_PORTAL_PORT}/data-portal` | | +| value in prod environment | function | +|------------------------------------------------------------------------------------------|----------| +| `CANDIG_DOMAIN=` | | +| `CANDIG_AUTH_DOMAIN=` | | +| `CANDIG_DEBUG_MODE=0` | | +| `CANDIG_PRODUCTION_MODE=1` | | +| `CANDIG_SITE_LOCATION=`update to your site location | | +| `FEDERATION_SELF_SERVER` - update id, province, province-code | | +| `KEYCLOAK_PUBLIC_PROTO=https` | | +| `KEYCLOAK_PUBLIC_URL=${KEYCLOAK_PUBLIC_PROTO}://${CANDIG_AUTH_DOMAIN}` | | +| `KEYCLOAK_PRIVATE_URL=${KEYCLOAK_PRIVATE_PROTO}://keycloak:${KEYCLOAK_PORT}` | | +| `TYK_LOGIN_TARGET_URL=https://${CANDIG_DOMAIN}` | | +| `TYK_USE_SSL=true` | | +| `CANDIG_DATA_PORTAL_URL=https://${CANDIG_DOMAIN}:${CANDIG_DATA_PORTAL_PORT}/data-portal` | | ## Changing the default site admin From b23394658a59af6a12f080afd5241154f41fc62f Mon Sep 17 00:00:00 2001 From: Marion Date: Wed, 11 Sep 2024 16:05:37 +1000 Subject: [PATCH 09/40] fix typo --- docs/production-candig.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/production-candig.md b/docs/production-candig.md index 6e7dc1069..3a6e8148b 100644 --- a/docs/production-candig.md +++ b/docs/production-candig.md @@ -113,7 +113,7 @@ curl -X GET $CANDIG_URL'/ingest/site-role/admin/email/site_admin@test.ca' -H 'Au ``` -## Connecting Keycloak to instituional LDAP +## Connecting Keycloak to institutional LDAP ## Federating with other CanDIG production instances From 79cbe5e91198a2984baa1888a9190074c31c85a8 Mon Sep 17 00:00:00 2001 From: Marion Date: Wed, 11 Sep 2024 16:06:48 +1000 Subject: [PATCH 10/40] edit for consistency --- docs/production-candig.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/docs/production-candig.md b/docs/production-candig.md index 3a6e8148b..6a0ed421f 100644 --- a/docs/production-candig.md +++ b/docs/production-candig.md @@ -57,7 +57,9 @@ The following default settings in the `.env` file should be changed when deployi When CanDIG is initially deployed, a `site_admin` user will be created by default. The username and password for this user can be found in the `env.sh` file. It is important to change this default to a real user who should have site administration privileges. 1. Login to the data portal with the credentials you wish to make a site administrator to ensure the user can login successfully + 2. Get a site admin token using the default site admin user: + ```bash source env.sh ``` @@ -81,7 +83,7 @@ export TOKEN=$(echo $CURL_OUTPUT | grep -Eo 'access_token":"[a-zA-Z0-9._\-]+' | 3. Set the role of the real user to a site admin with the following curl command: ```bash -curl -X POST $CANDIG_URL'/ingest/site-role/admin/email/' -H 'Authorization: Bearer '$TOKEN +curl -X POST $CANDIG_URL'/ingest/site-role/admin/email/' -H 'Authorization: Bearer '$TOKEN ``` 4. Check the role assignment was successful by verifying the following command returns `True`: From bc44ae311182a4571573485901c16e71ca076015 Mon Sep 17 00:00:00 2001 From: Marion Date: Wed, 11 Sep 2024 16:12:38 +1000 Subject: [PATCH 11/40] add logs to backup --- docs/backup-restore-candig.md | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/docs/backup-restore-candig.md b/docs/backup-restore-candig.md index 05b4944b8..1c334c3b6 100644 --- a/docs/backup-restore-candig.md +++ b/docs/backup-restore-candig.md @@ -1,10 +1,13 @@ # Backing up and restoring CanDIG data -There are two kinds of data stored in CanDIG that we recommend backing up regularly. +There are three kinds of data stored in CanDIG that we recommend backing up regularly. 1. Clinical and Genomic metadata stored in CanDIGs's postgres databases -1. Authorization data stored in vault that details user's authorization to access/edit ingested data +2. Authorization data stored in vault that details user's authorization to access/edit ingested data +3. Logs -We recommend taking back ups after each ingest event and to store one or more copies of your backups on a separate secure server from your CanDIG installation. We also recommend encrypting your backup so that it cannot be accessed by an unauthorizaed user. +For data types 1 and 2, we recommend taking back ups after each ingest event and to store one or more copies of your backups on a separate secure server from your CanDIG installation. We also recommend encrypting your backup so that it cannot be accessed by an unauthorizaed user. + +Logs can be backed up on a regular schedule and at a minimum, should be saved elsewhere when performing a rebuild of the stack. ## Backing up postgres databases @@ -104,4 +107,8 @@ You should be able to see the restored data in the data portal. ## Backing up Authorization data +## Backing up logs + +Logs are stored in `tmp/logging`. The contents of this folder should be saved periodically. + From 3a39fbb60b7b3093e6e3d5cc717d1421e06c03ac Mon Sep 17 00:00:00 2001 From: Marion Date: Thu, 12 Sep 2024 17:29:07 -0700 Subject: [PATCH 12/40] Update docs/production-candig.md Co-authored-by: OrdiNeu --- docs/production-candig.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/production-candig.md b/docs/production-candig.md index 6a0ed421f..5e6b8ca41 100644 --- a/docs/production-candig.md +++ b/docs/production-candig.md @@ -18,7 +18,7 @@ At UHN, the candig.uhnresearch.ca domain is under a proxy, so requests to a spec [ CLIENT ] ---> | HTTPS | ---> [ UHN PROXY ] ---> | HTTP | ---> [ CANDIG_PROD / TYK ] ---> | HTTP | ---> [ CANDIG_DATA_SERVICES ] ``` -Specifically, the UHN proxy forwards all candig.uhnresearch.ca and candigauth.uhnresearch.ca requests (all ports) to candig1:5080 (tyk) and candig1:8080 (keycloak) respectively, thereby acting as a firewall. All CanDIGv2 microservices can only be accessed through Tyk. +Specifically, the UHN proxy forwards all candig.uhnresearch.ca and candigauth.uhnresearch.ca requests (port 443) to candig1:5080 (tyk) and candig1:8080 (keycloak) respectively, thereby acting as a firewall. All CanDIGv2 microservices can only be accessed through Tyk. ### OpenStack security group & nginx - C3G From 7fbe5786085f153e6b7d9d632d07c3128d2215fe Mon Sep 17 00:00:00 2001 From: Marion Date: Fri, 13 Sep 2024 11:34:23 +1000 Subject: [PATCH 13/40] move province settings --- docs/install-candig.md | 44 ------------------------------------ docs/production-candig.md | 47 ++++++++++++++++++++++++++++++++++++++- 2 files changed, 46 insertions(+), 45 deletions(-) diff --git a/docs/install-candig.md b/docs/install-candig.md index e19a6c7a8..861211f8c 100644 --- a/docs/install-candig.md +++ b/docs/install-candig.md @@ -214,50 +214,6 @@ conda activate candig ## Deploy CanDIGv2 Services with Compose -### Site Specific Settings -You will need to modify the `.env` file to reflect your site's specific settings. Set CANDIG_SITE_LOCATION to the name of your site, such as UHN, BCGSC, or C3G. For federation settings, set the id, name, province, and province-code for `FEDERATION_SELF_SERVER` variable in the `.env`. The `name` within the `FEDERATION_SELF_SERVER` variable should match the `CANDIG_SITE_LOCATION` variable. - -```bash -ProvCodes = [ - 'ca-ab', - 'ca-bc', - 'ca-mb', - 'ca-nb', - 'ca-nl', - 'ca-nt', - 'ca-ns', - 'ca-nu', - 'ca-on', - 'ca-pe', - 'ca-qc', - 'ca-sk', - 'ca-yt' -]; -``` - -```bash -CANDIG_SITE_LOCATION=UHN # or your site's location -... -FEDERATION_SELF_SERVER="{'id': 'UHN', 'url': '${FEDERATION_SERVICE_URL}/${TYK_FEDERATION_API_LISTEN_PATH}','location': {'name': '${CANDIG_SITE_LOCATION}','province': 'ON','province-code': 'ca-on'}}" -``` -#### Setting Site Logo -To customize the site logo, you need to place your image in the candig-data-portal either before building or within the container after running the build-all or install-all commands. The image should be located at `CanDIGv2/lib/candig-data-portal/candig-data-portal/src/assets/images/users/siteLogo.png`. This will overwrite the default logo. - -File requirements: -- Name the file siteLogo.png -- The image should be square and will be set to 34x34 pixels -- The image format must be PNG - -If the portal is already running, copy the logo into the Docker container using this command: - -```bash - docker cp Your_images_path/siteLogo.png candigv2_candig-data-portal_1:/app/candig-data-portal/src/assets/images/users - ``` - Otherwise: - - ```bash - cp your_image_path/siteLogo.png CanDIGv2/lib/candig-data-portal/candig-data-portal/src/assets/images/users/siteLogo.png - ``` ### New diff --git a/docs/production-candig.md b/docs/production-candig.md index 5e6b8ca41..e8ae9cbad 100644 --- a/docs/production-candig.md +++ b/docs/production-candig.md @@ -44,7 +44,7 @@ The following default settings in the `.env` file should be changed when deployi | `CANDIG_DEBUG_MODE=0` | | | `CANDIG_PRODUCTION_MODE=1` | | | `CANDIG_SITE_LOCATION=`update to your site location | | -| `FEDERATION_SELF_SERVER` - update id, province, province-code | | +| `FEDERATION_SELF_SERVER` - update id, province, province-code see [section below]() | | | `KEYCLOAK_PUBLIC_PROTO=https` | | | `KEYCLOAK_PUBLIC_URL=${KEYCLOAK_PUBLIC_PROTO}://${CANDIG_AUTH_DOMAIN}` | | | `KEYCLOAK_PRIVATE_URL=${KEYCLOAK_PRIVATE_PROTO}://keycloak:${KEYCLOAK_PORT}` | | @@ -52,6 +52,51 @@ The following default settings in the `.env` file should be changed when deployi | `TYK_USE_SSL=true` | | | `CANDIG_DATA_PORTAL_URL=https://${CANDIG_DOMAIN}:${CANDIG_DATA_PORTAL_PORT}/data-portal` | | +### Setting location information +You will need to modify the `FEDERATION_SELF_SERVER` file to reflect your site's specific settings. Set `CANDIG_SITE_LOCATION` to the name of your site, such as UHN, BCGSC, or C3G. For federation settings, set the id, name, province, and province-code for `FEDERATION_SELF_SERVER` variable in the `.env`. + +| Province/Territory | province | province-code | +|------------------------------|-------------|------------------| +| Alberta | AB | ca-ab | +| British Columbia | BC | ca-bc | +| Manitoba | MB | ca-mb | +| New Brunswick | NB | ca-nb | +| Newfoundland and Labrador | NL | ca-nl | +| Northwest Territories | NT | ca-nt | +| Nova Scotia | NS | ca-ns | +| Nunavut | NU | ca-nu | +| Ontario | ON | ca-on | +| Prince Edward Island | PE | ca-pe | +| Quebec | QC | ca-qc | +| Saskatchewan | SK | ca-sk | +| Yukon | YT | ca-yt | + +Example from UHN: + +```bash +CANDIG_SITE_LOCATION=UHN # or your site's location +... +FEDERATION_SELF_SERVER="{'id': 'UHN', 'url': '${FEDERATION_SERVICE_URL}/${TYK_FEDERATION_API_LISTEN_PATH}','location': {'name': '${CANDIG_SITE_LOCATION}','province': 'ON','province-code': 'ca-on'}}" +``` +## Setting Site Logo +To customize the site logo, you need to place your image in the candig-data-portal either before building or within the container after running the build-all or install-all commands. The image should be located at `CanDIGv2/lib/candig-data-portal/candig-data-portal/src/assets/images/users/siteLogo.png`. This will overwrite the default logo. + +File requirements: +- Name the file siteLogo.png +- The image should be square and will be set to 34x34 pixels +- The image format must be PNG + +If the portal is already running, copy the logo into the Docker container using this command: + +```bash + docker cp Your_images_path/siteLogo.png candigv2_candig-data-portal_1:/app/candig-data-portal/src/assets/images/users + ``` + Otherwise: + + ```bash + cp your_image_path/siteLogo.png CanDIGv2/lib/candig-data-portal/candig-data-portal/src/assets/images/users/siteLogo.png + ``` + ## Changing the default site admin When CanDIG is initially deployed, a `site_admin` user will be created by default. The username and password for this user can be found in the `env.sh` file. It is important to change this default to a real user who should have site administration privileges. From 5d2c06d45071ec41f95ddc587be45d9474dc07a5 Mon Sep 17 00:00:00 2001 From: Marion Date: Fri, 13 Sep 2024 11:41:45 +1000 Subject: [PATCH 14/40] update table --- docs/production-candig.md | 33 +++++++++++++++++---------------- 1 file changed, 17 insertions(+), 16 deletions(-) diff --git a/docs/production-candig.md b/docs/production-candig.md index e8ae9cbad..3b5229e27 100644 --- a/docs/production-candig.md +++ b/docs/production-candig.md @@ -37,23 +37,23 @@ Any user that can access the VM where the CanDIG stack is running can access pot The following default settings in the `.env` file should be changed when deploying CanDIG in a production environment: -| value in prod environment | function | -|------------------------------------------------------------------------------------------|----------| -| `CANDIG_DOMAIN=` | | -| `CANDIG_AUTH_DOMAIN=` | | -| `CANDIG_DEBUG_MODE=0` | | -| `CANDIG_PRODUCTION_MODE=1` | | -| `CANDIG_SITE_LOCATION=`update to your site location | | -| `FEDERATION_SELF_SERVER` - update id, province, province-code see [section below]() | | -| `KEYCLOAK_PUBLIC_PROTO=https` | | -| `KEYCLOAK_PUBLIC_URL=${KEYCLOAK_PUBLIC_PROTO}://${CANDIG_AUTH_DOMAIN}` | | -| `KEYCLOAK_PRIVATE_URL=${KEYCLOAK_PRIVATE_PROTO}://keycloak:${KEYCLOAK_PORT}` | | -| `TYK_LOGIN_TARGET_URL=https://${CANDIG_DOMAIN}` | | -| `TYK_USE_SSL=true` | | -| `CANDIG_DATA_PORTAL_URL=https://${CANDIG_DOMAIN}:${CANDIG_DATA_PORTAL_PORT}/data-portal` | | +| value in prod environment | +|------------------------------------------------------------------------------------------| +| `CANDIG_DOMAIN=` | +| `CANDIG_AUTH_DOMAIN=` | +| `CANDIG_DEBUG_MODE=0` | +| `CANDIG_PRODUCTION_MODE=1` | +| `CANDIG_SITE_LOCATION=` e.g. UHN, BC | +| `FEDERATION_SELF_SERVER` - update id, province, province-code see [section below](setting-location-information) | +| `KEYCLOAK_PUBLIC_PROTO=https` | +| `KEYCLOAK_PUBLIC_URL=${KEYCLOAK_PUBLIC_PROTO}://${CANDIG_AUTH_DOMAIN}` | +| `KEYCLOAK_PRIVATE_URL=${KEYCLOAK_PRIVATE_PROTO}://keycloak:${KEYCLOAK_PORT}` | +| `TYK_LOGIN_TARGET_URL=https://${CANDIG_DOMAIN}` | +| `TYK_USE_SSL=true` | +| `CANDIG_DATA_PORTAL_URL=https://${CANDIG_DOMAIN}:${CANDIG_DATA_PORTAL_PORT}/data-portal` | ### Setting location information -You will need to modify the `FEDERATION_SELF_SERVER` file to reflect your site's specific settings. Set `CANDIG_SITE_LOCATION` to the name of your site, such as UHN, BCGSC, or C3G. For federation settings, set the id, name, province, and province-code for `FEDERATION_SELF_SERVER` variable in the `.env`. +You will need to modify the `FEDERATION_SELF_SERVER` file to reflect your site's specific settings. Set `CANDIG_SITE_LOCATION` to the name of your site, such as UHN, BCGSC, or C3G. For federation settings, set the id, name, province, and province-code for `FEDERATION_SELF_SERVER` variable in the `.env`. See table below for codes for each Canadian province and territory: | Province/Territory | province | province-code | |------------------------------|-------------|------------------| @@ -71,13 +71,14 @@ You will need to modify the `FEDERATION_SELF_SERVER` file to reflect your site's | Saskatchewan | SK | ca-sk | | Yukon | YT | ca-yt | -Example from UHN: +Example values from UHN which is located in Ontario: ```bash CANDIG_SITE_LOCATION=UHN # or your site's location ... FEDERATION_SELF_SERVER="{'id': 'UHN', 'url': '${FEDERATION_SERVICE_URL}/${TYK_FEDERATION_API_LISTEN_PATH}','location': {'name': '${CANDIG_SITE_LOCATION}','province': 'ON','province-code': 'ca-on'}}" ``` + ## Setting Site Logo To customize the site logo, you need to place your image in the candig-data-portal either before building or within the container after running the build-all or install-all commands. The image should be located at `CanDIGv2/lib/candig-data-portal/candig-data-portal/src/assets/images/users/siteLogo.png`. This will overwrite the default logo. From 29d41234aaad9c3be60d505fa6eb1afea8fb9dfb Mon Sep 17 00:00:00 2001 From: Marion Date: Fri, 13 Sep 2024 11:52:33 +1000 Subject: [PATCH 15/40] add federation --- docs/production-candig.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/docs/production-candig.md b/docs/production-candig.md index 3b5229e27..0e0c759df 100644 --- a/docs/production-candig.md +++ b/docs/production-candig.md @@ -166,3 +166,11 @@ curl -X GET $CANDIG_URL'/ingest/site-role/admin/email/site_admin@test.ca' -H 'Au ## Federating with other CanDIG production instances +To federate your own node with another CanDIG node, follow the instructions in the [federation-service README](https://github.com/CanDIG/federation_service#how-to-register-peer-servers). + +Federation is a two way process, where you need to register another server with your node, and the other node needs to register you node, by exchanging valid site administration bearer tokens. + +Once two nodes are federated, summary data from federated nodes will appear in both nodes' data portals and will be viewable by all users who are able to authenticate to either node. + +Access to patient level data through specific program authorization is managed by the node that hosts the data for that program. For example, if a user from UHN needs to be given authorization to a program hosted within the BC node, a site administrator from BC will need to add a program authorization for that UHN user to that program within the BC CanDIG node. + From 57abbe74998cf0daa21ee9dfbd40e7728cfc03a5 Mon Sep 17 00:00:00 2001 From: Marion Date: Fri, 13 Sep 2024 11:54:26 +1000 Subject: [PATCH 16/40] add link --- docs/production-candig.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/production-candig.md b/docs/production-candig.md index 0e0c759df..930777a81 100644 --- a/docs/production-candig.md +++ b/docs/production-candig.md @@ -164,6 +164,7 @@ curl -X GET $CANDIG_URL'/ingest/site-role/admin/email/site_admin@test.ca' -H 'Au ## Connecting Keycloak to institutional LDAP + ## Federating with other CanDIG production instances To federate your own node with another CanDIG node, follow the instructions in the [federation-service README](https://github.com/CanDIG/federation_service#how-to-register-peer-servers). @@ -172,5 +173,5 @@ Federation is a two way process, where you need to register another server with Once two nodes are federated, summary data from federated nodes will appear in both nodes' data portals and will be viewable by all users who are able to authenticate to either node. -Access to patient level data through specific program authorization is managed by the node that hosts the data for that program. For example, if a user from UHN needs to be given authorization to a program hosted within the BC node, a site administrator from BC will need to add a program authorization for that UHN user to that program within the BC CanDIG node. +Access to patient level data through specific program authorization is managed by the node that hosts the data for that program. For example, if a user from UHN needs to be given authorization to a program hosted within the BC node, a site administrator from BC will need to [add a program authorization](https://github.com/CanDIG/candigv2-ingest#6-adding-a-dac-style-program-authorization-for-a-user) for that UHN user to that program within the BC CanDIG node. From 7384b1ef8778a0e373f396c8657cfcde53066e35 Mon Sep 17 00:00:00 2001 From: Marion Date: Fri, 13 Sep 2024 12:02:20 +1000 Subject: [PATCH 17/40] rewording from @DavidBrownlee --- docs/production-candig.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/production-candig.md b/docs/production-candig.md index 930777a81..b4bb0995e 100644 --- a/docs/production-candig.md +++ b/docs/production-candig.md @@ -22,7 +22,7 @@ Specifically, the UHN proxy forwards all candig.uhnresearch.ca and candigauth.uh ### OpenStack security group & nginx - C3G - OpenStack security group that allows access to ports 80 and 443 acts as a Firewall. + An OpenStack security group is applied as a firewall that allows ingress traffic to ports 80 and 443 only. nginx acts as a reverse proxy which: 1. Re-routes http traffic to https @@ -169,9 +169,9 @@ curl -X GET $CANDIG_URL'/ingest/site-role/admin/email/site_admin@test.ca' -H 'Au To federate your own node with another CanDIG node, follow the instructions in the [federation-service README](https://github.com/CanDIG/federation_service#how-to-register-peer-servers). -Federation is a two way process, where you need to register another server with your node, and the other node needs to register you node, by exchanging valid site administration bearer tokens. +Federation is a two way process, where you need to register another server with your node, and the other node needs to register your node, by exchanging valid site administration bearer tokens. -Once two nodes are federated, summary data from federated nodes will appear in both nodes' data portals and will be viewable by all users who are able to authenticate to either node. +Once two nodes are federated, summary data from federated nodes will appear in both nodes' data portals and will be viewable by all users who are able to login. Access to patient level data through specific program authorization is managed by the node that hosts the data for that program. For example, if a user from UHN needs to be given authorization to a program hosted within the BC node, a site administrator from BC will need to [add a program authorization](https://github.com/CanDIG/candigv2-ingest#6-adding-a-dac-style-program-authorization-for-a-user) for that UHN user to that program within the BC CanDIG node. From c7eec30a0f031fae9332863eb79fd3bc9bf309f3 Mon Sep 17 00:00:00 2001 From: Marion Date: Fri, 13 Sep 2024 13:28:36 +1000 Subject: [PATCH 18/40] add expandables --- README.md | 46 ++------------------------------- docs/install-candig.md | 58 ++++++++++++++++++++++++++++++++---------- 2 files changed, 47 insertions(+), 57 deletions(-) diff --git a/README.md b/README.md index c99870e73..0964cc858 100644 --- a/README.md +++ b/README.md @@ -13,48 +13,7 @@ CanDIG uses a make-based deployment process, with services containerized in Dock * [CanDIG Deployment Guide](./docs/install-candig.md) -View additional Makefile options with `make help`. - -### `.env` Environment File - -You need an `.env` file in the project root directory, which contains a set of global variables that are used as reference to the various parameters, plugins, and config options that operators can modify for testing purposes. This repo contains an example `.env` file in `etc/env/example.env`. - -For a basic desktop sandbox setup, the example variable file needs very little (if any) modification. - -When deploying CanDIGv2 -using `make`, `.env` is imported by `make` and all uncommented variables are added as environment variables via -`export`. - -Some of the functionality that is controlled through `.env` are: - -* operating system flags -* change docker network, driver, and swarm host -* modify ports, protocols, and plugins for various services -* version control and app pinning -* pre-defined defaults for turnkey deployment - -Environment variables defined in the `.env` file can be read in `docker-compose` scripts through the variable substitution operator -`${VAR}`. - -```yaml -# example compose YAML using variable substitution with default option -services: - consul: - image: progrium/consul - network_mode: ${DOCKER_MODE} -... -``` -### Configuring CanDIG modules - -Not all CanDIG modules are required for a minimal installation. The `CANDIG_MODULES` setting defines which modules are included in the deployment. - -By default (if you copy the sample file from `etc/env/example.env`) the installation includes the minimal list of modules: - -``` - CANDIG_MODULES=keycloak vault minio postgres redis htsget katsu candig-data-portal query tyk opa federation candig-ingest -``` - -Optional modules follow the `#` and include various monitoring components, workflow execution, and some older modules not generally installed. +See [Interact with the stack](docs/interact-with-the-stack.md) for a guide to additional options or view all Makefile options with `make help`. ### Configuring CanDIG modules @@ -63,12 +22,11 @@ Not all CanDIG modules are required for a minimal installation. The `CANDIG_MODU By default (if you copy the sample file from `etc/env/example.env`) the installation includes the minimal list of modules: ``` - CANDIG_MODULES=keycloak vault minio postgres redis htsget katsu candig-data-portal query tyk opa federation candig-ingest + CANDIG_MODULES=logging keycloak vault redis postgres htsget katsu query tyk opa federation candig-ingest candig-data-portal ``` Optional modules follow the `#` and include various monitoring components, workflow execution, and some older modules not generally installed. - ## Project Structure ```plaintext diff --git a/docs/install-candig.md b/docs/install-candig.md index 861211f8c..8e480f397 100644 --- a/docs/install-candig.md +++ b/docs/install-candig.md @@ -26,7 +26,8 @@ CanDIG can be installed and deployed as below for development situations where n ## Install OS Dependencies -### Debian +
+Debian 1. Update system/install dependencies @@ -71,7 +72,11 @@ sudo systemctl start docker sudo usermod -aG docker $(whoami) ``` -### Ubuntu +
+ +
+ +Ubuntu 1. Update system/install dependencies ```bash @@ -121,7 +126,11 @@ groups getent group docker ``` -### CentOS 7 +
+ +
+ +CentOS 7 1. Update system/install dependencies @@ -166,8 +175,11 @@ sudo usermod -aG docker $(whoami) ``` yq >= 4 is required. See [https://github.com/mikefarah/yq/#install](https://github.com/mikefarah/yq/#install) for install options. +
+ +
-### Note for WSL Systems +Note for WSL Systems Miniconda3 must be installed at `~/miniconda3` on WSL systems to avoid an infinite symlink loop. Add `CONDA_INSTALL = ~/miniconda3` above `CONDA = $(CONDA_INSTALL)/bin/conda` in the Makefile to avoid this issue. You can also use the below command to move the miniconda3 installation to the correct location. @@ -177,15 +189,7 @@ bash bin/miniconda_install.sh -f -b -u -p ~/miniconda3 yq >= 4 is required, but the conda version is outdated. Install the latest version system-wide by following the instructions at [the yq GitHub](https://github.com/mikefarah/yq/#install). -### Note for WSL Systems -Miniconda3 must be installed at `~/miniconda3` on WSL systems to avoid an infinite symlink loop. Add `CONDA_INSTALL = ~/miniconda3` above `CONDA = $(CONDA_INSTALL)/bin/conda` in the Makefile to avoid this issue. You can also use the below command to move the miniconda3 installation to the correct location. - - -```bash -bash bin/miniconda_install.sh -f -b -u -p ~/miniconda3 -``` - -yq >= 4 is required, but the conda version is outdated. Find a way to install it system-wide. +
## Initialize CanDIGv2 Repo @@ -214,7 +218,35 @@ conda activate candig ## Deploy CanDIGv2 Services with Compose +### `.env` Environment File + +You need an `.env` file in the project root directory, which contains a set of global variables that are used as reference to the various parameters, plugins, and config options that operators can modify for testing purposes. This repo contains an example `.env` file in `etc/env/example.env`. + +For a basic desktop sandbox setup, the example variable file needs very little (if any) modification. + +When deploying CanDIGv2 +using `make`, `.env` is imported by `make` and all uncommented variables are added as environment variables via +`export`. +Some of the functionality that is controlled through `.env` are: + +* operating system flags +* change docker network, driver, and swarm host +* modify ports, protocols, and plugins for various services +* version control and app pinning +* pre-defined defaults for turnkey deployment + +Environment variables defined in the `.env` file can be read in `docker-compose` scripts through the variable substitution operator +`${VAR}`. + +```yaml +# example compose YAML using variable substitution with default option +services: + consul: + image: progrium/consul + network_mode: ${DOCKER_MODE} +... +``` ### New From b1f92c4bc19ebd29e468531ca29b3852a4c835ed Mon Sep 17 00:00:00 2001 From: Marion Date: Fri, 13 Sep 2024 13:31:42 +1000 Subject: [PATCH 19/40] reorg --- README.md | 14 +------------- docs/install-candig.md | 14 ++++++++++++++ 2 files changed, 15 insertions(+), 13 deletions(-) diff --git a/README.md b/README.md index 0964cc858..b2ec36e42 100644 --- a/README.md +++ b/README.md @@ -9,24 +9,12 @@ dataflow for genomic data. ## Installation -CanDIG uses a make-based deployment process, with services containerized in Docker. To deploy CanDIGv2, follow the installation guide in `docs/`: +CanDIG uses a make-based deployment process, with services containerized in Docker. To deploy CanDIGv2, follow the installation guides in `docs/`: * [CanDIG Deployment Guide](./docs/install-candig.md) See [Interact with the stack](docs/interact-with-the-stack.md) for a guide to additional options or view all Makefile options with `make help`. -### Configuring CanDIG modules - -Not all CanDIG modules are required for a minimal installation. The `CANDIG_MODULES` setting defines which modules are included in the deployment. - -By default (if you copy the sample file from `etc/env/example.env`) the installation includes the minimal list of modules: - -``` - CANDIG_MODULES=logging keycloak vault redis postgres htsget katsu query tyk opa federation candig-ingest candig-data-portal -``` - -Optional modules follow the `#` and include various monitoring components, workflow execution, and some older modules not generally installed. - ## Project Structure ```plaintext diff --git a/docs/install-candig.md b/docs/install-candig.md index 8e480f397..a18b2b4ee 100644 --- a/docs/install-candig.md +++ b/docs/install-candig.md @@ -247,6 +247,20 @@ services: network_mode: ${DOCKER_MODE} ... ``` +
+ +Configuring CanDIG modules + +Not all CanDIG modules are required for a minimal installation. The `CANDIG_MODULES` setting defines which modules are included in the deployment. + +By default (if you copy the sample file from `etc/env/example.env`) the installation includes the minimal list of modules: + +``` + CANDIG_MODULES=logging keycloak vault redis postgres htsget katsu query tyk opa federation candig-ingest candig-data-portal +``` + +Optional modules follow the `#` and include various monitoring components, workflow execution, and some older modules not generally installed. +
### New From cc9c3fb4e252ae2dd3a34a2b3af215575ce479f2 Mon Sep 17 00:00:00 2001 From: Marion Date: Fri, 13 Sep 2024 13:37:02 +1000 Subject: [PATCH 20/40] more collapses --- docs/install-candig.md | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/docs/install-candig.md b/docs/install-candig.md index a18b2b4ee..91b8e57d6 100644 --- a/docs/install-candig.md +++ b/docs/install-candig.md @@ -180,8 +180,8 @@ yq >= 4 is required. See [https://github.com/mikefarah/yq/#install](https://git
Note for WSL Systems -Miniconda3 must be installed at `~/miniconda3` on WSL systems to avoid an infinite symlink loop. Add `CONDA_INSTALL = ~/miniconda3` above `CONDA = $(CONDA_INSTALL)/bin/conda` in the Makefile to avoid this issue. You can also use the below command to move the miniconda3 installation to the correct location. +Miniconda3 must be installed at `~/miniconda3` on WSL systems to avoid an infinite symlink loop. Add `CONDA_INSTALL = ~/miniconda3` above `CONDA = $(CONDA_INSTALL)/bin/conda` in the Makefile to avoid this issue. You can also use the below command to move the miniconda3 installation to the correct location. ```bash bash bin/miniconda_install.sh -f -b -u -p ~/miniconda3 @@ -216,9 +216,9 @@ make init-conda conda activate candig ``` -## Deploy CanDIGv2 Services with Compose +
-### `.env` Environment File +More info about the `.env` Environment File You need an `.env` file in the project root directory, which contains a set of global variables that are used as reference to the various parameters, plugins, and config options that operators can modify for testing purposes. This repo contains an example `.env` file in `etc/env/example.env`. @@ -247,6 +247,9 @@ services: network_mode: ${DOCKER_MODE} ... ``` + +
+
Configuring CanDIG modules @@ -262,9 +265,7 @@ By default (if you copy the sample file from `etc/env/example.env`) the installa Optional modules follow the `#` and include various monitoring components, workflow execution, and some older modules not generally installed.
-### New - -`install-all` will perform all of the steps of the old method (section below) including the conda install, building images explicitly. **Note**: On Mac M1, you will not be able to use make install-all; instead, use the conda installation instructions as described above. Build-all will then build and compose the containers for you. +`install-all` will perform all of the steps to deploy CanDIG including the conda install, building images explicitly. **Note**: On Mac M1, you will not be able to use make install-all; instead, use the conda installation instructions as described above. Build-all will then build and compose the containers for you. ```bash From e460dff04212770af5823ddd5d08af017fc57075 Mon Sep 17 00:00:00 2001 From: Marion Date: Fri, 13 Sep 2024 13:40:07 +1000 Subject: [PATCH 21/40] rearrange sections --- docs/install-candig.md | 110 ++++++++++++++++++++++------------------- 1 file changed, 58 insertions(+), 52 deletions(-) diff --git a/docs/install-candig.md b/docs/install-candig.md index 91b8e57d6..408fe6484 100644 --- a/docs/install-candig.md +++ b/docs/install-candig.md @@ -191,6 +191,64 @@ yq >= 4 is required, but the conda version is outdated. Install the latest vers
+
+ +For Apple Silicon + +### 1. Install OS Dependencies + +- Install dependencies + +```bash +brew install gettext +brew link --force gettext +brew install jq +brew install yq +``` + +- Get [Docker Desktop for Apple Silicon](https://docs.docker.com/desktop/install/mac-install/). Be sure to start it. + +### 2. Initialize CanDIGv2 Repo + +```bash +git clone -b develop https://github.com/CanDIG/CanDIGv2.git +cd CanDIGv2 +git submodule update --init --recursive +cp -i etc/env/example.env .env +``` + +### 3. Update .env file + +```bash +# find out your ip and add to LOCAL_IP_ADDR +LOCAL_IP_ADDR=xxx.xx.x.x +# change OS +VENV_OS=arm64mac +``` + +Edit /etc/hosts on the machine (`sudo nano /etc/hosts`): + +```bash +::1 candig.docker.internal +``` + +### 4. Initialize conda + +```bash +make bin-all +make init-conda +conda activate candig +``` + +### 5. Build and test + +```bash +make build-all +make test-integration +``` + +
+ ## Initialize CanDIGv2 Repo ```bash @@ -331,59 +389,7 @@ sudo ufw allow from $DOCKER_BRIDGE_IP to Re-run `make clean-authx` and `make init-authx` and it should work. -## For Apple Silicon - -### 1. Install OS Dependencies - -- Install dependencies - -```bash -brew install gettext -brew link --force gettext -brew install jq -brew install yq -``` - -- Get [Docker Desktop for Apple Silicon](https://docs.docker.com/desktop/install/mac-install/). Be sure to start it. - -### 2. Initialize CanDIGv2 Repo - -```bash -git clone -b develop https://github.com/CanDIG/CanDIGv2.git -cd CanDIGv2 -git submodule update --init --recursive -cp -i etc/env/example.env .env -``` - -### 3. Update .env file - -```bash -# find out your ip and add to LOCAL_IP_ADDR -LOCAL_IP_ADDR=xxx.xx.x.x -# change OS -VENV_OS=arm64mac -``` - -Edit /etc/hosts on the machine (`sudo nano /etc/hosts`): -```bash -::1 candig.docker.internal -``` - -### 4. Initialize conda - -```bash -make bin-all -make init-conda -conda activate candig -``` - -### 5. Build and test - -```bash -make build-all -make test-integration -``` Once everything has run without errors, take a look at the documentation for [ingesting data and testing the deployment](ingest-and-test.md) as well as From c877a765c9577b74c0f3b4280b6e9cb5d30caf0b Mon Sep 17 00:00:00 2001 From: Marion Date: Fri, 13 Sep 2024 13:47:37 +1000 Subject: [PATCH 22/40] add stable branch --- docs/install-candig.md | 6 +----- docs/production-candig.md | 4 ++++ 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/docs/install-candig.md b/docs/install-candig.md index 408fe6484..e9e88d067 100644 --- a/docs/install-candig.md +++ b/docs/install-candig.md @@ -1,10 +1,6 @@ # CanDIGv2 Install Guide ---- - -These instructions work for server deployments or local linux deployments. For local OSX using M1 architecture, there are [modification instructions](#modifications-for-apple-silicon-m1) instructions at the bottom of this file. For WSL you can follow the linux instructions and follow WSL instructions for firewall file at [update firewall](#update-firewall). - -Before beginning, you should set up your environment variables as described in the [README](../README.md). +These instructions work for server deployments or local linux deployments. For local OSX using M1 architecture, there are modification instructions in the [install-os-dependencies](#install-os-dependencies) section. For WSL you can follow the linux instructions and follow WSL instructions for firewall file at [update firewall](#update-firewall). Docker Engine (also known as Docker CE) is recommended over Docker Desktop for linux installations. diff --git a/docs/production-candig.md b/docs/production-candig.md index b4bb0995e..b55b78fc3 100644 --- a/docs/production-candig.md +++ b/docs/production-candig.md @@ -2,6 +2,10 @@ Apart from the basic steps in the [CanDIGv2 Install Guide](install-candig.md) to get the candig stack up and running, there are additional settings and security recommendations that need to set up in a production level environment. We provide the following as general advice, but it is important for all CanDIG deployers to also consult with their institutional infrastructure security personnel to ensure that their deployment meets the necessary level of data security. +## Stable branch + +Production deployments should use the latest stable release of CanDIGv2 which uses the stable branches and fixed versions of all other submodules and packages. The develop versions of CanDIG software is under active development and should not be used for production purposes. When new stable releases are made, we recommend updating as soon as possible. It is possible that CanDIG nodes running different stable releases will not be able to be federated. + ## Reverse Proxy & Firewall It is essential to setup a reverse proxy and firewall so that only specific ports are open to the internet. The software used for this is up to the deployer and is considered outside of the CanDIG stack. From 766e565c6e69216d4b8822c92d726c7e7d82faf5 Mon Sep 17 00:00:00 2001 From: Marion Date: Fri, 13 Sep 2024 13:48:38 +1000 Subject: [PATCH 23/40] stable --- docs/production-candig.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/production-candig.md b/docs/production-candig.md index b55b78fc3..490aa5122 100644 --- a/docs/production-candig.md +++ b/docs/production-candig.md @@ -4,7 +4,7 @@ Apart from the basic steps in the [CanDIGv2 Install Guide](install-candig.md) to ## Stable branch -Production deployments should use the latest stable release of CanDIGv2 which uses the stable branches and fixed versions of all other submodules and packages. The develop versions of CanDIG software is under active development and should not be used for production purposes. When new stable releases are made, we recommend updating as soon as possible. It is possible that CanDIG nodes running different stable releases will not be able to be federated. +Production deployments should use the latest [stable release of CanDIGv2](https://github.com/CanDIG/CanDIGv2/releases) which uses the stable branches and fixed versions of all other submodules and packages. The develop versions of CanDIG software is under active development and should not be used for production purposes. When new stable releases are made, we recommend updating as soon as possible. It is possible that CanDIG nodes running different stable releases will not be able to be federated. ## Reverse Proxy & Firewall From 3252dbf27e588e565f7197e528ae41404f202a59 Mon Sep 17 00:00:00 2001 From: Marion Date: Mon, 16 Sep 2024 05:26:08 +1000 Subject: [PATCH 24/40] updates --- docs/production-candig.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/production-candig.md b/docs/production-candig.md index 490aa5122..53815e925 100644 --- a/docs/production-candig.md +++ b/docs/production-candig.md @@ -4,7 +4,7 @@ Apart from the basic steps in the [CanDIGv2 Install Guide](install-candig.md) to ## Stable branch -Production deployments should use the latest [stable release of CanDIGv2](https://github.com/CanDIG/CanDIGv2/releases) which uses the stable branches and fixed versions of all other submodules and packages. The develop versions of CanDIG software is under active development and should not be used for production purposes. When new stable releases are made, we recommend updating as soon as possible. It is possible that CanDIG nodes running different stable releases will not be able to be federated. +Production deployments should use the latest [stable release of CanDIGv2](https://github.com/CanDIG/CanDIGv2/releases) which uses the stable branches and fixed versions of all other submodules and packages. The develop versions of CanDIG software are under active development and should not be used for production purposes. When new stable releases are made, we recommend updating as soon as possible. It is possible that CanDIG nodes running different stable releases will not be able to be federated. ## Reverse Proxy & Firewall From df6a6efe06ab407f1c407a7a0c7c0641159cbbb8 Mon Sep 17 00:00:00 2001 From: Marion Date: Wed, 25 Sep 2024 11:34:27 -0700 Subject: [PATCH 25/40] Update install-candig.md --- docs/install-candig.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/docs/install-candig.md b/docs/install-candig.md index e9e88d067..761ec7f5d 100644 --- a/docs/install-candig.md +++ b/docs/install-candig.md @@ -270,6 +270,13 @@ make init-conda conda activate candig ``` +Copy and edit the `.env` file to match your configuration + +``` +cp etc/env/example.env .env +``` + +
More info about the `.env` Environment File From e4d87c665a114a3aaa04bc2aa6c5950d23ebf8cb Mon Sep 17 00:00:00 2001 From: Marion Date: Thu, 26 Sep 2024 12:34:23 -0700 Subject: [PATCH 26/40] update to develop --- lib/candig-ingest/candigv2-ingest | 2 +- lib/federation/federation | 2 +- lib/katsu/katsu_service | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/lib/candig-ingest/candigv2-ingest b/lib/candig-ingest/candigv2-ingest index b614329b3..96419ae0c 160000 --- a/lib/candig-ingest/candigv2-ingest +++ b/lib/candig-ingest/candigv2-ingest @@ -1 +1 @@ -Subproject commit b614329b3528e26be65fa918ead34d67b0d49049 +Subproject commit 96419ae0ce81aea4b31167899403c04e2bdcf9d0 diff --git a/lib/federation/federation b/lib/federation/federation index 3c79f4b2a..95648fe45 160000 --- a/lib/federation/federation +++ b/lib/federation/federation @@ -1 +1 @@ -Subproject commit 3c79f4b2aeab862adf7feb4e7073b2d0fee4b8a5 +Subproject commit 95648fe456ddab156799f206fca322dcd8d5c87b diff --git a/lib/katsu/katsu_service b/lib/katsu/katsu_service index 889bc1b4c..ea2a1929d 160000 --- a/lib/katsu/katsu_service +++ b/lib/katsu/katsu_service @@ -1 +1 @@ -Subproject commit 889bc1b4c74b44911c81775d1b0437a56705fe66 +Subproject commit ea2a1929d439137c19d36f65e154eca1c82eae0d From b74958f512d7b9886c17cfbd79d7c2659532f274 Mon Sep 17 00:00:00 2001 From: Marion Date: Thu, 26 Sep 2024 12:40:08 -0700 Subject: [PATCH 27/40] docs updates --- docs/install-candig.md | 10 ++++++++-- docs/production-candig.md | 2 +- 2 files changed, 9 insertions(+), 3 deletions(-) diff --git a/docs/install-candig.md b/docs/install-candig.md index 761ec7f5d..32be1c817 100644 --- a/docs/install-candig.md +++ b/docs/install-candig.md @@ -68,6 +68,8 @@ sudo systemctl start docker sudo usermod -aG docker $(whoami) ``` +Continue to [Initialize CanDIGv2 Repo](#initialize-candigv2-repo) section below +
@@ -122,6 +124,8 @@ groups getent group docker ``` +Continue to [Initialize CanDIGv2 Repo](#initialize-candigv2-repo) section below +
@@ -171,6 +175,8 @@ sudo usermod -aG docker $(whoami) ``` yq >= 4 is required. See [https://github.com/mikefarah/yq/#install](https://github.com/mikefarah/yq/#install) for install options. +Continue to [Initialize CanDIGv2 Repo](#initialize-candigv2-repo) section below +
@@ -185,6 +191,8 @@ bash bin/miniconda_install.sh -f -b -u -p ~/miniconda3 yq >= 4 is required, but the conda version is outdated. Install the latest version system-wide by following the instructions at [the yq GitHub](https://github.com/mikefarah/yq/#install). +Continue to [Initialize CanDIGv2 Repo](#initialize-candigv2-repo) section below +
@@ -392,8 +400,6 @@ sudo ufw allow from $DOCKER_BRIDGE_IP to Re-run `make clean-authx` and `make init-authx` and it should work. - - Once everything has run without errors, take a look at the documentation for [ingesting data and testing the deployment](ingest-and-test.md) as well as [how to modify code and test changes](docker-and-submodules.md) in diff --git a/docs/production-candig.md b/docs/production-candig.md index 53815e925..630d45ed4 100644 --- a/docs/production-candig.md +++ b/docs/production-candig.md @@ -167,7 +167,7 @@ curl -X GET $CANDIG_URL'/ingest/site-role/admin/email/site_admin@test.ca' -H 'Au ## Connecting Keycloak to institutional LDAP - +You will need to work with your site IT administrator in order to connect an external authentication service to the running Keycloak. ## Federating with other CanDIG production instances From 0c0df7165db02a53240f670e30e9ea0b8a727c54 Mon Sep 17 00:00:00 2001 From: Marion Date: Thu, 26 Sep 2024 12:46:23 -0700 Subject: [PATCH 28/40] add env section --- docs/install-candig.md | 66 ++++++++++++++++++++++++------------------ 1 file changed, 38 insertions(+), 28 deletions(-) diff --git a/docs/install-candig.md b/docs/install-candig.md index 32be1c817..10d8c5896 100644 --- a/docs/install-candig.md +++ b/docs/install-candig.md @@ -68,7 +68,7 @@ sudo systemctl start docker sudo usermod -aG docker $(whoami) ``` -Continue to [Initialize CanDIGv2 Repo](#initialize-candigv2-repo) section below +Continue to [Configure .env](#configure-.env) section below
@@ -124,7 +124,7 @@ groups getent group docker ``` -Continue to [Initialize CanDIGv2 Repo](#initialize-candigv2-repo) section below +Continue to [Configure .env](#configure-.env) section below @@ -175,7 +175,7 @@ sudo usermod -aG docker $(whoami) ``` yq >= 4 is required. See [https://github.com/mikefarah/yq/#install](https://github.com/mikefarah/yq/#install) for install options. -Continue to [Initialize CanDIGv2 Repo](#initialize-candigv2-repo) section below +Continue to [Configure .env](#configure-.env) section below @@ -191,7 +191,7 @@ bash bin/miniconda_install.sh -f -b -u -p ~/miniconda3 yq >= 4 is required, but the conda version is outdated. Install the latest version system-wide by following the instructions at [the yq GitHub](https://github.com/mikefarah/yq/#install). -Continue to [Initialize CanDIGv2 Repo](#initialize-candigv2-repo) section below +Continue to [Configure .env](#configure-.env) section below @@ -253,30 +253,7 @@ make test-integration -## Initialize CanDIGv2 Repo - -```bash -# 1. initialize repo and submodules -git clone -b develop https://github.com/CanDIG/CanDIGv2.git -cd CanDIGv2 -git submodule update --init --recursive - -# 2. copy and edit .env with your site's local configuration -cp -i etc/env/example.env .env - -# 3. (IF NOT USING MAKE INSTALL-ALL) option A: install miniconda and initialize candig virtualenv (use this option -# for systems installations). Installs miniconda in the candigv2 repo. -make bin-conda # If this fails on WSL, see the Note for WSL Systems section below -make init-conda - -# 3. (IF NOT USING MAKE INSTALL-ALL) option B: if you want to use an existing conda installation on your local -# at the top of the Makefile, set CONDA_BASE to your existing conda installation -make mkdir # skip most of bin-conda, but need the dir-creating step -make init-conda - -# 4. Activate the candig virtualenv. It may be necessary to restart your shell before doing this -conda activate candig -``` +## Configure .env Copy and edit the `.env` file to match your configuration @@ -284,6 +261,14 @@ Copy and edit the `.env` file to match your configuration cp etc/env/example.env .env ``` +Update any of the information you want or need to customize including: + +```bash +# find out your ip and add to LOCAL_IP_ADDR +LOCAL_IP_ADDR=xxx.xx.x.x +# change OS +VENV_OS= +```
@@ -319,6 +304,31 @@ services:
+## Initialize CanDIGv2 Repo + +```bash +# 1. initialize repo and submodules +git clone -b develop https://github.com/CanDIG/CanDIGv2.git +cd CanDIGv2 +git submodule update --init --recursive + +# 2. copy and edit .env with your site's local configuration +cp -i etc/env/example.env .env + +# 3. (IF NOT USING MAKE INSTALL-ALL) option A: install miniconda and initialize candig virtualenv (use this option +# for systems installations). Installs miniconda in the candigv2 repo. +make bin-conda # If this fails on WSL, see the Note for WSL Systems section below +make init-conda + +# 3. (IF NOT USING MAKE INSTALL-ALL) option B: if you want to use an existing conda installation on your local +# at the top of the Makefile, set CONDA_BASE to your existing conda installation +make mkdir # skip most of bin-conda, but need the dir-creating step +make init-conda + +# 4. Activate the candig virtualenv. It may be necessary to restart your shell before doing this +conda activate candig +``` +
Configuring CanDIG modules From 3d335f2a3548862aacc61aedde6bad0e7c065cbb Mon Sep 17 00:00:00 2001 From: Marion Date: Thu, 26 Sep 2024 12:55:00 -0700 Subject: [PATCH 29/40] rearrange --- docs/install-candig.md | 42 ++++++++++++++++++++++++------------------ 1 file changed, 24 insertions(+), 18 deletions(-) diff --git a/docs/install-candig.md b/docs/install-candig.md index 10d8c5896..581b382f0 100644 --- a/docs/install-candig.md +++ b/docs/install-candig.md @@ -253,12 +253,21 @@ make test-integration
-## Configure .env +## Initialize CanDIGv2 Repo + + +### 1. Initialize repo and submodules +```bash +git clone -b develop https://github.com/CanDIG/CanDIGv2.git +cd CanDIGv2 +git submodule update --init --recursive +``` + +### 2. Copy and edit .env with your site's local configuration -Copy and edit the `.env` file to match your configuration ``` -cp etc/env/example.env .env +cp -i etc/env/example.env .env ``` Update any of the information you want or need to customize including: @@ -302,30 +311,27 @@ services: ... ``` - +### 3. option A: install miniconda and initialize candig virtualenv -## Initialize CanDIGv2 Repo +Use this option for systems installations. It installs miniconda in the candigv2 repo. ```bash -# 1. initialize repo and submodules -git clone -b develop https://github.com/CanDIG/CanDIGv2.git -cd CanDIGv2 -git submodule update --init --recursive +make bin-conda # If this fails on WSL, see the Note for WSL Systems section above +make init-conda +``` -# 2. copy and edit .env with your site's local configuration -cp -i etc/env/example.env .env +### 3. option B. Use an existing Conda installation -# 3. (IF NOT USING MAKE INSTALL-ALL) option A: install miniconda and initialize candig virtualenv (use this option -# for systems installations). Installs miniconda in the candigv2 repo. -make bin-conda # If this fails on WSL, see the Note for WSL Systems section below -make init-conda +If you want to use an existing conda installation on your local at the bottom of the [.env](../etc/env/example.env#L310), set `CONDA_INSTALL` to your existing conda installation path -# 3. (IF NOT USING MAKE INSTALL-ALL) option B: if you want to use an existing conda installation on your local -# at the top of the Makefile, set CONDA_BASE to your existing conda installation +```bash make mkdir # skip most of bin-conda, but need the dir-creating step make init-conda +``` -# 4. Activate the candig virtualenv. It may be necessary to restart your shell before doing this +### 4. Activate the candig virtualenv. It may be necessary to restart your shell before doing this + +```bash conda activate candig ``` From 1d009197bae65a23d8d65975643a9dd32a31d605 Mon Sep 17 00:00:00 2001 From: Marion Date: Thu, 26 Sep 2024 12:56:57 -0700 Subject: [PATCH 30/40] missing tag --- docs/install-candig.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/install-candig.md b/docs/install-candig.md index 581b382f0..5af210084 100644 --- a/docs/install-candig.md +++ b/docs/install-candig.md @@ -311,6 +311,8 @@ services: ... ``` + + ### 3. option A: install miniconda and initialize candig virtualenv Use this option for systems installations. It installs miniconda in the candigv2 repo. From 6141d84d95ee4a90f8ee53c8e7bc0ac37f985366 Mon Sep 17 00:00:00 2001 From: Marion Date: Thu, 26 Sep 2024 12:59:43 -0700 Subject: [PATCH 31/40] rearrange --- docs/install-candig.md | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/docs/install-candig.md b/docs/install-candig.md index 5af210084..fcf3a5481 100644 --- a/docs/install-candig.md +++ b/docs/install-candig.md @@ -352,6 +352,8 @@ By default (if you copy the sample file from `etc/env/example.env`) the installa Optional modules follow the `#` and include various monitoring components, workflow execution, and some older modules not generally installed. +## Build and compose all the modules + `install-all` will perform all of the steps to deploy CanDIG including the conda install, building images explicitly. **Note**: On Mac M1, you will not be able to use make install-all; instead, use the conda installation instructions as described above. Build-all will then build and compose the containers for you. @@ -365,6 +367,13 @@ make install-all make build-all ``` +Once everything has run without errors, take a look at the documentation for +[ingesting data and testing the deployment](ingest-and-test.md) as well as +[how to modify code and test changes](docker-and-submodules.md) in +the context of the CanDIG stack. + +## Troubleshooting + On some machines (MacOS), if you get an error something like: ``` Please ensure the value of $CANDIG_DOMAIN in your .env file points to this machine @@ -405,7 +414,7 @@ If you can see the data portal at http://candig.docker.internal:5080/, your inst Confirm your installation with the [automatic tests](/docs/ingest-and-test.md). -## Update Firewall +### Update Firewall If the command still fails, it may be necessary to disable your local firewall, or edit it to allow requests from all ports used in the Docker stack. @@ -417,8 +426,3 @@ sudo ufw allow from $DOCKER_BRIDGE_IP to ``` Re-run `make clean-authx` and `make init-authx` and it should work. - -Once everything has run without errors, take a look at the documentation for -[ingesting data and testing the deployment](ingest-and-test.md) as well as -[how to modify code and test changes](docker-and-submodules.md) in -the context of the CanDIG stack. From a45d3669b626596df158252efa65d897f6b48641 Mon Sep 17 00:00:00 2001 From: Marion Date: Thu, 26 Sep 2024 13:01:09 -0700 Subject: [PATCH 32/40] add note --- docs/install-candig.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/docs/install-candig.md b/docs/install-candig.md index fcf3a5481..44f495f4e 100644 --- a/docs/install-candig.md +++ b/docs/install-candig.md @@ -354,7 +354,10 @@ Optional modules follow the `#` and include various monitoring components, workf ## Build and compose all the modules -`install-all` will perform all of the steps to deploy CanDIG including the conda install, building images explicitly. **Note**: On Mac M1, you will not be able to use make install-all; instead, use the conda installation instructions as described above. Build-all will then build and compose the containers for you. +`install-all` will perform all of the steps to deploy CanDIG including the conda install, building images explicitly. + +> [!IMPORTANT] +> On Mac M1, you will not be able to use make install-all; instead, use the conda installation instructions as described above. Build-all will then build and compose the containers for you. ```bash From c51ee51cb955c649213770f70b12a5237b6792fa Mon Sep 17 00:00:00 2001 From: Marion Date: Thu, 26 Sep 2024 13:09:23 -0700 Subject: [PATCH 33/40] issue and doc update --- .github/ISSUE_TEMPLATE/deployment-error.md | 13 ++++++++----- docs/install-candig.md | 8 +++++--- 2 files changed, 13 insertions(+), 8 deletions(-) diff --git a/.github/ISSUE_TEMPLATE/deployment-error.md b/.github/ISSUE_TEMPLATE/deployment-error.md index 54966584a..776e46d52 100644 --- a/.github/ISSUE_TEMPLATE/deployment-error.md +++ b/.github/ISSUE_TEMPLATE/deployment-error.md @@ -7,13 +7,16 @@ assignees: '' --- -**Operating system** +## Operating system - OS (linux, WSL, OSX darwin, OSX arm) -**Step where failure occurred (make-?)** +## What version are you deploying -**Are there errors in `tmp/error.txt`?** +## Step where failure occurred (make-?) -**Are there errors in the docker logs for the affected container(s)?** +## Are there errors in `tmp/error.txt`? -**Any other information** +## Are there errors in the central log for the affected container(s)? `tmp/logs/*.log` + + +## Any other information diff --git a/docs/install-candig.md b/docs/install-candig.md index 44f495f4e..70f0a3ceb 100644 --- a/docs/install-candig.md +++ b/docs/install-candig.md @@ -364,19 +364,21 @@ Optional modules follow the `#` and include various monitoring components, workf make install-all ``` -`build-all` will do the same without running bin-conda and init-conda: +`build-all` will do the same without running `bin-conda` and `init-conda`: ```bash make build-all ``` Once everything has run without errors, take a look at the documentation for -[ingesting data and testing the deployment](ingest-and-test.md) as well as -[how to modify code and test changes](docker-and-submodules.md) in +[ingesting data and testing the deployment](ingest-and-test.md) as well as [Interacting with the stack using Make](interact-with-the-stack.md) +and if you are a developer: [how to modify code and test changes](docker-and-submodules.md) in the context of the CanDIG stack. ## Troubleshooting +Below are some common issues that our users have encountered. If you run into any other issues not addressed here, please reach out via a github issue in this repo. + On some machines (MacOS), if you get an error something like: ``` Please ensure the value of $CANDIG_DOMAIN in your .env file points to this machine From 907bf225ff3b2cb9ddd8274453171b22b8a5054b Mon Sep 17 00:00:00 2001 From: Marion Date: Thu, 26 Sep 2024 13:16:00 -0700 Subject: [PATCH 34/40] update prod --- docs/production-candig.md | 13 ++++++++----- lib/katsu/katsu_service | 2 +- 2 files changed, 9 insertions(+), 6 deletions(-) diff --git a/docs/production-candig.md b/docs/production-candig.md index 630d45ed4..799f80755 100644 --- a/docs/production-candig.md +++ b/docs/production-candig.md @@ -108,7 +108,9 @@ When CanDIG is initially deployed, a `site_admin` user will be created by defaul 1. Login to the data portal with the credentials you wish to make a site administrator to ensure the user can login successfully -2. Get a site admin token using the default site admin user: +2. ssh into the VM running your CanDIG deployment and cd into the currently deployed repo directory + +3. Get a site admin token using the default site admin user: ```bash source env.sh @@ -130,19 +132,19 @@ CURL_OUTPUT=$(curl -s --request POST \ export TOKEN=$(echo $CURL_OUTPUT | grep -Eo 'access_token":"[a-zA-Z0-9._\-]+' | cut -d '"' -f3) ``` -3. Set the role of the real user to a site admin with the following curl command: +4. Set the role of the real user to a site admin with the following curl command: ```bash curl -X POST $CANDIG_URL'/ingest/site-role/admin/email/' -H 'Authorization: Bearer '$TOKEN ``` -4. Check the role assignment was successful by verifying the following command returns `True`: +5. Check the role assignment was successful by verifying the following command returns `True`: ```bash curl -X GET $CANDIG_URL'/ingest/site-role/admin/email/' -H 'Authorization: Bearer '$TOKEN ``` -5. Delete the default site admin user using your new real user site admin token +6. Delete the default site admin user using your new real user site admin token ```bash CURL_OUTPUT=$(curl -s --request POST \ @@ -161,9 +163,10 @@ export TOKEN=$(echo $CURL_OUTPUT | grep -Eo 'access_token":"[a-zA-Z0-9._\-]+' | ``` ```bash -curl -X GET $CANDIG_URL'/ingest/site-role/admin/email/site_admin@test.ca' -H 'Authorization: Bearer '$TOKEN +curl -X DELETE $CANDIG_URL'/ingest/site-role/admin/email/site_admin@test.ca' -H 'Authorization: Bearer '$TOKEN ``` +Keep the site admin user and password secure at all times. ## Connecting Keycloak to institutional LDAP diff --git a/lib/katsu/katsu_service b/lib/katsu/katsu_service index ea2a1929d..889bc1b4c 160000 --- a/lib/katsu/katsu_service +++ b/lib/katsu/katsu_service @@ -1 +1 @@ -Subproject commit ea2a1929d439137c19d36f65e154eca1c82eae0d +Subproject commit 889bc1b4c74b44911c81775d1b0437a56705fe66 From 362385b624ac4b5e01448edced352488bbd4d86b Mon Sep 17 00:00:00 2001 From: Marion Date: Thu, 26 Sep 2024 13:25:04 -0700 Subject: [PATCH 35/40] added backup --- docs/production-candig.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/docs/production-candig.md b/docs/production-candig.md index 799f80755..a7d7f64bb 100644 --- a/docs/production-candig.md +++ b/docs/production-candig.md @@ -182,3 +182,7 @@ Once two nodes are federated, summary data from federated nodes will appear in b Access to patient level data through specific program authorization is managed by the node that hosts the data for that program. For example, if a user from UHN needs to be given authorization to a program hosted within the BC node, a site administrator from BC will need to [add a program authorization](https://github.com/CanDIG/candigv2-ingest#6-adding-a-dac-style-program-authorization-for-a-user) for that UHN user to that program within the BC CanDIG node. +## Backing up production data + +It is not expected that a CanDIG instance would hold the only copy of any ingested data. However, recognising that the ETL and ingest process takes significant time and effort, it is a good idea to regularly backup all data stored in CanDIG. Steps for how to do this can be found in [Backing up and restoring CanDIG data](backup-restore-candig.md) + From d935cb5dfeb496c2f6d9e748b00b86c452f3a5c3 Mon Sep 17 00:00:00 2001 From: Marion Date: Fri, 27 Sep 2024 10:07:51 -0700 Subject: [PATCH 36/40] add vault backup, update mods --- docs/backup-restore-candig.md | 24 +++++++++++++++++++++++- lib/candig-ingest/candigv2-ingest | 2 +- lib/federation/federation | 2 +- lib/htsget/htsget_app | 2 +- lib/katsu/katsu_service | 2 +- 5 files changed, 27 insertions(+), 5 deletions(-) diff --git a/docs/backup-restore-candig.md b/docs/backup-restore-candig.md index 1c334c3b6..e23c56e28 100644 --- a/docs/backup-restore-candig.md +++ b/docs/backup-restore-candig.md @@ -105,7 +105,29 @@ docker start candigv2_htsget_1 You should be able to see the restored data in the data portal. -## Backing up Authorization data +## Backing up Secrets and Authorization data + +Secrets in CanDIG are stored within Vault. To back up Vault, run the command: + +``` +make backup-vault +``` + +This command creates a tar ball at `tmp/vault/backup.tar.gz`. This should be saved into a secure location outside the server your CanDIG deployment is running. You may want to change the name of the backup to include the date and type of backup for future reference, e.g. `YYYY-MM-DD-vault-backup.tar.gz` + +To restore the vault backup, copy the backup tarball into the vault directory in the CanDIG stack and rename it to `restore.tar.gz`: + +``` +cp /path/to/backup.tar.gz path/to/CanDIGv2/lib/vault/restore.tar.gz +``` + +Then run + +``` +make restore-vault +``` + +All previous secrets and authorizations should be restored to the stack. ## Backing up logs diff --git a/lib/candig-ingest/candigv2-ingest b/lib/candig-ingest/candigv2-ingest index 96419ae0c..129cfb0af 160000 --- a/lib/candig-ingest/candigv2-ingest +++ b/lib/candig-ingest/candigv2-ingest @@ -1 +1 @@ -Subproject commit 96419ae0ce81aea4b31167899403c04e2bdcf9d0 +Subproject commit 129cfb0afd270c392259717917ac290f4020c781 diff --git a/lib/federation/federation b/lib/federation/federation index 95648fe45..3c79f4b2a 160000 --- a/lib/federation/federation +++ b/lib/federation/federation @@ -1 +1 @@ -Subproject commit 95648fe456ddab156799f206fca322dcd8d5c87b +Subproject commit 3c79f4b2aeab862adf7feb4e7073b2d0fee4b8a5 diff --git a/lib/htsget/htsget_app b/lib/htsget/htsget_app index 638e3b249..ae8d6d7ec 160000 --- a/lib/htsget/htsget_app +++ b/lib/htsget/htsget_app @@ -1 +1 @@ -Subproject commit 638e3b249491526a877771391f0f89a321d0be46 +Subproject commit ae8d6d7ec5fcd12b5530d9736227114ec8dd93db diff --git a/lib/katsu/katsu_service b/lib/katsu/katsu_service index 889bc1b4c..16b5dc204 160000 --- a/lib/katsu/katsu_service +++ b/lib/katsu/katsu_service @@ -1 +1 @@ -Subproject commit 889bc1b4c74b44911c81775d1b0437a56705fe66 +Subproject commit 16b5dc204c79dfda6eb6b4f4c1ef55db720a200b From a073b9ed527ebfaf12496a16d3ee8db07f52999d Mon Sep 17 00:00:00 2001 From: Marion Date: Fri, 27 Sep 2024 10:09:45 -0700 Subject: [PATCH 37/40] update log dir --- docs/backup-restore-candig.md | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/docs/backup-restore-candig.md b/docs/backup-restore-candig.md index e23c56e28..199e0e596 100644 --- a/docs/backup-restore-candig.md +++ b/docs/backup-restore-candig.md @@ -131,6 +131,4 @@ All previous secrets and authorizations should be restored to the stack. ## Backing up logs -Logs are stored in `tmp/logging`. The contents of this folder should be saved periodically. - - +Logs are stored in `tmp/logs`. The contents of this folder should be saved periodically. From 9e89b4769e7f16afeaa5a08d624e17b41d3d9527 Mon Sep 17 00:00:00 2001 From: Marion Date: Fri, 27 Sep 2024 10:18:05 -0700 Subject: [PATCH 38/40] more detail to vault backup --- docs/backup-restore-candig.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/backup-restore-candig.md b/docs/backup-restore-candig.md index 199e0e596..a700ecd2f 100644 --- a/docs/backup-restore-candig.md +++ b/docs/backup-restore-candig.md @@ -107,7 +107,7 @@ You should be able to see the restored data in the data portal. ## Backing up Secrets and Authorization data -Secrets in CanDIG are stored within Vault. To back up Vault, run the command: +Secrets and Authorization data in CanDIG are stored within Vault. These should be backed up regularly so that they can be restored should there be a system crash and before the CanDIG stack is rebuilt. To back up Vault, run the command: ``` make backup-vault @@ -127,7 +127,7 @@ Then run make restore-vault ``` -All previous secrets and authorizations should be restored to the stack. +All previous secrets and authorizations should be restored to the stack. The tarball is renamed to `restored.tar.gz` and can be deleted. ## Backing up logs From 5617d99a3f4b8ae72f573b4e2dccdf2f4ce5f733 Mon Sep 17 00:00:00 2001 From: Marion Date: Fri, 27 Sep 2024 11:17:24 -0700 Subject: [PATCH 39/40] add note --- docs/ingest-and-test.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/ingest-and-test.md b/docs/ingest-and-test.md index d096172a7..f94d980db 100644 --- a/docs/ingest-and-test.md +++ b/docs/ingest-and-test.md @@ -25,6 +25,8 @@ Run the tests with: make test-integration ``` +> [!Note] +> These tests will not work if the default site administrator has been changed. ## Manual tests From f08d6d2b455aa3300c4f057470145e37f22ff931 Mon Sep 17 00:00:00 2001 From: Marion Date: Fri, 27 Sep 2024 11:22:53 -0700 Subject: [PATCH 40/40] add double proxy note --- docs/production-candig.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/docs/production-candig.md b/docs/production-candig.md index a7d7f64bb..eb81133b5 100644 --- a/docs/production-candig.md +++ b/docs/production-candig.md @@ -24,6 +24,9 @@ At UHN, the candig.uhnresearch.ca domain is under a proxy, so requests to a spec Specifically, the UHN proxy forwards all candig.uhnresearch.ca and candigauth.uhnresearch.ca requests (port 443) to candig1:5080 (tyk) and candig1:8080 (keycloak) respectively, thereby acting as a firewall. All CanDIGv2 microservices can only be accessed through Tyk. +> [!Note] +> BCGSC initially had an issue with having a double proxy which caused a double URL bug. Switching to a single proxy resolved this issue. Please reach out if you need help solving this. + ### OpenStack security group & nginx - C3G An OpenStack security group is applied as a firewall that allows ingress traffic to ports 80 and 443 only.