-
Notifications
You must be signed in to change notification settings - Fork 14
GCP Help
Get the $PROJECT_NAME
from your team leader.
export PROJECT_NAME=$PROJECT_NAME
export VM_NAME=yolo
export CPU_SIZE=e2-standard-16
export TPU_ZONE=us-central1-a
gcloud config set project $PROJECT_NAME
ctpu up --vm-only --name=$VM_NAME --project=$PROJECT_NAME --zone=$TPU_ZONE --disk-size-gb=50 --machine-type=$CPU_SIZE --tf-version=2.4.1
Inside the VM:
export TPU_NAME=fat-boy
export BRANCH_NAME=yolo_debug
export EXPERIMENT_NAME=yolov4-coco-tpuV2-256
export MODEL_DIR=$EXPERIMENT_NAME
export LOG_FILE=$EXPERIMENT_NAME
git clone https://github.com/PurdueCAM2Project/TensorFlowModels
cd TensorFlowModels
git checkout $BRANCH_NAME
pip install -r yolo/requirements.txt
nohup python3 -m yolo.train_vm --mode=train_and_eval --experiment=yolo_custom --config_file=yolo/configs/experiments/$EXPERIMENT_NAME.yaml --model_dir="gs://tensorflow2/$MODEL_DIR" > $LOG_FILE &
tail -f $LOG_FILE
export PROJECT_NAME=$PROJECT_NAME
export VM_NAME=yolo
export TPU_SIZE=v2-64
export CPU_SIZE=e2-standard-16
export TPU_ZONE=us-central1-a
gcloud config set project $PROJECT_NAME
ctpu up --tpu-size=$TPU_SIZE --name=$VM_NAME --project=$PROJECT_NAME --zone=$TPU_ZONE --disk-size-gb=50 --machine-type=$CPU_SIZE --tf-version=2.4.1
Inside the VM:
export VM_NAME=yolo
export TPU_NAME=$VM_NAME
export BRANCH_NAME=yolo_debug
export EXPERIMENT_NAME=yolov4-coco-tpuV2-256
export MODEL_DIR=$EXPERIMENT_NAME
export LOG_FILE=$EXPERIMENT_NAME
git clone https://github.com/PurdueCAM2Project/TensorFlowModels
cd TensorFlowModels
git checkout $BRANCH_NAME
pip install -r yolo/requirements.txt
nohup python3 -m yolo.train_vm --mode=train_and_eval --experiment=yolo_custom --config_file=yolo/configs/experiments/$EXPERIMENT_NAME.yaml --model_dir="gs://tensorflow2/$MODEL_DIR" > $LOG_FILE &
tail -f $LOG_FILE
Follow the instructions here: https://cloud.google.com/sdk/docs/install
Then download ctpu
to the bin folder in the Cloud SDK from this Git repo: https://github.com/tensorflow/tpu/blob/master/tools/ctpu/README.md
First, read through Google's documentation for creating and deleting TPUs and linking buckets to the VM. This will show how to do this in the GCP console, in your Terminal, and the Cloud Shell (cptu commands).
https://cloud.google.com/tpu/docs/creating-deleting-tpus
https://cloud.google.com/tpu/docs/storage-buckets
Run the command in GCP shell
gcloud beta services identity create --service tpu.googleapis.com --project $PROJECT_NAME
https://cloud.google.com/compute/docs/instances/adding-removing-ssh-keys#createsshkeys
Logging in with the SSH UI interface in the GCP will create a new user for your personal account. That is fine with the cam2tensorflow account, but if you want to use your personal account, you need to make your own key to access the cam2tensorflow account.
ssh-keygen -t rsa -f ~/.ssh/gcp -C cam2tensorflow
Add your generated ~/.ssh/gcp.pub to the metadata https://console.cloud.google.com/compute/metadata
https://cloud.google.com/iam/docs/creating-managing-service-account-keys
Go here, click on the developer service account and create a key for the service account. https://console.cloud.google.com/iam-admin/serviceaccounts
pip install google-cloud-storage
TODO: Understand the Python