Skip to content

Latest commit

 

History

History
82 lines (65 loc) · 2.92 KB

File metadata and controls

82 lines (65 loc) · 2.92 KB

GPU-Enabled Azure Function on Azure Kubernetes Service

Enable AKS GPU capabilities

From: https://learn.microsoft.com/en-us/azure/aks/gpu-cluster#use-the-aks-specialized-gpu-image-preview Also: https://github.com/puthurr/python-azure-function-gpu

az login

## Add aks-preview extension to Azure CLI
az extension add --name aks-preview
az extension update --name aks-preview

## Register GPU-Dedicated VHD Preview Feature
az feature register --namespace "Microsoft.ContainerService" --name "GPUDedicatedVHDPreview"
az feature show --namespace "Microsoft.ContainerService" --name "GPUDedicatedVHDPreview"

## Register Node Public IP Tags Preview Feature
# az feature register --namespace "Microsoft.ContainerService" --name "NodePublicIPTagsPreview"
# az feature show --namespace "Microsoft.ContainerService" --name "NodePublicIPTagsPreview"

## Register ContainerService Provider
az provider register --namespace Microsoft.ContainerService

Deploy Resource Group and AKS

## Create Resource Group
az group create --name dolly-dev-rg --location eastus

## Create AKS Cluster
az aks create --resource-group dolly-dev-rg  --name dolly-aks --node-count 1 --generate-ssh-keys
# az aks create --resource-group dolly-dev-rg  --name dolly-aks --node-count 1 --generate-ssh-keys --enable-node-public-ip --node-public-ip-tags RoutingPreference=Internet

Create a Node Pool

## Create Nodepool
az aks nodepool add --resource-group dolly-dev-rg --cluster-name dolly-aks --name dollywood --node-count 1 --node-vm-size standard_nc24ads_a100_v4 --node-taints sku=gpu:NoSchedule --aks-custom-headers UseGPUDedicatedVHD=true --enable-cluster-autoscaler --min-count 1 --max-count 1
# az aks nodepool add --resource-group dolly-dev-rg --cluster-name dolly-aks --name dollywood --node-count 1 --node-vm-size standard_nc24ads_a100_v4 --node-taints sku=gpu:NoSchedule --aks-custom-headers UseGPUDedicatedVHD=true --enable-cluster-autoscaler --min-count 1 --max-count 1 --enable-node-public-ip --node-public-ip-tags RoutingPreference=Internet

Deploy and Access Pod

## Get AKS Credentials for kubectl
az aks get-credentials --resource-group dolly-dev-rg --name dolly-aks

## Apply YAML file
kubectl apply -f dolly-gpu-k8s.yaml

## Remote into Pod
kubectl exec -it dolly-gpu-1 -- /bin/bash

Confirm GPUs are Available

kubectl get nodes
## Replace Node Name
kubectl describe node aks-dollywood-24483328-vmss000000

You should see 1 GPU:

Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests    Limits
  --------           --------    ------
  cpu                310m (1%)   500m (2%)
  memory             220Mi (0%)  1762Mi (0%)
  ephemeral-storage  0 (0%)      0 (0%)
  hugepages-1Gi      0 (0%)      0 (0%)
  hugepages-2Mi      0 (0%)      0 (0%)
  nvidia.com/gpu     1           1

Get Service Endpoint

## Get Public IP Address
kubectl get services