The following instructions show step by step how to use SusQL to aggregate energy data consumed by a GPU utilizing Jupyter notebook running on OpenShift AI.
The following are assumed to be installed and available.
oc command)
To use GPU functionality seamlessly within the cluster, a GPU and necessary software must be available:Any code that runs in a Jupyter notebook can be aggregated by SusQL. The following sample code demonstrates the use of GPU resources. This is also a good test case to verify that GPU is configured correctly.
pip install pycaret[full]
import torch
import time
if torch.cuda.is_available():
device = torch.device('cuda')
else:
device = torch.device('cpu')
matrix_size = 16384
x = torch.randn(matrix_size, matrix_size)
y = torch.randn(matrix_size, matrix_size)
x_gpu = x.to(device)
y_gpu = y.to(device)
torch.cuda.synchronize()
for i in range(10):
start = time.time()
result_gpu = torch.matmul(x_gpu, y_gpu)
print("Run time using device",result_gpu.device,"is","{:.7f}".format(time.time() - start))
A Jupyter Notebook can be created and run through the following steps:
pip command in its own cell to run once before running the rest of the Python code.)Although the OpenShift Web Console can be used to set labels on existing workloads, this is also easy to do from the command line:
The following command removes a SusQL label on the Jupyter Notebook Server pod, in case one happens to be defined.
$ oc label pod $(oc get po -n rhods-notebooks | grep jupyter | head -1 | cut -f 1 -d" ") -n rhods-notebooks "susql.label/1-"
pod/jupyter-nb-kube-3aadmin-0 unlabeled
Next, this command sets the label Susql.label/1 to openshiftaij for the Jupyter notebook server running in namespace rhods-notebooks.
$ oc label pod $(oc get po -n rhods-notebooks | grep jupyter | head -1 | cut -f 1 -d" ") -n rhods-notebooks "susql.label/1=openshiftaij"
pod/jupyter-nb-kube-3aadmin-0 labeled
And, finally, this command can verify that the label has been set
$ oc describe pod $(oc get po -n rhods-notebooks | grep jupyter | head -1 | cut -f 1 -d" ") -n rhods-notebooks | grep -i susql
susql.label/1=openshiftaij
First create a LabelGroup definition file called openshiftaij.yaml as follows:
---
apiVersion: susql.ibm.com/v1
kind: LabelGroup
metadata:
name: openshiftaij
namespace: rhods-notebooks
spec:
labels:
- openshiftaij
---
And apply the file:
$ oc apply -f openshiftaij.yaml
labelgroup.susql.ibm.com/openshiftaij created
susql_total_energy_joules{susql_label_1="openshiftaij"}
If you have cloned the GitHub susql-operator repository, you could also run the test/susqltop command to view energy aggregation from the command line.
$ test/susqltop
NameSpace LabelGroup Labels TotalEnergy (J)
rhods-notebooks openshiftaij ["openshiftaij"] 17963.00