Session: 2 for 1: Easy AI Computational Benchmarking Across Multiple Cloud Resources/Linux OpenStack Kubernetes Infrastructure: A LOKI Stack

Easy AI Computational Benchmarking Across Multiple Cloud Resources – Dharhas Pothina

Determining the most efficient cloud hardware for training, evaluating, or deploying a deep learning model can be time consuming, and if the model runs on resources that are poorly chosen, the cost can be high. Historically, benchmarking AI model computational performance required sophisticated infrastructure or expensive SAAS products, which are often out of reach for teams without dedicated DevOps expertise or deep pockets.

All the tools needed to explore computational performance are available in the open source ecosystem. By combining these tools, we will show how to easily run computational performance tests across a range of cloud GPU resources for deep learning models, such as fine-tuning large language models (LLMs), to assess code quality and determine optimal resource usage.

We will walk through a typical workflow, from initial experimentation in Jupyter to a pipeline running models on multiple GPU resources (via Hera and Argo Workflows) for evaluation of resource efficiency and code profiling, all with minimal code in Python. Resource usage is stored in a Prometheus database and detailed profiling data from PyTorch Profiler is saved to logs that are viewable in Tensorboard. This workflow will be demonstrated on Nebari, a new open source data science platform that provides scalable CPU & GPU compute and common data science tools on the cloud of your choice with minimal cloud expertise.

Linux OpenStack Kubernetes Infrastructure: A LOKI Stack – Kendall Nelson

Recently, the 2022 OpenStack User Survey showed that Kubernetes is now deployed on over 85% of OpenStack deployments. Kubernetes ON OpenStack. You might be thinking: Wait, together? I thought as a user I had to choose one or the other? VMs or Containers?“ The rise of usage of OpenStack and Kubernetes in production together has increased to 21% (from 16% in 2021). These deployments primarily run on a Linux distribution. Together, Linux, OpenStack and Kubernetes are becoming a powerhouse of a stack that can handle many different use cases and workloads. This talk will educate attendees on the ways in which OpenStack and Kubernetes fit together and some examples of usage in production today.