Jack Jiang

Product Manager at Microsoft

View all authors

Running more with less: Multi-instance GPU (MIG) with Dynamic Resource Allocation (DRA) on AKS

March 3, 2026 · 8 min read

Sachi Desai

Product Manager for AI/ML, GPU workloads on Azure Kubernetes Service

Jack Jiang

Product Manager at Microsoft

GPUs power a wide range of production Kubernetes workloads across industries. For example, media platforms rely on them for video encoding/transcoding, financial services firms run quantitative risk simulations, and research groups process and visualize large datasets. In each of these scenarios, GPUs significantly improve job throughput, yet individual workloads often consume only a portion of the available device.

By default, Kubernetes schedules GPUs as entire units; when a workload requires only a fraction of a GPU, the remaining capacity can remain unused. Over time, this leads to lower hardware utilization and higher infrastructure costs within a cluster.

Multi-instance GPU (MIG) combined with dynamic resource allocation (DRA) helps address this challenge. MIG partitions a physical GPU into isolated instances with dedicated compute and memory resources, while DRA enables those instances to be provisioned and bound dynamically through Kubernetes resource claims. Rather than treating a GPU as an indivisible resource, the cluster can allocate right-sized GPU partitions to multiple workloads at the same time!

Deploying KubeVirt on AKS

February 6, 2026 · 8 min read

Jack Jiang

Product Manager at Microsoft

Harshit Gupta

Senior Software Engineer at Microsoft

Many organizations still depend on virtual machines (VMs) to run applications to meet technical, regulatory, or operational requirements. While Kubernetes adoption continues to grow, not every workload can or should be redesigned for containers.

KubeVirt is a Cloud Native Computing Foundation (CNCF) incubating open-source project that allows users to run, deploy, and manage VMs in their Kubernetes clusters.

In this post, you will learn how KubeVirt lets you run, deploy, and manage VMs in AKS.

Delve into Dynamic Resource Allocation, devices, and drivers on Kubernetes

November 17, 2025 · 13 min read

Sachi Desai

Product Manager for AI/ML, GPU workloads on Azure Kubernetes Service

Jack Jiang

Product Manager at Microsoft

Dynamic Resource Allocation (DRA) is often mentioned in discussions about GPUs and specialized devices designed for high-performance AI and video processing jobs. But what exactly is it?

Azure VM Generations and AKS

April 23, 2025 · 6 min read

Jack Jiang

Product Manager at Microsoft

Ally Ford

Product Manager 2 at Microsoft

Sarah Zhou

Product Manager at Microsoft

What are Virtual Machine Generations?

If you are a user of Azure, you may be familiar with virtual machines. What you may not have known is the fact that Azure now offers two generations of virtual machines!

Before going further, let's first break down virtual machines. Azure virtual machines are offered in various "sizes," which are broken down by the amount and type of each resource allocated, such as CPU, memory, storage, and network bandwidth. These resources are tied to a portion of a physical server's hardware capabilities. Physical servers may be broken down into many different VM size series or configurations available utilizing its resources.

As the physical hardware ages and newer components become available, older hardware and VMs get retired, while newer generation hardware and VM products are made available.

In this blog, we will go over Generation 1 and newer Generation 2 virtual machines. Both have their own use cases, and picking the right one to suit your workloads is critical in ensuring you get the best possible experience, capabilities, and cost.

What are Virtual Machine Generations?​

What are Virtual Machine Generations?