2 posts tagged with "Ray" | AKS Engineering Blog

Scaling Anyscale Ray Workloads on AKS

February 13, 2026 · 7 min read

Anson Qian

Software Engineer at Azure Kubernetes Service

Bob Mital

Principal Product Manager at Microsoft Azure

Kenneth Kilty

Technical Program Manager for Cloud Native Platforms

This post focuses on running Anyscale's managed Ray service on AKS, using the Anyscale Runtime (formerly RayTurbo) for an optimized Ray experience. For open-source Ray on AKS, see our Ray on AKS overview.

Ray is an open-source distributed compute framework for scaling Python and AI workloads from a laptop to clusters with thousands of nodes. Anyscale provides a managed ML/AI platform and an optimized Ray runtime with better scalability, observability, and operability than running open-source KubeRay—including intelligent autoscaling, enhanced monitoring, and fault-tolerant training.

As part of Microsoft and Anyscale's strategic collaboration to deliver distributed AI/ML Azure-native computing at scale, we've been working closely with Anyscale to enhance the production-readiness of Ray workloads on Azure Kubernetes Service (AKS) in three critical areas:

Elastic scalability through multi-cluster multi-region capacity aggregation
Data persistence with unified storage across ML/AI development and operation lifecycle
Operational simplicity through automated credential management with service principal

Whether you're fine-tuning models with DeepSpeed or LLaMA-Factory or deploying inference endpoints for LLMs ranging from small to large-scale reasoning models, Anyscale on AKS delivers a production-grade ML/AI platform that scales with your needs.

Ray on AKS

January 13, 2025 · 2 min read

Kenneth Kilty

Technical Program Manager for Cloud Native Platforms

We've released new guidance for running Ray on AKS!