Simplifying AI Development at Scale: Google Cloud Integrates Anyscale's RayTurbo with GKE

By Gabe Monroy, VP & GM Cloud Runtimes and Keerti Melkote, CEO of Anyscale | April 9, 2025

Engineering teams can now build and scale AI applications more simply and with dramatically improved performance, thanks to a new partnership between Google Cloud and Anyscale.

This collaboration will integrate Anyscale RayTurbo—a high-performance runtime for Ray—with Google Kubernetes Engine (GKE), creating a unified platform that acts as a distributed operating system for AI. The news comes as Kubernetes is increasingly being chosen by organizations as the de facto infrastructure platform for AI training and inference.

By combining Ray’s Python-native distributed computing parallelism with GKE’s robust container and workload orchestration, together we are delivering a simpler, more scalable, and hyper efficient way to manage AI workloads. To learn more about Anyscale RayTurbo on GKE, sign up here.

LinkRay, Compute Engine for AI

Developers have embraced the open source Ray project for its ability to seamlessly handle complex, distributed Python workloads, scaling easily and efficiently across CPUs, GPUs, and TPUs. With fine-grained parallelism and intuitive Pythonic APIs, Ray makes it natural to express distributed AI workloads from multi-modal data processing and training, to inference.

Thousands of organizations—such as Coinbase, Attentive, Spotify, and Uber—rely on Ray as their compute engine for AI, using it to build, train, and run models in production. Capable of scaling to thousands of nodes and achieving throughput exceeding millions of tasks per second, Ray has become critical infrastructure for the AI community.

LinkRay + Kubernetes = Distributed Operating System for AI

Platform engineers have long trusted Kubernetes, and specifically GKE from Google Cloud, for its powerful orchestration, resource isolation, and auto scaling capabilities. To enable these users, Google and Anyscale have already partnered in open source on KubeRay to support OSS Ray deployments on Kubernetes. With that collaboration GKE has provided an excellent foundation for Ray workloads, with first-class GPU support, low-latency networking, and a dependable cluster autoscaler that reaches unmatched scale.

This partnership takes GKE further by natively integrating Anyscale RayTurbo — an optimized version of open source Ray that boosts task execution speed, increases throughput, and enhances GPU and TPU utilization — with GKE. Together, they form a distributed operating system tailored for AI, enabling teams to build, deploy, and scale applications with infrastructure abstracted away.

LinkEmpowering AI Teams for Scale

This collaboration can help eliminate the bottlenecks in AI development and production. Developers can accelerate model experimentation rather than wrestling with over-provisioned GPUs, brittle scaling logic, or DevOps overhead. Platform engineers can launch optimized RayTurbo clusters on GKE with ease and speed. The result is a combined platform that efficiently manages the dynamic, variable compute patterns of AI workloads, supporting increasingly sophisticated applications at massive scale.

The advantages of this integration are transformative:

Optimized Performance: RayTurbo delivers up to 4.5X faster multimodal data processing, 54% higher throughput in model serving, and up to 50% fewer nodes needed for online model serving, significantly cutting costs.
Enhanced GKE Features: Google Cloud is introducing Kubernetes capabilities tailored for RayTurbo on GKE, including TPU support, dynamic resource allocation, topology-aware scheduling, custom compute classes, improved horizontal and vertical pod autoscaling, and dynamic container mutation – further boosting performance and efficiency.

LinkA Distributed Operating System for AI

Both Google Cloud and Anyscale are committed to making AI applications as straightforward to build and run as writing Python code. By partnering to create this distributed OS for AI, we’re empowering developers and platform engineers to tackle sophisticated AI projects with unparalleled performance and flexibility. This is a major step forward in accelerating AI innovation, and we’re excited to see what our customers will build with it.

To learn more and get started with Anyscale RayTurbo on GKE, sign up here.

Ray, Compute Engine for AI
Ray + Kubernetes = Distributed Operating System for AI
Empowering AI Teams for Scale
A Distributed Operating System for AI

Sharing

Sign up for product updates

Announcing Native LLM APIs in Ray Data and Ray Serve

Introducing the Ray for Practitioners Course and Private Training from Anyscale

Announcing the Anyscale Technical Webinar Series: Learn Ray and Distributed AI

Ready to try Anyscale?

Access Anyscale today to see how companies using Anyscale and Ray benefit from rapid time-to-market and faster iterations across the entire AI lifecycle.