Hosted Anyscale is in Private Preview. Click here to request access!

The Modern AI Infrastructure trusted by CohereOpenAIUberCanvaJasper

Ray is the most popular open source framework for scaling and productionizing AI workloads. From Generative AI and LLMs to computer vision, Ray powers the world’s most ambitious AI workloads.

Get started

Event

Join us at Ray Summit 2024: Where Builders Create the AI Future

Learn

Learning Ray: The ultimate book about ray and it’s expanding ecosystem

Case Study

How Canva Built a Modern AI Platform Using Anyscale

The Leader in Performance

23x

Higher throughput*

75%

Lower cost*

60s

Time to scale to 1000 nodes

$1/TB

World record for shuffling 100TB*

Trusted by the world’s leading AI teams

From ChatGPT to Spotify recommendations to Uber ETA predictions, see how innovators are succeeding with Ray and Anyscale.

LLM / GEN AI

"At OpenAI, we are tackling some of the world’s most complex and demanding computational problems. Ray powers our solutions to the thorniest of these problems and allows us to iterate at scale much faster than we could before. As an example, we use Ray to train our largest models, including ChatGPT."

Greg Brockman

Co-founder and President

LLM / Gen AI

"We chose Ray as the unified compute backend for our machine learning and deep learning platform because it has allowed us to significantly improve performance and fault tolerance, while also reducing the complexity of our technology stack. Ray has brought significant value to our business, and has enabled us to rapidly pretrain, fine-tune and evaluate our LLMs."

Min Cai

Distinguished Engineer

ML WORKLOAD

"One of the biggest problems that Ray helped us resolve is improving scalability, latency, and cost-efficiency of very large workloads. We were able to improve the scalability by an order of magnitude, reduce the latency by over 90%, and improve the cost efficiency by over 90%. It was financially infeasible for us to approach that problem with any other distributed compute framework that we tried."

Patrick Ames

Principal Engineer

ML WORKLOAD

"Ant Group has deployed Ray Serve on 240,000 cores for model serving, which has increased by about 3.5 times compared to last year. The peak throughput during Double 11, the largest online shopping day in the world, was 1.37 million transactions per second. Ray allowed us to scale elastically to handle this load and to deploy ensembles of models in a fault tolerant manner."

Tengwei Cai

Staff Engineer

ML WORKLOAD

"At Instacart, we offer a diverse range of machine learning products that power every aspect of our marketplace. To support emerging requests, we must scale a large number of workloads on both CPU and GPU instances in a performant and cost-efficient way. We chose Ray as the foundational computational framework for distributed ML. This choice enabled us to run deep learning workloads 12x faster, to reduce costs by 8x, and to train our models on 100x more data."

Haixun Wang

VP Engineering

ML WORKLOAD

"We use Ray to run a number of AI workloads at Samsara. Since implementing the platform, we’ve been able to scale the training of our deep learning models to hundreds of millions of inputs, and accelerate deployment while cutting inference costs by 50% - we even use Ray to drive model evaluation on our IoT devices! Ray's performance, resource efficiency, and flexibility made it a great choice for supporting our evolving AI requirements."

Evan Welbourne

Head of AI and Data

Start your LLM journey with Open Models

Public Cloud

Anyscale Endpoints

Get started fast with a serverless API

Serverless API for serving and fine-tuning
State of the art open LLMs such as Llama-2, Mistral
Embedding and Function calling APIs
Free for the first 1M tokens

Learn More

Private Cloud

Anyscale Private Endpoints

Architected for data privacy and governance

Take control of your LLM endpoints and deploy in your own cloud account in just a few clicks
Optimized endpoint with vLLM, continuous batching for low latency and better throughput
Establish governance over your LLMs and applications with enterprise access controls

Learn More

Efficiently scale all of your ML workloads

Open Source

Ray

The open source, scalable, and flexible framework for all of your AI workloads and Python applications.

Single framework for training, batch and real time workloads running on top of CPUs, GPUs and xPUs
Support for Graviton, Trainium, Inferentia, TPUs and GPUs out of the box

Learn More

Managed

Anyscale Platform

The AI application managed platform by the Ray creators

Accelerate experiments across your teams with Anyscale Workspaces
Seamless transition from research to production with Anyscale jobs and Services
Smart scheduler across multi clouds, regions, zones and instance types for better compute cost efficiency and availability

Learn More

Ready to try Anyscale?

Access Anyscale today to see how companies using Anyscale and Ray benefit from rapid time-to-market and faster iterations across the entire AI lifecycle.

Get Started

Featured Resources

Webinar

Build a chat assistant fast using Canopy from Pinecone and Anyscale Endpoints

Watch the webinar on how to get started with building your own chat assistant.

Watch now

eBook

How to Simplify ML Development, Scaling, and Deployment in Production

Discover how Anyscale and AWS provide benefits for scalability, unifying environments, and operationalizing workloads.

Read now

Case study

LiveEO streamlines its MLOps and achieves a 65% cost reduction for its geospatial workloads

LiveEO uses AI to transform satellite imagery into actionable insights for better infrastructure management.

Read now