Anyscale's Ray LLM

Anyscale Exclusive ML Library for LLM Inference

What is Ray LLM?

Ray LLM is an ML library for LLM inference, only available on Anyscale.

Combining the performance gains from Anyscale’s optimized vLLM and the production-readiness and scalability capabilities of Anyscale’s Ray Serve, RayLLM is the best way to deploy and manage open source LLMs.

Benefits

Enhanced vLLM Engine

We’ve optimized vLLM capabilities for throughput, latency, and model parallelism. Tune your engine performance to your specific needs and reduce batch and online inference costs.

LLM Inference Capabilities

RayLLM supports any model. It’s compatible with OpenAI’s API, and offers JSON mode and function calling support. Plus, easily deploy multi-LoRA and vision language models.

Any Model, Any Accelerator

Fine-tune and deploy any open-source model that’s supported on HuggingFace, including popular open-source models like LLaMA, Mistral, and more. Run any inference engine, including vLLM, TRT-LLM, TGI, and more.

Best-in-Class Reliability

RayLLM is built on top of Ray Serve, Anyscale’s highly scalable and efficient ML serving system. Deploy LLMs without worrying, with support for head node recovery, Multi-AZ support, and zero downtime upgrades.

Feature Highlights

JSON Mode

Easily enable JSON mode in RayLLM. Plus get a variety of optimizations for performance and reliability, exclusive for Anyscale users.

“We have no ceiling on scale, and an incredible opportunity to bring AI features and value to our 170 million users.”

Greg Roodt
ML Lead, Canva

Related Resources

Learn more about why Anyscale’s Ray LLM is the leader for LLM inference.

LLMForge Library

Learn more about Anyscale's exlusive ML library for LLM fine-tuning: LLMForge.

Ray LLM Docs

Explore in-depth documentation on how to get started and use Ray LLM.

Webinar: End-to-End LLM Workflows

Master the end-to-end LLM process with our exclusive webinar walkthrough. Get tips and best practices from Anyscale leaders and expert practitioners.

Try Anyscale Today

Build, deploy, and manage scalable AI and Python applications on the leading AI platform. Unlock your AI potential with Anyscale.

Anyscale's Ray LLM

What is Ray LLM?

Benefits

Enhanced vLLM Engine

LLM Inference Capabilities

Any Model, Any Accelerator

Best-in-Class Reliability

Feature Highlights

JSON Mode

First-Class Resource Utilization

Best Price-Performance

Best Price-Performance

JSON Mode

First-Class Resource Utilization

Best Price-Performance

JSON Mode

First-Class Resource Utilization

Best Price-Performance

Try Anyscale Today