Workers AI

Run machine learning models, powered by serverless GPUs, on Cloudflare’s global network.

Name Summary Try Docs
Object Detection Object detection models can detect instances of objects like persons, faces, license plates, or others in an image. This task takes an image as input and returns a list of detected objects, each one containing a label, a probability score, and its surrounding box coordinates. Try Docs
LLaVA LLaVA is an open-source chatbot trained by fine-tuning LLaMA/Vicuna on GPT-generated multimodal instruction-following data. It is an auto-regressive language model, based on the transformer architecture. Try Docs
BART Large CNN BART is a transformer encoder-encoder (seq2seq) model with a bidirectional (BERT-like) encoder and an autoregressive (GPT-like) decoder. You can use this model for text summarization. Try Docs
DreamShaper 8 LCM Stable Diffusion model that has been fine-tuned to be better at photorealism without sacrificing range. Try Docs
Stable Diffusion XL Base Stable Diffusion model that has been fine-tuned to be better at photorealism without sacrificing range. Try Docs
Stable Diffusion XL Lightning Stable Diffusion model that has been fine-tuned to be better at photorealism without sacrificing range. Try Docs
Stable Diffusion v1-5 Img to Img Stable Diffusion is a latent text-to-image diffusion model capable of generating photo-realistic images. Img2img generate a new image from an input image with Stable Diffusion. Try Docs
stable-diffusion-v1-5-inpainting Stable Diffusion Inpainting is a latent text-to-image diffusion model capable of generating photo-realistic images given any text input, with the extra capability of inpainting the pictures by using a mask. Try Docs
llama-3.1-8b-instruct The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models. The Llama 3.1 instruction tuned text only models are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks. Try Docs