Why Replicate is joining Cloudflare

Cloudflare BlogDecember 1, 2025

Why It Matters

Replicate's acquisition by Cloudflare signifies a major step towards building a comprehensive, integrated AI infrastructure. It promises to simplify the deployment and scaling of complex AI applications by combining model serving with a global network and full-stack primitives.

Key Takeaways

  • Replicate, founded in 2019, aimed to democratize access to research-grade ML models by abstracting away infrastructure complexities.
  • They developed Cog for model packaging and the Replicate platform for running models as cloud API endpoints, successfully scaling with models like Stable Diffusion.
  • The modern AI stack has evolved beyond just model inference, requiring a full suite of services like microservices, storage, and databases.
  • Replicate is joining Cloudflare to leverage Cloudflare's extensive network, Workers, R2, and other primitives to build a complete, integrated AI infrastructure layer.
  • This acquisition will enable faster edge models, model pipelines on Workers, and streaming model I/O, realizing a vision where "the network is the computer" for AI.

Keywords

Machine Learning PlatformAI InfrastructureCloudflare WorkersEdge ComputingGPU Cluster ManagementModel InferenceDistributed Systems

Content Preview

We're happy to announce that as of today Replicate is officially part of Cloudflare.

When we started Replicate in 2019, OpenAI had just open sourced GPT-2, and few people outside of the machine learning community paid much attention to AI. But for those of us in the field, it felt like something big was about to happen. Remarkable models were being created in academic labs, but you needed a metaphorical lab coat to be able to run them.

We made it our mission to get research models out of the lab into the hands of developers. We wanted programmers to creatively bend and twist these models into products that the researchers would never have thought of.

We approached this as a tooling problem. Just like tools like Heroku made it possible to run websites without managing web servers, we wanted to build tools for running models without having to understand backpropagation or deal with CUDA errors.

The first tool we built was Cog: a standard packaging format for machine learning models. Then we built Replicate as the platform to run Cog models as API endpoints in the cloud. We abstracted away both the low-level machine learning, and the complicated GPU cluster management you need to run inference at scale.

It turns out the timing was just right. When Stable Diffusion was released in 2022 we had mature infrastructure that could handle the massive developer interest in running these models. A ton of fantastic apps and products were built on Replicate, apps that often ran a single model packaged in a slick UI to solve a particular use case.

Since then, AI Engineering has matured into a serious craft. AI apps are no longer just about running models. The modern AI stack has model inference, but also microservices, content delivery, object storage, caching, databases, telemetry, etc. We see many of our customers building complex heterogenous stacks where the Replicate models are one part of a higher-order system across several platforms.

This is why we’re joining Cloudflare. Replicate has the tools and primitives for running models. Cloudflare has the best network, Workers, R2, Durable Objects, and all the other primitives you need to build a full AI stack.

The AI stack lives entirely on the network. Models run on data center GPUs and are glued together by small cloud functions that call out to vector databases, fetch objects from blob storage, call MCP servers, etc. “The network is the computer” has never been more true.

At Cloudflare, we’ll now be able to build the AI infrastructure layer we have dreamed of since we started. We’ll be able to do things like run fast models on the edge, run model pipelines on instantly-booting Workers, stream model inputs and outputs with WebRTC, etc.

We’re proud of what we’ve built at Replicate. We were the first generative AI serving platform, and we defined the abstractions and design patterns that most of our peers have adopted. We’ve grown a wonderful community of builders and researchers around our product.

Continue reading on the original blog to support the author

Read Full Article
Why Replicate is joining Cloudflare - Enggist