Raindrop SmartInference Automatically Unlocks The Right Enterprise AI Models for Every Developer

Today, we’re launching SmartInference—a revolutionary addition to the Raindrop platform that brings 30+ state-of-the-art AI models directly to your fingertips. For the first time, developers can access models like Kimi K2’s 1 trillion parameters and GPT-OSS 120B through a single multi-agentic AI platform, with performance that rivals enterprise deployments: Example: 250+ tokens per second throughput at just 350ms latency for GPT-OSS-120B

Most agentic platform vendors only support models up to 70B models or completely leave the model selection, routing, and operation up to you. Raindrop SmartInference takes a different approach. You describe the application requirement, and SmartInference chooses the best model to meet your needs (obviously, human model selection is available too).

The Problem with Current AI Development

Every developer building AI agents faces the same frustrating reality: you’re either limited to smaller models that can’t handle complex reasoning, or you’re managing a corn maze of API keys, rate limits, and infrastructure decisions. Want to use DeepSeek R1’s 671B parameters for advanced reasoning? That typically requires an enterprise contract. Need Whisper Large v3 for transcription alongside Kimi K2 for complex tool use? Good luck orchestrating multiple providers.

Even worse, most platforms force you to become an AI infrastructure expert just to get started. You need to understand model capabilities, optimize for latency, manage costs, and handle failovers—all before writing a single line of your actual application.

Person chooses high learning curve AI models - illustration showing the difficulty of choosing the right AI model

Enter SmartInference: Your AI Supercomputer, Simplified

SmartInference eliminates these barriers completely. Through our SmartInference process, Raindrop automatically picks the perfect model for each task in your application. Building a customer support agent? It might select Llama 3.1-8b-instant for greeting and routing (ultra-fast responses), then seamlessly leverage DeepSeek R1 for complex problem-solving that requires deep reasoning.

You focus on what your agent should do. SmartInference handles everything else.

Person approving tasks orchestrated by Raindrop SmartInference to use multiple AI models - illustration showing the difficulty of choosing the right AI model

The Model Arsenal at Your Fingertips with SmartInference

The Heavyweights (Beyond 70B)

Kimi Family

kimi-k2 - Moonshot AI’s 1T parameter MoE model with advanced tool use capabilities. Perfect for agents that need to orchestrate complex workflows and make sophisticated decisions.

GPT-OSS Family

gpt-oss-120b - Advanced reasoning with chain-of-thought capabilities
gpt-oss-20b - Efficient reasoning for complex tasks

DeepSeek Family

deepseek-r1 - State-of-the-art 671B reasoning model
deepseek-v3 - High-performance model with advanced capabilities
deepseek-r1-distill-70b - Fast reasoning with long context support

The Workhorses (Lightning Fast, Production Ready)

Llama Family

llama-4-maverick-17b - Advanced multimodal capabilities
llama-3.3-70b - High quality large language model
llama-3.1-8b-instant - Ultra-fast responses for real-time applications
llama-4-scout-17b - Multimodal model with vision capabilities
Plus 7 more variants optimized for different use cases

The Specialists

qwen-coder-32b - Purpose-built for code generation
deepseek-math-7b - Mathematical reasoning
whisper-large-v3 - Advanced speech-to-text
llama-3.3-swallow-70b - Excels at Japanese language and tasks
BGE family - Multilingual embeddings supporting 100+ languages

Real Performance, Real Production

These aren’t experimental models running on underpowered infrastructure. Our SmartInference service delivers:

256.47 tokens/second throughput on models like gpt-oss-120B
350ms average latency
High reliability and Production-grade consistency

This is the same performance enterprises expect from their private deployments—now available to every developer for just $5/month during our public beta.

Build Your Multi-Model App, MCP, or Agent in Minutes

Here’s how simple it is to create a sophisticated multi-model agent with Raindrop and SmartInference:

Describe in natural language the application you want to build
Raindrop’s SmartInference selects and automatically provisions the right models for each task
Raindrop handles all routing, infrastructure, and orchestration
Optimize your application for cost and performance
Deploy to production with a single command : raindrop build deploy

Unlock New Applications

SmartInference unlocks a whole new class of applications with bigger, smarter models that are only used when needed. Build agents that can:

Handle complex multi-step reasoning that smaller models can’t manage
Process and understand nuanced context across long conversations
Generate sophisticated code architectures, not just snippets
Provide expert-level analysis in specialized domains, even multi-modal

Available Today

SmartInference is available immediately to all Raindrop Users.

You get:

Access to all 30+ models
Generous token allowances that far exceed the subscription cost
Production-ready performance and reliability
Full integration with Claude Code and Raindrop MCP

The Bottom Line

You can build production agents in 25 minutes with zero infrastructure or backend skills.

Illustration showing the ability to build production agents in 25 minutes with zero infrastructure or backend skills

That’s not marketing speak. That’s the reality of what our users are already doing. They’re building applications that are “more than just a database”, things that would typically require a team of ML engineers and a significant cloud provider budget and time. They’re deploying in minutes, not months. They’re iterating rapidly without worrying about infrastructure.

Try Raindrop and SmartInference

The age of being limited by model availability is over. Sign up for Raindrop public beta access at liquidmetal.run and start building with SmartInference today.

For $5/month during public beta, you can build any agent, MCP, API, or production backend you can conceive—powered by the most advanced AI models on the planet.

Don’t wait. This pricing will never be offered again after beta. Early adopters get grandfathered for life.

Raindrop SmartInference is available now as part of the Raindrop public beta. Setup takes less than 5 minutes. Visit liquidmetal.run to get started.