The Problem: One Model Doesn't Fit All
When you're running 14 AI agents that handle everything from answering customer phone calls to generating financial reports, you hit a fundamental problem: no single AI model is the best choice for every task.
Claude Sonnet excels at empathetic customer conversations. GPT-4o is better at structured data processing. Haiku is 10x cheaper for simple classifications. GPT-5.4 handles complex reasoning at a fraction of Opus's cost. If you default everything to one premium model, you're either overpaying or underperforming.
At Solid#, 14 AI agents serve businesses across 52 industries. A plumber's customer service bot, a dental practice's appointment scheduler, and a law firm's intake agent all have different needs. SmartRouter is how we solved this.
Architecture: A 5-Priority Routing Chain
SmartRouter doesn't just pick a model at random. It runs through a priority chain that balances customer preferences, experimental data, prompt characteristics, historical performance, and task-specific heuristics.
Priority 1: Company Preference
Every business on Solid# can set their preferred AI provider per task type. A company that wants all customer interactions handled by Claude can configure that. SmartRouter respects this — but first validates that the company actually has credentials for that provider. Multi-tenancy means every company can have different provider configurations.
Priority 1.5: A/B Experiments
SmartRouter supports live A/B testing between providers. We can run an experiment like “Does GPT-5.4 outperform Claude Sonnet for dental practice customer service?” and let the data decide. Each experiment gets an ID, and routing decisions are tagged so we can measure quality, cost, and latency per variant.
Priority 1.7: Context-Aware Routing
This is where it gets interesting. SmartRouter looks at the actual prompt before deciding:
- Short queries (<200 tokens): Route to fast, cheap models like Haiku or GPT-4o Mini. A customer asking “What are your hours?” doesn't need Opus.
- Long analytical queries (>5,000 tokens): Route to powerful reasoning models like GPT-5.4. A complex business analysis needs depth.
- Exception: Orchestrator agents never downgrade. ADA (our AI coordinator) orchestrates 14 agents and 608 MCP tools. Even a 5-word question to ADA might trigger a complex multi-agent workflow, so she always gets a reasoning-class model.
Priority 2: Performance-Based Routing
SmartRouter tracks every request's success rate, latency, and cost in Redis. Over time, it learns which provider actually performs best for each task type. If GPT-4o starts outperforming Claude for email generation in production, the router shifts automatically. This is the VC feedback loop: “Let your orchestrator learn.”
Priority 3: Heuristic Chain
When there's no company preference, no experiment, and not enough performance data, SmartRouter falls back to curated task-to-provider mappings. These are informed by thousands of hours of testing each model against each task type.
21 Task Types, Each with an Optimal Provider Chain
Every AI request in Solid# is classified into one of 21 task types. Each task type has an ordered chain of preferred providers:
| Category | Task Type | Primary → Fallback | Why |
|---|---|---|---|
| Customer-Facing | Customer Chat | Claude Sonnet → GPT-4o → Gemini | Claude excels at empathy and nuance |
| Customer-Facing | Customer Support | Claude Sonnet → GPT-4o → Haiku | Empathy + fast fallback for simple tickets |
| Content | Creative Writing | Claude Sonnet → Gemini → GPT-4o | Claude's creative strength |
| Content | Email Generation | Claude Sonnet → GPT-4o → Haiku | Natural language + fast fallback |
| Content | SMS Generation | Haiku → GPT-4o Mini → Sonnet | Short text = fast + cheap |
| Content | Marketing Copy | Claude Sonnet → GPT-4o → Gemini | Persuasion + quality |
| Data | Data Import | GPT-4o → GPT-4.1 → Claude Sonnet | GPT best for structured data |
| Data | Data Extraction | GPT-4o → Claude Sonnet → Gemini | Parsing accuracy |
| Data | Data Analysis | GPT-4o → Gemini → Claude Sonnet | Analytical reasoning |
| Data | Report Generation | Claude Sonnet → GPT-4o → Gemini | Narrative quality |
| Code | Code Generation | GPT-4o → Claude Sonnet → Grok | Speed + accuracy |
| Code | SQL Generation | GPT-4o → Claude Sonnet → GPT-4.1 | Database expertise |
| Reasoning | Reasoning | GPT-5.4 → GPT-4.1 → Sonnet | 9x cheaper than Opus, same quality |
| Reasoning | Planning | GPT-5.4 → Claude Sonnet → GPT-4.1 | Complex multi-step planning |
| Reasoning | Decision Making | GPT-5.4 → GPT-4.1 → Sonnet | Analytical depth |
| Reasoning | Orchestration | GPT-5.4 → Claude Sonnet → GPT-4.1 | ADA needs the smartest model |
| Speed | Quick Response | Haiku → GPT-4o Mini → Grok | Latency under 500ms |
| Speed | Autocomplete | Haiku → GPT-4o Mini → Sonnet | Sub-second responses |
| Speed | Classification | Haiku → GPT-4o Mini → GPT-4o | Fast categorization |
12 Provider Configurations Across 5 Vendors
SmartRouter maintains a registry of every model with its strengths, weaknesses, cost per token, latency profile, and context window. This isn't just a list — it's a decision matrix.
| Model | Vendor | Input Cost | Latency | Best For |
|---|---|---|---|---|
| GPT-4.1 Nano | OpenAI | $0.10/1M | ~500ms | Bulk tasks, cheapest option |
| GPT-4o Mini | OpenAI | $0.15/1M | ~400ms | Speed, classification, simple tasks |
| GPT-4.1 Mini | OpenAI | $0.40/1M | ~800ms | Coding, function calling |
| Claude Haiku 4.5 | Anthropic | $1/1M | ~500ms | Quick responses, classification |
| GPT-5.4 | OpenAI | $2/1M | ~2s | Reasoning (9x cheaper than Opus) |
| GPT-4.1 | OpenAI | $2/1M | ~1.5s | Coding, agents, instruction following |
| GPT-4o | OpenAI | $2.50/1M | ~1.2s | Data processing, SQL, code |
| Claude Sonnet 4.6 | Anthropic | $3/1M | ~1.5s | Empathy, creative writing, chat |
| Gemini 1.5 Pro | $1.25/1M | ~1.8s | Long context, multimodal | |
| Grok Beta | xAI | $5/1M | ~1s | Real-time data |
| GPT-4 Turbo | OpenAI | $10/1M | ~2s | Legacy reasoning (being replaced) |
| Claude Opus 4.6 | Anthropic | $15/1M | ~3s | Complex analysis (expensive) |
Multi-Tenant Model Selection
Here's what makes SmartRouter different from a simple config file: every company on the platform can have different providers configured. A dental practice might only have OpenAI credentials. A law firm might require Anthropic for compliance reasons. A tech company might bring their own API keys for all providers.
SmartRouter handles this by checking the LLMProvider table for each company before making a routing decision. If a company doesn't have credentials for the optimal provider, SmartRouter automatically falls back to the next provider in the chain that the companydoes have access to. No errors, no manual configuration needed.
The Cost Impact: Why This Matters
The difference between routing correctly and defaulting to a premium model is dramatic:
- SMS generation: Haiku at $1/1M tokens vs. Opus at $15/1M = 15x cost reduction
- Reasoning tasks: GPT-5.4 at $2/1M vs. Opus at $15/1M = 7.5x cost reduction with comparable quality
- Classification: GPT-4.1 Nano at $0.10/1M vs. Sonnet at $3/1M = 30x cost reduction
- Customer chat: Sonnet at $3/1M (optimal) vs. Nano at $0.10/1M = poor quality. Some tasks need the premium model
SmartRouter works hand-in-hand with CognitiveLimiter — our cost control system. SmartRouter picks the most cost-effective model for the task. CognitiveLimiter enforces the budget so no company ever gets a surprise bill.
Provider Health Monitoring
AI providers go down. APIs degrade. Latency spikes. SmartRouter tracks provider health in Redis and automatically routes around problems:
- Latency tracking: P95 latency per provider, per task type
- Error rates: Rolling window of failures per provider
- Automatic failover: If a provider is unhealthy, SmartRouter skips it in the chain and moves to the next option
- Degradation detection: A provider can be “degraded” (slow but working) vs. “unhealthy” (failing). Degraded providers get deprioritized but not excluded
When Anthropic has an API outage (it happens), customer-facing agents automatically fall through to GPT-4o within milliseconds. The business owner never notices.
Agent-to-Task Mapping: Every Agent Has a Default
Each of our 14 AI agents has a default task type that SmartRouter uses when no explicit task type is provided. This means agents get the right model without every caller needing to specify:
- Sarah (Customer Service) → Customer Chat → Claude Sonnet (empathy)
- Marcus (Growth Intelligence) → Data Analysis → GPT-4o (structured data)
- ADA (Orchestrator) → Orchestration → GPT-5.4 (reasoning power)
- Devon (Operations) → Code Generation → GPT-4o (technical accuracy)
- Maya (Content) → Creative Writing → Claude Sonnet (narrative quality)
- Jordan (Sales) → Reasoning → GPT-5.4 (analytical depth for deal strategy)
What We Learned Building SmartRouter
- Model strengths are real and measurable. Claude genuinely writes more empathetic customer service responses. GPT-4o genuinely handles structured data better. These aren't marketing claims — they show up in production quality metrics.
- Cost differences are 30x+ between models. Routing a simple classification to Opus instead of Nano is like hiring a surgeon to take your temperature. The math matters at scale.
- Context-aware routing is the biggest single optimization. Most customer messages are under 200 tokens. Routing them to fast/cheap models without degrading quality saves more than any other single change.
- Multi-tenancy forces good architecture. When every company can have different providers, you can't hardcode anything. This constraint made SmartRouter more flexible than it would have been otherwise.
- Orchestrator agents are special. We learned the hard way that ADA should never be downgraded to a cheap model based on prompt length. A 5-word question to an orchestrator can trigger a complex multi-agent workflow.
Open Architecture, Not a Black Box
SmartRouter isn't a proprietary secret. The architecture pattern — task classification, provider chain with fallbacks, performance-based learning, company-specific overrides — is something any team building multi-model AI systems should consider. We publish this because the problem of intelligent model selection is universal, and our solution is one battle-tested approach.
SmartRouter is part of Solid#'s AI infrastructure layer, alongside CognitiveLimiter (cost control) and PromptGuard (security). Together, they form the foundation that lets 14 AI agents serve businesses across 52 industries without vendor lock-in, budget surprises, or security vulnerabilities.