The Problem: One Model Doesn't Fit All

When you're running AI agents that handle everything from answering customer phone calls to generating financial reports, you hit a fundamental problem: no single AI model is the best choice for every task.

Claude Sonnet excels at empathetic customer conversations. GPT-4o is better at structured data processing. Haiku is 10x cheaper for simple classifications. GPT-5.4 handles complex reasoning at a fraction of Opus's cost. If you default everything to one premium model, you're either overpaying or underperforming.

At Solid#, AI agents serve businesses across 54 industries. A plumber's customer service bot, a dental practice's appointment scheduler, and a law firm's intake agent all have different needs. SmartRouter is how we solved this.

Architecture: A 5-Priority Routing Chain

SmartRouter doesn't just pick a model at random. It runs through a priority chain that balances customer preferences, experimental data, prompt characteristics, historical performance, and task-specific heuristics.

Priority 1: Company Preference

Every business on Solid# can set their preferred AI provider per task type. A company that wants all customer interactions handled by Claude can configure that. SmartRouter respects this — but first validates that the company actually has credentials for that provider. Multi-tenancy means every company can have different provider configurations.

Priority 1.5: A/B Experiments

SmartRouter supports live A/B testing between providers. We can run an experiment like “Does GPT-5.4 outperform Claude Sonnet for dental practice customer service?” and let the data decide. Each experiment gets an ID, and routing decisions are tagged so we can measure quality, cost, and latency per variant.

Priority 1.7: Context-Aware Routing

This is where it gets interesting. SmartRouter looks at the actual prompt before deciding:

Short queries (<200 tokens): Route to fast, cheap models like Haiku or GPT-4o Mini. A customer asking “What are your hours?” doesn't need Opus.
Long analytical queries (>5,000 tokens): Route to powerful reasoning models like GPT-5.4. A complex business analysis needs depth.
Exception: Orchestrator agents never downgrade. ADA (our AI coordinator) orchestrates AI agents and the full verb surface — CLI, MCP, WebMCP, and UCP. Even a 5-word question to ADA might trigger a complex multi-agent workflow, so she always gets a reasoning-class model.

Priority 2: Performance-Based Routing

SmartRouter tracks every request's success rate, latency, and cost in Redis. Over time, it learns which provider actually performs best for each task type. If GPT-4o starts outperforming Claude for email generation in production, the router shifts automatically. This is the VC feedback loop: “Let your orchestrator learn.”

Priority 3: Heuristic Chain

When there's no company preference, no experiment, and not enough performance data, SmartRouter falls back to curated task-to-provider mappings. These are informed by thousands of hours of testing each model against each task type.

21 Task Types, Each with an Optimal Provider Chain

Every AI request in Solid# is classified into one of 21 task types. Each task type has an ordered chain of preferred providers:

Category	Task Type	Primary → Fallback	Why
Customer-Facing	Customer Chat	Claude Sonnet → GPT-4o → Gemini	Claude excels at empathy and nuance
Customer-Facing	Customer Support	Claude Sonnet → GPT-4o → Haiku	Empathy + fast fallback for simple tickets
Content	Creative Writing	Claude Sonnet → Gemini → GPT-4o	Claude's creative strength
Content	Email Generation	Claude Sonnet → GPT-4o → Haiku	Natural language + fast fallback
Content	SMS Generation	Haiku → GPT-4o Mini → Sonnet	Short text = fast + cheap
Content	Marketing Copy	Claude Sonnet → GPT-4o → Gemini	Persuasion + quality
Data	Data Import	GPT-4o → GPT-4.1 → Claude Sonnet	GPT best for structured data
Data	Data Extraction	GPT-4o → Claude Sonnet → Gemini	Parsing accuracy
Data	Data Analysis	GPT-4o → Gemini → Claude Sonnet	Analytical reasoning
Data	Report Generation	Claude Sonnet → GPT-4o → Gemini	Narrative quality
Code	Code Generation	GPT-4o → Claude Sonnet → Grok	Speed + accuracy
Code	SQL Generation	GPT-4o → Claude Sonnet → GPT-4.1	Database expertise
Reasoning	Reasoning	GPT-5.4 → GPT-4.1 → Sonnet	9x cheaper than Opus, same quality
Reasoning	Planning	GPT-5.4 → Claude Sonnet → GPT-4.1	Complex multi-step planning
Reasoning	Decision Making	GPT-5.4 → GPT-4.1 → Sonnet	Analytical depth
Reasoning	Orchestration	GPT-5.4 → Claude Sonnet → GPT-4.1	ADA needs the smartest model
Speed	Quick Response	Haiku → GPT-4o Mini → Grok	Latency under 500ms
Speed	Autocomplete	Haiku → GPT-4o Mini → Sonnet	Sub-second responses
Speed	Classification	Haiku → GPT-4o Mini → GPT-4o	Fast categorization

12 Provider Configurations Across 5 Vendors

SmartRouter maintains a registry of every model with its strengths, weaknesses, cost per token, latency profile, and context window. This isn't just a list — it's a decision matrix.

Model	Vendor	Input Cost	Latency	Best For
GPT-4.1 Nano	OpenAI	$0.10/1M	~500ms	Bulk tasks, cheapest option
GPT-4o Mini	OpenAI	$0.15/1M	~400ms	Speed, classification, simple tasks
GPT-4.1 Mini	OpenAI	$0.40/1M	~800ms	Coding, function calling
Claude Haiku 4.5	Anthropic	$1/1M	~500ms	Quick responses, classification
GPT-5.4	OpenAI	$2/1M	~2s	Reasoning (9x cheaper than Opus)
GPT-4.1	OpenAI	$2/1M	~1.5s	Coding, agents, instruction following
GPT-4o	OpenAI	$2.50/1M	~1.2s	Data processing, SQL, code
Claude Sonnet 4.6	Anthropic	$3/1M	~1.5s	Empathy, creative writing, chat
Gemini 1.5 Pro	Google	$1.25/1M	~1.8s	Long context, multimodal
Grok Beta	xAI	$5/1M	~1s	Real-time data
GPT-4 Turbo	OpenAI	$10/1M	~2s	Legacy reasoning (being replaced)
Claude Opus 4.6	Anthropic	$15/1M	~3s	Complex analysis (expensive)

Multi-Tenant Model Selection

Here's what makes SmartRouter different from a simple config file: every company on the platform can have different providers configured. A dental practice might only have OpenAI credentials. A law firm might require Anthropic for compliance reasons. A tech company might bring their own API keys for all providers.

SmartRouter handles this by checking the LLMProvider table for each company before making a routing decision. If a company doesn't have credentials for the optimal provider, SmartRouter automatically falls back to the next provider in the chain that the companydoes have access to. No errors, no manual configuration needed.

The Cost Impact: Why This Matters

The difference between routing correctly and defaulting to a premium model is dramatic:

SMS generation: Haiku at $1/1M tokens vs. Opus at $15/1M = 15x cost reduction
Reasoning tasks: GPT-5.4 at $2/1M vs. Opus at $15/1M = 7.5x cost reduction with comparable quality
Classification: GPT-4.1 Nano at $0.10/1M vs. Sonnet at $3/1M = 30x cost reduction
Customer chat: Sonnet at $3/1M (optimal) vs. Nano at $0.10/1M = poor quality. Some tasks need the premium model

SmartRouter works hand-in-hand with CognitiveLimiter — our cost control system. SmartRouter picks the most cost-effective model for the task. CognitiveLimiter enforces the budget so no company ever gets a surprise bill.

Provider Health Monitoring

AI providers go down. APIs degrade. Latency spikes. SmartRouter tracks provider health in Redis and automatically routes around problems:

Latency tracking: P95 latency per provider, per task type
Error rates: Rolling window of failures per provider
Automatic failover: If a provider is unhealthy, SmartRouter skips it in the chain and moves to the next option
Degradation detection: A provider can be “degraded” (slow but working) vs. “unhealthy” (failing). Degraded providers get deprioritized but not excluded

When Anthropic has an API outage (it happens), customer-facing agents automatically fall through to GPT-4o within milliseconds. The business owner never notices.

Agent-to-Task Mapping: Every Agent Has a Default

Each of our AI agents has a default task type that SmartRouter uses when no explicit task type is provided. This means agents get the right model without every caller needing to specify:

Sarah (Customer Service) → Customer Chat → Claude Sonnet (empathy)
Marcus (Growth Intelligence) → Data Analysis → GPT-4o (structured data)
ADA (Orchestrator) → Orchestration → GPT-5.4 (reasoning power)
Devon (Operations) → Code Generation → GPT-4o (technical accuracy)
Maya (Content) → Creative Writing → Claude Sonnet (narrative quality)
Jordan (Sales) → Reasoning → GPT-5.4 (analytical depth for deal strategy)

What We Learned Building SmartRouter

Model strengths are real and measurable. Claude genuinely writes more empathetic customer service responses. GPT-4o genuinely handles structured data better. These aren't marketing claims — they show up in production quality metrics.
Cost differences are 30x+ between models. Routing a simple classification to Opus instead of Nano is like hiring a surgeon to take your temperature. The math matters at scale.
Context-aware routing is the biggest single optimization. Most customer messages are under 200 tokens. Routing them to fast/cheap models without degrading quality saves more than any other single change.
Multi-tenancy forces good architecture. When every company can have different providers, you can't hardcode anything. This constraint made SmartRouter more flexible than it would have been otherwise.
Orchestrator agents are special. We learned the hard way that ADA should never be downgraded to a cheap model based on prompt length. A 5-word question to an orchestrator can trigger a complex multi-agent workflow.

Open Architecture, Not a Black Box

SmartRouter isn't a proprietary secret. The architecture pattern — task classification, provider chain with fallbacks, performance-based learning, company-specific overrides — is something any team building multi-model AI systems should consider. We publish this because the problem of intelligent model selection is universal, and our solution is one battle-tested approach.

SmartRouter is part of Solid#'s AI infrastructure layer, alongside CognitiveLimiter (cost control) and PromptGuard (security). Together, they form the foundation that lets 14 AI agents serve businesses across 54 industries without vendor lock-in, budget surprises, or security vulnerabilities.

SmartRouter: How We Select the Optimal AI Model Per Task