Skip to main content
AI Infrastructure

Published March 2026

SmartRouter: How We Select the Optimal AI Model Per Task

A deep dive into the routing engine that decides whether your customer service message goes to Claude, your data import goes to GPT-4o, or your quick classification goes to Haiku — and why getting this wrong costs real money.

The Problem: One Model Doesn't Fit All

When you're running 14 AI agents that handle everything from answering customer phone calls to generating financial reports, you hit a fundamental problem: no single AI model is the best choice for every task.

Claude Sonnet excels at empathetic customer conversations. GPT-4o is better at structured data processing. Haiku is 10x cheaper for simple classifications. GPT-5.4 handles complex reasoning at a fraction of Opus's cost. If you default everything to one premium model, you're either overpaying or underperforming.

At Solid#, 14 AI agents serve businesses across 52 industries. A plumber's customer service bot, a dental practice's appointment scheduler, and a law firm's intake agent all have different needs. SmartRouter is how we solved this.

Architecture: A 5-Priority Routing Chain

SmartRouter doesn't just pick a model at random. It runs through a priority chain that balances customer preferences, experimental data, prompt characteristics, historical performance, and task-specific heuristics.

Priority 1: Company Preference

Every business on Solid# can set their preferred AI provider per task type. A company that wants all customer interactions handled by Claude can configure that. SmartRouter respects this — but first validates that the company actually has credentials for that provider. Multi-tenancy means every company can have different provider configurations.

Priority 1.5: A/B Experiments

SmartRouter supports live A/B testing between providers. We can run an experiment like “Does GPT-5.4 outperform Claude Sonnet for dental practice customer service?” and let the data decide. Each experiment gets an ID, and routing decisions are tagged so we can measure quality, cost, and latency per variant.

Priority 1.7: Context-Aware Routing

This is where it gets interesting. SmartRouter looks at the actual prompt before deciding:

  • Short queries (<200 tokens): Route to fast, cheap models like Haiku or GPT-4o Mini. A customer asking “What are your hours?” doesn't need Opus.
  • Long analytical queries (>5,000 tokens): Route to powerful reasoning models like GPT-5.4. A complex business analysis needs depth.
  • Exception: Orchestrator agents never downgrade. ADA (our AI coordinator) orchestrates 14 agents and 608 MCP tools. Even a 5-word question to ADA might trigger a complex multi-agent workflow, so she always gets a reasoning-class model.

Priority 2: Performance-Based Routing

SmartRouter tracks every request's success rate, latency, and cost in Redis. Over time, it learns which provider actually performs best for each task type. If GPT-4o starts outperforming Claude for email generation in production, the router shifts automatically. This is the VC feedback loop: “Let your orchestrator learn.”

Priority 3: Heuristic Chain

When there's no company preference, no experiment, and not enough performance data, SmartRouter falls back to curated task-to-provider mappings. These are informed by thousands of hours of testing each model against each task type.

21 Task Types, Each with an Optimal Provider Chain

Every AI request in Solid# is classified into one of 21 task types. Each task type has an ordered chain of preferred providers:

CategoryTask TypePrimary → FallbackWhy
Customer-FacingCustomer ChatClaude Sonnet → GPT-4o → GeminiClaude excels at empathy and nuance
Customer-FacingCustomer SupportClaude Sonnet → GPT-4o → HaikuEmpathy + fast fallback for simple tickets
ContentCreative WritingClaude Sonnet → Gemini → GPT-4oClaude's creative strength
ContentEmail GenerationClaude Sonnet → GPT-4o → HaikuNatural language + fast fallback
ContentSMS GenerationHaiku → GPT-4o Mini → SonnetShort text = fast + cheap
ContentMarketing CopyClaude Sonnet → GPT-4o → GeminiPersuasion + quality
DataData ImportGPT-4o → GPT-4.1 → Claude SonnetGPT best for structured data
DataData ExtractionGPT-4o → Claude Sonnet → GeminiParsing accuracy
DataData AnalysisGPT-4o → Gemini → Claude SonnetAnalytical reasoning
DataReport GenerationClaude Sonnet → GPT-4o → GeminiNarrative quality
CodeCode GenerationGPT-4o → Claude Sonnet → GrokSpeed + accuracy
CodeSQL GenerationGPT-4o → Claude Sonnet → GPT-4.1Database expertise
ReasoningReasoningGPT-5.4 → GPT-4.1 → Sonnet9x cheaper than Opus, same quality
ReasoningPlanningGPT-5.4 → Claude Sonnet → GPT-4.1Complex multi-step planning
ReasoningDecision MakingGPT-5.4 → GPT-4.1 → SonnetAnalytical depth
ReasoningOrchestrationGPT-5.4 → Claude Sonnet → GPT-4.1ADA needs the smartest model
SpeedQuick ResponseHaiku → GPT-4o Mini → GrokLatency under 500ms
SpeedAutocompleteHaiku → GPT-4o Mini → SonnetSub-second responses
SpeedClassificationHaiku → GPT-4o Mini → GPT-4oFast categorization

12 Provider Configurations Across 5 Vendors

SmartRouter maintains a registry of every model with its strengths, weaknesses, cost per token, latency profile, and context window. This isn't just a list — it's a decision matrix.

ModelVendorInput CostLatencyBest For
GPT-4.1 NanoOpenAI$0.10/1M~500msBulk tasks, cheapest option
GPT-4o MiniOpenAI$0.15/1M~400msSpeed, classification, simple tasks
GPT-4.1 MiniOpenAI$0.40/1M~800msCoding, function calling
Claude Haiku 4.5Anthropic$1/1M~500msQuick responses, classification
GPT-5.4OpenAI$2/1M~2sReasoning (9x cheaper than Opus)
GPT-4.1OpenAI$2/1M~1.5sCoding, agents, instruction following
GPT-4oOpenAI$2.50/1M~1.2sData processing, SQL, code
Claude Sonnet 4.6Anthropic$3/1M~1.5sEmpathy, creative writing, chat
Gemini 1.5 ProGoogle$1.25/1M~1.8sLong context, multimodal
Grok BetaxAI$5/1M~1sReal-time data
GPT-4 TurboOpenAI$10/1M~2sLegacy reasoning (being replaced)
Claude Opus 4.6Anthropic$15/1M~3sComplex analysis (expensive)

Multi-Tenant Model Selection

Here's what makes SmartRouter different from a simple config file: every company on the platform can have different providers configured. A dental practice might only have OpenAI credentials. A law firm might require Anthropic for compliance reasons. A tech company might bring their own API keys for all providers.

SmartRouter handles this by checking the LLMProvider table for each company before making a routing decision. If a company doesn't have credentials for the optimal provider, SmartRouter automatically falls back to the next provider in the chain that the companydoes have access to. No errors, no manual configuration needed.

The Cost Impact: Why This Matters

The difference between routing correctly and defaulting to a premium model is dramatic:

  • SMS generation: Haiku at $1/1M tokens vs. Opus at $15/1M = 15x cost reduction
  • Reasoning tasks: GPT-5.4 at $2/1M vs. Opus at $15/1M = 7.5x cost reduction with comparable quality
  • Classification: GPT-4.1 Nano at $0.10/1M vs. Sonnet at $3/1M = 30x cost reduction
  • Customer chat: Sonnet at $3/1M (optimal) vs. Nano at $0.10/1M = poor quality. Some tasks need the premium model

SmartRouter works hand-in-hand with CognitiveLimiter — our cost control system. SmartRouter picks the most cost-effective model for the task. CognitiveLimiter enforces the budget so no company ever gets a surprise bill.

Provider Health Monitoring

AI providers go down. APIs degrade. Latency spikes. SmartRouter tracks provider health in Redis and automatically routes around problems:

  • Latency tracking: P95 latency per provider, per task type
  • Error rates: Rolling window of failures per provider
  • Automatic failover: If a provider is unhealthy, SmartRouter skips it in the chain and moves to the next option
  • Degradation detection: A provider can be “degraded” (slow but working) vs. “unhealthy” (failing). Degraded providers get deprioritized but not excluded

When Anthropic has an API outage (it happens), customer-facing agents automatically fall through to GPT-4o within milliseconds. The business owner never notices.

Agent-to-Task Mapping: Every Agent Has a Default

Each of our 14 AI agents has a default task type that SmartRouter uses when no explicit task type is provided. This means agents get the right model without every caller needing to specify:

  • Sarah (Customer Service) → Customer Chat → Claude Sonnet (empathy)
  • Marcus (Growth Intelligence) → Data Analysis → GPT-4o (structured data)
  • ADA (Orchestrator) → Orchestration → GPT-5.4 (reasoning power)
  • Devon (Operations) → Code Generation → GPT-4o (technical accuracy)
  • Maya (Content) → Creative Writing → Claude Sonnet (narrative quality)
  • Jordan (Sales) → Reasoning → GPT-5.4 (analytical depth for deal strategy)

What We Learned Building SmartRouter

  1. Model strengths are real and measurable. Claude genuinely writes more empathetic customer service responses. GPT-4o genuinely handles structured data better. These aren't marketing claims — they show up in production quality metrics.
  2. Cost differences are 30x+ between models. Routing a simple classification to Opus instead of Nano is like hiring a surgeon to take your temperature. The math matters at scale.
  3. Context-aware routing is the biggest single optimization. Most customer messages are under 200 tokens. Routing them to fast/cheap models without degrading quality saves more than any other single change.
  4. Multi-tenancy forces good architecture. When every company can have different providers, you can't hardcode anything. This constraint made SmartRouter more flexible than it would have been otherwise.
  5. Orchestrator agents are special. We learned the hard way that ADA should never be downgraded to a cheap model based on prompt length. A 5-word question to an orchestrator can trigger a complex multi-agent workflow.

Open Architecture, Not a Black Box

SmartRouter isn't a proprietary secret. The architecture pattern — task classification, provider chain with fallbacks, performance-based learning, company-specific overrides — is something any team building multi-model AI systems should consider. We publish this because the problem of intelligent model selection is universal, and our solution is one battle-tested approach.

SmartRouter is part of Solid#'s AI infrastructure layer, alongside CognitiveLimiter (cost control) and PromptGuard (security). Together, they form the foundation that lets 14 AI agents serve businesses across 52 industries without vendor lock-in, budget surprises, or security vulnerabilities.

SmartRouter: How We Select the Optimal AI Model Per Task | Solid# Research | SolidNumber