LLM Optimizer

Find the smartest model for your budget

Data updated 47m ago from Artificial Analysis · 515 models

Model Creator Intelligence Coding Input $/M Output $/M Blended $/M Speed t/s TTFT s Context Value
Qwen3.5 0.8B (Reasoning) Alibaba 10.5 0.0 $0.01 $0.05 $0.02 0.0 0.00 525.0
Qwen3.5 0.8B (Non-reasoning) Alibaba 9.9 1.0 $0.01 $0.05 $0.02 88.5 0.26 495.0
Qwen3.5 4B (Reasoning) Alibaba 27.1 17.5 $0.03 $0.15 $0.06 189.8 0.23 451.7
Qwen3.5 2B (Reasoning) Alibaba 16.3 3.5 $0.02 $0.10 $0.04 0.0 0.00 407.5
Qwen3.5 4B (Non-reasoning) Alibaba 22.6 13.7 $0.03 $0.15 $0.06 172.0 0.24 376.7
Qwen3.5 2B (Non-reasoning) Alibaba 14.7 4.9 $0.02 $0.10 $0.04 319.3 0.25 367.5
Qwen3.5 9B (Reasoning) Alibaba 32.4 25.3 $0.10 $0.15 $0.11 49.4 0.35 286.7
gpt-oss-20B (high) OpenAI 24.5 18.5 $0.05 $0.20 $0.09 269.4 0.43 278.4
MiMo-V2-Flash (Feb 2026) Xiaomi 41.5 33.5 $0.10 $0.30 $0.15 126.4 1.44 276.7
DeepSeek V4 Flash (Reasoning, Max Effort) DeepSeek 46.5 38.7 $0.14 $0.28 $0.17 103.6 0.75 265.7
DeepSeek V4 Flash (Reasoning, High Effort) DeepSeek 46.0 39.8 $0.14 $0.28 $0.17 0.0 0.00 262.9
MiMo-V2-Flash (Reasoning) Xiaomi 39.2 31.8 $0.10 $0.30 $0.15 128.3 1.47 261.3
Gemma 3n E4B Instruct Google 6.4 4.2 $0.02 $0.04 $0.03 52.4 0.48 256.0
NVIDIA Nemotron 3 Nano 30B A3B (Reasoning) NVIDIA 24.3 19.0 $0.06 $0.22 $0.10 181.2 1.09 253.1
Step 3.5 Flash StepFun 37.8 31.6 $0.10 $0.30 $0.15 152.4 1.09 252.0
gpt-oss-20B (low) OpenAI 20.8 14.4 $0.06 $0.20 $0.10 273.3 0.48 218.9
NVIDIA Nemotron Nano 9B V2 (Reasoning) NVIDIA 14.8 8.3 $0.04 $0.16 $0.07 120.2 0.22 211.4
Hy3-preview (Reasoning) Tencent 41.9 36.5 $0.12 $0.43 $0.20 99.2 2.40 209.5
DeepSeek V4 Flash (Non-reasoning) DeepSeek 36.5 35.2 $0.14 $0.28 $0.17 97.8 0.86 208.6
MiMo-V2-Flash (Non-reasoning) Xiaomi 30.3 25.8 $0.10 $0.30 $0.15 128.3 1.22 202.0
LFM2 24B A2B Liquid AI 10.5 3.6 $0.03 $0.12 $0.05 154.2 0.26 201.9
Granite 4.1 8B IBM 12.4 7.3 $0.05 $0.10 $0.06 133.9 0.40 196.8
GLM-4.7-Flash (Reasoning) Z AI 30.1 25.9 $0.07 $0.40 $0.15 78.0 0.85 196.7
GPT-5 nano (high) OpenAI 26.8 20.3 $0.05 $0.40 $0.14 152.6 82.64 194.2
GPT-5 nano (medium) OpenAI 25.9 22.9 $0.05 $0.40 $0.14 165.8 28.98 187.7
Ling 2.6 Flash InclusionAI 26.2 23.2 $0.10 $0.30 $0.15 0.0 0.00 174.7
Nova Micro Amazon 10.3 4.1 $0.04 $0.14 $0.06 341.4 0.58 168.9
Hy3-preview (Non-reasoning) Tencent 33.7 34.3 $0.12 $0.43 $0.20 87.8 2.75 168.5
Nemotron 3 Nano Omni 30B A3B Reasoning NVIDIA 21.4 14.8 $0.07 $0.30 $0.13 295.9 0.63 163.4
Gemma 4 26B A4B (Reasoning) Google 31.2 22.4 $0.13 $0.40 $0.20 0.0 0.00 157.6
Gemma 4 31B (Non-reasoning) Google 32.3 33.9 $0.14 $0.40 $0.20 31.2 0.72 157.6
NVIDIA Nemotron Nano 9B V2 (Non-reasoning) NVIDIA 13.2 7.5 $0.05 $0.20 $0.09 125.4 0.65 153.5
NVIDIA Nemotron 3 Nano 30B A3B (Non-reasoning) NVIDIA 13.2 15.8 $0.05 $0.20 $0.09 100.0 0.28 150.0
GLM-4.7-Flash (Non-reasoning) Z AI 22.1 11.0 $0.07 $0.40 $0.15 195.1 1.24 144.4
Gemma 4 26B A4B (Non-reasoning) Google 27.1 29.1 $0.13 $0.40 $0.20 73.5 0.77 136.9
Qwen2.5 Turbo Alibaba 12.0 $0.05 $0.20 $0.09 70.5 1.26 136.4
Grok 4 Fast (Reasoning) xAI 35.1 27.4 $0.20 $0.50 $0.28 0.0 0.00 127.6
gpt-oss-120b (high) OpenAI 33.3 28.6 $0.15 $0.60 $0.26 329.7 0.48 127.1
Llama 3.2 Instruct 1B Meta 6.3 0.6 $0.05 $0.05 $0.05 90.1 0.56 126.0
Gemma 3 4B Instruct Google 6.3 2.9 $0.04 $0.08 $0.05 0.0 0.00 126.0
DeepSeek V3.2 (Reasoning) DeepSeek 41.7 36.7 $0.30 $0.45 $0.34 0.0 0.00 123.7
Gemini 2.5 Flash-Lite Preview (Sep '25) (Reasoning) Google 21.6 18.2 $0.10 $0.40 $0.17 0.0 0.00 123.4
Mistral Small 3 Mistral 12.7 $0.07 $0.19 $0.10 159.4 0.49 122.1
Nova Lite Amazon 12.7 5.1 $0.06 $0.24 $0.10 83.0 0.64 121.0
Llama 3.1 Instruct 8B Meta 11.8 4.9 $0.10 $0.10 $0.10 220.5 0.45 118.0
Mistral Small 3.2 Mistral 15.1 13.3 $0.09 $0.25 $0.13 134.2 0.36 118.0
Ministral 3 3B Mistral 11.2 4.8 $0.10 $0.10 $0.10 201.7 0.34 112.0
Gemini 2.5 Flash-Lite Preview (Sep '25) (Non-reasoning) Google 19.4 14.5 $0.10 $0.40 $0.17 0.0 0.00 110.9
Llama Nemotron Super 49B v1.5 (Reasoning) NVIDIA 18.7 15.1 $0.10 $0.40 $0.17 50.4 0.30 106.9
DeepSeek V3.2 Exp (Reasoning) DeepSeek 32.9 33.3 $0.28 $0.41 $0.31 0.0 0.00 106.1
Mistral Small 4 (Reasoning) Mistral 27.8 24.3 $0.15 $0.60 $0.26 164.0 0.56 106.1
Mistral Small 3.1 Mistral 14.5 13.9 $0.10 $0.23 $0.14 153.9 0.47 105.1
Devstral Small (Jul '25) Mistral 15.2 12.1 $0.10 $0.30 $0.15 203.2 0.40 101.3
Granite 4.0 H Small IBM 10.8 8.5 $0.06 $0.25 $0.11 423.5 8.80 100.9
Gemini 2.5 Flash-Lite (Reasoning) Google 17.6 9.5 $0.10 $0.40 $0.17 236.2 14.03 100.6
GPT-5 nano (minimal) OpenAI 13.8 14.2 $0.05 $0.40 $0.14 158.4 0.68 100.0
Ministral 3 8B Mistral 14.8 10.0 $0.15 $0.15 $0.15 112.6 0.36 98.7
Llama 2 Chat 7B Meta 9.7 $0.05 $0.25 $0.10 95.0 7.68 97.0
GPT-5.4 nano (xhigh) OpenAI 44.0 43.9 $0.20 $1.25 $0.46 159.3 1.98 95.0
DeepSeek V4 Pro (Reasoning, Max Effort) DeepSeek 51.5 47.5 $0.43 $0.87 $0.54 43.7 1.22 94.7
MiniMax-M2.7 MiniMax 49.6 41.9 $0.30 $1.20 $0.53 50.3 1.16 94.5
Qwen3.5 Omni Flash Alibaba 25.9 14.0 $0.10 $0.80 $0.28 231.6 0.94 94.2
Qwen3 30B A3B (Non-reasoning) Alibaba 12.5 13.3 $0.08 $0.29 $0.13 67.6 1.06 94.0
gpt-oss-120b (low) OpenAI 24.5 15.5 $0.15 $0.60 $0.26 365.1 0.48 93.5
Grok 3 mini Reasoning (high) xAI 32.1 25.2 $0.30 $0.50 $0.35 51.7 0.46 91.7
DeepSeek V3.2 Exp (Non-reasoning) DeepSeek 28.4 30.0 $0.28 $0.41 $0.31 0.0 0.00 91.6
DeepSeek V4 Pro (Reasoning, High Effort) DeepSeek 49.8 43.2 $0.43 $0.87 $0.54 41.7 1.14 91.5
Llama 3 Instruct 8B Meta 6.4 4.0 $0.04 $0.14 $0.07 81.9 0.50 91.4
Mercury 2 Inception 32.8 30.6 $0.25 $0.75 $0.38 718.8 2.96 87.5
NVIDIA Nemotron 3 Super 120B A12B (Reasoning) NVIDIA 36.0 31.2 $0.30 $0.75 $0.41 155.2 0.87 87.4
Qwen3 30B A3B (Reasoning) Alibaba 15.3 11.0 $0.09 $0.45 $0.18 67.9 1.28 85.0
Grok 4 Fast (Non-reasoning) xAI 23.1 19.0 $0.20 $0.50 $0.28 0.0 0.00 84.0
Seed-OSS-36B-Instruct ByteDance Seed 25.2 16.7 $0.21 $0.57 $0.30 40.4 1.82 84.0
Llama Nemotron Super 49B v1.5 (Non-reasoning) NVIDIA 14.6 10.5 $0.10 $0.40 $0.17 49.0 0.28 83.4
KAT Coder Pro V2 KwaiKAT 43.8 45.6 $0.30 $1.20 $0.53 106.6 1.91 83.4
Granite 3.3 8B (Non-reasoning) IBM 7.0 3.4 $0.03 $0.25 $0.09 330.7 21.77 82.4
GPT-5.4 nano (medium) OpenAI 38.1 35.0 $0.20 $1.25 $0.46 161.5 3.72 82.3
Hermes 4 - Llama-3.1 70B (Reasoning) Nous Research 16.0 14.4 $0.13 $0.40 $0.20 71.7 0.61 80.8
Trinity Large Thinking Arcee AI 31.9 27.2 $0.23 $0.88 $0.40 136.8 0.60 80.8
Ministral 3 14B Mistral 16.0 10.9 $0.20 $0.20 $0.20 88.2 0.42 80.0
MiniMax-M2.5 MiniMax 41.9 37.4 $0.30 $1.20 $0.53 96.7 1.03 79.8
Solar Mini Upstage 11.9 $0.15 $0.15 $0.15 72.4 0.77 79.3
Qwen3.6 35B A3B (Reasoning) Alibaba 43.5 35.2 $0.25 $1.49 $0.56 169.5 1.19 78.1
MiniMax-M2.1 MiniMax 39.4 32.8 $0.30 $1.20 $0.53 84.1 1.10 75.0
GPT-4.1 nano OpenAI 13.0 11.2 $0.10 $0.40 $0.17 183.4 0.41 74.3
Gemini 2.5 Flash-Lite (Non-reasoning) Google 12.7 7.4 $0.10 $0.40 $0.17 260.6 5.95 72.6
DeepSeek V4 Pro (Non-reasoning) DeepSeek 39.3 38.4 $0.43 $0.87 $0.54 42.4 1.16 72.2
Gemma 3 27B Instruct Google 10.3 9.6 $0.11 $0.25 $0.14 0.0 0.00 71.0
Mistral Small 4 (Non-reasoning) Mistral 18.6 16.4 $0.15 $0.60 $0.26 161.9 0.54 71.0
Gemini 2.0 Flash (Feb '25) Google 18.5 13.6 $0.15 $0.60 $0.26 0.0 0.00 70.6
Qwen3 30B A3B 2507 Instruct Alibaba 15.0 14.2 $0.15 $0.40 $0.21 109.9 1.20 70.4
Qwen3 235B A22B 2507 Instruct Alibaba 25.0 22.1 $0.20 $0.82 $0.36 65.5 1.29 70.2
MiniMax-M2 MiniMax 36.1 29.2 $0.30 $1.20 $0.53 95.5 1.09 68.8
KAT-Coder-Pro V1 KwaiKAT 36.0 18.3 $0.30 $1.20 $0.53 95.5 2.89 68.6
MiMo-V2.5 Xiaomi 49.0 42.1 $0.36 $1.80 $0.72 90.7 2.19 68.1
Qwen3 4B (Non-reasoning) Alibaba 12.5 $0.11 $0.42 $0.19 104.5 1.02 66.5
Olmo 3 7B Instruct Allen Institute for AI 8.1 3.4 $0.10 $0.20 $0.13 0.0 0.00 64.8
Llama 3.2 Instruct 3B Meta 9.7 $0.15 $0.15 $0.15 52.4 0.70 64.7
Hermes 4 - Llama-3.1 70B (Non-reasoning) Nous Research 12.6 9.2 $0.13 $0.40 $0.20 70.0 0.60 63.6
Ling-flash-2.0 InclusionAI 15.7 16.7 $0.14 $0.57 $0.25 80.0 1.61 63.6
DeepSeek V3.1 Terminus (Non-reasoning) DeepSeek 28.5 31.9 $0.27 $1.00 $0.45 0.0 0.00 62.9
Gemma 3 12B Instruct Google 8.8 6.3 $0.09 $0.29 $0.14 0.0 0.00 62.9
GLM-4.5-Air Z AI 23.2 23.8 $0.17 $0.98 $0.37 59.9 1.69 62.4
GPT-5 mini (high) OpenAI 41.2 35.3 $0.25 $2.00 $0.69 94.0 75.51 59.9
Qwen3 32B (Reasoning) Alibaba 16.5 13.8 $0.20 $0.52 $0.28 98.7 1.04 59.8
Gemini 3.1 Flash-Lite Preview Google 33.5 30.1 $0.25 $1.50 $0.56 276.1 5.27 59.5
Qwen3 VL 30B A3B (Reasoning) Alibaba 19.7 13.1 $0.20 $0.75 $0.34 125.8 1.11 58.3
Qwen3 8B (Non-reasoning) Alibaba 10.6 7.1 $0.18 $0.20 $0.18 65.2 1.33 57.3
Qwen3 Coder 30B A3B Instruct Alibaba 20.0 19.4 $0.19 $0.84 $0.35 99.3 1.51 56.8
Ring-flash-2.0 InclusionAI 14.0 10.6 $0.14 $0.57 $0.25 0.0 0.00 56.7
GPT-5 mini (medium) OpenAI 38.9 32.8 $0.25 $2.00 $0.69 93.7 12.72 56.5
MiMo-V2-Omni-0327 Xiaomi 44.9 36.9 $0.40 $2.00 $0.80 92.3 1.67 56.1
GPT-5.1 Codex mini (high) OpenAI 38.6 36.4 $0.25 $2.00 $0.69 218.1 2.17 56.1
Qwen3 32B (Non-reasoning) Alibaba 14.5 $0.15 $0.59 $0.26 101.3 1.15 55.8
Qwen3.5 35B A3B (Reasoning) Alibaba 37.1 30.3 $0.25 $2.00 $0.69 151.1 1.24 53.9
Qwen3 VL 30B A3B Instruct Alibaba 16.0 14.3 $0.20 $0.60 $0.30 125.0 1.01 53.3
GPT-5.4 nano (Non-Reasoning) OpenAI 24.4 27.9 $0.20 $1.25 $0.46 146.6 0.57 52.7
GLM-4.6V (Reasoning) Z AI 23.4 19.7 $0.30 $0.90 $0.45 42.3 1.36 52.0
Qwen3.5 27B (Reasoning) Alibaba 42.1 34.9 $0.30 $2.40 $0.82 83.7 1.55 51.0
Qwen3 Coder Next Alibaba 28.3 22.9 $0.35 $1.20 $0.56 173.5 0.84 50.3
NVIDIA Nemotron Nano 12B v2 VL (Reasoning) NVIDIA 14.9 11.7 $0.20 $0.60 $0.30 0.0 0.00 49.7
GPT-4o mini OpenAI 12.6 $0.15 $0.60 $0.26 61.7 0.65 48.1
Phi-4 Microsoft 10.4 11.2 $0.13 $0.50 $0.22 22.6 0.57 47.5
Apertus 8B Instruct Swiss AI Initiative 5.9 1.4 $0.10 $0.20 $0.13 0.0 0.00 47.2
Llama 4 Scout Meta 13.5 6.7 $0.17 $0.66 $0.29 106.6 0.57 46.2
Qwen3 VL 8B Instruct Alibaba 14.3 7.3 $0.18 $0.70 $0.31 146.2 1.11 46.1
Ring-2.6-1T InclusionAI 38.5 33.3 $0.30 $2.50 $0.85 119.6 1.86 45.3
Qwen3.5 35B A3B (Non-reasoning) Alibaba 30.7 16.8 $0.25 $2.00 $0.69 143.0 1.13 44.6
Qwen3.5 27B (Non-reasoning) Alibaba 37.2 33.4 $0.28 $2.50 $0.83 89.3 1.37 44.6
Qwen3.6 Plus Alibaba 50.0 42.9 $0.50 $3.00 $1.13 53.0 1.69 44.4
Qwen2.5 Instruct 72B Alibaba 15.6 11.9 $0.36 $0.40 $0.37 54.9 1.23 42.2
GLM-4.7 (Reasoning) Z AI 42.1 36.3 $0.60 $2.20 $1.00 88.5 0.80 42.1
DeepSeek V3.2 (Non-reasoning) DeepSeek 32.1 34.6 $0.50 $1.60 $0.78 0.0 0.00 41.4
Gemini 3 Flash Preview (Reasoning) Google 46.4 42.6 $0.50 $3.00 $1.13 195.4 5.01 41.2
Nova 2.0 Lite (high) Amazon 34.5 23.4 $0.30 $2.50 $0.85 160.3 16.03 40.6
Ling-2.6-1T InclusionAI 33.6 33.1 $0.30 $2.50 $0.85 0.0 0.00 39.5
Kimi K2.5 (Reasoning) Kimi 46.8 39.6 $0.58 $3.00 $1.19 43.5 1.16 39.5
Llama 4 Maverick Meta 18.4 15.6 $0.35 $0.85 $0.47 119.6 0.68 38.7
Kimi K2 Thinking Kimi 40.9 34.8 $0.60 $2.50 $1.07 97.6 0.85 38.0
GLM-4.6V (Non-reasoning) Z AI 17.1 11.1 $0.30 $0.90 $0.45 42.3 1.25 38.0
Qwen3.5 122B A10B (Reasoning) Alibaba 41.6 34.7 $0.40 $3.20 $1.10 146.2 1.06 37.8
Qwen3.6 35B A3B (Non-reasoning) Alibaba 31.5 17.6 $0.38 $2.25 $0.84 167.3 1.34 37.3
Qwen3 Coder 480B A35B Instruct Alibaba 24.8 24.6 $0.30 $1.80 $0.68 65.7 1.61 36.7
Qwen3 Omni 30B A3B (Reasoning) Alibaba 15.6 12.7 $0.25 $0.97 $0.43 86.5 0.96 36.3
Qwen3 1.7B (Non-reasoning) Alibaba 6.8 2.3 $0.11 $0.42 $0.19 140.1 0.93 36.2
Mistral 7B Instruct Mistral 7.4 $0.20 $0.23 $0.21 99.4 0.37 35.9
MiMo-V2.5-Pro Xiaomi 53.8 45.5 $1.00 $3.00 $1.50 53.4 2.35 35.9
Qwen3 4B (Reasoning) Alibaba 14.2 $0.11 $1.26 $0.40 102.7 1.03 35.7
Qwen3 8B (Reasoning) Alibaba 13.2 9.0 $0.11 $1.15 $0.37 61.4 1.42 35.7
Llama 3.2 Instruct 11B (Vision) Meta 8.7 4.2 $0.24 $0.24 $0.24 87.0 0.42 35.5
Hermes 3 - Llama-3.1 70B Nous Research 10.6 $0.30 $0.30 $0.30 34.8 0.41 35.3
Qwen3 235B A22B 2507 (Reasoning) Alibaba 29.5 23.2 $0.40 $2.15 $0.84 69.4 1.18 35.2
Nova 2.0 Lite (medium) Amazon 29.7 23.9 $0.30 $2.50 $0.85 167.2 16.58 34.9
Reka Flash (Sep '24) Reka AI 12.0 $0.20 $0.80 $0.35 85.9 1.73 34.3
GLM-4.7 (Non-reasoning) Z AI 34.2 32.0 $0.60 $2.20 $1.00 96.1 0.85 34.2
Grok 4.3 (high) xAI 53.2 41.0 $1.25 $2.50 $1.56 136.3 22.97 34.0
Mistral Small (Sep '24) Mistral 10.2 $0.20 $0.60 $0.30 154.9 0.56 34.0
Qwen3.6 27B (Reasoning) Alibaba 45.8 36.5 $0.60 $3.60 $1.35 60.8 1.42 33.9
GLM-4.6 (Reasoning) Z AI 32.5 29.5 $0.55 $2.20 $0.96 46.3 0.81 33.7
DeepSeek V3.1 (Non-reasoning) DeepSeek 28.1 28.4 $0.56 $1.67 $0.83 0.0 0.00 33.7
NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning) NVIDIA 10.1 5.9 $0.20 $0.60 $0.30 222.9 0.72 33.7
Qwen3 14B (Non-reasoning) Alibaba 12.8 12.4 $0.23 $0.82 $0.38 63.6 1.01 33.6
Qwen3.5 397B A17B (Reasoning) Alibaba 45.0 41.3 $0.60 $3.60 $1.35 52.1 1.71 33.3
Qwen3 30B A3B 2507 (Reasoning) Alibaba 22.4 14.6 $0.28 $1.85 $0.67 146.2 1.06 33.3
Nova 2.0 Omni (medium) Amazon 28.0 15.1 $0.30 $2.50 $0.85 0.0 0.00 32.9
MiMo-V2-Pro Xiaomi 49.2 41.4 $1.00 $3.00 $1.50 61.5 1.46 32.8
GPT-4.1 mini OpenAI 22.9 18.5 $0.40 $1.60 $0.70 93.0 0.42 32.7
Qwen3.5 122B A10B (Non-reasoning) Alibaba 35.9 31.6 $0.40 $3.20 $1.10 156.0 1.08 32.6
GLM-5 (Reasoning) Z AI 49.8 44.2 $1.00 $3.20 $1.55 71.8 0.69 32.1
DeepSeek V3.1 (Reasoning) DeepSeek 27.7 29.7 $0.59 $1.69 $0.86 0.0 0.00 32.0
Jamba 1.5 Mini AI21 Labs 8.0 $0.20 $0.40 $0.25 0.0 0.00 32.0
Gemini 2.5 Flash (Reasoning) Google 27.0 22.2 $0.30 $2.50 $0.85 214.8 11.10 31.8
Jamba 1.6 Mini AI21 Labs 7.9 $0.20 $0.40 $0.25 188.2 0.67 31.6
DeepSeek V3 (Dec '24) DeepSeek 16.5 16.4 $0.40 $0.89 $0.52 0.0 0.00 31.5
Kimi K2.6 Kimi 53.9 47.1 $0.95 $4.00 $1.71 103.2 0.98 31.5
Grok 4.3 (medium) xAI 48.8 35.1 $1.25 $2.50 $1.56 132.7 14.47 31.2
Gemini 3 Flash Preview (Non-reasoning) Google 35.0 37.8 $0.50 $3.00 $1.13 189.7 0.79 31.1
Kimi K2.5 (Non-reasoning) Kimi 37.3 25.8 $0.60 $3.00 $1.20 39.9 1.12 31.1
ERNIE 4.5 300B A47B Baidu 15.0 14.5 $0.28 $1.10 $0.48 23.3 1.57 30.9
Mistral Large 3 Mistral 22.8 22.7 $0.50 $1.50 $0.75 61.0 0.56 30.4
Qwen3 0.6B (Non-reasoning) Alibaba 5.7 1.4 $0.11 $0.42 $0.19 222.9 0.92 30.3
GLM-4.6 (Non-reasoning) Z AI 30.2 30.2 $0.60 $2.20 $1.00 59.5 1.74 30.2
GPT-5 mini (minimal) OpenAI 20.7 21.9 $0.25 $2.00 $0.69 90.9 0.79 30.1
Qwen3 VL 235B A22B Instruct Alibaba 20.8 16.5 $0.30 $1.90 $0.70 50.6 1.10 29.7
Qwen3.5 397B A17B (Non-reasoning) Alibaba 40.1 37.4 $0.60 $3.60 $1.35 52.9 1.64 29.7
GPT-5.4 mini (xhigh) OpenAI 48.9 51.5 $0.75 $4.50 $1.69 160.3 4.09 29.0
Nova 2.0 Lite (low) Amazon 24.6 13.6 $0.30 $2.50 $0.85 160.5 5.85 28.9
Kimi K2 0905 Kimi 30.9 25.9 $0.60 $2.50 $1.07 21.5 2.02 28.7
Grok 4.3 (low) xAI 43.9 31.6 $1.25 $2.50 $1.56 126.3 5.04 28.1
Qwen3.6 27B (Non-reasoning) Alibaba 37.1 26.6 $0.60 $3.60 $1.35 60.9 1.50 27.5
Nova 2.0 Omni (low) Amazon 23.2 13.9 $0.30 $2.50 $0.85 0.0 0.00 27.3
Reka Flash 3 Reka AI 9.5 8.9 $0.20 $0.80 $0.35 90.9 19.94 27.1
Mistral Medium 3.1 Mistral 21.3 18.3 $0.40 $2.00 $0.80 64.9 0.53 26.6
QwQ 32B Alibaba 19.7 $0.66 $1.00 $0.74 30.6 0.46 26.4
GLM-4.5 (Reasoning) Z AI 26.4 26.3 $0.60 $2.20 $1.00 50.6 1.14 26.4
GLM-5 (Non-reasoning) Z AI 40.6 39.0 $1.00 $3.20 $1.55 63.6 1.04 26.2
Qwen3.5 Omni Plus Alibaba 38.6 27.6 $0.40 $4.80 $1.50 55.7 1.26 25.7
MiniMax M1 80k MiniMax 24.4 14.5 $0.55 $2.20 $0.96 0.0 0.00 25.3
Kimi K2 Kimi 26.3 22.1 $0.58 $2.40 $1.04 25.9 1.51 25.3
Qwen3 VL 8B (Reasoning) Alibaba 16.7 9.8 $0.18 $2.10 $0.66 135.8 1.03 25.3
Kimi K2.6 (Non-reasoning) Kimi 42.9 38.4 $0.95 $4.00 $1.71 118.3 1.21 25.1
Qwen3 Omni 30B A3B Instruct Alibaba 10.7 7.2 $0.25 $0.97 $0.43 109.0 0.93 24.9
Claude 3 Haiku Anthropic 12.3 6.7 $0.25 $1.25 $0.50 0.0 0.00 24.6
Magistral Small 1.2 Mistral 18.2 14.8 $0.50 $1.50 $0.75 108.7 0.38 24.3
Gemini 2.5 Flash (Non-reasoning) Google 20.6 17.8 $0.30 $2.50 $0.85 204.4 0.48 24.2
GLM-5.1 (Reasoning) Z AI 51.4 43.4 $1.40 $4.40 $2.15 66.2 0.78 23.9
MiMo-V2.5-Pro (Non-reasoning) Xiaomi 35.6 36.8 $1.00 $3.00 $1.50 56.5 2.07 23.7
Llama 3.3 Instruct 70B Meta 14.5 10.7 $0.58 $0.71 $0.62 93.7 0.63 23.5
Mistral Medium 3 Mistral 18.8 13.6 $0.40 $2.00 $0.80 43.7 0.52 23.5
Devstral Medium Mistral 18.7 15.9 $0.40 $2.00 $0.80 69.4 0.51 23.4
Qwen3 Next 80B A3B Instruct Alibaba 20.1 15.3 $0.50 $2.00 $0.88 147.5 1.12 23.0
GPT-5.4 mini (medium) OpenAI 37.7 37.5 $0.75 $4.50 $1.69 159.3 6.12 22.3
Llama 3.1 Instruct 70B Meta 12.5 10.9 $0.56 $0.56 $0.56 31.3 0.70 22.3
Qwen3 14B (Reasoning) Alibaba 16.2 13.1 $0.23 $2.22 $0.73 60.8 1.03 22.2
Qwen3 235B A22B (Non-reasoning) Alibaba 17.0 14.0 $0.45 $1.80 $0.79 63.2 1.24 21.6
Nova 2.0 Lite (Non-reasoning) Amazon 18.0 12.5 $0.30 $2.50 $0.85 224.5 0.80 21.2
GLM-5.1 (Non-reasoning) Z AI 43.8 35.8 $1.40 $4.40 $2.15 43.0 0.83 20.4
DeepSeek R1 Distill Llama 70B DeepSeek 16.0 11.4 $0.70 $1.05 $0.79 46.8 0.32 20.3
Qwen3 1.7B (Reasoning) Alibaba 8.0 1.4 $0.11 $1.26 $0.40 140.4 1.00 20.1
Grok 4.3 (Non-reasoning) xAI 31.0 25.1 $1.25 $2.50 $1.56 124.9 0.45 19.8
Nova 2.0 Omni (Non-reasoning) Amazon 16.6 13.8 $0.30 $2.50 $0.85 0.0 0.00 19.5
DeepSeek V3 0324 DeepSeek 22.3 22.0 $1.20 $1.25 $1.21 0.0 0.00 18.4
DeepSeek V3.1 Terminus (Reasoning) DeepSeek 33.9 33.7 $1.64 $2.75 $1.91 0.0 0.00 17.7
Qwen3.6 Max Preview Alibaba 51.8 44.9 $1.30 $7.80 $2.92 34.3 2.03 17.7
o4-mini (high) OpenAI 33.1 25.6 $1.10 $4.40 $1.93 171.2 15.68 17.2
Claude 4.5 Haiku (Reasoning) Anthropic 37.1 32.6 $1.25 $5.00 $2.19 101.8 10.41 17.0
GLM-4.5V (Reasoning) Z AI 15.1 10.9 $0.60 $1.80 $0.90 35.2 1.15 16.8
Llama 3.1 Nemotron Ultra 253B v1 (Reasoning) NVIDIA 15.0 13.1 $0.60 $1.80 $0.90 52.8 0.69 16.7
Qwen3 Max Thinking Alibaba 39.8 30.5 $1.20 $6.00 $2.40 48.1 1.39 16.6
Grok 4.20 0309 v2 (Reasoning) xAI 49.3 40.5 $2.00 $6.00 $3.00 143.4 18.06 16.4
Gemini 3.5 Flash (high) Google 55.3 45.0 $1.50 $9.00 $3.38 214.1 13.13 16.4
Qwen3 0.6B (Reasoning) Alibaba 6.5 0.9 $0.11 $1.26 $0.40 222.4 0.92 16.3
Grok 4.20 0309 (Reasoning) xAI 48.5 42.2 $2.00 $6.00 $3.00 141.8 18.88 16.2
Qwen3.7 Max Alibaba 56.6 50.1 $2.50 $7.50 $3.75 197.6 1.60 15.1
Mixtral 8x7B Instruct Mistral 7.7 $0.45 $0.70 $0.51 0.0 0.00 15.0
Qwen3 Next 80B A3B (Reasoning) Alibaba 26.7 19.5 $0.50 $6.00 $1.88 166.3 1.15 14.2
Claude 4.5 Haiku (Non-reasoning) Anthropic 31.0 29.6 $1.25 $5.00 $2.19 98.0 0.63 14.2
GLM-4.5V (Non-reasoning) Z AI 12.7 10.8 $0.60 $1.80 $0.90 45.2 34.35 14.1
Qwen3 VL 32B Instruct Alibaba 17.2 15.6 $0.70 $2.80 $1.23 79.2 1.24 14.0
GPT-5.1 (high) OpenAI 47.7 44.7 $1.25 $10.00 $3.44 149.3 14.25 13.9
GPT-5.4 mini (Non-Reasoning) OpenAI 23.3 25.3 $0.75 $4.50 $1.69 155.7 0.52 13.8
Qwen3 Max Thinking (Preview) Alibaba 32.5 24.5 $1.20 $6.00 $2.40 48.6 1.77 13.5
o3-mini OpenAI 25.9 17.9 $1.10 $4.40 $1.93 182.9 6.10 13.5
DeepSeek R1 0528 (May '25) DeepSeek 27.1 24.0 $1.35 $4.20 $2.06 0.0 0.00 13.1
o3-mini (high) OpenAI 25.2 17.3 $1.10 $4.40 $1.93 167.1 20.36 13.1
Mistral Medium 3.5 Mistral 39.2 35.4 $1.50 $7.50 $3.00 150.4 0.52 13.1
GPT-5 (high) OpenAI 44.6 36.0 $1.25 $10.00 $3.44 82.9 83.02 13.0
GPT-5 Codex (high) OpenAI 44.6 38.9 $1.25 $10.00 $3.44 177.2 7.93 13.0
Gemini 3.5 Flash (minimal) Google 43.3 47.1 $1.50 $9.00 $3.38 211.3 0.79 12.8
Gemini 3.1 Pro Preview Google 57.2 55.5 $2.00 $12.00 $4.50 125.9 23.69 12.7
Qwen3 VL 235B A22B (Reasoning) Alibaba 27.6 20.9 $0.84 $6.17 $2.17 38.2 1.86 12.7
GPT-5.1 Codex (high) OpenAI 43.1 36.6 $1.25 $10.00 $3.44 175.5 3.14 12.5
Hermes 4 - Llama-3.1 405B (Reasoning) Nous Research 18.6 16.0 $1.00 $3.00 $1.50 28.8 0.80 12.4
GPT-5 (medium) OpenAI 42.0 38.9 $1.25 $10.00 $3.44 83.4 28.76 12.2
GPT-3.5 Turbo OpenAI 9.0 10.7 $0.50 $1.50 $0.75 116.2 0.43 12.0
Hermes 4 - Llama-3.1 405B (Non-reasoning) Nous Research 17.6 18.1 $1.00 $3.00 $1.50 32.7 0.75 11.7
GPT-5 (low) OpenAI 39.2 30.7 $1.25 $10.00 $3.44 85.5 7.54 11.4
Llama 3.1 Nemotron Instruct 70B NVIDIA 13.4 10.8 $1.20 $1.20 $1.20 300.7 0.26 11.2
GPT-5.3 Codex (xhigh) OpenAI 53.6 53.1 $1.75 $14.00 $4.81 76.2 56.81 11.1
o3 OpenAI 38.4 38.4 $2.00 $8.00 $3.50 142.2 5.53 11.0
Qwen3 Max (Preview) Alibaba 26.1 25.5 $1.20 $6.00 $2.40 52.6 1.98 10.9
Gemini 3 Pro Preview (high) Google 48.4 46.5 $2.00 $12.00 $4.50 125.3 25.64 10.8
Claude 3.5 Haiku Anthropic 18.7 10.7 $1.00 $4.00 $1.75 0.0 0.00 10.7
GPT-5.2 (xhigh) OpenAI 51.3 48.7 $1.75 $14.00 $4.81 74.5 92.24 10.7
Nova 2.0 Pro Preview (medium) Amazon 35.7 30.4 $1.25 $10.00 $3.44 128.6 16.55 10.4
Qwen3 Max Alibaba 31.4 26.4 $1.66 $7.22 $3.05 32.6 1.85 10.3
GPT-5.2 Codex (xhigh) OpenAI 49.0 43.0 $1.75 $14.00 $4.81 100.7 1.81 10.2
GPT-5.4 (xhigh) OpenAI 56.8 57.2 $2.50 $15.00 $5.63 79.4 161.32 10.1
Gemini 2.5 Pro Google 34.6 32.0 $1.25 $10.00 $3.44 132.4 20.72 10.1
Grok 4.20 0309 (Non-reasoning) xAI 29.7 25.4 $2.00 $6.00 $3.00 135.0 0.52 9.9
Command-R (Mar '24) Cohere 7.4 $0.50 $1.50 $0.75 0.0 0.00 9.9
Magistral Medium 1.2 Mistral 27.1 21.7 $2.00 $5.00 $2.75 40.2 0.51 9.9
GPT-5.2 (medium) OpenAI 46.6 44.2 $1.75 $14.00 $4.81 0.0 0.00 9.7
Grok 4.20 0309 v2 (Non-reasoning) xAI 29.0 22.0 $2.00 $6.00 $3.00 128.6 0.62 9.7
Nova Pro Amazon 13.5 11.0 $0.80 $3.20 $1.40 0.0 0.00 9.6
Qwen3 VL 32B (Reasoning) Alibaba 24.7 14.5 $0.70 $8.40 $2.63 97.2 1.28 9.4
Nova 2.0 Pro Preview (low) Amazon 31.9 24.5 $1.25 $10.00 $3.44 138.3 5.27 9.3
Gemini 3 Pro Preview (low) Google 41.3 39.4 $2.00 $12.00 $4.50 0.0 0.00 9.2
Llama 3.2 Instruct 90B (Vision) Meta 11.9 $1.38 $1.38 $1.38 48.5 0.54 8.6
Gemini 2.5 Pro Preview (May' 25) Google 29.5 $1.25 $10.00 $3.44 0.0 0.00 8.6
GPT-5.4 (low) OpenAI 47.9 45.6 $2.50 $15.00 $5.63 64.0 1.55 8.5
GPT-5.1 (Non-reasoning) OpenAI 27.4 27.3 $1.25 $10.00 $3.44 135.0 0.66 8.0
Claude Sonnet 4.6 (Adaptive Reasoning, Max Effort) Anthropic 51.7 50.9 $3.75 $15.00 $6.56 50.3 41.52 7.9
DeepSeek R1 (Jan '25) DeepSeek 18.8 15.9 $1.68 $4.70 $2.43 0.0 0.00 7.7
Llama 3 Instruct 70B Meta 8.9 6.8 $0.65 $2.75 $1.18 46.2 0.64 7.6
Qwen3 235B A22B (Reasoning) Alibaba 19.8 17.4 $0.70 $8.40 $2.63 65.1 1.18 7.5
GPT-4.1 OpenAI 26.3 21.8 $2.00 $8.00 $3.50 130.4 0.66 7.5
GPT-5.2 (Non-reasoning) OpenAI 33.6 34.7 $1.75 $14.00 $4.81 66.6 0.58 7.0
GPT-5 (minimal) OpenAI 23.9 25.0 $1.25 $10.00 $3.44 81.1 0.73 7.0
Claude Sonnet 4.6 (Non-reasoning, High Effort) Anthropic 44.4 46.4 $3.75 $15.00 $6.56 45.5 1.07 6.8
Nova 2.0 Pro Preview (Non-reasoning) Amazon 23.1 20.5 $1.25 $10.00 $3.44 139.2 0.76 6.7
Claude 4.5 Sonnet (Reasoning) Anthropic 43.0 38.6 $3.75 $15.00 $6.56 48.3 7.84 6.6
Claude Sonnet 4.6 (Non-reasoning, Low Effort) Anthropic 42.6 43.0 $3.75 $15.00 $6.56 45.3 1.04 6.5
GPT-5 (ChatGPT) OpenAI 21.8 21.2 $1.25 $10.00 $3.44 180.3 0.46 6.3
GPT-5.4 (Non-reasoning) OpenAI 35.4 41.0 $2.50 $15.00 $5.63 66.6 0.73 6.3
Mistral Small (Feb '24) Mistral 9.0 $1.00 $3.00 $1.50 151.3 0.56 6.0
Claude 4 Sonnet (Reasoning) Anthropic 38.7 34.1 $3.75 $15.00 $6.56 45.8 7.97 5.9
Qwen2.5 Max Alibaba 16.3 $1.60 $6.40 $2.80 49.1 1.13 5.8
Apertus 70B Instruct Swiss AI Initiative 7.7 1.9 $0.82 $2.92 $1.34 0.0 0.00 5.7
Claude 4.5 Sonnet (Non-reasoning) Anthropic 37.1 33.5 $3.75 $15.00 $6.56 45.7 1.44 5.7
GPT-5.5 (xhigh) OpenAI 60.2 59.1 $5.00 $30.00 $11.25 65.9 68.36 5.4
Claude Opus 4.7 (Adaptive Reasoning, Max Effort) Anthropic 57.3 52.5 $6.25 $25.00 $10.94 46.7 13.62 5.2
GPT-5.5 (high) OpenAI 58.9 58.5 $5.00 $30.00 $11.25 66.0 17.23 5.2
GPT-5.5 (medium) OpenAI 56.7 56.2 $5.00 $30.00 $11.25 68.9 5.11 5.0
Mistral Large 2 (Nov '24) Mistral 15.1 13.8 $2.00 $6.00 $3.00 33.3 0.59 5.0
Claude 4 Sonnet (Non-reasoning) Anthropic 33.0 30.6 $3.75 $15.00 $6.56 47.5 1.12 5.0
Claude Opus 4.6 (Adaptive Reasoning, Max Effort) Anthropic 52.9 48.1 $6.25 $25.00 $10.94 46.3 7.00 4.8
Claude Opus 4.7 (Non-reasoning, High Effort) Anthropic 51.8 53.1 $6.25 $25.00 $10.94 44.5 1.39 4.7
Llama 3.1 Instruct 405B Meta 17.4 14.5 $2.75 $6.50 $3.69 45.0 0.62 4.7
Claude 3.7 Sonnet (Non-reasoning) Anthropic 30.8 26.7 $3.75 $15.00 $6.56 0.0 0.00 4.7
Pixtral Large Mistral 14.0 $2.00 $6.00 $3.00 55.8 0.51 4.7
Claude Opus 4.5 (Reasoning) Anthropic 49.7 47.8 $6.25 $25.00 $10.94 50.0 10.45 4.5
GPT-5.5 (low) OpenAI 50.8 52.1 $5.00 $30.00 $11.25 62.1 1.60 4.5
Mistral Large 2 (Jul '24) Mistral 13.0 $2.00 $6.00 $3.00 0.0 0.00 4.3
GPT-4o (Aug '24) OpenAI 18.6 16.6 $2.50 $10.00 $4.38 138.7 0.57 4.3
Claude Opus 4.6 (Non-reasoning, High Effort) Anthropic 46.5 47.6 $6.25 $25.00 $10.94 45.8 1.45 4.3
GPT-4o (Nov '24) OpenAI 17.3 16.7 $2.50 $10.00 $4.38 205.7 0.45 4.0
Claude Opus 4.5 (Non-reasoning) Anthropic 43.1 42.9 $6.25 $25.00 $10.94 50.7 1.47 3.9
Nova Premier Amazon 19.0 13.8 $2.50 $12.50 $5.00 70.9 1.08 3.8
Grok 4 xAI 41.5 40.5 $5.50 $27.50 $11.00 0.0 0.00 3.8
GPT-5.5 (Non-reasoning) OpenAI 40.9 48.6 $5.00 $30.00 $11.25 61.7 0.85 3.6
Grok 3 xAI 25.2 19.8 $4.00 $20.00 $8.00 0.0 0.00 3.1
Jamba 1.7 Large AI21 Labs 10.9 7.8 $2.00 $8.00 $3.50 60.6 0.89 3.1
Command A Cohere 13.5 9.9 $2.50 $10.00 $4.38 61.1 0.41 3.1
Jamba 1.5 Large AI21 Labs 10.7 $2.00 $8.00 $3.50 0.0 0.00 3.1
Jamba 1.6 Large AI21 Labs 10.6 $2.00 $8.00 $3.50 60.9 0.89 3.0
Claude 3.5 Sonnet (Oct '24) Anthropic 15.9 30.2 $3.75 $15.00 $6.56 0.0 0.00 2.4
Mistral Medium Mistral 9.0 $2.75 $8.10 $4.09 56.1 0.76 2.2
Claude 3.5 Sonnet (June '24) Anthropic 14.2 26.0 $3.75 $15.00 $6.56 0.0 0.00 2.2
GPT-4o (May '24) OpenAI 14.5 24.2 $5.00 $15.00 $7.50 137.7 0.52 1.9
Claude 3 Sonnet Anthropic 10.3 $3.00 $15.00 $6.00 0.0 0.00 1.7
Mistral Large (Feb '24) Mistral 9.9 $4.00 $12.00 $6.00 0.0 0.00 1.7
Command-R+ (Apr '24) Cohere 8.3 $3.00 $15.00 $6.00 0.0 0.00 1.4
Claude 4.1 Opus (Reasoning) Anthropic 42.0 36.5 $18.75 $75.00 $32.81 35.0 8.20 1.3
Claude 4 Opus (Reasoning) Anthropic 39.0 34.0 $18.75 $75.00 $32.81 33.6 7.34 1.2
o1 OpenAI 30.7 20.5 $15.00 $60.00 $26.25 97.3 19.98 1.2
o3-pro OpenAI 40.7 $20.00 $80.00 $35.00 23.1 66.16 1.2
Claude 4.1 Opus (Non-reasoning) Anthropic 36.0 $18.75 $75.00 $32.81 34.4 1.77 1.1
Claude 4 Opus (Non-reasoning) Anthropic 33.0 $18.75 $75.00 $32.81 33.9 1.60 1.0
GPT-4 Turbo OpenAI 13.7 21.5 $10.00 $30.00 $15.00 35.5 1.11 0.9
o1-preview OpenAI 23.7 34.0 $16.50 $66.00 $28.88 0.0 0.00 0.8
Claude 3 Opus Anthropic 18.0 19.5 $18.75 $75.00 $32.81 0.0 0.00 0.5
GPT-4 OpenAI 12.8 13.1 $30.00 $60.00 $37.50 45.1 0.99 0.3
o1-pro OpenAI 25.8 $150.00 $600.00 $262.50 0.0 0.00 0.1
Grok-1 xAI 11.7 $0.00 $0.00 $0.00 0.0 0.00 0.0
Muse Spark Meta 52.2 47.5 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemma 4 E4B (Reasoning) Google 18.8 13.7 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemma 4 31B (Reasoning) Google 39.2 38.7 $0.00 $0.00 $0.00 35.0 0.98 0.0
Gemma 3 270M Google 7.7 0.0 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemma 4 E4B (Non-reasoning) Google 14.8 6.4 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemma 4 E2B (Non-reasoning) Google 12.1 8.3 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemma 4 E2B (Reasoning) Google 15.2 9.0 $0.00 $0.00 $0.00 0.0 0.00 0.0
Devstral Small 2 Mistral 19.5 20.7 $0.00 $0.00 $0.00 57.0 0.53 0.0
Devstral 2 Mistral 22.0 23.7 $0.00 $0.00 $0.00 59.8 0.73 0.0
R1 1776 Perplexity 12.0 $0.00 $0.00 $0.00 0.0 0.00 0.0
Falcon-H1R-7B TII UAE 15.8 9.8 $0.00 $0.00 $0.00 0.0 0.00 0.0
Grok 4.1 Fast (Non-reasoning) xAI 23.6 19.5 $0.00 $0.00 $0.00 0.0 0.00 0.0
Grok 4.1 Fast (Reasoning) xAI 38.6 30.9 $0.00 $0.00 $0.00 0.0 0.00 0.0
Grok Code Fast 1 xAI 28.7 23.7 $0.00 $0.00 $0.00 0.0 0.00 0.0
Phi-4 Mini Instruct Microsoft 8.4 3.6 $0.00 $0.00 $0.00 0.0 0.00 0.0
Phi-4 Multimodal Instruct Microsoft 10.0 $0.00 $0.00 $0.00 17.2 0.36 0.0
LFM2 8B A1B Liquid AI 7.0 2.3 $0.00 $0.00 $0.00 0.0 0.00 0.0
LFM2.5-VL-1.6B Liquid AI 6.2 1.0 $0.00 $0.00 $0.00 0.0 0.00 0.0
LFM2.5-1.2B-Instruct Liquid AI 8.0 0.8 $0.00 $0.00 $0.00 0.0 0.00 0.0
LFM2.5-1.2B-Thinking Liquid AI 8.1 1.4 $0.00 $0.00 $0.00 0.0 0.00 0.0
LFM2 2.6B Liquid AI 8.0 1.4 $0.00 $0.00 $0.00 0.0 0.00 0.0
Solar Pro 2 (Reasoning) Upstage 14.9 12.1 $0.00 $0.00 $0.00 0.0 0.00 0.0
Solar Open 100B (Reasoning) Upstage 21.7 10.5 $0.00 $0.00 $0.00 0.0 0.00 0.0
Solar Pro 2 (Non-reasoning) Upstage 13.6 11.3 $0.00 $0.00 $0.00 0.0 0.00 0.0
Solar Pro 3 Upstage 25.9 13.3 $0.00 $0.00 $0.00 0.0 0.00 0.0
NVIDIA Nemotron 3 Nano 4B NVIDIA 14.7 10.0 $0.00 $0.00 $0.00 0.0 0.00 0.0
Llama 3.1 Nemotron Nano 4B v1.1 (Reasoning) NVIDIA 14.4 $0.00 $0.00 $0.00 0.0 0.00 0.0
Nemotron Cascade 2 30B A3B NVIDIA 28.4 25.8 $0.00 $0.00 $0.00 0.0 0.00 0.0
Llama 3.3 Nemotron Super 49B v1 (Non-reasoning) NVIDIA 14.3 7.6 $0.00 $0.00 $0.00 0.0 0.00 0.0
Llama 3.3 Nemotron Super 49B v1 (Reasoning) NVIDIA 18.5 9.4 $0.00 $0.00 $0.00 0.0 0.00 0.0
Llama 65B Meta 7.4 $0.00 $0.00 $0.00 0.0 0.00 0.0
Kimi Linear 48B A3B Instruct Kimi 14.4 14.2 $0.00 $0.00 $0.00 0.0 0.00 0.0
Step3 VL 10B StepFun 15.5 13.9 $0.00 $0.00 $0.00 0.0 0.00 0.0
Step 3.5 Flash 2603 StepFun 38.5 34.6 $0.00 $0.00 $0.00 149.5 0.86 0.0
Olmo 3.1 32B Think Allen Institute for AI 13.9 9.8 $0.00 $0.00 $0.00 0.0 0.00 0.0
Molmo 7B-D Allen Institute for AI 9.2 1.2 $0.00 $0.00 $0.00 0.0 0.00 0.0
Molmo2-8B Allen Institute for AI 7.3 4.4 $0.00 $0.00 $0.00 0.0 0.00 0.0
Olmo 3 7B Think Allen Institute for AI 9.4 7.6 $0.00 $0.00 $0.00 0.0 0.00 0.0
Olmo 3.1 32B Instruct Allen Institute for AI 12.2 5.6 $0.00 $0.00 $0.00 0.0 0.00 0.0
Granite 4.0 H 1B IBM 8.0 2.7 $0.00 $0.00 $0.00 0.0 0.00 0.0
Granite 4.0 Micro IBM 7.7 5.0 $0.00 $0.00 $0.00 0.0 0.00 0.0
Granite 4.1 3B IBM 8.5 5.5 $0.00 $0.00 $0.00 0.0 0.00 0.0
Granite 4.0 350M IBM 6.1 0.3 $0.00 $0.00 $0.00 0.0 0.00 0.0
Granite 4.1 30B IBM 14.7 10.1 $0.00 $0.00 $0.00 0.0 0.00 0.0
Granite 4.0 H 350M IBM 5.4 0.6 $0.00 $0.00 $0.00 0.0 0.00 0.0
Granite 4.0 1B IBM 7.3 2.9 $0.00 $0.00 $0.00 0.0 0.00 0.0
DeepHermes 3 - Llama-3.1 8B Preview (Non-reasoning) Nous Research 7.6 $0.00 $0.00 $0.00 0.0 0.00 0.0
DeepHermes 3 - Mistral 24B Preview (Non-reasoning) Nous Research 10.9 $0.00 $0.00 $0.00 0.0 0.00 0.0
Exaone 4.0 1.2B (Reasoning) LG AI Research 8.3 3.1 $0.00 $0.00 $0.00 0.0 0.00 0.0
K-EXAONE (Reasoning) LG AI Research 32.1 27.0 $0.00 $0.00 $0.00 0.0 0.00 0.0
EXAONE 4.0 32B (Reasoning) LG AI Research 16.7 14.0 $0.00 $0.00 $0.00 0.0 0.00 0.0
EXAONE 4.0 32B (Non-reasoning) LG AI Research 11.7 9.4 $0.00 $0.00 $0.00 0.0 0.00 0.0
Exaone 4.0 1.2B (Non-reasoning) LG AI Research 8.1 2.5 $0.00 $0.00 $0.00 0.0 0.00 0.0
EXAONE 4.5 33B LG AI Research 30.2 23.0 $0.00 $0.00 $0.00 0.0 0.00 0.0
K-EXAONE (Non-reasoning) LG AI Research 23.4 13.5 $0.00 $0.00 $0.00 0.0 0.00 0.0
MiMo-V2-Omni Xiaomi 43.4 35.5 $0.00 $0.00 $0.00 96.0 1.36 0.0
Qwen Chat 14B Alibaba 7.4 $0.00 $0.00 $0.00 0.0 0.00 0.0
ERNIE 5.0 Thinking Preview Baidu 29.1 29.2 $0.00 $0.00 $0.00 0.0 0.00 0.0
Sarvam 105B (high) Sarvam 18.2 9.8 $0.00 $0.00 $0.00 121.0 1.23 0.0
Sarvam 30B (high) Sarvam 12.3 7.9 $0.00 $0.00 $0.00 83.1 1.25 0.0
INTELLECT-3 Prime Intellect 22.2 19.1 $0.00 $0.00 $0.00 0.0 0.00 0.0
Motif-2-12.7B-Reasoning Motif Technologies 19.1 11.9 $0.00 $0.00 $0.00 0.0 0.00 0.0
K2 Think V2 MBZUAI Institute of Foundation Models 24.1 15.5 $0.00 $0.00 $0.00 0.0 0.00 0.0
K2-V2 (medium) MBZUAI Institute of Foundation Models 18.7 14.0 $0.00 $0.00 $0.00 0.0 0.00 0.0
K2-V2 (high) MBZUAI Institute of Foundation Models 20.6 16.1 $0.00 $0.00 $0.00 0.0 0.00 0.0
K2-V2 (low) MBZUAI Institute of Foundation Models 14.4 10.5 $0.00 $0.00 $0.00 0.0 0.00 0.0
Mi:dm K 2.5 Pro Korea Telecom 23.1 12.6 $0.00 $0.00 $0.00 0.0 0.00 0.0
HyperCLOVA X SEED Think (32B) Naver 23.7 17.5 $0.00 $0.00 $0.00 0.0 0.00 0.0
LongCat Flash Lite LongCat 23.9 16.5 $0.00 $0.00 $0.00 103.2 4.30 0.0
Tri-21B-Think Trillion Labs 18.6 6.3 $0.00 $0.00 $0.00 0.0 0.00 0.0
Tri-21B-think Preview Trillion Labs 20.0 7.4 $0.00 $0.00 $0.00 0.0 0.00 0.0
Nanbeige4.1-3B Nanbeige 16.1 8.9 $0.00 $0.00 $0.00 0.0 0.00 0.0
MiniCPM-V 4.6 1.3B OpenBMB 12.7 0.7 $0.00 $0.00 $0.00 0.0 0.00 0.0
MiniCPM5-1B (Non-reasoning) OpenBMB 17.9 0.5 $0.00 $0.00 $0.00 0.0 0.00 0.0
JT-MINI China Mobile 25.4 21.2 $0.00 $0.00 $0.00 0.0 0.00 0.0
JT-35B-Flash China Mobile 36.1 28.9 $0.00 $0.00 $0.00 0.0 0.00 0.0
GLM 5V Turbo (Reasoning) Z AI 42.9 36.2 $0.00 $0.00 $0.00 0.0 0.00 0.0
GLM-5-Turbo Z AI 46.8 36.8 $0.00 $0.00 $0.00 0.0 0.00 0.0
Command A+ Cohere 37.2 29.3 $0.00 $0.00 $0.00 201.2 0.17 0.0
Tiny Aya Global Cohere 4.7 1.2 $0.00 $0.00 $0.00 124.4 0.30 0.0
Apriel-v1.6-15B-Thinker ServiceNow 27.6 22.0 $0.00 $0.00 $0.00 0.0 0.00 0.0
Jamba 1.7 Mini AI21 Labs 8.1 3.1 $0.00 $0.00 $0.00 0.0 0.00 0.0
Jamba Reasoning 3B AI21 Labs 9.6 2.5 $0.00 $0.00 $0.00 0.0 0.00 0.0
Qwen3.5 9B (Non-reasoning) Alibaba 27.3 21.3 $0.00 $0.00 $0.00 0.0 0.00 0.0
Ring-1T InclusionAI 22.8 16.8 $0.00 $0.00 $0.00 0.0 0.00 0.0
Ling-mini-2.0 InclusionAI 9.2 5.0 $0.00 $0.00 $0.00 0.0 0.00 0.0
Ling-1T InclusionAI 19.0 18.8 $0.00 $0.00 $0.00 0.0 0.00 0.0
Doubao Seed Code ByteDance Seed 33.5 31.3 $0.00 $0.00 $0.00 0.0 0.00 0.0
o1-mini OpenAI 20.4 $0.00 $0.00 $0.00 0.0 0.00 0.0
GPT-4o (March 2025, chatgpt-4o-latest) OpenAI 18.6 $0.00 $0.00 $0.00 0.0 0.00 0.0
GPT-4o (ChatGPT) OpenAI 14.1 $0.00 $0.00 $0.00 0.0 0.00 0.0
GPT-4.5 (Preview) OpenAI 20.0 $0.00 $0.00 $0.00 0.0 0.00 0.0
Llama 2 Chat 70B Meta 8.4 $0.00 $0.00 $0.00 0.0 0.00 0.0
Llama 2 Chat 13B Meta 8.4 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemini 2.0 Pro Experimental (Feb '25) Google 18.1 25.5 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemini 2.0 Flash (experimental) Google 16.8 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemini 1.5 Pro (Sep '24) Google 16.0 23.6 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemini 2.0 Flash-Lite (Preview) Google 14.5 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemini 1.5 Flash (Sep '24) Google 13.8 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemini 1.5 Flash-8B Google 11.1 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemini 1.0 Pro Google 8.5 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemma 3 1B Instruct Google 5.6 0.2 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemini 2.5 Pro Preview (Mar' 25) Google 30.3 46.7 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemini 2.0 Flash Thinking Experimental (Jan '25) Google 19.6 24.1 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemini 2.5 Flash Preview (Reasoning) Google 24.3 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemini 2.5 Flash Preview (Non-reasoning) Google 17.8 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemini 2.5 Flash Preview (Sep '25) (Non-reasoning) Google 25.7 22.1 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemini 2.5 Flash Preview (Sep '25) (Reasoning) Google 31.1 24.6 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemma 3n E2B Instruct Google 4.8 2.2 $0.00 $0.00 $0.00 0.0 0.00 0.0
PALM-2 Google 8.6 4.6 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemini 1.5 Pro (May '24) Google 12.0 19.8 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemma 3n E4B Instruct Preview (May '25) Google 10.1 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemini 2.0 Flash-Lite (Feb '25) Google 14.7 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemini 2.0 Flash Thinking Experimental (Dec '24) Google 12.3 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemini 1.5 Flash (May '24) Google 10.5 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemini 1.0 Ultra Google 10.1 17.6 $0.00 $0.00 $0.00 0.0 0.00 0.0
Claude Instant Anthropic 7.4 7.8 $0.00 $0.00 $0.00 0.0 0.00 0.0
Claude 3.7 Sonnet (Reasoning) Anthropic 34.7 27.6 $0.00 $0.00 $0.00 0.0 0.00 0.0
Claude 2.1 Anthropic 9.3 14.0 $0.00 $0.00 $0.00 0.0 0.00 0.0
Claude 2.0 Anthropic 9.1 12.9 $0.00 $0.00 $0.00 0.0 0.00 0.0
Mixtral 8x22B Instruct Mistral 9.8 $0.00 $0.00 $0.00 0.0 0.00 0.0
Magistral Small 1 Mistral 16.8 11.1 $0.00 $0.00 $0.00 0.0 0.00 0.0
Mistral Saba Mistral 12.1 $0.00 $0.00 $0.00 0.0 0.00 0.0
Magistral Medium 1 Mistral 18.8 16.0 $0.00 $0.00 $0.00 0.0 0.00 0.0
Devstral Small (May '25) Mistral 18.0 12.2 $0.00 $0.00 $0.00 0.0 0.00 0.0
DeepSeek R1 Distill Qwen 32B DeepSeek 17.2 $0.00 $0.00 $0.00 0.0 0.00 0.0
DeepSeek R1 Distill Qwen 14B DeepSeek 15.8 $0.00 $0.00 $0.00 0.0 0.00 0.0
DeepSeek-V2.5 (Dec '24) DeepSeek 12.5 $0.00 $0.00 $0.00 0.0 0.00 0.0
DeepSeek-Coder-V2 DeepSeek 10.6 $0.00 $0.00 $0.00 0.0 0.00 0.0
DeepSeek R1 Distill Llama 8B DeepSeek 12.1 $0.00 $0.00 $0.00 0.0 0.00 0.0
DeepSeek LLM 67B Chat (V1) DeepSeek 8.4 $0.00 $0.00 $0.00 0.0 0.00 0.0
DeepSeek R1 Distill Qwen 1.5B DeepSeek 9.1 $0.00 $0.00 $0.00 0.0 0.00 0.0
DeepSeek R1 0528 Qwen3 8B DeepSeek 16.4 7.8 $0.00 $0.00 $0.00 0.0 0.00 0.0
DeepSeek V3.2 Speciale DeepSeek 29.4 37.9 $0.00 $0.00 $0.00 0.0 0.00 0.0
DeepSeek-V2-Chat DeepSeek 9.1 $0.00 $0.00 $0.00 0.0 0.00 0.0
DeepSeek Coder V2 Lite Instruct DeepSeek 8.5 $0.00 $0.00 $0.00 0.0 0.00 0.0
DeepSeek-V2.5 DeepSeek 12.3 $0.00 $0.00 $0.00 0.0 0.00 0.0
Sonar Perplexity 15.5 $0.00 $0.00 $0.00 0.0 0.00 0.0
Sonar Pro Perplexity 15.2 $0.00 $0.00 $0.00 0.0 0.00 0.0
Sonar Reasoning Perplexity 17.9 $0.00 $0.00 $0.00 0.0 0.00 0.0
Sonar Reasoning Pro Perplexity 24.6 $0.00 $0.00 $0.00 0.0 0.00 0.0
Grok Beta xAI 13.3 $0.00 $0.00 $0.00 0.0 0.00 0.0
Grok 3 Reasoning Beta xAI 21.6 $0.00 $0.00 $0.00 0.0 0.00 0.0
Grok 2 (Dec '24) xAI 13.9 $0.00 $0.00 $0.00 0.0 0.00 0.0
OpenChat 3.5 (1210) OpenChat 8.3 $0.00 $0.00 $0.00 0.0 0.00 0.0
Phi-3 Mini Instruct 3.8B Microsoft 10.1 3.0 $0.00 $0.00 $0.00 0.0 0.00 0.0
LFM 40B Liquid AI 8.8 $0.00 $0.00 $0.00 0.0 0.00 0.0
LFM2 1.2B Liquid AI 6.3 0.8 $0.00 $0.00 $0.00 0.0 0.00 0.0
Solar Pro 2 (Preview) (Reasoning) Upstage 18.8 $0.00 $0.00 $0.00 0.0 0.00 0.0
Solar Pro 2 (Preview) (Non-reasoning) Upstage 16.0 $0.00 $0.00 $0.00 0.0 0.00 0.0
DBRX Instruct Databricks 8.3 $0.00 $0.00 $0.00 0.0 0.00 0.0
MiniMax M1 40k MiniMax 20.9 14.1 $0.00 $0.00 $0.00 0.0 0.00 0.0
Llama 3.1 Tulu3 405B Allen Institute for AI 14.1 $0.00 $0.00 $0.00 0.0 0.00 0.0
Olmo 3 32B Think Allen Institute for AI 12.1 10.5 $0.00 $0.00 $0.00 0.0 0.00 0.0
OLMo 2 32B Allen Institute for AI 10.6 2.7 $0.00 $0.00 $0.00 0.0 0.00 0.0
OLMo 2 7B Allen Institute for AI 9.3 1.2 $0.00 $0.00 $0.00 0.0 0.00 0.0
Sarvam M (Reasoning) Sarvam 8.4 7.5 $0.00 $0.00 $0.00 135.8 1.17 0.0
Apriel-v1.5-15B-Thinker ServiceNow 28.3 18.7 $0.00 $0.00 $0.00 0.0 0.00 0.0
Arctic Instruct Snowflake 8.8 $0.00 $0.00 $0.00 0.0 0.00 0.0
Qwen2.5 Coder Instruct 32B Alibaba 12.9 $0.00 $0.00 $0.00 0.0 0.00 0.0
Qwen2 Instruct 72B Alibaba 11.7 $0.00 $0.00 $0.00 0.0 0.00 0.0
Qwen Chat 72B Alibaba 8.8 $0.00 $0.00 $0.00 0.0 0.00 0.0
Qwen1.5 Chat 110B Alibaba 9.5 $0.00 $0.00 $0.00 0.0 0.00 0.0
QwQ 32B-Preview Alibaba 15.2 $0.00 $0.00 $0.00 0.0 0.00 0.0
Qwen3 VL 4B Instruct Alibaba 9.6 4.6 $0.00 $0.00 $0.00 0.0 0.00 0.0
Qwen3 VL 4B (Reasoning) Alibaba 13.7 6.7 $0.00 $0.00 $0.00 0.0 0.00 0.0
Qwen2.5 Coder Instruct 7B Alibaba 10.0 $0.00 $0.00 $0.00 0.0 0.00 0.0
Qwen3 4B 2507 (Reasoning) Alibaba 18.2 9.5 $0.00 $0.00 $0.00 0.0 0.00 0.0
Qwen2.5 Instruct 32B Alibaba 13.2 $0.00 $0.00 $0.00 0.0 0.00 0.0
Qwen3 4B 2507 Instruct Alibaba 12.9 9.0 $0.00 $0.00 $0.00 0.0 0.00 0.0

The most expensive model is rarely the best choice

There is a persistent assumption in the industry that higher price means higher quality. The data tells a different story. Sort the table above by value score and you will see that many of the top-ranked models cost a fraction of the flagship offerings while matching or exceeding them on intelligence and coding benchmarks. A model at $0.50 per million tokens can score within a few points of one at $15 — that is a 30x price difference for nearly identical capability.

This matters at scale. If your application processes a million requests a day, the difference between a $2/M model and a $0.20/M model is not a rounding error — it is thousands of dollars a month. And the cheaper model might actually respond faster, because smaller or better-optimized models frequently achieve higher throughput. You are paying more for less speed.

The flagship models from the largest providers do lead on the hardest benchmarks — frontier math, PhD-level science, complex multi-step reasoning. But most production workloads are not PhD-level science. They are classification, extraction, summarization, code generation, and conversational tasks. For these, a mid-tier model is not a compromise — it is the right tool. Using a $10/M model to parse invoices is like hiring a surgeon to apply a bandage.

Reasoning models add another dimension. They spend extra compute “thinking” before answering, which boosts accuracy on hard problems but also increases latency and cost. If your use case does not require multi-step logical deduction, a standard model will give you the same answer in a tenth of the time. The filter above lets you isolate reasoning models so you can compare them separately.

The value score in this table — intelligence divided by blended price — exists to make this tradeoff visible at a glance. The best model for your project is not the smartest one available. It is the smartest one you need, at the lowest price that delivers it.

If you are building on European infrastructure and want to find providers that respect data sovereignty, check out Voie.fi — an open index of 1,400+ European-headquartered digital infrastructure providers across compute, storage, payments, security, AI, and more. It helps teams replace US-dominant services with genuinely European alternatives that avoid CLOUD Act and FISA exposure.