LLM Optimizer

予算に合った最も賢いモデルを見つけよう

Artificial Analysis より 3時間 31分前 に更新 · 444 モデル

モデル 開発元 知能 コーディング 入力 $/M 出力 $/M 混合 $/M 速度 t/s TTFT s コンテキスト 価値
Qwen3.5 9B (Reasoning) Alibaba 32.4 25.3 $0.10 $0.15 $0.11 49.8 0.31 286.7
MiMo-V2-Flash (Feb 2026) Xiaomi 41.5 33.5 $0.10 $0.30 $0.15 129.1 1.38 276.7
MiMo-V2-Flash (Reasoning) Xiaomi 39.2 31.8 $0.10 $0.30 $0.15 125.7 1.56 261.3
gpt-oss-20B (high) OpenAI 24.5 18.5 $0.06 $0.20 $0.09 271.1 0.48 260.6
Gemma 3n E4B Instruct Google 6.4 4.2 $0.02 $0.04 $0.03 50.8 0.32 256.0
Step 3.5 Flash StepFun 37.8 31.6 $0.10 $0.30 $0.15 85.0 3.11 252.0
Devstral Small (May '25) Mistral 18.0 12.2 $0.06 $0.12 $0.07 0.0 0.00 240.0
NVIDIA Nemotron 3 Nano 30B A3B (Reasoning) NVIDIA 24.3 19.0 $0.06 $0.24 $0.10 175.9 0.91 231.4
gpt-oss-20B (low) OpenAI 20.8 14.4 $0.06 $0.20 $0.09 265.1 0.49 221.3
NVIDIA Nemotron Nano 9B V2 (Reasoning) NVIDIA 14.8 8.3 $0.04 $0.16 $0.07 124.0 0.31 211.4
MiMo-V2-Flash (Non-reasoning) Xiaomi 30.4 25.8 $0.10 $0.30 $0.15 134.5 1.33 202.7
LFM2 24B A2B Liquid AI 10.5 3.6 $0.03 $0.12 $0.05 214.4 0.26 201.9
GLM-4.7-Flash (Reasoning) Z AI 30.1 25.9 $0.07 $0.40 $0.15 86.5 0.68 198.0
GPT-5 nano (high) OpenAI 26.8 20.3 $0.05 $0.40 $0.14 130.6 104.17 194.2
GPT-5 nano (medium) OpenAI 25.9 22.9 $0.05 $0.40 $0.14 136.8 51.11 187.7
Nova Micro Amazon 10.3 4.1 $0.04 $0.14 $0.06 304.2 0.37 168.9
NVIDIA Nemotron Nano 9B V2 (Non-reasoning) NVIDIA 13.2 7.5 $0.05 $0.20 $0.09 140.8 0.57 153.5
NVIDIA Nemotron 3 Nano 30B A3B (Non-reasoning) NVIDIA 13.2 15.8 $0.05 $0.20 $0.09 74.2 0.32 151.7
GLM-4.7-Flash (Non-reasoning) Z AI 22.1 11.0 $0.07 $0.40 $0.15 90.1 1.19 145.4
Grok 4.1 Fast (Reasoning) xAI 38.6 30.9 $0.20 $0.50 $0.28 164.9 8.83 140.4
Qwen2.5 Turbo Alibaba 12.0 $0.05 $0.20 $0.09 61.0 1.02 137.9
DeepSeek V3.2 (Reasoning) DeepSeek 41.7 36.7 $0.28 $0.42 $0.32 31.6 1.42 132.4
Grok 4 Fast (Reasoning) xAI 35.1 27.4 $0.20 $0.50 $0.28 128.3 3.64 127.6
gpt-oss-120B (high) OpenAI 33.3 28.6 $0.15 $0.60 $0.26 258.2 0.51 126.6
Gemini 2.5 Flash-Lite Preview (Sep '25) (Reasoning) Google 21.6 18.1 $0.10 $0.40 $0.17 400.9 5.64 123.4
Nova Lite Amazon 12.7 5.1 $0.06 $0.24 $0.10 201.8 0.40 121.0
Llama 3.1 Instruct 8B Meta 11.8 4.9 $0.10 $0.10 $0.10 191.3 0.47 118.0
Llama 3.2 Instruct 3B Meta 9.7 $0.09 $0.09 $0.09 51.3 0.39 114.1
QwQ 32B-Preview Alibaba 15.2 $0.12 $0.18 $0.14 61.6 0.45 112.6
Ministral 3 3B Mistral 11.2 4.8 $0.10 $0.10 $0.10 293.0 0.26 112.0
Gemini 2.5 Flash-Lite Preview (Sep '25) (Non-reasoning) Google 19.4 14.5 $0.10 $0.40 $0.17 348.3 0.34 110.9
Llama Nemotron Super 49B v1.5 (Reasoning) NVIDIA 18.7 15.2 $0.10 $0.40 $0.17 51.8 0.38 106.9
DeepSeek V3.2 Exp (Reasoning) DeepSeek 32.9 33.3 $0.28 $0.42 $0.32 31.8 1.50 104.4
Mistral Small 4 (Reasoning) Mistral 26.9 24.3 $0.15 $0.60 $0.26 0.0 0.00 102.3
DeepSeek V3.2 (Non-reasoning) DeepSeek 32.1 34.6 $0.28 $0.42 $0.32 33.0 1.55 101.9
Devstral Small (Jul '25) Mistral 15.2 12.1 $0.10 $0.30 $0.15 177.0 0.34 101.3
Granite 4.0 H Small IBM 10.8 8.5 $0.06 $0.25 $0.11 321.5 8.66 100.9
Mistral Small 3.2 Mistral 15.1 13.3 $0.10 $0.30 $0.15 191.8 0.30 100.7
Gemini 2.5 Flash-Lite (Reasoning) Google 17.6 9.5 $0.10 $0.40 $0.17 321.6 11.71 100.6
GPT-5 nano (minimal) OpenAI 13.8 14.2 $0.05 $0.40 $0.14 125.8 0.76 100.0
Ministral 3 8B Mistral 14.8 10.0 $0.15 $0.15 $0.15 170.4 0.27 98.7
Llama 2 Chat 7B Meta 9.7 $0.05 $0.25 $0.10 117.3 0.54 97.0
Mistral Small 3.1 Mistral 14.5 13.9 $0.10 $0.30 $0.15 159.7 0.41 96.7
GPT-5.4 nano (xhigh) OpenAI 44.4 43.9 $0.20 $1.25 $0.46 220.9 2.31 95.9
MiniMax-M2.7 MiniMax 49.6 41.9 $0.30 $1.20 $0.53 47.1 2.12 94.5
gpt-oss-120B (low) OpenAI 24.5 15.5 $0.15 $0.60 $0.26 273.5 0.49 93.2
Grok 3 mini Reasoning (high) xAI 32.1 25.2 $0.30 $0.50 $0.35 198.0 0.37 91.7
Llama 3 Instruct 8B Meta 6.4 4.0 $0.04 $0.14 $0.07 82.7 0.38 91.4
DeepSeek V3.2 Exp (Non-reasoning) DeepSeek 28.4 30.0 $0.28 $0.42 $0.32 32.9 1.38 90.2
Mercury 2 Inception 32.8 30.6 $0.25 $0.75 $0.38 974.9 3.94 87.5
NVIDIA Nemotron 3 Super 120B A12B (Reasoning) NVIDIA 36.0 31.2 $0.30 $0.75 $0.41 367.1 0.54 87.4
Grok 4.1 Fast (Non-reasoning) xAI 23.6 19.5 $0.20 $0.50 $0.28 121.9 0.33 85.8
Mistral Small 3 Mistral 12.7 $0.10 $0.30 $0.15 159.7 0.40 84.7
Grok 4 Fast (Non-reasoning) xAI 23.1 19.0 $0.20 $0.50 $0.28 133.7 0.28 84.0
Seed-OSS-36B-Instruct ByteDance Seed 25.2 16.7 $0.21 $0.57 $0.30 42.3 2.10 84.0
Llama Nemotron Super 49B v1.5 (Non-reasoning) NVIDIA 14.6 10.5 $0.10 $0.40 $0.17 52.3 0.31 83.4
Granite 3.3 8B (Non-reasoning) IBM 7.0 3.4 $0.03 $0.25 $0.09 444.6 7.32 82.4
GPT-5.4 nano (medium) OpenAI 38.1 35.0 $0.20 $1.25 $0.46 218.6 3.84 82.3
Hermes 4 - Llama-3.1 70B (Reasoning) Nous Research 16.0 14.4 $0.13 $0.40 $0.20 80.5 0.56 80.8
Ministral 3 14B Mistral 16.0 10.9 $0.20 $0.20 $0.20 123.3 0.31 80.0
MiniMax-M2.5 MiniMax 41.9 37.4 $0.30 $1.20 $0.53 51.5 3.27 79.8
Solar Mini Upstage 11.9 $0.15 $0.15 $0.15 90.3 1.30 79.3
MiniMax-M2.1 MiniMax 39.4 32.8 $0.30 $1.20 $0.53 51.1 1.91 75.0
GPT-4.1 nano OpenAI 13.0 11.2 $0.10 $0.40 $0.17 105.6 0.39 74.3
Gemini 2.5 Flash-Lite (Non-reasoning) Google 12.7 7.4 $0.10 $0.40 $0.17 252.9 0.37 72.6
Mistral Small 4 (Non-reasoning) Mistral 18.6 16.4 $0.15 $0.60 $0.26 160.2 0.41 70.7
Gemini 2.0 Flash (Feb '25) Google 18.5 13.6 $0.15 $0.60 $0.26 0.0 0.00 70.3
MiniMax-M2 MiniMax 36.1 29.2 $0.30 $1.20 $0.53 54.8 1.97 68.8
KAT-Coder-Pro V1 KwaiKAT 36.0 18.3 $0.30 $1.20 $0.53 42.9 1.24 68.6
Qwen3 4B (Non-reasoning) Alibaba 12.5 $0.11 $0.42 $0.19 104.3 0.95 66.5
Olmo 3 7B Instruct Allen Institute for AI 8.2 3.4 $0.10 $0.20 $0.13 87.2 0.42 65.6
DeepSeek R1 Distill Qwen 32B DeepSeek 17.2 $0.27 $0.27 $0.27 60.4 0.47 63.7
Hermes 4 - Llama-3.1 70B (Non-reasoning) Nous Research 12.6 9.2 $0.13 $0.40 $0.20 80.3 0.59 63.6
Ling-flash-2.0 InclusionAI 15.7 16.7 $0.14 $0.57 $0.25 67.2 1.74 63.6
Llama 3.2 Instruct 1B Meta 6.3 0.6 $0.10 $0.10 $0.10 91.2 0.41 63.0
GPT-5 mini (high) OpenAI 41.2 35.3 $0.25 $2.00 $0.69 74.6 84.77 59.9
Gemini 3.1 Flash-Lite Preview Google 33.5 30.1 $0.25 $1.50 $0.56 214.5 7.81 59.5
Ring-flash-2.0 InclusionAI 14.0 10.6 $0.14 $0.57 $0.25 70.1 1.97 56.7
GPT-5 mini (medium) OpenAI 38.9 32.9 $0.25 $2.00 $0.69 70.7 20.89 56.5
GPT-5.1 Codex mini (high) OpenAI 38.6 36.4 $0.25 $2.00 $0.69 171.9 5.11 56.1
Grok Code Fast 1 xAI 28.7 23.7 $0.20 $1.50 $0.53 194.6 3.65 54.7
GLM-4.5-Air Z AI 23.2 23.8 $0.20 $1.10 $0.42 105.4 0.64 54.6
Llama 3.2 Instruct 11B (Vision) Meta 8.7 4.3 $0.16 $0.16 $0.16 76.8 0.38 54.4
Qwen3.5 35B A3B (Reasoning) Alibaba 37.1 30.3 $0.25 $2.00 $0.69 130.3 0.99 53.9
GPT-5.4 nano (Non-Reasoning) OpenAI 24.4 27.9 $0.20 $1.25 $0.46 223.1 0.53 52.7
GLM-4.6V (Reasoning) Z AI 23.4 19.7 $0.30 $0.90 $0.45 27.9 1.18 52.0
Qwen3.5 27B (Reasoning) Alibaba 42.1 34.9 $0.30 $2.40 $0.82 91.5 1.32 51.0
NVIDIA Nemotron Nano 12B v2 VL (Reasoning) NVIDIA 14.9 11.8 $0.20 $0.60 $0.30 132.5 0.40 49.7
GPT-4o mini OpenAI 12.6 $0.15 $0.60 $0.26 54.8 0.51 47.9
Phi-4 Microsoft Azure 10.4 11.2 $0.13 $0.50 $0.22 12.6 0.61 47.5
Apertus 8B Instruct Swiss AI Initiative 5.9 1.4 $0.10 $0.20 $0.13 143.6 2.14 47.2
Qwen3 Coder Next Alibaba 28.3 22.9 $0.35 $1.20 $0.60 149.6 0.87 47.2
Llama 4 Scout Meta 13.5 6.7 $0.17 $0.66 $0.29 128.1 0.45 46.2
Qwen3 VL 8B Instruct Alibaba 14.3 7.3 $0.18 $0.70 $0.31 139.4 1.01 46.1
Qwen3 VL 30B A3B Instruct Alibaba 16.1 14.3 $0.20 $0.80 $0.35 118.5 0.97 46.0
DeepSeek V3.1 Terminus (Non-reasoning) DeepSeek 28.5 31.9 $0.34 $1.50 $0.63 0.0 0.00 45.5
Qwen3.5 27B (Non-reasoning) Alibaba 37.2 33.4 $0.30 $2.40 $0.82 91.4 1.43 45.1
Qwen3.5 35B A3B (Non-reasoning) Alibaba 30.7 16.8 $0.25 $2.00 $0.69 116.5 1.04 44.6
Qwen3 30B A3B 2507 Instruct Alibaba 15.0 14.2 $0.20 $0.80 $0.35 55.3 1.05 42.9
DeepSeek V3.1 Terminus (Reasoning) DeepSeek 33.9 33.7 $0.40 $2.00 $0.80 0.0 0.00 42.4
GLM-4.7 (Reasoning) Z AI 42.1 36.3 $0.60 $2.20 $1.00 81.9 0.71 42.1
Gemini 3 Flash Preview (Reasoning) Google 46.4 42.6 $0.50 $3.00 $1.13 194.6 5.52 41.2
Olmo 3.1 32B Instruct Allen Institute for AI 12.2 5.6 $0.20 $0.60 $0.30 54.1 0.47 40.7
Kimi K2.5 (Reasoning) Kimi 46.8 39.5 $0.60 $3.00 $1.20 33.9 1.35 39.0
Kimi K2 Thinking Kimi 40.9 34.8 $0.60 $2.50 $1.07 92.9 0.64 38.0
GLM-4.6V (Non-reasoning) Z AI 17.1 11.1 $0.30 $0.90 $0.45 21.3 5.89 38.0
Qwen3.5 122B A10B (Reasoning) Alibaba 41.6 34.7 $0.40 $3.20 $1.10 134.4 1.01 37.8
Llama 4 Maverick Meta 18.4 15.6 $0.31 $0.91 $0.49 127.3 0.47 37.8
GLM-4.7 (Non-reasoning) Z AI 34.2 32.0 $0.55 $2.15 $0.94 82.6 0.66 36.5
Qwen3 Omni 30B A3B (Reasoning) Alibaba 15.6 12.7 $0.25 $0.97 $0.43 102.6 0.95 36.3
Qwen3 1.7B (Non-reasoning) Alibaba 6.8 2.3 $0.11 $0.42 $0.19 136.5 0.89 36.2
Qwen3 30B A3B (Non-reasoning) Alibaba 12.5 13.3 $0.20 $0.80 $0.35 61.4 1.02 35.7
Qwen3 4B (Reasoning) Alibaba 14.2 $0.11 $1.26 $0.40 103.8 0.96 35.7
Hermes 3 - Llama-3.1 70B Nous Research 10.6 $0.30 $0.30 $0.30 41.5 0.32 35.3
Nova 2.0 Lite (medium) Amazon 29.7 23.9 $0.30 $2.50 $0.85 238.8 10.21 34.9
Reka Flash (Sep '24) Reka AI 12.0 $0.20 $0.80 $0.35 85.4 1.27 34.3
Qwen3 8B (Non-reasoning) Alibaba 10.6 7.1 $0.18 $0.70 $0.31 82.2 1.06 34.2
Mistral Small (Sep '24) Mistral 10.2 $0.20 $0.60 $0.30 170.1 0.41 34.0
NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning) NVIDIA 10.1 5.9 $0.20 $0.60 $0.30 138.7 0.55 33.7
DeepSeek V3.1 (Non-reasoning) DeepSeek 28.1 28.4 $0.56 $1.68 $0.84 0.0 0.00 33.5
Qwen3.5 397B A17B (Reasoning) Alibaba 45.0 41.3 $0.60 $3.60 $1.35 54.1 1.47 33.3
GLM-4.6 (Reasoning) Z AI 32.5 29.5 $0.57 $2.20 $0.98 87.4 0.77 33.1
Nova 2.0 Omni (medium) Amazon 28.0 15.1 $0.30 $2.50 $0.85 0.0 0.00 32.9
MiMo-V2-Pro Xiaomi 49.2 41.4 $1.00 $3.00 $1.50 92.9 1.55 32.8
GPT-4.1 mini OpenAI 22.9 18.5 $0.40 $1.60 $0.70 70.2 0.52 32.7
Qwen3.5 122B A10B (Non-reasoning) Alibaba 35.9 31.6 $0.40 $3.20 $1.10 127.1 0.99 32.6
GLM-5 (Reasoning) Z AI 49.8 44.2 $1.00 $3.20 $1.55 68.8 0.95 32.1
Jamba 1.5 Mini AI21 Labs 8.0 $0.20 $0.40 $0.25 0.0 0.00 32.0
Gemini 2.5 Flash (Reasoning) Google 27.0 22.2 $0.30 $2.50 $0.85 219.7 12.97 31.8
DeepSeek V3.1 (Reasoning) DeepSeek 27.7 29.7 $0.60 $1.70 $0.88 0.0 0.00 31.7
Jamba 1.6 Mini AI21 Labs 7.9 $0.20 $0.40 $0.25 182.9 0.64 31.6
GLM-4.5 (Reasoning) Z AI 26.4 26.3 $0.49 $1.90 $0.84 39.3 0.99 31.3
Gemini 3 Flash Preview (Non-reasoning) Google 35.0 37.8 $0.50 $3.00 $1.13 190.8 0.73 31.1
Kimi K2.5 (Non-reasoning) Kimi 37.3 25.8 $0.60 $3.00 $1.20 35.2 1.91 31.1
ERNIE 4.5 300B A47B Baidu 15.0 14.5 $0.28 $1.10 $0.48 31.5 1.85 30.9
Mistral Large 3 Mistral 22.8 22.7 $0.50 $1.50 $0.75 58.8 0.53 30.4
Qwen3 0.6B (Non-reasoning) Alibaba 5.7 1.4 $0.11 $0.42 $0.19 222.7 0.87 30.3
GLM-4.6 (Non-reasoning) Z AI 30.2 30.2 $0.60 $2.20 $1.00 80.4 2.31 30.2
GPT-5 mini (minimal) OpenAI 20.7 21.9 $0.25 $2.00 $0.69 75.8 0.85 30.1
Qwen3 30B A3B 2507 (Reasoning) Alibaba 22.4 14.7 $0.20 $2.40 $0.75 142.5 0.99 29.9
Qwen3.5 397B A17B (Non-reasoning) Alibaba 40.1 37.4 $0.60 $3.60 $1.35 55.4 1.49 29.7
Mistral 7B Instruct Mistral 7.4 $0.25 $0.25 $0.25 162.2 0.27 29.6
Nova 2.0 Lite (low) Amazon 24.6 13.6 $0.30 $2.50 $0.85 209.0 5.05 28.9
GPT-5.4 mini (xhigh) OpenAI 48.1 51.5 $0.75 $4.50 $1.69 218.5 6.57 28.5
Nova 2.0 Omni (low) Amazon 23.2 13.9 $0.30 $2.50 $0.85 0.0 0.00 27.3
Kimi K2 0905 Kimi 30.9 25.9 $0.80 $2.25 $1.14 51.6 0.81 27.2
Reka Flash 3 Reka AI 9.5 8.9 $0.20 $0.80 $0.35 55.2 1.28 27.1
Mistral Medium 3.1 Mistral 21.3 18.3 $0.40 $2.00 $0.80 78.7 0.40 26.6
QwQ 32B Alibaba 19.7 $0.66 $1.00 $0.74 32.3 0.46 26.4
DeepSeek V3 (Dec '24) DeepSeek 16.5 16.4 $0.40 $0.89 $0.63 0.0 0.00 26.4
Qwen3 VL 30B A3B (Reasoning) Alibaba 19.7 13.1 $0.20 $2.40 $0.75 127.6 0.98 26.3
Kimi K2 Kimi 26.3 22.1 $0.57 $2.40 $1.00 37.6 0.95 26.2
GLM-5 (Non-reasoning) Z AI 40.6 39.0 $1.00 $3.20 $1.55 67.6 1.07 26.2
MiniMax M1 80k MiniMax 24.4 14.5 $0.55 $2.20 $0.96 0.0 0.00 25.3
Qwen3 VL 8B (Reasoning) Alibaba 16.7 9.8 $0.18 $2.10 $0.66 136.8 1.05 25.3
Qwen3 Omni 30B A3B Instruct Alibaba 10.7 7.2 $0.25 $0.97 $0.43 105.7 0.88 24.9
Claude 3 Haiku Anthropic 12.3 6.7 $0.25 $1.25 $0.50 130.0 0.46 24.6
Magistral Small 1.2 Mistral 18.2 14.8 $0.50 $1.50 $0.75 104.0 0.33 24.3
Gemini 2.5 Flash (Non-reasoning) Google 20.6 17.8 $0.30 $2.50 $0.85 201.8 0.47 24.2
Mistral Medium 3 Mistral 18.8 13.6 $0.40 $2.00 $0.80 45.3 0.42 23.5
Devstral Medium Mistral 18.7 15.9 $0.40 $2.00 $0.80 138.0 0.41 23.4
Qwen3 Next 80B A3B Instruct Alibaba 20.1 15.3 $0.50 $2.00 $0.88 142.5 0.95 23.0
Llama 3.3 Instruct 70B Meta 14.5 10.7 $0.58 $0.71 $0.64 92.7 0.54 22.7
GPT-5.4 mini (medium) OpenAI 37.7 37.5 $0.75 $4.50 $1.69 213.7 6.99 22.3
Llama 3.1 Instruct 70B Meta 12.5 10.9 $0.56 $0.56 $0.56 31.4 0.56 22.3
Qwen3 Coder 30B A3B Instruct Alibaba 20.0 19.4 $0.45 $2.25 $0.90 25.4 1.45 22.2
Nova 2.0 Lite (Non-reasoning) Amazon 18.0 12.5 $0.30 $2.50 $0.85 176.3 0.55 21.2
Qwen3 14B (Non-reasoning) Alibaba 12.8 12.4 $0.35 $1.40 $0.61 64.6 0.99 20.9
Qwen3 235B A22B 2507 Instruct Alibaba 25.0 22.1 $0.70 $2.80 $1.23 69.6 1.11 20.4
Qwen3 30B A3B (Reasoning) Alibaba 15.3 11.0 $0.20 $2.40 $0.75 59.1 1.09 20.4
Qwen3 1.7B (Reasoning) Alibaba 8.0 1.4 $0.11 $1.26 $0.40 137.8 0.90 20.1
Qwen3 8B (Reasoning) Alibaba 13.2 9.0 $0.18 $2.10 $0.66 83.4 0.93 20.0
Nova 2.0 Omni (Non-reasoning) Amazon 16.6 13.8 $0.30 $2.50 $0.85 224.3 0.62 19.5
Claude 4.5 Haiku (Reasoning) Anthropic 37.1 32.6 $1.00 $5.00 $2.00 144.0 11.28 18.6
DeepSeek R1 Distill Llama 70B DeepSeek 16.0 11.4 $0.70 $1.05 $0.88 54.3 0.79 18.3
DeepSeek V3 0324 DeepSeek 22.3 22.0 $1.25 $1.45 $1.25 0.0 0.00 17.8
o4-mini (high) OpenAI 33.1 25.6 $1.10 $4.40 $1.93 128.7 20.69 17.2
Qwen3 VL 235B A22B Instruct Alibaba 20.8 16.5 $0.70 $2.80 $1.23 57.5 1.03 17.0
GLM-4.5V (Reasoning) Z AI 15.1 10.9 $0.60 $1.80 $0.90 45.3 0.96 16.8
Llama 3.1 Nemotron Ultra 253B v1 (Reasoning) NVIDIA 15.0 13.1 $0.60 $1.80 $0.90 42.3 0.66 16.7
Qwen3 Max Thinking Alibaba 39.9 30.5 $1.20 $6.00 $2.40 34.6 1.64 16.6
Llama 3.2 Instruct 90B (Vision) Meta 11.9 $0.72 $0.72 $0.72 41.6 0.38 16.5
Qwen3 0.6B (Reasoning) Alibaba 6.5 0.9 $0.11 $1.26 $0.40 222.7 0.90 16.3
Grok 4.20 Beta 0309 (Reasoning) xAI 48.5 42.2 $2.00 $6.00 $3.00 218.5 14.99 16.2
Claude 4.5 Haiku (Non-reasoning) Anthropic 31.1 29.6 $1.00 $5.00 $2.00 99.6 0.51 15.6
Mixtral 8x7B Instruct Mistral 7.7 $0.54 $0.60 $0.54 0.0 0.00 14.3
Qwen3 Next 80B A3B (Reasoning) Alibaba 26.7 19.5 $0.50 $6.00 $1.88 142.0 1.03 14.2
GLM-4.5V (Non-reasoning) Z AI 12.7 10.8 $0.60 $1.80 $0.90 51.5 24.57 14.1
Qwen3 VL 32B Instruct Alibaba 17.2 15.6 $0.70 $2.80 $1.23 69.3 1.11 14.0
Qwen3 235B A22B (Non-reasoning) Alibaba 17.0 14.0 $0.70 $2.80 $1.23 44.2 1.18 13.9
GPT-5.1 (high) OpenAI 47.7 44.7 $1.25 $10.00 $3.44 92.2 37.94 13.9
GPT-5.4 mini (Non-Reasoning) OpenAI 23.3 25.3 $0.75 $4.50 $1.69 191.4 0.43 13.8
Qwen3 Max Thinking (Preview) Alibaba 32.5 24.5 $1.20 $6.00 $2.40 41.6 1.71 13.5
o3-mini OpenAI 25.9 17.9 $1.10 $4.40 $1.93 137.6 6.94 13.5
o3-mini (high) OpenAI 25.2 17.3 $1.10 $4.40 $1.93 133.8 24.66 13.1
Qwen3 Max Alibaba 31.4 26.4 $1.20 $6.00 $2.40 32.4 1.76 13.1
GPT-5 (high) OpenAI 44.6 36.0 $1.25 $10.00 $3.44 80.9 91.36 13.0
GPT-5 Codex (high) OpenAI 44.6 38.9 $1.25 $10.00 $3.44 190.0 8.57 13.0
Gemini 3.1 Pro Preview Google 57.2 55.5 $2.00 $12.00 $4.50 113.4 23.00 12.7
GPT-5.1 Codex (high) OpenAI 43.1 36.6 $1.25 $10.00 $3.44 118.1 4.29 12.5
Hermes 4 - Llama-3.1 405B (Reasoning) Nous Research 18.6 16.0 $1.00 $3.00 $1.50 32.5 0.75 12.4
Qwen3 14B (Reasoning) Alibaba 16.2 13.1 $0.35 $4.20 $1.31 63.4 1.06 12.3
GPT-5 (medium) OpenAI 42.0 39.0 $1.25 $10.00 $3.44 84.7 52.77 12.2
GPT-3.5 Turbo OpenAI 9.0 10.7 $0.50 $1.50 $0.75 98.1 0.40 12.0
Qwen3 32B (Non-reasoning) Alibaba 14.5 $0.70 $2.80 $1.23 105.2 0.97 11.8
Hermes 4 - Llama-3.1 405B (Non-reasoning) Nous Research 17.6 18.1 $1.00 $3.00 $1.50 31.4 0.69 11.7
Claude 3.5 Haiku Anthropic 18.7 10.7 $0.80 $4.00 $1.60 0.0 0.00 11.7
DeepSeek R1 0528 (May '25) DeepSeek 27.1 24.0 $1.35 $5.40 $2.36 0.0 0.00 11.5
GPT-5 (low) OpenAI 39.2 30.7 $1.25 $10.00 $3.44 69.7 11.92 11.4
Qwen3 235B A22B 2507 (Reasoning) Alibaba 29.5 23.2 $0.70 $8.40 $2.63 41.9 1.26 11.2
GPT-5.3 Codex (xhigh) OpenAI 54.0 53.1 $1.75 $14.00 $4.81 77.7 80.44 11.2
Llama 3.1 Nemotron Instruct 70B NVIDIA 13.4 10.8 $1.20 $1.20 $1.20 31.6 0.35 11.2
o3 OpenAI 38.4 38.4 $2.00 $8.00 $3.50 71.6 9.65 11.0
Qwen3 Max (Preview) Alibaba 26.1 25.5 $1.20 $6.00 $2.40 44.9 1.70 10.9
Gemini 3 Pro Preview (high) Google 48.4 46.5 $2.00 $12.00 $4.50 114.8 32.56 10.8
GPT-5.2 (xhigh) OpenAI 51.3 48.7 $1.75 $14.00 $4.81 71.7 72.39 10.7
Qwen3 VL 235B A22B (Reasoning) Alibaba 27.6 20.9 $0.70 $8.40 $2.63 51.5 1.15 10.5
Nova 2.0 Pro Preview (medium) Amazon 35.7 30.4 $1.25 $10.00 $3.44 150.5 11.82 10.4
Llama 3 Instruct 70B Meta 8.9 6.8 $0.58 $1.75 $0.87 38.7 0.51 10.2
GPT-5.2 Codex (xhigh) OpenAI 49.0 43.0 $1.75 $14.00 $4.81 99.6 5.48 10.2
GPT-5.4 (xhigh) OpenAI 57.2 57.3 $2.50 $15.00 $5.63 74.1 179.07 10.2
Gemini 2.5 Pro Google 34.6 31.9 $1.25 $10.00 $3.44 125.4 24.01 10.1
Grok 4.20 Beta 0309 (Non-reasoning) xAI 29.7 25.4 $2.00 $6.00 $3.00 190.9 0.35 9.9
Command-R (Mar '24) Cohere 7.4 $0.50 $1.50 $0.75 0.0 0.00 9.9
Magistral Medium 1.2 Mistral 27.1 21.7 $2.00 $5.00 $2.75 92.6 0.43 9.9
GPT-5.2 (medium) OpenAI 46.6 44.2 $1.75 $14.00 $4.81 0.0 0.00 9.7
Nova Pro Amazon 13.5 11.0 $0.80 $3.20 $1.40 0.0 0.00 9.6
Qwen3 VL 32B (Reasoning) Alibaba 24.7 14.5 $0.70 $8.40 $2.63 96.0 1.08 9.4
Nova 2.0 Pro Preview (low) Amazon 31.9 24.5 $1.25 $10.00 $3.44 171.5 4.57 9.3
Gemini 3 Pro Preview (low) Google 41.3 39.4 $2.00 $12.00 $4.50 114.2 3.99 9.2
Claude Sonnet 4.6 (Adaptive Reasoning, Max Effort) Anthropic 51.7 50.9 $3.00 $15.00 $6.00 72.5 33.23 8.6
Gemini 2.5 Pro Preview (May' 25) Google 29.5 $1.25 $10.00 $3.44 0.0 0.00 8.6
Qwen3 Coder 480B A35B Instruct Alibaba 24.8 24.6 $1.50 $7.50 $3.00 61.5 1.65 8.3
GPT-5.1 (Non-reasoning) OpenAI 27.4 27.3 $1.25 $10.00 $3.44 88.9 0.68 8.0
DeepSeek R1 (Jan '25) DeepSeek 18.8 15.9 $1.35 $4.00 $2.36 0.0 0.00 8.0
Qwen3 235B A22B (Reasoning) Alibaba 19.8 17.4 $0.70 $8.40 $2.63 48.8 1.16 7.5
GPT-4.1 OpenAI 26.3 21.8 $2.00 $8.00 $3.50 85.4 0.52 7.5
Claude Sonnet 4.6 (Non-reasoning, High Effort) Anthropic 44.4 46.4 $3.00 $15.00 $6.00 53.2 1.41 7.4
Claude 4.5 Sonnet (Reasoning) Anthropic 43.0 38.6 $3.00 $15.00 $6.00 53.5 7.94 7.2
Claude Sonnet 4.6 (Non-reasoning, Low Effort) Anthropic 42.6 43.0 $3.00 $15.00 $6.00 52.3 1.21 7.1
GPT-5.2 (Non-reasoning) OpenAI 33.6 34.7 $1.75 $14.00 $4.81 69.0 0.62 7.0
GPT-5 (minimal) OpenAI 23.9 25.1 $1.25 $10.00 $3.44 61.4 1.08 7.0
Grok 4 xAI 41.5 40.5 $3.00 $15.00 $6.00 45.4 9.17 6.9
Nova 2.0 Pro Preview (Non-reasoning) Amazon 23.1 20.5 $1.25 $10.00 $3.44 162.4 0.46 6.7
Claude 4 Sonnet (Reasoning) Anthropic 38.7 34.1 $3.00 $15.00 $6.00 51.7 10.99 6.5
GPT-5 (ChatGPT) OpenAI 21.8 21.2 $1.25 $10.00 $3.44 142.5 0.55 6.3
GPT-5.4 (Non-reasoning) OpenAI 35.4 41.0 $2.50 $15.00 $5.63 64.2 0.64 6.3
Qwen3 32B (Reasoning) Alibaba 16.5 13.8 $0.70 $8.40 $2.63 102.5 1.02 6.3
Claude 4.5 Sonnet (Non-reasoning) Anthropic 37.1 33.5 $3.00 $15.00 $6.00 54.3 1.49 6.2
Mistral Small (Feb '24) Mistral 9.0 $1.00 $3.00 $1.50 160.9 0.42 6.0
Qwen2.5 Max Alibaba 16.3 $1.60 $6.40 $2.80 47.3 1.10 5.8
Claude 3.7 Sonnet (Reasoning) Anthropic 34.7 27.6 $3.00 $15.00 $6.00 0.0 0.00 5.8
Apertus 70B Instruct Swiss AI Initiative 7.7 1.9 $0.82 $2.92 $1.34 63.1 1.67 5.7
Claude 4 Sonnet (Non-reasoning) Anthropic 33.0 30.6 $3.00 $15.00 $6.00 51.8 1.18 5.5
Claude Opus 4.6 (Adaptive Reasoning, Max Effort) Anthropic 53.0 48.1 $5.00 $25.00 $10.00 51.0 11.89 5.3
Claude 3.7 Sonnet (Non-reasoning) Anthropic 30.8 26.7 $3.00 $15.00 $6.00 0.0 0.00 5.1
Mistral Large 2 (Nov '24) Mistral 15.1 13.8 $2.00 $6.00 $3.00 38.7 0.46 5.0
Claude Opus 4.5 (Reasoning) Anthropic 49.7 47.8 $5.00 $25.00 $10.00 59.4 10.05 5.0
Llama 3.1 Instruct 405B Meta 17.4 14.5 $2.75 $6.50 $3.69 32.0 0.49 4.7
Pixtral Large Mistral 14.0 $2.00 $6.00 $3.00 54.3 0.41 4.7
Claude Opus 4.6 (Non-reasoning, High Effort) Anthropic 46.5 47.6 $5.00 $25.00 $10.00 50.7 2.10 4.7
Mistral Large 2 (Jul '24) Mistral 13.0 $2.00 $6.00 $3.00 0.0 0.00 4.3
Claude Opus 4.5 (Non-reasoning) Anthropic 43.1 42.9 $5.00 $25.00 $10.00 60.3 1.35 4.3
GPT-4o (Aug '24) OpenAI 18.6 16.6 $2.50 $10.00 $4.38 80.6 0.58 4.3
Grok 3 xAI 25.2 19.8 $3.00 $15.00 $6.00 68.6 0.33 4.2
GPT-4o (Nov '24) OpenAI 17.3 16.7 $2.50 $10.00 $4.38 119.7 0.48 4.0
Nova Premier Amazon 19.0 13.8 $2.50 $12.50 $5.00 66.7 0.88 3.8
Jamba 1.7 Large AI21 Labs 10.9 7.8 $2.00 $8.00 $3.50 52.6 0.73 3.1
Command A Cohere 13.5 9.9 $2.50 $10.00 $4.38 51.6 0.52 3.1
Jamba 1.5 Large AI21 Labs 10.7 $2.00 $8.00 $3.50 0.0 0.00 3.1
Jamba 1.6 Large AI21 Labs 10.6 $2.00 $8.00 $3.50 56.2 0.74 3.0
Claude 3.5 Sonnet (Oct '24) Anthropic 15.9 30.2 $3.00 $15.00 $6.00 0.0 0.00 2.6
Claude 3.5 Sonnet (June '24) Anthropic 14.2 26.0 $3.00 $15.00 $6.00 0.0 0.00 2.4
Mistral Medium Mistral 9.0 $2.75 $8.10 $4.09 75.4 0.39 2.2
GPT-4o (May '24) OpenAI 14.5 24.2 $5.00 $15.00 $7.50 65.1 0.69 1.9
Claude 3 Sonnet Anthropic 10.3 $3.00 $15.00 $6.00 0.0 0.00 1.7
Mistral Large (Feb '24) Mistral 9.9 $4.00 $12.00 $6.00 0.0 0.00 1.7
Claude 4.1 Opus (Reasoning) Anthropic 42.0 36.5 $15.00 $75.00 $30.00 40.4 9.50 1.4
Command-R+ (Apr '24) Cohere 8.3 $3.00 $15.00 $6.00 0.0 0.00 1.4
Claude 4 Opus (Reasoning) Anthropic 39.0 34.0 $15.00 $75.00 $30.00 40.5 7.70 1.3
Claude 4.1 Opus (Non-reasoning) Anthropic 36.0 $15.00 $75.00 $30.00 37.3 1.33 1.2
o1 OpenAI 30.8 20.5 $15.00 $60.00 $26.25 95.7 21.09 1.2
o3-pro OpenAI 40.7 $20.00 $80.00 $35.00 16.7 108.78 1.2
Claude 4 Opus (Non-reasoning) Anthropic 33.0 $15.00 $75.00 $30.00 37.6 1.25 1.1
GPT-4 Turbo OpenAI 13.7 21.5 $10.00 $30.00 $15.00 23.0 1.00 0.9
o1-preview OpenAI 23.7 34.0 $16.50 $66.00 $28.88 0.0 0.00 0.8
Claude 3 Opus Anthropic 18.0 19.5 $15.00 $75.00 $30.00 0.0 0.00 0.6
GPT-4 OpenAI 12.8 13.1 $30.00 $60.00 $37.50 33.5 0.74 0.3
o1-pro OpenAI 25.8 $150.00 $600.00 $262.50 0.0 0.00 0.1
Grok-1 xAI 11.7 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemma 3 4B Instruct Google 6.3 2.9 $0.00 $0.00 $0.00 31.7 1.18 0.0
Gemma 3 270M Google 7.7 0.0 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemma 3n E2B Instruct Google 4.8 2.2 $0.00 $0.00 $0.00 38.3 0.40 0.0
Gemma 3 27B Instruct Google 10.3 9.6 $0.00 $0.00 $0.00 29.7 2.06 0.0
Gemma 3 12B Instruct Google 8.8 6.3 $0.00 $0.00 $0.00 31.5 8.04 0.0
Gemma 3 1B Instruct Google 5.5 0.2 $0.00 $0.00 $0.00 38.8 0.59 0.0
Devstral 2 Mistral 22.0 23.7 $0.00 $0.00 $0.00 84.7 0.37 0.0
Devstral Small 2 Mistral 19.5 20.7 $0.00 $0.00 $0.00 196.8 0.34 0.0
DeepSeek V3.2 Speciale DeepSeek 29.4 37.9 $0.00 $0.00 $0.00 0.0 0.00 0.0
DeepSeek R1 0528 Qwen3 8B DeepSeek 16.4 7.8 $0.00 $0.00 $0.00 0.0 0.00 0.0
R1 1776 Perplexity 12.0 $0.00 $0.00 $0.00 0.0 0.00 0.0
Falcon-H1R-7B TII UAE 15.8 9.8 $0.00 $0.00 $0.00 0.0 0.00 0.0
Phi-4 Mini Instruct Microsoft Azure 8.4 3.6 $0.00 $0.00 $0.00 43.9 0.32 0.0
Phi-4 Multimodal Instruct Microsoft Azure 10.0 $0.00 $0.00 $0.00 16.5 0.35 0.0
LFM2.5-1.2B-Instruct Liquid AI 8.0 0.8 $0.00 $0.00 $0.00 0.0 0.00 0.0
LFM2.5-VL-1.6B Liquid AI 6.2 1.0 $0.00 $0.00 $0.00 0.0 0.00 0.0
LFM2 2.6B Liquid AI 8.0 1.4 $0.00 $0.00 $0.00 0.0 0.00 0.0
LFM2.5-1.2B-Thinking Liquid AI 8.1 1.4 $0.00 $0.00 $0.00 0.0 0.00 0.0
LFM2 8B A1B Liquid AI 7.0 2.3 $0.00 $0.00 $0.00 0.0 0.00 0.0
Solar Open 100B (Reasoning) Upstage 21.7 10.5 $0.00 $0.00 $0.00 0.0 0.00 0.0
Solar Pro 2 (Non-reasoning) Upstage 13.6 11.3 $0.00 $0.00 $0.00 0.0 0.00 0.0
Solar Pro 2 (Reasoning) Upstage 14.9 12.1 $0.00 $0.00 $0.00 0.0 0.00 0.0
Llama 3.3 Nemotron Super 49B v1 (Non-reasoning) NVIDIA 14.3 7.6 $0.00 $0.00 $0.00 0.0 0.00 0.0
Llama 3.3 Nemotron Super 49B v1 (Reasoning) NVIDIA 18.5 9.4 $0.00 $0.00 $0.00 0.0 0.00 0.0
NVIDIA Nemotron 3 Nano 4B NVIDIA 14.7 10.0 $0.00 $0.00 $0.00 0.0 0.00 0.0
Llama 3.1 Nemotron Nano 4B v1.1 (Reasoning) NVIDIA 14.4 $0.00 $0.00 $0.00 0.0 0.00 0.0
Kimi Linear 48B A3B Instruct Kimi 14.4 14.2 $0.00 $0.00 $0.00 0.0 0.00 0.0
Step3 VL 10B StepFun 15.4 13.9 $0.00 $0.00 $0.00 0.0 0.00 0.0
Molmo 7B-D Allen Institute for AI 9.2 1.2 $0.00 $0.00 $0.00 0.0 0.00 0.0
Molmo2-8B Allen Institute for AI 7.3 4.4 $0.00 $0.00 $0.00 136.3 0.44 0.0
Olmo 3.1 32B Think Allen Institute for AI 13.9 9.8 $0.00 $0.00 $0.00 92.2 0.65 0.0
Olmo 3 7B Think Allen Institute for AI 9.4 7.6 $0.00 $0.00 $0.00 0.0 0.00 0.0
Granite 4.0 Micro IBM 7.7 5.0 $0.00 $0.00 $0.00 0.0 0.00 0.0
Granite 4.0 1B IBM 7.3 2.9 $0.00 $0.00 $0.00 0.0 0.00 0.0
Granite 4.0 H 350M IBM 5.4 0.6 $0.00 $0.00 $0.00 0.0 0.00 0.0
Granite 4.0 H 1B IBM 8.0 2.7 $0.00 $0.00 $0.00 0.0 0.00 0.0
Granite 4.0 350M IBM 6.1 0.3 $0.00 $0.00 $0.00 0.0 0.00 0.0
DeepHermes 3 - Llama-3.1 8B Preview (Non-reasoning) Nous Research 7.6 $0.00 $0.00 $0.00 0.0 0.00 0.0
DeepHermes 3 - Mistral 24B Preview (Non-reasoning) Nous Research 10.9 $0.00 $0.00 $0.00 0.0 0.00 0.0
Exaone 4.0 1.2B (Non-reasoning) LG AI Research 8.1 2.5 $0.00 $0.00 $0.00 0.0 0.00 0.0
EXAONE 4.0 32B (Non-reasoning) LG AI Research 11.7 9.4 $0.00 $0.00 $0.00 0.0 0.00 0.0
Llama 65B Meta 7.4 $0.00 $0.00 $0.00 0.0 0.00 0.0
Exaone 4.0 1.2B (Reasoning) LG AI Research 8.3 3.1 $0.00 $0.00 $0.00 0.0 0.00 0.0
EXAONE 4.0 32B (Reasoning) LG AI Research 16.7 14.0 $0.00 $0.00 $0.00 0.0 0.00 0.0
K-EXAONE (Non-reasoning) LG AI Research 23.4 13.5 $0.00 $0.00 $0.00 0.0 0.00 0.0
K-EXAONE (Reasoning) LG AI Research 32.1 27.0 $0.00 $0.00 $0.00 0.0 0.00 0.0
MiMo-V2-Omni Xiaomi 43.4 35.5 $0.00 $0.00 $0.00 0.0 0.00 0.0
ERNIE 5.0 Thinking Preview Baidu 29.1 29.2 $0.00 $0.00 $0.00 0.0 0.00 0.0
Sarvam 105B (high) Sarvam 18.2 9.8 $0.00 $0.00 $0.00 87.9 1.76 0.0
Sarvam 30B (high) Sarvam 12.3 7.9 $0.00 $0.00 $0.00 283.2 1.55 0.0
INTELLECT-3 Prime Intellect 22.2 19.1 $0.00 $0.00 $0.00 0.0 0.00 0.0
Motif-2-12.7B-Reasoning Motif Technologies 19.1 11.9 $0.00 $0.00 $0.00 0.0 0.00 0.0
K2-V2 (low) MBZUAI Institute of Foundation Models 14.4 10.5 $0.00 $0.00 $0.00 0.0 0.00 0.0
K2-V2 (high) MBZUAI Institute of Foundation Models 20.6 16.1 $0.00 $0.00 $0.00 0.0 0.00 0.0
K2 Think V2 MBZUAI Institute of Foundation Models 24.1 15.5 $0.00 $0.00 $0.00 0.0 0.00 0.0
K2-V2 (medium) MBZUAI Institute of Foundation Models 18.7 14.0 $0.00 $0.00 $0.00 0.0 0.00 0.0
Mi:dm K 2.5 Pro Korea Telecom 23.1 12.6 $0.00 $0.00 $0.00 0.0 0.00 0.0
HyperCLOVA X SEED Think (32B) Naver 23.7 17.5 $0.00 $0.00 $0.00 0.0 0.00 0.0
LongCat Flash Lite LongCat 23.9 16.5 $0.00 $0.00 $0.00 124.7 3.04 0.0
Tri-21B-think Preview Trillion Labs 20.0 7.4 $0.00 $0.00 $0.00 0.0 0.00 0.0
Tri-21B-Think Trillion Labs 18.6 6.3 $0.00 $0.00 $0.00 0.0 0.00 0.0
Nanbeige4.1-3B Nanbeige 16.1 8.9 $0.00 $0.00 $0.00 0.0 0.00 0.0
GLM-5-Turbo Z AI 46.8 36.8 $0.00 $0.00 $0.00 0.0 0.00 0.0
Qwen Chat 14B Alibaba 7.4 $0.00 $0.00 $0.00 0.0 0.00 0.0
Tiny Aya Global Cohere 4.7 1.2 $0.00 $0.00 $0.00 0.0 0.00 0.0
Apriel-v1.6-15B-Thinker ServiceNow 27.6 22.0 $0.00 $0.00 $0.00 134.6 0.25 0.0
Jamba 1.7 Mini AI21 Labs 8.1 3.1 $0.00 $0.00 $0.00 0.0 0.00 0.0
Jamba Reasoning 3B AI21 Labs 9.6 2.5 $0.00 $0.00 $0.00 0.0 0.00 0.0
Qwen3.5 0.8B (Reasoning) Alibaba 10.5 0.0 $0.00 $0.00 $0.00 0.0 0.00 0.0
Qwen3.5 2B (Reasoning) Alibaba 16.3 3.5 $0.00 $0.00 $0.00 0.0 0.00 0.0
Qwen3.5 2B (Non-reasoning) Alibaba 14.7 4.9 $0.00 $0.00 $0.00 0.0 0.00 0.0
Qwen3.5 9B (Non-reasoning) Alibaba 27.3 21.4 $0.00 $0.00 $0.00 0.0 0.00 0.0
Qwen3.5 4B (Non-reasoning) Alibaba 22.6 13.7 $0.00 $0.00 $0.00 0.0 0.00 0.0
Qwen3.5 0.8B (Non-reasoning) Alibaba 9.9 1.0 $0.00 $0.00 $0.00 0.0 0.00 0.0
Qwen3.5 4B (Reasoning) Alibaba 27.1 17.5 $0.00 $0.00 $0.00 0.0 0.00 0.0
Ling-mini-2.0 InclusionAI 9.2 5.0 $0.00 $0.00 $0.00 0.0 0.00 0.0
Ling-1T InclusionAI 19.0 18.8 $0.00 $0.00 $0.00 0.0 0.00 0.0
Ring-1T InclusionAI 22.8 16.8 $0.00 $0.00 $0.00 0.0 0.00 0.0
Doubao Seed Code ByteDance Seed 33.5 31.3 $0.00 $0.00 $0.00 0.0 0.00 0.0
o1-mini OpenAI 20.4 $0.00 $0.00 $0.00 0.0 0.00 0.0
GPT-4.5 (Preview) OpenAI 20.0 $0.00 $0.00 $0.00 0.0 0.00 0.0
GPT-4o (March 2025, chatgpt-4o-latest) OpenAI 18.6 $0.00 $0.00 $0.00 0.0 0.00 0.0
GPT-4o (ChatGPT) OpenAI 14.1 $0.00 $0.00 $0.00 0.0 0.00 0.0
Llama 2 Chat 70B Meta 8.4 $0.00 $0.00 $0.00 0.0 0.00 0.0
Llama 2 Chat 13B Meta 8.4 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemini 2.0 Pro Experimental (Feb '25) Google 18.1 25.5 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemini 2.0 Flash (experimental) Google 16.8 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemini 1.5 Pro (Sep '24) Google 16.0 23.6 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemini 2.0 Flash-Lite (Preview) Google 14.5 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemini 1.5 Flash (Sep '24) Google 13.8 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemini 1.5 Flash-8B Google 11.1 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemini 1.0 Pro Google 8.5 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemini 2.5 Flash Preview (Reasoning) Google 24.3 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemini 2.5 Flash Preview (Sep '25) (Non-reasoning) Google 25.7 22.1 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemini 1.0 Ultra Google 10.1 17.6 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemini 1.5 Flash (May '24) Google 10.5 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemini 2.0 Flash-Lite (Feb '25) Google 14.7 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemini 2.5 Flash Preview (Non-reasoning) Google 17.8 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemini 2.0 Flash Thinking Experimental (Dec '24) Google 12.3 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemini 2.0 Flash Thinking Experimental (Jan '25) Google 19.6 24.1 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemini 2.5 Flash Preview (Sep '25) (Reasoning) Google 31.1 24.6 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemma 3n E4B Instruct Preview (May '25) Google 10.1 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemini 1.5 Pro (May '24) Google 12.0 19.8 $0.00 $0.00 $0.00 0.0 0.00 0.0
PALM-2 Google 8.6 4.6 $0.00 $0.00 $0.00 0.0 0.00 0.0
Gemini 2.5 Pro Preview (Mar' 25) Google 30.3 46.7 $0.00 $0.00 $0.00 0.0 0.00 0.0
Claude Instant Anthropic 7.4 7.8 $0.00 $0.00 $0.00 0.0 0.00 0.0
Claude 2.1 Anthropic 9.3 14.0 $0.00 $0.00 $0.00 0.0 0.00 0.0
Claude 2.0 Anthropic 9.1 12.9 $0.00 $0.00 $0.00 0.0 0.00 0.0
Mixtral 8x22B Instruct Mistral 9.8 $0.00 $0.00 $0.00 0.0 0.00 0.0
Mistral Saba Mistral 12.1 $0.00 $0.00 $0.00 0.0 0.00 0.0
Magistral Small 1 Mistral 16.8 11.1 $0.00 $0.00 $0.00 0.0 0.00 0.0
Magistral Medium 1 Mistral 18.8 16.0 $0.00 $0.00 $0.00 0.0 0.00 0.0
DeepSeek R1 Distill Qwen 14B DeepSeek 15.8 $0.00 $0.00 $0.00 0.0 0.00 0.0
DeepSeek-V2.5 (Dec '24) DeepSeek 12.5 $0.00 $0.00 $0.00 0.0 0.00 0.0
DeepSeek-Coder-V2 DeepSeek 10.6 $0.00 $0.00 $0.00 0.0 0.00 0.0
DeepSeek R1 Distill Llama 8B DeepSeek 12.1 $0.00 $0.00 $0.00 0.0 0.00 0.0
DeepSeek LLM 67B Chat (V1) DeepSeek 8.4 $0.00 $0.00 $0.00 0.0 0.00 0.0
DeepSeek R1 Distill Qwen 1.5B DeepSeek 9.1 $0.00 $0.00 $0.00 0.0 0.00 0.0
DeepSeek-V2.5 DeepSeek 12.3 $0.00 $0.00 $0.00 0.0 0.00 0.0
DeepSeek Coder V2 Lite Instruct DeepSeek 8.5 $0.00 $0.00 $0.00 0.0 0.00 0.0
DeepSeek-V2-Chat DeepSeek 9.1 $0.00 $0.00 $0.00 0.0 0.00 0.0
Sonar Perplexity 15.5 $0.00 $0.00 $0.00 0.0 0.00 0.0
Sonar Reasoning Pro Perplexity 24.6 $0.00 $0.00 $0.00 0.0 0.00 0.0
Sonar Pro Perplexity 15.2 $0.00 $0.00 $0.00 0.0 0.00 0.0
Sonar Reasoning Perplexity 17.9 $0.00 $0.00 $0.00 0.0 0.00 0.0
Grok Beta xAI 13.3 $0.00 $0.00 $0.00 0.0 0.00 0.0
Grok 3 Reasoning Beta xAI 21.6 $0.00 $0.00 $0.00 0.0 0.00 0.0
Grok 2 (Dec '24) xAI 13.9 $0.00 $0.00 $0.00 0.0 0.00 0.0
OpenChat 3.5 (1210) OpenChat 8.3 $0.00 $0.00 $0.00 0.0 0.00 0.0
Phi-3 Mini Instruct 3.8B Microsoft Azure 10.1 3.0 $0.00 $0.00 $0.00 0.0 0.00 0.0
LFM 40B Liquid AI 8.8 $0.00 $0.00 $0.00 0.0 0.00 0.0
LFM2 1.2B Liquid AI 6.3 0.8 $0.00 $0.00 $0.00 0.0 0.00 0.0
Solar Pro 2 (Preview) (Reasoning) Upstage 18.8 $0.00 $0.00 $0.00 0.0 0.00 0.0
Solar Pro 2 (Preview) (Non-reasoning) Upstage 16.0 $0.00 $0.00 $0.00 0.0 0.00 0.0
DBRX Instruct Databricks 8.3 $0.00 $0.00 $0.00 0.0 0.00 0.0
MiniMax M1 40k MiniMax 20.9 14.1 $0.00 $0.00 $0.00 0.0 0.00 0.0
Llama 3.1 Tulu3 405B Allen Institute for AI 14.1 $0.00 $0.00 $0.00 0.0 0.00 0.0
OLMo 2 32B Allen Institute for AI 10.6 2.7 $0.00 $0.00 $0.00 0.0 0.00 0.0
OLMo 2 7B Allen Institute for AI 9.3 1.2 $0.00 $0.00 $0.00 0.0 0.00 0.0
Olmo 3 32B Think Allen Institute for AI 12.1 10.5 $0.00 $0.00 $0.00 0.0 0.00 0.0
Sarvam M (Reasoning) Sarvam 8.4 7.5 $0.00 $0.00 $0.00 0.0 0.00 0.0
Apriel-v1.5-15B-Thinker ServiceNow 28.3 18.7 $0.00 $0.00 $0.00 131.7 0.18 0.0
Arctic Instruct Snowflake 8.8 $0.00 $0.00 $0.00 0.0 0.00 0.0
Qwen2.5 Instruct 72B Alibaba 15.6 11.9 $0.00 $0.00 $0.00 54.2 1.10 0.0
Qwen2.5 Coder Instruct 32B Alibaba 12.9 $0.00 $0.00 $0.00 0.0 0.00 0.0
Qwen2 Instruct 72B Alibaba 11.7 $0.00 $0.00 $0.00 0.0 0.00 0.0
Qwen3 VL 4B (Reasoning) Alibaba 13.7 6.7 $0.00 $0.00 $0.00 0.0 0.00 0.0
Qwen2.5 Instruct 32B Alibaba 13.2 $0.00 $0.00 $0.00 0.0 0.00 0.0
Qwen3 4B 2507 (Reasoning) Alibaba 18.2 9.5 $0.00 $0.00 $0.00 0.0 0.00 0.0
Qwen3 VL 4B Instruct Alibaba 9.6 4.5 $0.00 $0.00 $0.00 0.0 0.00 0.0
Qwen2.5 Coder Instruct 7B Alibaba 10.0 $0.00 $0.00 $0.00 0.0 0.00 0.0
Qwen1.5 Chat 110B Alibaba 9.5 $0.00 $0.00 $0.00 0.0 0.00 0.0
Qwen Chat 72B Alibaba 8.8 $0.00 $0.00 $0.00 0.0 0.00 0.0
Qwen3 4B 2507 Instruct Alibaba 12.9 9.1 $0.00 $0.00 $0.00 0.0 0.00 0.0

最も高価なモデルが最善とは限らない

業界には、価格が高いほど品質が高いという根強い思い込みがあります。しかしデータは別の物語を語ります。上の表をバリュースコアで並べ替えると、上位にランクされるモデルの多くがフラッグシップモデルのほんの一部の価格でありながら、知能やコーディングのベンチマークで同等かそれ以上のスコアを出していることがわかります。

これは大規模になると重要です。アプリケーションが1日100万リクエストを処理する場合、$2/Mモデルと$0.20/Mモデルの差は丸め誤差ではなく、月に数千ドルの差になります。そして安価なモデルの方が実際に速い場合もあります。小型または最適化されたモデルはより高いスループットを達成することが多いからです。

大手プロバイダーのフラッグシップモデルは最難関のベンチマーク(先端数学、博士レベルの科学、複雑な多段階推論)ではリードしています。しかし、ほとんどの本番ワークロードは博士レベルの科学ではありません。分類、抽出、要約、コード生成、会話タスクです。これらには中位モデルは妥協ではなく、適切なツールです。

推論モデルはさらに別の次元を加えます。回答前に余分な計算で「思考」し、難しい問題での精度を向上させますが、レイテンシーとコストも増加します。多段階の論理的推論を必要としないユースケースでは、標準モデルが10分の1の時間で同じ答えを出します。

この表のバリュースコア(知能を混合価格で割った値)は、このトレードオフを一目で可視化するためにあります。プロジェクトに最適なモデルは、利用可能な中で最も賢いモデルではありません。必要な賢さを、それを提供する最低価格で実現するモデルです。

ヨーロッパのインフラストラクチャ上に構築し、データ主権を尊重するプロバイダーを見つけたい場合は、Voie.fiをご覧ください。コンピュート、ストレージ、決済、セキュリティ、AIなど1,400以上のヨーロッパ本社のデジタルインフラプロバイダーのオープンインデックスです。