What is the cheapest LLM API right now?

The cheapest tracked model by blended (output-weighted) cost is Command R7B (12-2024) from Cohere at $0.04/1M input and $0.15/1M output.

Cheapest LLM API by task tier

Models ranked by blended $/M-token cost (output weighted 3:1, the typical agent mix), grouped by minimum context window. Live status shown per row.

Bottom line: the cheapest tracked model overall is Command R7B (12-2024) (Cohere) at $0.04/1M in, $0.15/1M out.

Cheapest with ≥128K context

#	Model	Provider	Input /1M	Output /1M	Context	Blended
1	Command R7B (12-2024)	Cohere	$0.04	$0.15	128K	$0.12
2	GPT-OSS 20B	Groq	$0.08	$0.3	131K	$0.24
3	DeepSeek V4 Flash	DeepSeek	$0.14	$0.28	1M	$0.25
4	Llama 3.3 70B (via OpenRouter)	OpenRouter	$0.1	$0.32	131K	$0.27
5	Llama 4 Scout (17Bx16E)	Groq	$0.11	$0.34	131K	$0.28
6	GPT-4.1 nano	OpenAI	$0.1	$0.4	1M	$0.33
7	Gemini 2.5 Flash-Lite	Google Gemini	$0.1	$0.4	1.048576M	$0.33
8	GPT-4o mini (legacy)	OpenAI	$0.15	$0.6	128K	$0.49

Absolute cheapest (any context)

#	Model	Provider	Input /1M	Output /1M	Context	Blended
1	Command R7B (12-2024)	Cohere	$0.04	$0.15	128K	$0.12
2	GPT-OSS 20B	Groq	$0.08	$0.3	131K	$0.24
3	DeepSeek V4 Flash	DeepSeek	$0.14	$0.28	1M	$0.25
4	Llama 3.3 70B (via OpenRouter)	OpenRouter	$0.1	$0.32	131K	$0.27
5	Llama 4 Scout (17Bx16E)	Groq	$0.11	$0.34	131K	$0.28
6	GPT-4.1 nano	OpenAI	$0.1	$0.4	1M	$0.33
7	Gemini 2.5 Flash-Lite	Google Gemini	$0.1	$0.4	1.048576M	$0.33
8	GPT-4o mini (legacy)	OpenAI	$0.15	$0.6	128K	$0.49

Cheapest with ≥1M context

#	Model	Provider	Input /1M	Output /1M	Context	Blended
1	DeepSeek V4 Flash	DeepSeek	$0.14	$0.28	1M	$0.25
2	GPT-4.1 nano	OpenAI	$0.1	$0.4	1M	$0.33
3	Gemini 2.5 Flash-Lite	Google Gemini	$0.1	$0.4	1.048576M	$0.33
4	DeepSeek V4 Pro	DeepSeek	$0.44	$0.87	1M	$0.76
5	DeepSeek V4 Pro (via OpenRouter)	OpenRouter	$0.44	$0.87	1.048576M	$0.76
6	Gemini 3.1 Flash-Lite	Google Gemini	$0.25	$1.5	1.048576M	$1.19

Automate this: agents call cheapest_model(min_context=…, operational_only=true) over MCP/x402 to route to the cheapest live model per request. API docs →