Skip to content

chore(pricing): Update fireworks-ai pricing#549

Open
siddharthsambharia-portkey wants to merge 46 commits intomainfrom
pricing-update/fireworks-ai
Open

chore(pricing): Update fireworks-ai pricing#549
siddharthsambharia-portkey wants to merge 46 commits intomainfrom
pricing-update/fireworks-ai

Conversation

@siddharthsambharia-portkey
Copy link
Copy Markdown
Collaborator

@siddharthsambharia-portkey siddharthsambharia-portkey commented Mar 17, 2026

🔄 Pricing Update: fireworks-ai

📊 Summary (complete_diff mode)

Change Type Count
➕ Models added 17
🔄 Models updated (merged) 4

➕ New Models

  • deepseek-v3p1
  • deepseek-v3p2
  • glm-4p7
  • glm-5
  • gpt-oss-120b
  • gpt-oss-20b
  • llama-v3p3-70b-instruct
  • minimax-m2p1
  • minimax-m2p5
  • qwen3-8b
  • qwen3-vl-30b-a3b-instruct
  • qwen3-vl-30b-a3b-thinking
  • flux-1-dev-fp8
  • flux-1-schnell-fp8
  • flux-kontext-pro
  • flux-kontext-max
  • qwen3-embedding-8b

🔄 Updated Models

  • kimi-k2-instruct-0905
  • kimi-k2-thinking
  • kimi-k2p5
  • mixtral-8x22b-instruct

Model → Pricing Category Mapping

Model ID Category Input Output Notes
deepseek-v3p1 Named: DeepSeek V3 family $0.56 $1.68 cache read = $0.28 (50%)
deepseek-v3p2 Named: DeepSeek V3 family $0.56 $1.68 cache read = $0.28 (50%)
glm-4p7 Named: GLM-4.7 $0.60 $2.20 cache read = $0.30 (50%)
glm-5 Named: GLM-5 $1.00 $3.20 cache read = $0.20 (page override)
gpt-oss-120b Named: gpt-oss-120b $0.15 $0.60 cache read = $0.075 (50%)
gpt-oss-20b Named: gpt-oss-20b $0.07 $0.30 cache read = $0.035 (50%)
kimi-k2-instruct-0905 Named: Kimi K2 $0.60 $2.50 cache read = $0.30 (50%)
kimi-k2-thinking Named: Kimi K2 $0.60 $2.50 cache read = $0.30 (50%)
kimi-k2p5 Named: Kimi K2.5 $0.60 $3.00 cache read = $0.10 (page override)
llama-v3p3-70b-instruct Tier: >16B $0.90 $0.90 cache read = $0.45 (50%)
minimax-m2p1 Named: MiniMax M2 $0.30 $1.20 cache read = $0.03 (page override)
minimax-m2p5 Named: MiniMax M2 $0.30 $1.20 cache read = $0.03 (page override)
mixtral-8x22b-instruct Tier: MoE 56.1–176B $1.20 $1.20 cache read = $0.60 (50%)
qwen3-8b Tier: 4B–16B $0.20 $0.20 cache read = $0.10 (50%)
qwen3-vl-30b-a3b-instruct Named: Qwen3 VL 30B A3B $0.15 $0.60 cache read = $0.075 (50%)
qwen3-vl-30b-a3b-thinking Named: Qwen3 VL 30B A3B $0.15 $0.60 cache read = $0.075 (50%)
flux-1-dev-fp8 Image: per-step $0.00025/step
flux-1-schnell-fp8 Image: per-step $0.00035/step
flux-kontext-pro Image: per-image $0.04/image
flux-kontext-max Image: per-image $0.08/image
qwen3-embedding-8b Embedding $0.008 $0
qwen3-reranker-8b SKIPPED Reranker — excluded per skill rules

Pricing Rules Applied

  • Cache read: 50% of input for tier-based models; page-specified override for GLM-5 ($0.20), Kimi K2.5 ($0.10), MiniMax M2 ($0.03)
  • Batch inference: 50% of serverless input + 50% of serverless output for all text/vision models
  • No cache_write_price set (Fireworks charges read-only)
  • Pricing source: 3 consistent scrapes of https://fireworks.ai/pricing (2026-03-08), verified against live page fetch (2026-03-30, same page size ~280KB)

Generated by Pricing Agent on 2026-03-30

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant