CoreWeave tops AI inference benchmark for Moonshot AI model
CoreWeave Inc. (NASDAQ: CRWV) achieved the highest ranking for inference speed and price-performance for Moonshot AI's Kimi K2.6 model in an independent benchmark conducted by Artificial Analysis, the company announced.
The benchmark evaluated 11 inference providers on the open-source model. CoreWeave delivered 205 tokens per second at $0.7 per million tokens using a 7:2:1 agentic blend pricing structure, according to the company's press release.
The results reflect CoreWeave's optimization across memory architecture, runtime, and interconnect systems. The company used NVFP4 Quantization with Eagle3 Speculative decoding on NVIDIA GB300 NVL72 hardware to achieve the performance metrics.
"Training launched the first wave of AI, and inference will define the next one," said Chen Goldberg, Executive Vice President of Product and Engineering at CoreWeave. "This benchmark reflects the investments we've made across our full stack."
George Cameron, Co-founder at Artificial Analysis, said the benchmark aims to provide organizations transparency in how inference offerings perform. "CoreWeave performed strongly across speed and price-performance dimensions in our benchmarking of providers of Kimi K2.6," Cameron stated.
CoreWeave offers three inference services: Serverless Inference for immediate API access, Dedicated Inference for predictable production scaling, and Inference on CoreWeave Kubernetes Service for direct infrastructure control.
Artificial Analysis independently tested Moonshot AI's Kimi K2.6 across more than 10 metrics including MMLU-Pro, GPQA, and agentic coding tasks to evaluate speed, cost, and reasoning capability.
The company previously received Platinum rankings in SemiAnalysis ClusterMAX 1.0 and 2.0 evaluations and recorded benchmark results in MLPerf testing, according to the press release.
