The service supports Meta’s Llama 3.1 models and promises faster, more efficient performance than competitors, marking a shift in the growing AI chip market.
SambaNova Systems has introduced a cloud service for AI software developers, positioning itself among competitors like Cerebras and Groq to challenge Nvidia’s dominance in the AI chip market. The company launched SambaNova Cloud, a new API service powered by its SN40L AI chip, aimed at accelerating inference tasks for developers. The service supports Meta’s Llama 3.1 models, including the largest 405-billion-parameter variant.
The introduction of this service highlights the increasing importance of inference in generative AI. The company pointed to growing demand for alternatives to Nvidia, especially from enterprises seeking faster inference speeds. Inference, which involves running AI models after training, presents an opportunity for companies like SambaNova. Recently, smaller AI chip makers, including SambaNova, have launched products focused on the inferencing market, offering claimed advantages over GPUs from Nvidia, AMD, and others.
In July, Groq made Llama 3.1 models available on its GroqCloud Dev Console, which supports over 300,000 developers. It uses its language processing unit (LPU) instead of GPUs for inferencing tasks. In August, Cerebras Systems introduced Cerebras Inference, offering higher speeds for Llama models and claiming better performance and cost efficiency compared to Nvidia GPUs. Both Groq and Cerebras use SRAM to enhance inferencing performance. The SN40L chip features a three-tier memory system that combines SRAM, HBM, and DDR5, allowing it to handle both training and inference workloads efficiently. McGonnell argued that SambaNova’s architecture is more efficient than competitors that rely heavily on SRAM.
The company’s cloud service runs the 405-billion-parameter Llama model at over 100 tokens per second, surpassing Groq and Cerebras in speed and efficiency. The global AI chip market, forecasted to grow significantly, offers opportunities for new players as demand for Nvidia’s H100 creates supply shortages. Major cloud providers and traditional chipmakers are also expanding their AI offerings to meet market needs.