Has the hunt for AI compute uncovered the next Cerebras?
Source: TechCrunch
The raging demand for computers to run AI models has only accelerated, but there are two major obstacles that anyone in the business needs to overcome: getting the right chips, and getting them into data centers where they can start generating revenue.
The Right Chip for Inference
The demand for GPUs has exploded, yet it’s becoming conventional wisdom that they aren’t the best‑suited chips for running AI models once they have been trained. Inference—the phase where a model actively generates responses—has different computational requirements than training, and a new class of chips is being designed specifically for it.
Recent moves such as Nvidia’s $20 billion acquisition of Groq and Cerebras’ $57 billion IPO highlight the growing focus on inference‑optimized hardware. With capacity strained at both companies, the co‑founders of General Compute—CEO Finn Puklowski and CTO Jason Goodison—looked elsewhere.
General Compute’s Chip Strategy
General Compute is turning to SambaNova, an Intel‑backed chipmaker focused on inference that has slipped somewhat out of the mainstream conversation. SambaNova’s upcoming chips are:
- More flexible architecture
- Larger memory for storing context during inference calculations
- Claims of outperforming GPUs and other specialized chips (e.g., Groq, Cerebras)
Puklowski says the new chips will generate 600–700 tokens per second, compared with roughly 250 tokens per second for GPUs. General Compute has $300 million of SN50 chips on order and will be the first neocloud to deploy them.
Advantages for Data Center Deployment
- Air‑cooled rather than water‑cooled
- Lower power consumption
These characteristics allow the chips to be installed in existing data‑center facilities without new infrastructure investments.
Deployment Model: Colocation and Repurposing
General Compute is pursuing colocation deals—installing its hardware in third‑party facilities—not only with traditional data‑center providers but also with crypto miners looking to repurpose infrastructure as the cost of producing Bitcoin often exceeds its price.
The company launched its cloud offering last week, claiming it is already the fastest at running MiniMax 2.7, a powerful open‑source LLM.
Funding and Strategic Partnerships
- $15 million seed round at a $60 million post‑money valuation, led by FUSE VC with participation from Carya Venture Partners and Village Global Ventures.
- Joe Hasselmann, a venture investor who backed Groq in 2021, invested through his new fund Evercrest Capital Partners. He sees parallels between SambaNova’s partnership with General Compute and the relationships CoreWeave has with Nvidia, as well as Groq’s chip‑making paired with its former cloud offering.
“They do need a healthy mix of customers that are going to put their chips in environments that are going to have high growth to them,” Hasselmann said. “As much as General Compute is making a bet on SambaNova, SambaNova is making a bet on General Compute.”
The Broader Inference Landscape
The key question is which computer architecture will capture the most value in the AI future. Inference clouds are implicit bets on a world of multiple models and agents, where no single provider dominates and speed and cost of inference become the primary competitive variables.
- Example: OpenRouter’s $113 million Series B round reflects the market’s appetite for platforms that give customers access to multiple models to optimize token spend.
- Speed matters for both price and capability. Puklowski aims to shrink hour‑long coding‑agent workloads to 5–10 minutes and to enable faster audio agents for customer service.
“If you use ChatGPT and it gives you 50 tokens per second, that’s still a heck of a lot faster than we can read,” Puklowski told TechCrunch. “Now that things have moved to agent‑to‑agent, where agents are out there reading on our behalf or pinging databases, they need to go faster.”