In a recent conversation on the “20VC with Harry Stebbings” show, Jonathan Ross—Founder and CEO of Groq—shared a bold vision of where AI is headed. From debates on scaling laws to the vastly underestimated costs of inference, Ross’s perspective challenges many commonly held beliefs about AI hardware, business models, and global competition. Below is a deep dive into some of the most compelling insights from the discussion.
Ross begins by revisiting the “scaling laws” popularized by OpenAI. These laws suggest that the bigger the model and the more data you feed it, the better it performs. However, Ross argues that the quality and nature of the data matter greatly—dumping massive amounts of low-quality internet text is not the same as carefully curated or synthetic data.
He points out that large language models (LLMs) can generate their own synthetic data, effectively producing “better-than-human” training materials for themselves. In this process, the LLM learns from higher-quality, self-generated data, creating an iterative loop of improvement. This synthetic data approach refines models faster than simply throwing more tokens at them in a brute-force manner.
While advanced training techniques and improved data quality can make models more intuitive, Ross highlights the separate and equally important “system 2” aspect—reasoning. True progress in AI requires balancing massive data ingestion with more sophisticated reasoning algorithms. Ultimately, Ross sees ongoing algorithmic innovations, combined with more efficient training methods, as essential for achieving breakthroughs in performance.
Contrary to popular belief, Ross says that the bulk of AI’s operational costs lie not in training but in inference—the real-time computation that occurs every time a model is used. At Google, he notes, inference consumed “10 to 20 times the compute” of training. And in his view, most of the market has underestimated the enormity of the inference challenge.
Ross believes that NVIDIA’s GPUs dominate training today, but a different architectural approach is needed for massive-scale inference. His company, Groq, focuses on LPUs (Language Processing Units)—a design he argues is better optimized for inference at scale: it’s faster, more energy-efficient, and significantly cheaper per token of output.
NVIDIA famously enjoys very high margins. But for Ross, the future is in high-volume, commodity-like inference. Rather than fight NVIDIA in training—a battle he calls “already solved” by GPUs—Ross wants Groq to own the inference market, offering a fraction of the cost and a fraction of the energy usage. It’s a strategic “divide and conquer”: let NVIDIA keep training, while Groq tackles inference at orders of magnitude lower cost.
One of Groq’s architectural advantages is a more efficient approach to memory. GPUs require specialized high-bandwidth memory (HBM), which is in short supply and expensive. By contrast, Groq’s design pipeline keeps model parameters directly on-chip, reducing the massive energy overhead incurred by constant data transfer. This approach, says Ross, leads to lower cost and up to three times better energy efficiency.
Ross predicts a near-future surge in data center capacity, potentially creating an oversupply that will later be insufficient once AI truly hits its stride. Many data center developers, he warns, do not fully understand the complexities of power, cooling, and water requirements. “There’s going to be a mismatch,” Ross cautions, with huge sums of money going into projects that ultimately lack the necessary infrastructure. Eventually, real AI demand will outstrip this half-built supply, driving data centers to scramble once again.
For years, Groq lacked product-market fit. Ross candidly describes how the company nearly ran out of money and resorted to “Groq Bonds”—an internal scheme asking employees to swap salary for equity. Remarkably, most employees embraced the plan, uniting the team around its long-term mission rather than short-term comfort. That shared conviction kept the company afloat.
Ross clarifies a recent deal that was widely (but incorrectly) touted as “Groq raising $1.5 billion.” In reality, it’s a revenue-generating partnership in Saudi Arabia, where Groq will deploy its chips across data centers, with the partner fronting most of the capital. Groq pays back the cost—plus an agreed return—out of its revenue. Because Groq’s hardware and infrastructure are profitable at scale, Ross says, the company can avoid massive dilutive equity rounds and still ramp up at astonishing speed.
Ross refers to China’s AI ecosystem as “Sputnik 2.0” for the West, noting that China’s deep-seated technological ambition fuels a competitive mindset. However, he questions whether China’s models can be as open or uncensored as Western counterparts—a major factor that could limit the global adoption of China’s AI tools. In his view, strictly controlled LLMs are at a disadvantage compared to freely innovating, less-restricted systems.
While Europe may appear behind the U.S. and China, Ross argues there’s a wellspring of talent waiting to be activated by the right incentives. Rather than focusing on regulations, he believes European governments should foster the entrepreneurial spirit through “zones of risk-taking.” If the continent embraces faster labor mobility and simpler rules for startups, Ross sees no reason Europe can’t experience a breakthrough moment similar to Silicon Valley’s rise.
Ross draws on John Maynard Keynes’s “beauty contest” analogy to describe the current AI investment frenzy. With billions flowing into AI startups, even unproven ideas can raise staggering sums. He anticipates inevitable shakeouts and incineration of capital among the also-rans. Nevertheless, he contends the market winners—those solving real, unmet needs—will generate far more value than the total money lost.
Ross outlines a progression for LLMs:
Companies that master these stages, he believes, will become defining forces in the industry—likely outpacing even the largest incumbents.
Ross envisions AI driving tremendous abundance: from cost-free (or nearly free) workforce equivalents to large-scale solutions for energy or housing. However, he worries about “financial diabetes,” where a life of ease dulls human ambition and problem-solving skills. As AI takes on more decisions, society risks losing the grit that comes from adversity.
Ross also speculates that if radical longevity is possible—akin to the sudden breakthrough in weight-loss drugs—then AI-powered biomedical research could make it happen faster than anyone expects. Though the timeline is uncertain, the convergence of AI and drug discovery raises the possibility of drastically extending human lifespans.
Jonathan Ross’s conversation with Harry Stebbings offers a glimpse into AI’s future that stands apart from the daily headlines. For Ross, inference is where the real battle looms: efficient, affordable deployment of large-scale models for billions of daily queries. Groq’s strategy—optimizing for cost, speed, and energy—marks a sharp departure from the GPU-centric mindset.
While there may be disagreements on exactly how quickly this shift will unfold, Ross’s overarching message resonates: AI’s most significant transformations often happen where people least expect them. If training has been the early star, inference is set to steal the spotlight—reshaping hardware architectures, business models, and the balance of global technological power.
In a world where “your job is to get positioned for the wave, not to follow it,” Jonathan Ross is betting on a new paradigm. Whether or not Groq achieves its ambitious goal of powering at least half of the world’s AI inference compute by 2027 while preserving human agency in the age of AI, the conversation highlights a pivotal shift: AI is not just evolving—it is reshaping the foundations of technology and industry.