Contemplative Path to Growth and Influence

NVIDIA’s 75% Gross Margins Won’t Last. Here’s Why.

NVIDIA is currently running 73.4% gross margins with a recent peak of 78.3% in early 2024. That’s extraordinary. It’s also unsustainable.

We’ve seen this movie before.

History Repeats: The Intel Playbook

Intel in their heyday (2010): 65% gross margins
Intel today (2025): 32% gross margins

What happened? Competition didn’t need to match performance immediately. They just needed to be “good enough” at a better price.

AMD, ARM, and custom silicon chipped away at the margins over 14 years. Not the technology moat—the economic moat.

Intel’s technical dominance remained intact for years. But once competitors achieved “good enough” performance at 30-40% lower prices, the business model crumbled. Market share followed margins, not performance leadership.

The Same Pattern Is Playing Out in AI Infrastructure

NVIDIA’s current margins suggest their H100 costs ~$7.5K-$10K to produce but sells for $30K+.

For AI companies burning millions on LLM costs, that spread is an open invitation:

  • AMD MI300X at 44% margin → 39% cost reduction potential
  • Custom inference chips (Google TPU, AWS Inferentia) → even wider gaps
  • Specialized silicon optimized for transformers → purpose-built efficiency

The metric that matters isn’t TFLOPS or tensor cores. It’s $/token.

And that’s where the arbitrage begins.

Why This Matters Now

1. Performance parity takes 3-5 years. Price parity takes 6-12 months.

Training models still demands cutting-edge hardware. But inference—which now represents 60-80% of AI compute workloads—doesn’t need the latest architecture. It needs efficiency and cost.

2. AI workloads are increasingly inference-heavy.

Every production LLM deployment shifts the balance from training (one-time) to inference (continuous). GPT-4, Claude, Gemini—they’re all inference machines at scale. That’s where margin compression hits hardest.

3. Margin compression always targets the fattest margins first.

It’s not a question of if, it’s when. Competitors don’t need to beat NVIDIA on every dimension—just the dimension customers care about most: total cost of ownership.

4. Intel’s decline from 65% → 32% margins took 14 years.

In AI, everything moves 10x faster. The infrastructure cycle that took Intel over a decade will compress into 18-36 months for NVIDIA.

The Infrastructure Arbitrage Game Has Just Begun

For AI companies, the winners in 2026+ won’t be the ones with the fastest GPUs. They’ll be the ones with the best $/token economics.

NVIDIA will remain the performance leader. But leadership and profitability are different games.

The question isn’t whether NVIDIA’s margins will compress—it’s how fast, and who captures the value on the way down.

What do you think? How long can 73% margins survive in a commoditizing market?


Discover more from The Quiet Leadership

Subscribe to get the latest posts sent to your email.

Leave a comment

Discover more from The Quiet Leadership

Subscribe now to keep reading and get access to the full archive.

Continue reading