Why Is NVIDIA the Most Valuable Company in the World? The AI Stack, the CUDA Moat, and the Threats That Could Unseat It
NVIDIA dominates AI because it controls the software stack, particularly CUDA, that makes its hardware indispensable. While others supply the physical manufacturing, NVIDIA owns the layer where algorithms meet silicon, creating a switching cost so deep that researchers, cloud providers, and enterprises have built entire ecosystems around it, making displacement practically impossible despite comparable competing hardware.
Andrew Baker | Group CIO, Capitec Bank | May 2026
NVIDIA sits at the top of the most important value chain in the world right now. But it does not build the machines that make its chips, it does not own the factories, and it does not fabricate the transistors. So why is it worth more than most national economies, and why can nobody unseat it?
To answer that properly, you have to understand the full supply chain, because NVIDIA’s dominance makes no sense without first understanding who comes before it. The story starts in the Netherlands, in a nondescript industrial park in Veldhoven, at a company most people have never heard of.
1. ASML: The Machine That Builds the Machine
Every advanced chip in your phone, your laptop, or a data centre running a large language model starts its life in a factory that depends entirely on machines built by a single company. [1] That company is ASML, a Dutch manufacturer whose extreme ultraviolet lithography systems hold a 100% monopoly on the technology required to produce the world’s most advanced semiconductors. The entire global AI compute buildout, every H100, every Blackwell GPU, every custom AI accelerator from Google or Amazon, flows through machines assembled in one industrial town in the south of the Netherlands.
To appreciate what ASML does, you need to understand the basic physics of chipmaking. A modern processor is a wafer of silicon onto which billions of transistors have been etched, each one a microscopic switch that controls the flow of electricity. The smaller you can make those transistors, the more you can pack onto a single chip, and the more powerful and efficient that chip becomes. The process of etching those patterns is called lithography, and it works by shining light through a blueprint onto the silicon surface. The governing physics constraint is severe: you can only print patterns as fine as the wavelength of light you use, and visible or near-ultraviolet light from previous generations of machines could not print small enough to keep pace with what AI workloads now demand. [2]
ASML solved this by building machines that generate extreme ultraviolet light at a wavelength of 13.5 nanometres, roughly a five thousandth of the width of a human hair. The engineering required to produce this at commercial scale is genuinely science fiction made real. Inside each machine, a laser fires 50,000 pulses per second at individual droplets of molten tin travelling at high speed through a vacuum. The laser vaporises each droplet into a plasma that emits a burst of EUV light hotter than the surface of the sun, and that light is then guided by mirrors polished to within a single atom of flatness, through a vacuum chamber, through a patterned chip blueprint, and onto a silicon wafer. The whole assembly weighs around 200 tonnes, costs up to 400 million euros per unit, and draws on components from over 800 specialist suppliers across the globe, including optical systems from Zeiss in Germany that took fifteen years to develop. [3]
Why can nobody else build these machines? The honest answer is accumulated time. ASML spent thirty years and tens of billions of euros developing this technology, iterating through engineering failures, building supplier relationships that cannot be recreated overnight, and accumulating manufacturing knowledge that is simply not available on the open market. [4] Competitors like Nikon and Canon pursued their own EUV programmes, ran out of willingness to absorb the financial risk, and quietly withdrew. Nikon stopped mentioning its EUV work in annual reports from 2013 onward. China has poured enormous state resources into the problem and remains locked out of the most advanced process nodes as a result. The learning curve involved is not a curve so much as a cliff face, and ASML has been ascending it since the 1990s while every other potential entrant watched from the bottom and eventually turned away. [5]
The strategic implication runs deep across the entire semiconductor industry. ASML is the tollbooth through which all advanced semiconductor progress must pass, and the US government understood this clearly enough to pressure the Dutch government to restrict ASML’s sales of its newest machines to China, a chokepoint that has become one of the defining structural features of technology competition between the two largest economies in the world. [6]
2. TSMC: The Factory That Builds What NVIDIA Designs
Once ASML’s machines are installed in a fab, you still need an organisation with the process expertise to run them at industrial scale and produce consistent, high-yield chips. That organisation is Taiwan Semiconductor Manufacturing Company, and it is arguably the most strategically critical factory operation on the planet. [7] TSMC operates what is called a foundry model, meaning it designs nothing itself and instead manufactures chips for others at a level of precision and scale that no rival has come close to matching. Apple’s processors, AMD’s CPUs, Qualcomm’s mobile chips, and critically NVIDIA’s AI accelerators all roll off TSMC’s production lines in Taiwan.
TSMC controls roughly 71% of the global contract chip manufacturing market, with Samsung in second place at around 8%. [8] The gap between them is not simply a matter of installed capacity. TSMC leads on process technology advancement, which means it consistently achieves the finest geometries at the highest production yields, and doing so reliably, at volume, with the quality controls required for chips that cost hundreds of dollars each to produce demands decades of accumulated operational expertise that rivals cannot shortcut with capital expenditure alone.
NVIDIA relies on TSMC almost entirely for its most advanced products. The H100 and the entire Blackwell generation were manufactured at TSMC, and when TSMC encounters supply constraints or yield challenges with a new advanced packaging technology, the impact appears immediately in NVIDIA’s ability to ship product to the customers who need it most. [9] This dependency is not a weakness unique to NVIDIA but rather a structural feature of the entire advanced chip ecosystem, one that every major chip designer including Apple, AMD, and Qualcomm shares equally.
3. The Full Stack
The value chain looks like this: ASML builds the machines that make fabrication possible, TSMC runs those machines at industrial scale, NVIDIA designs the chips that TSMC manufactures and then wraps them in a software platform, that platform powers the AI frameworks like PyTorch and TensorFlow, and those frameworks train the large language models and AI systems that the world is currently spending hundreds of billions of dollars to build. NVIDIA sits in the middle of this chain but captures more value than anyone else in it. ASML and TSMC both hold extraordinary and durable competitive positions, yet NVIDIA commands a market valuation that dwarfs them both combined. The reason for that is not the hardware alone. It is the software layer that NVIDIA spent two decades building on top of the hardware, making its silicon not just fast but structurally indispensable to the entire AI industry.
4. From Gamer Chips to the Engine of Intelligence
NVIDIA was founded in 1993 to build graphics processing units for gaming, and for most of its first two decades that is what it primarily did. GPUs are architecturally different from CPUs in a way that turns out to be profoundly consequential for AI. A CPU is optimised for sequential processing, meaning a small number of extremely powerful cores designed to handle complex and branching logic quickly. A GPU is designed for parallel processing, meaning thousands of simpler cores that execute the same operation across enormous datasets simultaneously. For gaming, this parallelism is what allows a system to calculate lighting, shadows, and textures across millions of pixels at frame rates fast enough to feel real.
Training a neural network requires exactly the same kind of workload at a mathematical level. The core operations of deep learning, including the matrix multiplications, gradient calculations, and weight updates applied across billions of parameters, are embarrassingly parallel in nature, meaning they are ideally suited to being distributed across thousands of cores running simultaneously. When researchers started experimenting seriously with GPUs for machine learning in the late 2000s, the performance gains over CPUs were not incremental improvements. They were transformational, often an order of magnitude better or more, and they changed what was computationally feasible for AI research almost overnight. [10]
The 2012 AlexNet breakthrough, in which a deep neural network trained on NVIDIA GPUs crushed every competitor in the ImageNet image recognition competition, is widely cited as the moment the modern AI era began in earnest. [11] What rarely gets sufficient emphasis is that AlexNet would not have been possible without the CUDA programming framework that NVIDIA had been quietly building since 2006, six years before anyone outside a handful of machine learning research groups understood why it would matter commercially.
5. CUDA: The Moat That Hardware Cannot Buy
CUDA stands for Compute Unified Device Architecture. It is a parallel computing platform and programming model launched by NVIDIA in 2006 that allows software developers to write code which runs directly on NVIDIA GPUs. At the time of its launch, this was a significant and genuinely risky investment. GPUs were niche gaming and graphics hardware, and convincing developers to invest in learning an entirely new programming paradigm, with no AI market yet in existence and no guarantee that one would emerge, required a bet on a future that was far from certain. [12]
What NVIDIA did over the following two decades was build an entire ecosystem on top of this foundation in a way that compounded its value with every passing year. The base CUDA platform enabled general-purpose computing on the GPU, and around it NVIDIA layered purpose-built libraries: cuDNN for deep neural network operations, cuBLAS for linear algebra, NCCL for multi-GPU communication across distributed systems, and hundreds of additional specialist libraries, each optimised down to the hardware level in ways that generic code simply could not match. When PyTorch and TensorFlow emerged as the dominant AI development frameworks, they were built with deep CUDA integration at their core. PhD students learning machine learning learned CUDA as part of the standard curriculum. Academic papers published CUDA benchmark results as the accepted industry standard. The entire intellectual infrastructure of the AI research field was effectively written on top of NVIDIA’s platform, and this was not an accident so much as the deliberate outcome of a sustained strategy to make the platform indispensable before the market arrived. [13]
The implications for competitive switching are severe and deserve to be understood clearly. A company that has built its model training pipeline on NVIDIA hardware is not simply making a hardware procurement choice. Its engineers are trained on NVIDIA’s profiling and debugging tools. Its performance-critical kernel code is written in CUDA. Its model validation benchmarks were run on NVIDIA hardware and validated against NVIDIA performance characteristics. To migrate to an alternative, that company must retrain its engineering staff, rewrite its optimised code, revalidate that model performance is equivalent or better on the new platform, and accept operational risk throughout the entire transition period. Estimates for rewriting a significant CUDA-based production system for AMD’s competing ROCm platform consistently run to months of engineering effort and hundreds of thousands of dollars per project, and that is before accounting for the productivity loss during transition. [14] For a hyperscaler running thousands of such systems, the economics of migration are deeply unfavourable even when the alternative hardware offers comparable raw performance on selected benchmarks.
This is the CUDA moat in its full form. It is not a software trick or a contractual lock-in mechanism. It is a self-reinforcing ecosystem built on time, on massive developer investment, and on accumulated optimisation depth that no competitor has managed to replicate at comparable scale. AMD’s ROCm platform is maturing and genuinely improving year on year, but independent engineering assessments consistently describe it as materially behind CUDA in stability, library completeness, and framework integration depth. Intel’s oneAPI faces structurally similar challenges. The performance gap on hardware benchmarks has narrowed, but the software ecosystem gap remains wide enough to determine real purchasing decisions in practice, and that is what matters to an organisation deciding where to deploy its next billion dollars of AI infrastructure investment. [15] [16]
6. Why NVIDIA, Not ASML or TSMC, Commands the Highest Valuation
ASML holds a literal monopoly on the machines that make advanced chips. TSMC manufactures roughly 71% of the world’s contracted advanced semiconductor output. Both are extraordinary businesses with competitive positions that are genuinely difficult to challenge on any short time horizon. So why does NVIDIA sit at a market valuation of 3 to 4.5 trillion dollars while ASML sits closer to 300 billion, and why has NVIDIA briefly touched five trillion dollars while TSMC remains at a fraction of that figure? [17]
The answer lies in where value accumulates in a technology stack, and specifically in the difference between manufacturing economics and software economics. ASML is a capital-intensive industrial manufacturer. Its machines are extraordinary feats of engineering, but each one requires enormous cost to build, takes years to produce, and the company’s revenue is fundamentally bounded by its manufacturing throughput and the rate at which its customers build new fabs. TSMC is a precision manufacturer operating at breathtaking scale, with all the capital intensity that implies. Its margins are excellent by semiconductor industry standards but remain structurally constrained by the cost of running and continuously upgrading fabrication plants that cost tens of billions of dollars each to construct.
NVIDIA captures value at the layer where software meets silicon, and software economics are categorically different from manufacturing economics. The GPU itself is fabricated by TSMC, but the CUDA ecosystem, the developer relationships, the AI framework integrations, the enterprise software stack, the DGX server systems, and the NVLink interconnect fabric are all NVIDIA’s own intellectual property. Once you have written those libraries and embedded your platform in the frameworks that every AI researcher and engineer depends on, the marginal cost of serving an additional customer approaches zero. The gross margins on NVIDIA’s data centre segment reflect this reality, running consistently in the high seventies to low eighties as a percentage of revenue, numbers that no hardware manufacturer running physical factories can approach. [18]
The pricing power dimension compounds the advantage further. Because demand for AI training compute has consistently outstripped NVIDIA’s production capacity since ChatGPT’s release in late 2022, the company has been setting prices rather than discovering them through competitive market pressure. H100 GPUs with a retail list price of around 30,000 dollars were trading on secondary markets for multiples of that figure during the peak supply crunch of 2023 and 2024. Customers who needed to train frontier models had no viable alternative and they paid without meaningful negotiating leverage. This is what near-monopoly pricing power looks like when it is operating in a market where demand is existential to the buyer and supply is genuinely constrained. [19]
7. Who Is Challenging NVIDIA, and How Seriously?
NVIDIA’s dominance at roughly 80 to 90% market share in AI training accelerators invites challenge from every direction: competitors seeking revenue share, hyperscaler customers seeking independence from a single supplier, and governments concerned about strategic concentration in a technology that is fast becoming critical national infrastructure. [20] The threats come from meaningfully different sources and deserve honest assessment rather than either dismissal or overcrediting.
AMD, with its MI300 and upcoming MI400 GPU series, is the most credible hardware challenger. Its ROCm software platform is closing the gap year on year but remains materially behind CUDA in ecosystem depth and production stability. Google’s TPUs are excellent for Google’s own models and deeply optimised for their specific workloads, but are not available to third parties at meaningful commercial scale. Amazon’s Trainium chips are growing in adoption within the AWS ecosystem, though migration requires significant developer effort and the technology remains tightly ecosystem-specific. Broadcom has secured significant custom AI accelerator design partnerships with hyperscalers and projects substantial revenue in custom silicon by 2027, but these designs rely on deep customer co-design and are not general-purpose fungible substitutes for NVIDIA’s platform. Intel’s Gaudi accelerators are showing early signs of data centre recovery but remain a distant challenger in AI-native compute workloads. Huawei and China’s domestic chip efforts are meaningful within China given US export restrictions on NVIDIA hardware but are of limited relevance to NVIDIA’s global competitive position outside that geography. [21]
The pattern that emerges from examining these challengers is consistent and instructive. The most credible hardware competitors all run into the same structural obstacle: they can build hardware that competes on selected benchmarks, but the software ecosystem surrounding NVIDIA is not something that can be replicated by engineering effort or capital expenditure on any short time horizon. The custom silicon programmes at individual hyperscalers reduce their own dependence on NVIDIA without creating externally available alternatives that the broader market can adopt without enormous friction and sustained investment commitment.
There is a more nuanced and structurally important threat worth watching closely. As AI workloads shift from training, which is massively parallel and strongly suited to GPU architecture, toward inference, the computational requirements change in ways that matter for the competitive landscape. Inference is often more latency-sensitive, more workload-specific, and less computationally intensive per query than training large models from scratch. Custom silicon designed around the economics of inference, measured in cost per token at scale rather than raw training throughput, can outperform general-purpose GPUs on the metrics that drive production deployment decisions. This is where the CUDA moat is at its weakest, because inference optimisation is newer territory with less accumulated ecosystem depth and fewer sunk engineering costs on the customer side. Custom silicon already represents around 20% of the AI accelerator market by revenue and is projected to grow to nearly 28% in 2026, representing the most significant structural shift in the competitive landscape since the AI buildout began in earnest. [22]
8. What Protects the Revenue Moat Going Forward
Several structural advantages make NVIDIA’s position durable well beyond the current investment cycle, though that durability should not be mistaken for permanence or invulnerability to long-term erosion through accumulated and patient competitive pressure.
The first protection is the fullstack expansion that NVIDIA has been executing deliberately for several years. NVLink, its proprietary high-speed interconnect, is how large GPU clusters communicate internally at the speeds required to train frontier models across thousands of GPUs simultaneously. NVLink Fusion extends this architecture to scenarios where customers want to combine NVIDIA’s networking and compute components with their own custom silicon, which means that even organisations building their own chips find themselves pulled partially back into the NVIDIA ecosystem rather than exiting it entirely. Omniverse provides a simulation and digital twin platform that extends NVIDIA’s presence into physical AI and industrial applications well beyond the data centre. NVIDIA AI Enterprise is a software subscription layer targeting the recurring revenue profile that creates a financial relationship with customers extending beyond each hardware procurement cycle. [23]
The second protection is the talent pipeline, which is less visible than hardware or software metrics but arguably more durable over a long time horizon. Universities teach CUDA as the standard parallel computing curriculum. Graduate students in machine learning and AI write their dissertations on NVIDIA hardware using NVIDIA tools. The engineers entering the AI industry over the next decade have been trained on NVIDIA’s ecosystem and will carry that knowledge and those preferences into the organisations they join. This creates a compounding onboarding advantage for any organisation that remains on NVIDIA and a genuine skills-scarcity problem for any organisation that decides to migrate away, because the people who know how to optimise for alternative platforms are far scarcer than those who know CUDA deeply. [24]
The third protection is the pace of hardware generation cycles and the revenue they generate for reinvestment. NVIDIA’s roadmap has moved from the Hopper generation to Blackwell and onward toward Vera Rubin, with each generation delivering meaningful performance improvements and capturing revenue at a scale that funds R&D investment that no startup and few large competitors can match on the software ecosystem side. The Blackwell generation alone is projected to contribute approximately 320 billion dollars in data centre revenue across 2026, and that revenue funds the software ecosystem investment that makes the next generation of hardware even more defensible. The flywheel does not merely sustain itself. Each revolution makes it harder to interrupt from the outside. [25]
9. Where Does NVIDIA Go From Here?
Jensen Huang has stated visibility to one trillion dollars in cumulative revenue across the Blackwell and Vera Rubin chip generations through 2027. [26] The four major hyperscalers, Microsoft, Google, Amazon, and Meta, have collectively committed in excess of 650 billion dollars in AI infrastructure spending for 2026 alone, and NVIDIA captures a dominant share of that capital flow. NVIDIA’s market capitalisation briefly crossed five trillion dollars in April 2026, making it the first chip company in history to reach that milestone and, at that moment, the largest company in the world by valuation. [27]
The strategic direction that NVIDIA is pursuing beyond the current GPU supercycle runs in several directions simultaneously. Sovereign AI, the growing conviction among governments that nations need their own AI computing infrastructure rather than routing all sensitive AI workloads through US hyperscaler data centres, is emerging as a substantial and strategically distinct market. NVIDIA is actively deploying AI infrastructure in partnership with governments and telecommunications operators across France, Germany, Italy, and the United Kingdom, and these are not marginal pilot programmes. National governments operate on different procurement timelines and with different strategic motivations from private hyperscalers, and NVIDIA’s fullstack capability combined with its established brand in AI infrastructure makes it the default partner for sovereign AI buildouts of this kind. [28]
Physical AI, meaning the application of AI to robotics, industrial automation, and systems that interact with the physical world rather than processing text or images inside data centres, is the frontier that Huang has articulated most clearly as NVIDIA’s next major market. The Vera Rubin architecture is designed with agentic and physically embodied AI workloads in mind rather than being purely optimised for large language model training. As AI capability shifts from software products accessed through text interfaces toward systems that perceive and manipulate the physical environment, the computational requirements and latency constraints change significantly, and NVIDIA’s Isaac robotics platform and Jetson edge computing product line position it early in what could become the next major compute supercycle after the current data centre buildout matures. [29]
The risks are real and deserve honest acknowledgement rather than dismissal. Custom silicon continues to gain share in inference workloads at a pace that matters strategically. US export restrictions on advanced chip sales to China have permanently removed NVIDIA from what was previously a significant revenue market. The Vera Rubin architecture has experienced development delays that create uncertainty precisely when NVIDIA needs to demonstrate continued momentum to justify its valuation premium. And as the performance gap between NVIDIA’s hardware and its competitors narrows on published benchmarks, the argument for remaining on the NVIDIA platform increasingly rests on the CUDA ecosystem rather than on raw hardware superiority, making the health and continued developer investment in that ecosystem an increasingly important variable to monitor going forward. [30]
10. Power: The Physical Constraint Nobody Wants to Talk About
The AI buildout has a ceiling that no amount of software elegance or hardware innovation can simply engineer away, and that ceiling is power. Every GPU consumes electricity. Every rack generates heat. Every data centre sits on a grid that was not designed for this scale of demand, and the numbers have reached a point where power availability is now as binding a constraint on AI progress as chip supply.
The trajectory is stark. NVIDIA’s H100 GPU, the workhorse of the 2023 and 2024 AI buildout, draws approximately 700 watts per unit. The Blackwell B200 draws 1,000 watts. [31] A GB200 NVL72 rack, which houses 72 Blackwell GPUs and their interconnects, draws between 120 and 140 kilowatts, and fewer than 5% of the world’s existing data centres are capable of supporting even 50 kilowatts per rack. [32] When NVIDIA ships the Vera Rubin generation, a single NVL72 rack is projected to draw power at levels that require industrial-grade liquid cooling as a baseline requirement, not an option. Air cooling physically cannot dissipate the thermal output of these systems, which is why the liquid cooling supply chain has become one of the most consequential secondary markets in the AI infrastructure buildout.
At the cluster level, the numbers move from watts to megawatts to gigawatts in a way that strains comprehension. NVIDIA’s partnership with OpenAI alone involves the deployment of a minimum of 10 gigawatts of AI data centre capacity, equivalent to the power consumption of an entire small country. [33] Europe’s largest planned AI campus in France carries a projected capacity of 1.4 gigawatts. AI infrastructure programmes in Saudi Arabia are planned at up to 500 megawatts. Goldman Sachs projects that the AI buildout will require approximately 7.6 trillion dollars in cumulative capital expenditure between 2026 and 2031, and a substantial fraction of that is not chips or software but power infrastructure: grid connections, cooling systems, and the substations and transmission capacity required to feed facilities at this scale. [34]
Microsoft has disclosed an 80 billion dollar backlog of Azure AI orders that cannot be fulfilled due to power constraints, not chip shortages or software readiness but the simple inability to connect enough electricity to enough buildings fast enough. [35] The power constraint is now the binding limit on AI expansion in many of the most desirable data centre markets, including northern Virginia, the UK, and parts of the Netherlands, where permitting for new power connections has slowed or stalled. This is driving hyperscalers toward increasingly aggressive strategies: long-term power purchase agreements with nuclear operators, behind-the-meter generation through small modular reactors and dedicated gas turbines, and a global search for jurisdictions with surplus grid capacity and favourable permitting regimes.
The efficiency dimension matters as much as the raw power numbers. NVIDIA’s Blackwell architecture delivers approximately 10 times the inference throughput per megawatt compared to the prior Hopper generation, meaning that a facility with a fixed power budget can serve ten times more inference requests. [36] This is not a marginal improvement. It fundamentally changes the economics of AI deployment for any operator whose costs are driven by electricity rather than by hardware procurement. As inference workloads grow relative to training workloads, power efficiency per token becomes the metric that determines profitability, and NVIDIA has understood this shift and oriented its Blackwell and Vera Rubin architectures accordingly. The company that can deliver the most intelligence per kilowatt-hour wins the power-constrained AI market, and NVIDIA currently leads on that metric by a substantial margin.
For anyone building or procuring AI infrastructure, power is no longer a facilities question delegated to the data centre team. It is a strategic constraint that shapes which models can be trained, at what cost, in which locations, and on what timeline. NVIDIA’s ability to keep improving performance per watt is now as important to its competitive position as its ability to deliver raw compute throughput.
11. NVLink and the Interconnect Layer: Why GPUs Have to Talk Fast
A single GPU, no matter how powerful, cannot train a large language model. GPT-4 has an estimated 1.8 trillion parameters. The models being trained on frontier infrastructure today are larger still, and growing. No single chip has enough memory to hold a model of that size, which means training requires distributing the work across thousands of GPUs simultaneously. The central challenge of large-scale AI training is therefore not how fast any individual GPU can compute, but how fast the GPUs in a cluster can communicate with each other as they exchange gradients, synchronise weights, and pass intermediate results back and forth during the training loop.
This is where interconnect technology becomes the decisive variable, and why NVIDIA’s NVLink and NVSwitch represent a moat almost as significant as CUDA itself. Traditional servers connect their components via PCIe, the standard bus that links GPUs and CPUs. PCIe Gen 5 delivers approximately 32 gigabytes per second of bidirectional bandwidth. NVLink 5, NVIDIA’s current generation interconnect, delivers 1.8 terabytes per second of bidirectional bandwidth per GPU, more than 14 times the throughput of PCIe at the same generation. [37] This is not a refinement of the same approach. It is a categorically different speed of communication that changes what distributed AI workloads can do in practice.
NVSwitch extends this principle from pairs of GPUs to entire systems. Rather than allowing GPUs to communicate only with immediate neighbours, NVSwitch creates an all-to-all fabric where every GPU in a system can communicate with every other GPU simultaneously at full NVLink bandwidth, without any data needing to traverse the CPU or the system memory. A single NVSwitch chip acts as a non-blocking crossbar, meaning no GPU has to wait for another to finish transmitting before it can receive. In the GB200 NVL72 rack, 72 Blackwell GPUs connected via NVLink 5 and NVSwitch deliver 130 terabytes per second of aggregate system bandwidth and can be addressed as a single unified accelerator rather than as 72 discrete devices. [38] The Vera Rubin NVL72 pushes this further to 260 terabytes per second, with 72 GPUs effectively functioning as a single 3.6 exaflop compute engine.
The practical consequence for AI training is decisive. During a distributed training run, GPUs need to perform what are called collective operations, principally the all-reduce operation in which every GPU shares its locally computed gradient updates with every other GPU so they can collectively update the model weights. The speed at which this collective communication completes determines how long a training step takes, and therefore how long and how expensive a training run is. On PCIe-connected systems, this communication becomes a severe bottleneck as the number of GPUs scales. On NVLink-connected systems with NVSwitch fabrics, the communication overhead becomes small relative to the compute time, which is the only regime in which large-scale training is economically viable. Training a frontier model on a PCIe-connected cluster is not merely slower than doing it on an NVLink cluster. At sufficient scale, it may be practically impossible within a commercial time and cost budget. [39]
NVLink Fusion, launched in 2025, extends this ecosystem in a strategically important direction. It allows third-party CPUs and accelerators from companies including Qualcomm, Fujitsu, Marvell, and MediaTek to integrate into NVLink fabrics, which means that organisations building their own custom silicon can connect that silicon to NVIDIA’s interconnect infrastructure rather than building a competing fabric from scratch. This move pulls potential defectors partially back into the NVIDIA ecosystem even as they invest in custom chip development. The competing UALink standard, backed by AMD, Intel, Google, Microsoft, and others, published its initial specification in April 2025 and represents the industry’s best attempt to create an open alternative to NVLink’s proprietary fabric. [40] It remains in early stages, and closing the performance and ecosystem gap with NVLink represents a multi-year engineering challenge. Until UALink matures into a viable production alternative, the interconnect layer represents a second dimension along which NVIDIA’s competitive position is difficult to replicate at comparable performance.
12. The Arms Race Dynamic: Why Hyperscalers Cannot Stop Spending
The five largest hyperscalers, Amazon, Microsoft, Google, Meta, and Oracle, are projected to spend in excess of 600 billion dollars on capital expenditure in 2026, a 36% increase over already historic levels in 2025, with approximately 75% of that spend directly tied to AI infrastructure. [41] Amazon alone has committed 200 billion dollars in capital expenditure for the year, a figure that is expected to push its free cash flow into negative territory for the first time in years. Alphabet’s guidance has reached up to 190 billion dollars, matching Microsoft. Meta has committed up to 145 billion dollars. Big Tech AI capital expenditure has gone from 162 billion dollars in 2022 to a projected 700 billion dollars in 2026, and Goldman Sachs projects cumulative hyperscaler AI investment of 1.15 trillion dollars between 2025 and 2027 alone. [42]
To understand why this spending continues at a rate that is visibly straining cash flows and forcing recourse to debt markets, you need to understand the competitive terror that drives it. Google co-founder Larry Page was quoted saying he was willing to go bankrupt rather than lose this race. [43] Truist Securities lead internet analyst Youssef Squali articulated the logic clearly: whoever gets to artificial general intelligence first will have an incredible competitive advantage over everybody else, and it is that fear of missing out that all these players are experiencing. It is the right strategy. The hyperscalers are not spending because they can clearly demonstrate return on investment from each marginal dollar of AI infrastructure. Several of them have acknowledged internally that they cannot. They are spending because the cost of being left behind by a competitor that did spend is existential in a way that the cost of overspending is not. An enterprise cloud business that trails a competitor in AI capability by two or three years faces a structural revenue problem that no efficiency programme can fix. An enterprise cloud business that spent too aggressively in 2026 and 2027 faces a cash flow problem that is uncomfortable but recoverable.
This asymmetry in outcomes is what makes the AI arms race structurally self-sustaining even in the face of genuine uncertainty about near-term ROI. Barclays analysts covering Meta noted that they were now modelling negative free cash flow for 2027 and 2028 and described this as somewhat shocking, while simultaneously maintaining an overweight rating on the stock. [44] The financial logic is that the cost of not competing outweighs the cost of the spending itself. Microsoft’s CFO Amy Hood acknowledged at earnings that despite the aggressive capital commitment, the company expected to remain capacity-constrained through at least 2026 as it worked to bring GPU, CPU, and storage infrastructure online. Demand is not the constraint. The ability to build and power facilities fast enough to meet it is. For NVIDIA, this dynamic is as favourable as it is possible for a supplier to be. Its primary customers are compelled to buy by competitive forces that operate independently of whether NVIDIA’s products deliver demonstrable returns on each specific deployment, and those customers have publicly committed to capital expenditure plans that lock in demand through the end of the decade.
13. The ROI Question Nobody Can Fully Answer
The most important unresolved question in the AI industry is whether the infrastructure being built at such extraordinary scale will generate returns commensurate with the investment. It is also the question that receives the least honest treatment in public discourse, largely because the people who know most about the answer have the strongest incentives to express confidence rather than uncertainty.
The current state of the evidence is genuinely mixed. Less than half of IT leaders surveyed said their AI projects were profitable in 2024, with a third breaking even and 14% recording losses. [45] The hyperscalers have not demonstrated positive ROI on their AI infrastructure investments at scale, even as revenue growth from AI-enabled cloud services has been strong. Microsoft’s Azure AI revenue grew 62% year on year in the most recent quarter. Google Cloud AI revenue grew 48%. Amazon’s Bedrock platform processed three times more API calls in the first quarter of 2026 than in all of 2025. [46] These are meaningful numbers, but they are revenue growth figures, not return on investment calculations. The capital being deployed is not being measured against these figures in any rigorous public accounting.
Goldman Sachs has flagged a structural risk that deserves more attention than it typically receives. If AI accelerators are purchased at 50,000 dollars per unit and depreciated over five years, but the next generation of hardware delivers dramatically better performance per dollar before that depreciation schedule expires, operators will be carrying the cost of assets that no longer drive the economic value they once did. Multiply that dynamic across hundreds of thousands of devices and the risk becomes a threat to the fundamental economics of the AI ecosystem. [47] The rate of GPU generation improvement has been rapid enough that early-vintage Hopper hardware is already substantially less competitive per dollar than current Blackwell hardware, and Vera Rubin is expected to continue that trajectory. For organisations that bought heavily at Hopper prices, the depreciation maths on those assets is becoming uncomfortable.
The most credible bullish case for ROI rests on inference rather than training. Training a frontier model is expensive and the return is diffuse: it produces a model that may or may not generate enough commercial value to justify the compute cost. Inference, by contrast, is an operational cost that scales with usage, and usage is clearly growing at a rate that makes the economics clearer. OpenAI went from 1 billion dollars in revenue in 2023 to more than 13 billion dollars in annualised revenue by early 2025, growth driven almost entirely by inference API calls and usage of deployed models. [48] Anthropic projects revenues in the range of 20 to 26 billion dollars in 2026. The inference monetisation wave appears to be arriving, and if it continues to accelerate, the ROI calculus shifts materially in favour of the investment. The question is whether it arrives fast enough and at sufficient scale to justify the pace of infrastructure investment, or whether the industry builds five years of capacity in two years and enters a period of overcapacity that compresses margins across the entire ecosystem. That question does not yet have a clear answer, and anyone who claims it does is expressing a position, not a fact.
14. Geopolitics: The Taiwan Risk the Market Partly Ignores
The entire advanced semiconductor supply chain, and therefore the entire AI infrastructure buildout, rests on a single geopolitical assumption: that Taiwan remains politically stable and that TSMC’s fabs in Hsinchu and Tainan continue to operate. This assumption is not unreasonable, but the risk of it being violated is real enough and consequential enough that any serious analysis of NVIDIA’s long-term position has to address it directly.
TSMC produces approximately 92% of the world’s most advanced chips at 7nm and below. [49] The company’s concentration in a single island of 36,000 square kilometres, 180 kilometres from the Chinese mainland, is the single largest systemic risk in global technology. Analysts have estimated that a disruption of Taiwan’s semiconductor output through conflict, blockade, or even a severe natural disaster could cost the global economy up to 2.5 trillion dollars per year in losses. [50] A Chinese military invasion or a sustained blockade of Taiwan would halt NVIDIA’s ability to produce its most advanced products entirely, since there is no alternative production capacity for Blackwell or Vera Rubin class chips at the required process nodes outside Taiwan. TSMC’s Arizona fabs, which began producing 4nm chips in 2024 and are scaling toward 3nm, represent a meaningful start at geographic diversification, but they currently represent a small fraction of TSMC’s total advanced node capacity and cannot substitute for Taiwan in any near-term scenario.
The geopolitical constraint on NVIDIA’s supply chain operates at two levels simultaneously. The first is the direct Taiwan risk described above, which most serious analysts assess as low probability in the near term but cannot dismiss as negligible. Defence strategy simulations run by the Center for Strategic and International Studies found that a Chinese amphibious invasion of Taiwan, even if ultimately defeated by US and allied forces, would involve catastrophic losses on all sides and multiple weeks of conflict that would paralyse semiconductor production regardless of the outcome. [51] The second level is the export restriction regime that the US government has imposed on advanced chip sales to China, which has already had material financial consequences for NVIDIA. The company forecasted a 5.5 billion dollar hit from restrictions on H20 chip sales to China, and that revenue has effectively disappeared overnight from the financial model. [52] Further tightening of export controls, which multiple US administrations have shown consistent appetite for, represents a permanent and unpredictable source of revenue risk for any company whose most advanced products are also the most strategically sensitive.
TSMC is responding to the Taiwan risk by diversifying its geographic footprint in ways that would have been unthinkable a decade ago. Beyond Arizona, TSMC has established a fab in Kumamoto, Japan, in partnership with Sony and Denso, and is in discussions regarding European facilities. In early 2025, TSMC announced a fresh 100 billion dollar investment to build five additional chip facilities in the US, a commitment that reflects both genuine strategic intent and the enormous political pressure being applied by the US government under successive administrations. [53] This diversification reduces the concentration risk over a decade-long horizon but does not eliminate the near-term dependency on Taiwan, where the deepest process technology expertise, the best-trained workforce, and the highest-yield production capacity remain overwhelmingly concentrated.
15. The Efficiency Threat: DeepSeek and the Model That Shook the Market
On January 20, 2025, a Chinese AI lab released a model that briefly erased 589 billion dollars of NVIDIA’s market capitalisation in a single trading session, the largest single-day market capitalisation loss for any company in stock market history. [59] The lab was DeepSeek, a unit of the quantitative hedge fund High-Flyer. The model was R1, a 671 billion parameter reasoning system trained using approximately 2,000 NVIDIA H800 GPUs, at a reported compute cost of around 6 million dollars. For comparison, Meta’s LLaMA 3 required over 16,000 H100 GPUs for training, and OpenAI’s GPT-4 cost an estimated 80 to 100 million dollars in compute. [60] R1 matched or exceeded GPT-4 level performance on multiple academic benchmarks including mathematics, coding, and structured reasoning tasks, and was released as open source under an MIT licence. The investor reaction was logical and immediate: if a Chinese lab could train a frontier-capable model for a fraction of the assumed cost, using older and less powerful hardware, did the entire premise of the multi-hundred-billion-dollar AI infrastructure buildout need to be reconsidered?
The technical reasons for DeepSeek’s efficiency are worth understanding specifically, because they point to a structural trend rather than an isolated achievement. R1 used a Mixture of Experts architecture in which only approximately 37 billion of its 671 billion parameters are active during any given inference pass, meaning the model delivers most of its capability while running a fraction of its total parameter count. [61] The training used FP8 mixed precision computation, which substantially reduces memory requirements and compute cost compared to standard FP32 training. It applied reinforcement learning from the base model directly rather than relying on expensive supervised fine-tuning at scale. It used multi-head latent attention to compress memory usage to between 5 and 13% of prior methods. And it developed a proprietary DualPipe algorithm that optimised GPU-to-GPU communication during training, reducing the interconnect overhead that typically scales with cluster size. Taken together, these are not tricks or shortcuts. They are genuine architectural advances that deliver more capability per GPU-hour, and the important observation is that they were developed by a team operating under US export restrictions on advanced hardware, which created a structural incentive to optimise for efficiency that well-resourced labs working with abundant H100 clusters did not face in the same way.
DeepSeek’s trajectory since R1 has continued in the same direction. DeepSeek V3.2, released in late 2025, pushed the architecture further. DeepSeek V4-Pro, released in April 2026, is a 1.6 trillion parameter model that benchmarks within a fraction of the most capable Western models on coding tasks while pricing its API at 3.48 dollars per million output tokens, compared to 25 dollars for comparable Western frontier models, a sevenfold price gap at near-identical benchmark performance. [62] The efficiency improvement trajectory, measured in capability per dollar of training compute, has been running at roughly a 10x improvement every twelve months across the AI industry, and DeepSeek has consistently been at the frontier of that curve.
The market’s panicked reaction to R1 was partly overcorrection and partly legitimate signal, and it is important to distinguish between the two. The overcorrection argument rests on Jevons Paradox, named after the 19th-century economist who observed that efficiency improvements in coal engines led to more coal consumption rather than less, because cheaper coal-powered processes enabled applications that were previously uneconomical. The same dynamic is structurally plausible in AI: if inference becomes cheaper per token, the rational response is to run more inference across more applications, not to run the same inference on less hardware. When DeepSeek’s API went live at its disruptive price point, AWS GPU reservation prices for H100 instances went up, not down, because demand for inference compute increased faster than the price reduction implied. [63] The total hardware required to serve a world in which AI inference is cheap and ubiquitous may be larger, not smaller, than the hardware required to serve a world in which AI inference is expensive and selective.
The legitimate signal is more nuanced and more important for NVIDIA specifically. If model efficiency improvements continue at the rate the evidence suggests, the compute required to train a model of any given capability level will continue to fall. This directly pressures NVIDIA’s training revenue, which represents the highest-margin portion of its data centre business and the segment most tightly tied to the frontier labs and hyperscalers that pay the highest prices for the most capable hardware. A world in which the next generation of frontier models can be trained on a few thousand GPUs rather than tens of thousands is a world in which the addressable market for NVIDIA’s highest-end training clusters contracts. The counterargument, which has historical support, is that frontier capability requirements will simply continue to expand to consume whatever compute is available, as labs chase performance levels that require more compute even as efficiency per unit of compute improves. The history of computing broadly supports this view, but it is an assumption about human ambition rather than a mathematical certainty. The honest position is that both forces are real, they operate simultaneously, and the net effect on NVIDIA’s training revenue over a five to ten year horizon is genuinely uncertain in a way that the current valuation does not fully price in.
There is also a geopolitical dimension to the DeepSeek story that deserves explicit acknowledgment. DeepSeek was built under the constraint of US export restrictions on advanced chips, and optimised for efficiency partly because it had no choice. DeepSeek’s R2 project, which attempted to train on Huawei Ascend chips to eliminate dependence on NVIDIA hardware entirely, was delayed by over six months due to hardware instability problems and eventually pivoted back to NVIDIA GPUs for the critical training phases. [64] This failure is a data point in favour of NVIDIA’s moat: even a well-resourced and technically sophisticated Chinese lab found NVIDIA’s ecosystem sufficiently superior that abandoning it for domestic alternatives was practically infeasible on an acceptable timeline. But DeepSeek’s efficiency achievements also demonstrate that constrained engineering can close a capability gap in ways that have direct implications for how much compute the world actually needs to build powerful AI, which is ultimately a question NVIDIA’s valuation cannot afford to answer incorrectly.
16. The Valuation Question: Is Any of This Priced Correctly?
NVIDIA’s valuation, which has briefly touched five trillion dollars and has settled in a range of three to four trillion dollars through early 2026, is simultaneously the most argued-about number in financial markets and the one where reasonable people can reach genuinely different conclusions depending on which assumptions they make about the sustainability of the AI buildout. [54]
At a trailing price to earnings ratio in the low fifties and a forward ratio of approximately 36 times earnings, NVIDIA trades at a premium to most semiconductor peers but below the highest-multiple comparable companies in the AI ecosystem. [55] The forward multiple of 36 times assumes that the current trajectory of data centre revenue growth, which has been running at 60 to 66% year on year, continues at rates high enough to grow into the valuation. NVIDIA’s revenue mix has shifted dramatically: data centre now represents 88% of total revenue at roughly 115 billion dollars in fiscal 2025, with gaming reduced to approximately 9%. [56] The company that most investors still conceptually associate with gaming chips is now primarily a data centre infrastructure business with margins that would be remarkable in any industry.
The bull case for the valuation rests on three compounding factors. First, the hyperscalers have collectively committed to capital expenditure that will sustain NVIDIA’s data centre revenue growth through at least 2027, and Goldman Sachs projects cumulative AI infrastructure investment of 7.6 trillion dollars through 2031. [57] Second, NVIDIA is simultaneously capturing the hardware revenue and building the software subscription and services layer that will generate recurring revenue independent of each hardware cycle. Third, new markets in sovereign AI, physical AI, and agentic computing extend the total addressable market well beyond the current data centre buildout in ways that are not yet reflected in current revenue.
The bear case is structurally coherent and deserves to be taken seriously. The current valuation assumes that data centre revenue growth continues at rates that have no historical precedent in hardware markets. It assumes no material disruption from custom silicon migration, from the model efficiency trend evidenced by DeepSeek and its successors, from inference efficiency improvements that reduce the compute required per query, from geopolitical disruption to the Taiwan supply chain, or from further export restriction tightening. It assumes that NVIDIA successfully makes the transition from a hardware-dominated revenue model to a platform and software model before hardware commodity economics begin to compress its margins. And it prices in a degree of sustained competitive dominance that very few technology companies have managed to maintain for the decade-length horizon that a 35 to 40 times forward earnings multiple implicitly requires. EPS grew approximately 70% in fiscal 2025 while the stock price increased less than 40%, which means the multiple has actually been compressing as earnings have expanded, a dynamic that is reassuring but sustainable only if the earnings growth trajectory is maintained. [58]
The honest answer is that NVIDIA’s valuation is defensible under a bullish but not implausible set of assumptions, and fragile under a sceptical but equally defensible alternative set. The most important variables to watch are whether inference monetisation grows fast enough to sustain hyperscaler capex commitment beyond 2027, whether custom silicon takes share in inference at a rate that accelerates meaningfully, and whether any single geopolitical or regulatory event removes a major market from NVIDIA’s addressable revenue base the way China restrictions already have. At three to four trillion dollars, the market is expressing confidence that none of those risks will materialise in a way that breaks the growth trajectory. That may be correct, but it is a bet on a future that is not yet assured.
17. The Strategic Takeaway
What NVIDIA represents is a masterclass in platform strategy worth understanding regardless of whether you approach it as an investor, an enterprise technology buyer, or someone building AI-native products. The company did not win because it made the best chip at any given moment in the competitive cycle. It won because it made a twenty-year investment in making its hardware indispensable through software, and then the world’s most consequential technology shift arrived and needed precisely what it had spent those two decades quietly building while gaming remained its primary market.
ASML built the machine that makes the machine, and holds a monopoly on that position so complete that even nation-states with unlimited resource commitments cannot replicate it on any relevant time horizon. TSMC built the operational expertise to run those machines at a scale and yield rate that no competitor has matched. NVIDIA, however, is the layer where raw silicon becomes programmable intelligence, and it has made that layer sticky enough through software depth, developer investment, and ecosystem scale that the entire AI industry currently runs on top of it and finds the cost of migration prohibitive.
The question for the next decade is not whether NVIDIA will remain relevant to the AI industry, because it clearly will. The question is whether the structural shift toward inference workloads, the growth of custom silicon programmes at the hyperscalers, and the gradual maturation of competing software platforms represent changes to the value chain that are significant enough to erode the CUDA advantage at a rate that compresses NVIDIA’s extraordinary margins before it has fully transitioned into its next strategic position in physical AI and sovereign infrastructure. On current evidence, the moat is holding and the reinvestment machine is operating at full speed. NVIDIA understands the long game it is playing, which is why it is simultaneously selling chips, software subscriptions, interconnect systems, infrastructure partnerships, and sovereign AI programmes. It is not merely defending an existing position. It is continuously expanding the territory that requires defending, and that is a fundamentally different strategic posture from every other company in this supply chain.
ASML and TSMC are extraordinary monopolies in physical production. NVIDIA is building a monopoly in the intellectual infrastructure of AI itself, and that distinction matters enormously for anyone trying to understand where value in this industry will accumulate over the next decade.
Andrew Baker is Group CIO at Capitec Bank. He writes about AI strategy, cloud infrastructure, and technology leadership at andrewbaker.ninja and publishes longer-form work at @futureherman on Substack.
References
[1] ASML (2025). ASML Annual Report 2025: Revenue, EUV Shipments, and Market Position.
[2] Works in Progress Magazine. The World’s Most Complex Machine.
[3] CNBC (2022). Inside ASML, the Company Advanced Chipmakers Use for EUV Lithography.
[4] Strange Ventures Review (2025). ASML’s 30-Year Monopoly: The Moonshot Bet No One Can Replicate.
[5] Tom’s Hardware (2026). ASML’s Roadmap for Chipmaking Lithography Tools Examined: From DUV to Hyper-NA.
[6] TrendForce (2025). ASML EUV Dominance and China’s Semiconductor Equipment Self-Sufficiency Push.
[7] The Motley Fool (2025). ASML Is the Silent Monopoly Behind the Entire Tech Industry.
[8] PredictStreet via Financial Content (2025). NVIDIA NASDAQ NVDA Deep Dive: AI Dominance and Future Frontiers.
[9] PredictStreet via Financial Content (2025). NVIDIA Supply Chain Constraints and Blackwell Production Challenges.
[10] Aidan Pak via Medium (2024). The CUDA Advantage: How NVIDIA Came to Dominate AI and the Role of GPU Memory in Large-Scale Model Training.
[11] Future Bridge (2025). The Rise of NVIDIA: How AI Hardware Became the Hottest Sector in US Semiconductors.
[12] AmiNext (2025). Why NVIDIA’s True Moat Isn’t Chips, But CUDA: An Investor’s Guide to the Ecosystem Wars.
[13] The Product Brief via Medium (2026). NVIDIA’s CUDA Moat: How Developer Lock-In Built a Trillion-Dollar AI Empire.
[14] Compute Forecast (2026). Why CUDA’s Software Moat Matters More Than Any GPU Spec.
[15] Klover AI (2025). NVIDIA AI Strategy: Analysis of Sustained Dominance in AI.
[16] Introl (2026). NVIDIA’s Unassailable Position: CUDA Moat and Competition Analysis.
[17] CNN Business (2026). How NVIDIA Became the First $5 Trillion Company, in 4 Charts.
[18] The Motley Fool (2026). I Nailed My NVIDIA Market Cap Prediction in 2025. Here’s Where I Predict It’s Going in 2026.
[19] Intellectia AI (2026). NVIDIA Hits $5 Trillion Market Cap on Soaring AI Chip Demand in 2026.
[20] International Data Corporation, cited in CNN Business (2026). NVIDIA 81% Data Centre Chip Market Share by Revenue.
[21] Built In (2026). The Next Wave of AI Infrastructure Must Target NVIDIA’s CUDA Moat.
[22] I/O Fund (2026). Nvidia Stock Prediction: The Path to a $20 Trillion Market Cap Is Strengthening. TrendForce custom silicon market share data cited therein.
[23] ABI Research (2026). NVIDIA’s Strategy: Dominating AI Through Ecosystem, Access, and Interconnect.
[24] The Product Brief via Medium (2026). NVIDIA’s CUDA Moat: Talent Pipeline and Developer Ecosystem Lock-In.
[25] I/O Fund (2026). Nvidia’s $20 Trillion Thesis Is Intact: Blackwell Revenue Projections for 2026.
[26] CNN Business (2026). Jensen Huang States $1 Trillion Revenue Visibility at GTC 2026.
[27] Intellectia AI (2026). NVIDIA Becomes First Chip Company to Cross $5 Trillion Market Capitalisation.
[28] CNN Business (2026). NVIDIA’s Sovereign AI Infrastructure Partnerships Across Europe.
[29] I/O Fund (2026). Physical AI and the Vera Rubin Architecture as NVIDIA’s Next Strategic Frontier.
[30] I/O Fund (2026). Nvidia’s $20 Trillion Thesis: Competitive Risks in Inference and Custom Silicon Growth.
[31] Navitas Semiconductor (2024). NVIDIA’s Grace Hopper Runs at 700W, Blackwell Will Be 1kW: How Is the Power Supply Industry Enabling Data Centers to Run These Advanced AI Processors?
[32] Navitas Semiconductor (2024). Blackwell Configurations Will Require 60kW to 120kW Per Rack; Fewer Than 5% of the World’s Data Centers Can Support Even 50kW Per Rack.
[33] EnkiAI (2026). NVIDIA’s AI Energy Demand: A 10GW Challenge in 2025.
[34] Goldman Sachs Global Institute (2026). Tracking Trillions: The Assumptions Shaping the Scale of the AI Build-Out.
[35] Futurum Group (2026). AI Capex 2026: The $690B Infrastructure Sprint. Microsoft Azure $80 billion power-constrained backlog cited therein.
[36] NVIDIA Corporation (2026). NVIDIA Blackwell Delivers 10x Throughput Per Megawatt for Mixture-of-Experts Models Versus Prior Hopper Generation. SEC Form 8-K FY2025.
[37] Continuum Labs (2024). NVLink Switch: Architecture and Bandwidth Specifications.
[38] IntuitionLabs (2026). NVIDIA NVLink Explained: A Guide to the GPU Interconnect. GB200 NVL72 system bandwidth specifications cited therein.
[39] APXML Advanced AI Infrastructure Course. High-Bandwidth Interconnects: NVLink, InfiniBand and Their Role in AI Cluster Architecture.
[40] IntuitionLabs (2026). UALink 200G 1.0 Specification Published by AMD, Intel, Google, Microsoft and Others as Open NVLink Alternative.
[41] CreditSights (2025). Technology: Hyperscaler Capex 2026 Estimates.
[42] Goldman Sachs (2025). Why AI Companies May Invest More Than $500 Billion in 2026: Cumulative Hyperscaler Investment Projections. Cited in Introl (2026).
[43] IEEE ComSoc Technology Blog (2025). Hyperscaler Capex Over $600bn in 2026: Larry Page Quoted on AI Race.
[44] CNBC (2026). Tech AI Spending Approaches $700 Billion in 2026; Barclays Models Negative Free Cash Flow for Meta in 2027 and 2028.
[45] CIO Dive (2025). Nvidia Shows Strong AI Demand as Enterprises Grapple With ROI. IBM Research: Less than half of IT leaders said AI projects were profitable in 2024, cited therein.
[46] Tech Insider (2026). Big Tech AI Spending: $700B Capex Race in 2026. Azure AI revenue 62% YoY, Google Cloud AI 48%, Amazon Bedrock API calls 3x in Q1 2026 cited therein.
[47] Goldman Sachs Global Institute (2026). Tracking Trillions: GPU Depreciation Risk and the Economics of Early-Vintage AI Hardware.
[48] I/O Fund (2025). Nvidia Stock and the AI Monetisation Supercycle No One Is Pricing In. OpenAI revenue trajectory from $1B in 2023 to $13B annualised cited therein.
[49] Franki Tabor (2025). US-China Trade Tensions and Taiwan’s Semiconductor Nexus. TSMC produces over 60% of all semiconductors and 92% of advanced chips at 7nm and below.
[50] Franki Tabor (2025). Analysts Warn Disruption of Taiwan’s Semiconductor Output Could Cost the Global Economy $2.5 Trillion in Annual Losses.
[51] AEI (2024). How Disruptive Would a Chinese Invasion of Taiwan Be? Center for Strategic and International Studies wargaming exercise cited therein.
[52] Verdantix (2025). War Games and Wafers: The Semiconductor Industry on a Geopolitical Edge. NVIDIA $5.5 billion forecasted hit from H20 export restrictions cited therein.
[53] Franki Tabor (2025). TSMC Announces $100 Billion Investment in Five Additional US Chip Facilities.
[54] Intellectia AI (2026). NVIDIA Hits $5 Trillion Market Cap: Valuation Range and Market Dynamics.
[55] XS Trading (2026). Nvidia Stock Price Prediction 2026: P/E Ratio, Forward Multiple, and Valuation Analysis.
[56] Winvesta (2026). NVIDIA Revenue Breakdown 2026: Gaming vs Data Centre Analysis. Data centre 88% of revenue at $115B in fiscal 2025 cited therein.
[57] Goldman Sachs Global Institute (2026). Tracking Trillions: $7.6 Trillion Cumulative AI CapEx Projected Between 2026 and 2031.
[58] Seeking Alpha (2026). Is Nvidia’s Valuation Still Justified? What Really Matters Heading Into 2026. EPS rose 70% in 2025 while stock price increased less than 40%, resulting in declining P/E multiple.
[59] Meta Intelligence (2026). DeepSeek V4 and R2 Deep Dive: R1 Release January 2025 and NVIDIA $589 Billion Single-Day Market Cap Loss.
[60] Bain and Company (2025). DeepSeek: A Game Changer in AI Efficiency? Training cost and GPU comparisons with GPT-4 and LLaMA 3 cited therein.
[61] Hypotenuse AI (2025). What Is DeepSeek R1 and Why Is It Making Waves in AI? Mixture of Experts architecture and training optimisation techniques detailed therein.
[62] Build Fast With AI (2026). DeepSeek V4-Pro Review: Benchmarks, Pricing and Architecture. 1.6 trillion parameter model at $3.48 per million output tokens.
[63] Remio AI (2025). Latest Updates and Industry Impact of DeepSeek V3.1 in 2025. AWS H100 GPU prices increased after DeepSeek release due to rising inference demand cited therein.
[64] Meta Intelligence (2026). DeepSeek R2 Delayed by Huawei Ascend Chip Instability; Project Pivoted Back to NVIDIA GPUs.