HBM vs Optical Interconnects

Comparison

High Bandwidth Memory (HBM) and optical interconnects are two foundational technologies addressing the bandwidth crisis in AI computing — but they operate at fundamentally different layers of the data movement hierarchy. HBM solves the memory wall: delivering terabytes per second of bandwidth between DRAM and processor cores across millimeters of silicon. Optical interconnects solve the network wall: moving data between chips, servers, and racks across meters to kilometers using photons instead of electrons. Together, they form the two most critical bandwidth technologies in modern AI datacenters, and their co-evolution will determine how far AI infrastructure can scale. This comparison examines where each technology excels, where they overlap, and how they complement each other in the AI compute stack.

Feature Comparison

Dimension	High Bandwidth Memory	Optical Interconnects
Primary Function	Processor-to-memory data transfer (on-package)	Chip-to-chip, rack-to-rack, and datacenter-scale data transport
Signal Medium	Electrical signals through through-silicon vias (TSVs) and silicon interposer	Photons through optical fiber and silicon photonic waveguides
Typical Reach	~5 mm (on-package via interposer)	Meters to 80+ km (fiber optic links)
Bandwidth (Current Gen)	HBM3e: up to 8 TB/s per stack (NVIDIA B200); HBM4: up to 2 TB/s per stack via 2048-bit interface	800G per port today; NVIDIA Quantum-X switches deliver 115 Tb/s aggregate (144 × 800G ports); 1.6T transceivers arriving 2026
Energy Efficiency	~2.5 pJ/bit (HBM3); improving to sub-2 pJ/bit with HBM4	5–6 pJ/bit (CPO in 2026); 0.05–0.2 pJ/bit for fiber transmission alone; targeting sub-1 pJ/bit end-to-end
Key Manufacturers	SK Hynix (market leader), Samsung, Micron	Broadcom, NVIDIA, Intel, Cisco (Acacia), Ayar Labs, Coherent
Manufacturing Process	3D DRAM die stacking with TSVs; advanced packaging on TSMC interposers	Silicon photonics fabricated on standard CMOS processes; co-packaged with switch/accelerator ASICs
Cost Profile	5–10× cost of standard DRAM per GB; major contributor to $30K–40K+ accelerator cost	800G transceivers ~$500–1,000 each; CPO expected to reduce per-port cost at scale by 2028
Supply Constraints	Severe: only 3 manufacturers; complex 3D stacking limits yields; most constrained AI component	Moderate: transceiver supply tightening with AI demand; CPO still ramping to volume production
Scaling Trajectory	HBM4 (2026): 2048-bit interface, 64 GB/stack; HBM4e (2027): 3.25 TB/s; roadmap extends to HBM8	1.6T transceivers (2026); CPO volume adoption 2028–2030; future optical memory interfaces targeting 100+ TB/s
Role in AI Training	Stores model weights and activations on-chip; bandwidth determines batch size and model scale	Connects GPUs across the training cluster; bandwidth determines gradient synchronization speed
Role in AI Inference	Bandwidth-limited: token generation speed dominated by weight-reading from HBM	Enables distributed inference across multiple servers; critical for disaggregated inference architectures

Detailed Analysis

Different Layers of the Same Bandwidth Problem

The AI compute stack has a layered bandwidth hierarchy, and HBM and optical interconnects address different — but equally critical — levels. HBM operates at the innermost layer: the interface between processor cores and memory dies, separated by just a few millimeters of silicon. HBM achieves its extraordinary bandwidth (up to 8 TB/s in HBM3e configurations) by stacking DRAM dies vertically and connecting them with thousands of through-silicon vias, creating an ultra-wide 1024-bit or 2048-bit data bus. Optical interconnects, by contrast, handle everything beyond the package edge — chip-to-chip links within a server, server-to-switch uplinks, and the datacenter fabric that ties thousands of AI accelerators into a single training cluster. Neither technology can substitute for the other: faster HBM cannot compensate for a slow network fabric, and faster optical links cannot help if the GPU is bottlenecked reading weights from memory.

The Memory Wall vs. The Network Wall

AI workloads face two distinct bandwidth walls. The memory wall limits single-accelerator performance: during LLM inference, each generated token requires reading the entire model's weights from HBM, making token generation rate directly proportional to memory bandwidth. NVIDIA's B200 with 8 TB/s of HBM3e bandwidth can generate tokens roughly 2.4× faster than the H100's 3.35 TB/s, purely from the memory bandwidth improvement. The network wall limits multi-accelerator scaling: when training a model across thousands of GPUs, gradient synchronization requires all-reduce operations that push massive data volumes across the interconnect fabric. As clusters grow from hundreds to hundreds of thousands of accelerators, the aggregate network bandwidth must scale superlinearly — a challenge only optical interconnects can meet without the power budget consuming the entire datacenter.

Energy Efficiency at Different Scales

Both technologies are engaged in a race to reduce energy per bit, but at very different operating points. HBM3 achieves approximately 2.5 pJ/bit for memory access — a 68% improvement over GDDR6X — and HBM4 is pushing below 2 pJ/bit through shorter TSV paths and optimized signaling. Optical interconnects tell a more nuanced story: the fiber transmission itself is extraordinarily efficient (0.05–0.2 pJ/bit), but the electro-optical conversion at each end — the lasers, modulators, photodetectors, and transimpedance amplifiers — adds significant overhead. Current pluggable transceivers consume roughly 15 pJ/bit end-to-end, but co-packaged optics (CPO) have driven this down to 5–6 pJ/bit by eliminating power-hungry SerDes circuits. NVIDIA reports that CPO reduces networking power consumption by up to 3.5× compared to pluggable modules. For AI datacenters where interconnect power can consume 20–30% of total facility power, this efficiency gain is transformative.

Supply Chain Dynamics and Strategic Risk

HBM faces one of the most constrained supply chains in the semiconductor industry. Only three companies — SK Hynix, Samsung, and Micron — can manufacture it, and the complex 3D stacking process yields fewer good units per wafer than standard DRAM. HBM has become the single most supply-constrained component in AI infrastructure, often more so than the GPUs themselves. SK Hynix commands roughly 50% market share and has seen its revenue multiply as AI demand surged. Optical interconnects have a broader but still concentrated vendor ecosystem: Broadcom leads in switch ASICs and CPO integration (with its Tomahawk 6-Davisson achieving the first 102.4 Tbps CPO switch), while companies like Coherent, Lumentum, and II-VI supply the optical components. The transition to CPO introduces new supply chain dependencies on silicon photonics fabrication, which relies on TSMC and GlobalFoundries process nodes — adding another layer to an already complex AI supply chain.

Convergence: When Memory Meets Optics

Perhaps the most exciting development is the emerging convergence of these two technologies. Researchers and companies are developing optical memory interfaces that would use photonic links to connect HBM stacks to processors, potentially achieving bandwidths exceeding 100 TB/s — an order of magnitude beyond current electrical interposers. This would effectively extend HBM's reach from millimeters to meters, enabling disaggregated memory architectures where pools of HBM could be shared across multiple accelerators via optical fabric. NVIDIA's roadmap hints at this direction: its 2026 platforms integrate CPO for switch-level interconnects, and future generations may bring optical I/O directly onto GPU packages. Intel has demonstrated a fully integrated optical I/O chiplet that could serve as a bridge between on-package memory and optical fabric. If realized, this convergence would dissolve the current boundary between the memory hierarchy and the network hierarchy entirely.

Investment and Market Trajectory

Both technologies are experiencing explosive market growth driven by AI demand. The HBM market is projected to exceed $25 billion by 2026, driven by insatiable demand from NVIDIA, AMD, and hyperscaler custom accelerators. The optical interconnect market for AI datacenters reached $9.94 billion in 2025 and is projected to hit $31 billion by 2033, with 800G transceiver shipments doubling year-over-year in 2025 alone. The critical difference is maturity: HBM is a proven, volume-production technology with a clear generational roadmap (HBM4 through HBM8), while CPO — the most transformative optical interconnect architecture — is just entering initial commercial deployment in 2026, with large-scale adoption not expected until 2028–2030. For infrastructure planners, HBM choices are constrained by what accelerator vendors offer, while optical interconnect decisions involve active architecture choices between pluggable modules, linear-drive optics, and co-packaged solutions.

Best For

Single-GPU Inference Performance

HBM

Token generation speed in LLM inference is directly limited by HBM bandwidth. A higher-bandwidth HBM generation (e.g., HBM3e vs. HBM3) delivers proportional inference speedups. Optical interconnects are irrelevant for single-device inference.

Multi-Node Training at Scale (1,000+ GPUs)

Optical Interconnects

Training clusters spanning multiple racks require optical fabric for gradient synchronization. At 1,000+ GPU scale, all-reduce bandwidth requirements make optical interconnects the primary scaling bottleneck — not memory bandwidth.

Large Model Hosting (100B+ Parameters)

Both Critical

Models exceeding single-GPU HBM capacity require tensor parallelism across multiple devices, making both HBM bandwidth (for per-GPU weight loading) and optical interconnects (for inter-GPU tensor communication) simultaneously critical.

Datacenter Power Optimization

Optical Interconnects

Co-packaged optics reduce networking power by up to 3.5× vs. pluggable modules. At datacenter scale where interconnect power is 20–30% of total consumption, the transition from electrical to optical I/O yields the largest absolute power savings.

AI Accelerator Cost Reduction

HBM

HBM is the single largest cost component in AI accelerators at 5–10× the price of standard DRAM. Yield improvements and generational density increases in HBM have the greatest impact on reducing per-accelerator cost.

Edge AI / On-Device Inference

HBM

Edge deployments typically use single accelerators where memory bandwidth determines performance. Optical interconnects are unnecessary at edge scale where electrical connections between components are measured in centimeters.

Disaggregated / Composable Infrastructure

Optical Interconnects

Future architectures that pool memory and compute resources across a fabric require ultra-low-latency optical links. Optical memory interfaces targeting 100+ TB/s could enable shared HBM pools — a paradigm shift only optics can deliver.

Next-Generation AI Supercomputers (100K+ GPUs)

Both Critical

Frontier-scale systems like NVIDIA's GB200 NVL72 and future Rubin platforms require both maximum HBM bandwidth per accelerator and optical fabric to connect tens of thousands of GPUs. Neither technology alone is sufficient at this scale.

The Bottom Line

High Bandwidth Memory and optical interconnects are not competing technologies — they are complementary solutions to different layers of AI's bandwidth crisis. HBM is the critical near-term technology: it directly determines single-accelerator performance for the memory-bandwidth-bound workloads that dominate AI inference, and its constrained supply chain makes it the pacing factor for AI infrastructure buildout today. Optical interconnects are the critical scaling technology: as training clusters grow beyond thousands of accelerators and datacenter power budgets tighten, only photonic solutions can deliver the bandwidth density and energy efficiency needed to sustain AI's exponential growth. The most important trend to watch is their convergence — optical memory interfaces and photonic I/O chiplets could merge these two technology domains, enabling disaggregated architectures where HBM pools are connected via optical fabric. For infrastructure planners, the practical advice is straightforward: HBM choices are dictated by your accelerator vendor, but optical interconnect architecture decisions (pluggable vs. CPO, InfiniBand vs. Ethernet) remain active design choices with long-term implications for datacenter scalability and power efficiency.

HBM vs Optical Interconnects

Feature Comparison

Detailed Analysis

Different Layers of the Same Bandwidth Problem

The Memory Wall vs. The Network Wall

Energy Efficiency at Different Scales

Supply Chain Dynamics and Strategic Risk

Convergence: When Memory Meets Optics

Investment and Market Trajectory

Best For

Single-GPU Inference Performance

Multi-Node Training at Scale (1,000+ GPUs)

Large Model Hosting (100B+ Parameters)

Datacenter Power Optimization

AI Accelerator Cost Reduction

Edge AI / On-Device Inference

Disaggregated / Composable Infrastructure

Next-Generation AI Supercomputers (100K+ GPUs)

The Bottom Line

Related Topics

Further Reading