Neural Radiance Fields (NeRF)
Neural Radiance Fields (NeRF) is a technique that uses neural networks to represent 3D scenes as continuous volumetric functions, enabling photorealistic synthesis of novel viewpoints from a set of input photographs. Introduced by Mildenhall et al. in 2020, NeRF demonstrated that a relatively simple neural network could learn to encode the color and density at every point in 3D space, producing renderings of stunning quality.
The core idea is elegant. A NeRF model takes a 5D input — a 3D spatial coordinate (x, y, z) plus a 2D viewing direction (θ, φ) — and outputs the color and volume density at that point. To render an image, rays are cast from the camera through each pixel, sampled at points along each ray, and the network's outputs are composited using volume rendering equations. The viewing direction input allows the model to represent view-dependent effects like specular reflections and translucency.
Training optimizes the neural network to reproduce the input photographs when rendered from their known camera positions. No explicit 3D geometry is stored — the scene's structure is entirely encoded in the network weights. This implicit representation captures fine details, complex materials, and subtle lighting effects that challenge traditional reconstruction methods.
The original NeRF was slow: training took hours and rendering a single frame took minutes. Subsequent work dramatically improved both. Instant-NGP (NVIDIA, 2022) used hash-grid encodings to train NeRFs in seconds and render in milliseconds. Zip-NeRF combined mip-NeRF 360 with grid-based representations for both quality and speed. These advances brought NeRF from a research curiosity to a practical tool.
However, 3D Gaussian Splatting (2023) has emerged as a strong competitor, offering faster training, real-time rendering, and easier editing. The two approaches represent different tradeoffs: NeRFs excel at compact, continuous representations and view-dependent effects; Gaussian splats excel at speed and editability. Both are evolving rapidly, and hybrid approaches are emerging.
For spatial computing and mixed reality, NeRF and its successors solve a critical problem: turning real-world photographs into immersive 3D experiences. Combined with photogrammetry for initial camera estimation and neural rendering for real-time display, these techniques are making captured reality a viable content type alongside traditional 3D assets.
Further Reading
- The Agentic Web: Discovery, Commerce, and Creation — Jon Radoff