Google Announces New Eighth-Gen TPUs, Splitting Architecture to Power "Agentic Era"

AI-generated image: synthetic visual, not an actual depiction of events, people, or locations.
Google Cloud officially declared the beginning of the "Agentic Era" at its annual Cloud Next '26 conference today, April 22, 2026, by announcing the launch of its eighth-generation Tensor Processing Units (TPUs). In a strategic shift, Google has split the architecture into two distinct, purpose-built chips: TPU 8t and TPU 8i, designed to handle the continuous reasoning and multi-step workflows required by autonomous AI agents.
The TPU 8t (Training) is engineered specifically to accelerate the development of massive frontier models. Google claims the chip delivers nearly 3x higher compute performance than the previous generation, significantly shrinking training timelines for complex AI.
Key technical milestones for TPU 8t include:
Massive Scale: A single superpod contains 9,600 chips, providing 121 exaflops of compute and two petabytes of shared memory.
Global Fabric: Utilizing the Virgo network and JAX/Pathways, Google can now link more than one million TPUs across multiple data center sites into a single training cluster.
Efficiency and Precision: The architecture introduces native 4-bit floating point (FP4) support to double throughput and reportedly offers twice the performance per watt compared to the prior generation.
Specialized Accelerators: The "SparseCore" engine handles irregular memory access patterns for embedding-heavy workloads, preventing zero-op bottlenecks during training.
Designed specifically for the "agentic" compute loop—reasoning, planning, and execution—the TPU 8i (Inference) focuses on ultra-low latency and high-concurrency requests. Google reports an 80% improvement in performance per dollar for inference tasks compared to its predecessor.
Operational features of TPU 8i include:
Memory Architecture: The system triples on-chip SRAM to 384 MB and increases high-bandwidth memory (HBM) to 288 GB to host massive Key-Value (KV) Caches directly on silicon.
Collectives Acceleration Engine (CAE): This dedicated engine reduces on-chip latency by up to 5x, accelerating the reduction and synchronization steps essential for "chain-of-thought" processing.
Boardfly Topology: For the 8i, Google moved to a specialized Boardfly network topology to minimize all-to-all latencies across clusters of up to 1,024 active chips.
The new chips are integrated into Google’s AI Hypercomputer, a unified stack that combines the eighth-generation TPUs with the Virgo data center fabric and high-performance storage solutions like Google Cloud Managed Lustre, which now delivers 10 TB/s of bandwidth.
This announcement positions Google as a direct vertical competitor to both NVIDIA and Arm in the merchant and custom silicon markets. While NVIDIA remains a primary partner—Google also announced upcoming availability for NVIDIA Vera Rubin NVL72 systems—the TPU 8-series provides a sovereign alternative for customers deeply integrated into the Google Cloud ecosystem.
Both TPU 8t and 8i are expected to reach general availability later this year.
This article was generated with the support of our AI agent, which has been rigorously trained under the supervision of well-qualified journalists. While we strive for the highest quality in every article, if you find anything amiss, please contact us to let us know.
RELATED MARKET INTELLIGENCE NEWS
MORE NEWS
Cango Inc. Appoints New CFO and Director Amid Strategic Shift
15h ago

HIVE Announces Closing of Private Offering of US$115 Million of 0% Exchangeable Senior Notes
1d ago

Soluna Expands Blockware Deal With 3.3 MW at Texas Wind-Powered Bitcoin, AI Data Center
1d ago

TransAlta Appoints Mike Politeski as CFO and Grant Arnold as CCO
2d ago

Keel Infrastructure Finalizes $13 Million Sale of Paraguay Site, Exits Latin America
15h ago

HIVE Digital secures $75 million through private bond issuance and seeks to list on TSX
Apr 15, 2026
