Google unveils its 8th generation TPUs for AI Training and Inference

Google unveils its 8th-generation TPUs for AI training and inference

Google has just unveiled its 8th-generation of TPUs (Tensor Processing Units), announcing two separate TPU units for AI training and inference. These chips are Coogle’s 8t and 8i TPUs. These workload-specialised chips aim to deliver more performance per dollar, with more raw performance and more power efficiency at scale.

Both of these new TPUs will be available later this year and can be used as part of Google’s AI Hypercomputer platform.

Separate chips for AI Training and Inference

With TPU 8t, Google aimed to deliver more computational throughput and maximise interchip bandwidth. Google’s new 8t Superpods now scale to 9,600 chips with 2 petabytes of shared HBM memory and 3x the computational performance per pod as their last-generation Superpods.

With 10x faster storage access and their new TPUDirect feature, data can now be pulled directly into Google’s TPUs and delivered faster. This enables higher levels of hardware utilisation, boosting performance.

With their new 8t TPU, Google aims to reduce the AI model development cycle from months to weeks. This chip is targeted at training, where raw computational power is vital.

 

Big upgrades over Ironwood

With Google’s 8i chip, Google pairs 288GB of HBM memory with 384MB of SRAM. This is 3x the SRAM of Google’s last-generation Ironwood chip. This allows Google’s new TPU to keep more data on chip, accelerating performance.

With 8i pods, Google has doubled the number of CPU hosts per server and started using its new Axion ARM CPUs. These changes deliver increased full system performance. New on-chip Collectives Acceleration Engines (CAEs) and Google have also minimised lag by reducing on-chip latencies by 5x.

Overall, these changes reportedly deliver 80% more performance per dollar to users. Google claims that businesses can nearly serve twice as many customers at the same cost as before.

With its 8th-generation TPU technology, Google has delivered boosted performance and increased value. That said, developing two separate chips moving forward would significantly increase development costs. It also means that their new chips are a little less versatile than their last-generation TPU, which was made for both Inference and Training. While purpose-built hardware has its advantages, more generalised hardware can be more readily used as workflows change. AI is an evolving market, so what’s optimal today could change tomorrow.

You can join the discussion on Google’s new 8th-generation TPUs on the OC3D Forums.

Mark Campbell

Mark Campbell

A Northern Irish father, husband, and techie that works to turn tea and coffee into articles when he isn’t painting his extensive minis collection or using things to make other things.

Follow Mark Campbell on Twitter
View more about me and my articles.

Uh-oh! It looks like you're using an ad blocker.

OC3D relies on ads to provide free content and sustain our operations. By white listing us on your ad blocker, you help support us and ensure we can continue offering valuable content without any cost to you. We only run our own hand picked ads from Industry brands like MSI, BeQuiet, Sapphire and PC-Specialist - meaning they are all relevent to the content you are reading.

We truly appreciate your understanding and support. Thank you for considering whitelisting OC3D