tl;dr — click here to view the Github Repo.

On June 1, 2021, Google finally released TPU VMs into the Public after quietly announcing the private Alpha late last year.

Inside a TPU VM — 96 vCPU and 335 GB of RAM — prettyyy

The major key difference between TPU VMs and the previous way of utilizing TPUs (i.e. TPU Nodes) was that you never had direct access to the TPU Host itself, and the communication had to take place through grpc.

This meant that you had to set up a separate VM that then communicated with the TPU Host, which often introduced some problems that you might have otherwise with training on GPUs.

Tri Songz Nguyen

Chief Architect @ Growth Engine

