Maybe only for their own models

walterbell · on Dec 11, 2024

Now any Google customer can use Trillium for training any model?

richards · on Dec 11, 2024

[Google employee] Yes, you can use TPUs in Compute Engine and GKE, among other places, for whatever you'd like. I just checked and the v6 are available.

KaoruAoiShiho · on Dec 11, 2024

Is there not goin to be a v6p?

richards · on Dec 11, 2024

Can't speculate on futures, but here's the current version log ... https://cloud.google.com/tpu/docs/system-architecture-tpu-vm...

xnx · on Dec 11, 2024

Google trained Llama-2-70B on Trillium chips

monocasa · on Dec 11, 2024

I thought llama was trained by meta.

DrBenCarson · on Dec 11, 2024

> Google trained Llama

Source? This would make quite the splash in the market

xnx · on Dec 11, 2024

It's in the article: "When training the Llama-2-70B model, our tests demonstrate that Trillium achieves near-linear scaling from a 4-slice Trillium-256 chip pod to a 36-slice Trillium-256 chip pod at a 99% scaling efficiency."

llm_nerd · on Dec 11, 2024

I'm pretty sure they're doing fine-tune training, using Llama because it is a widely known and available sample. They used SDXL elsewhere for the same reason.

Llama 2 was released well over a year ago and was training between Meta and Microsoft.

hhh · on Dec 11, 2024

They can just train another one.

llm_nerd · on Dec 11, 2024

Llama 2 end weights are public. The data used to train it, or even the process used to train it, are not. Google can't just train another Llama 2 from scratch.

They could train something similar, but it'd be super weird if they called it Llama 2. They could call it something like "Gemini", or if it's open weights, "Gemma".

lern_too_spel · on Dec 12, 2024

The article says they used maxtext to load the weights and pretrain on additional data. It looks like the instructions for doing that are here: https://github.com/AI-Hypercomputer/maxtext/blob/main/gettin...

ein0p · on Dec 12, 2024

They don't mean literally LLaMA. They mean a model with the same architecture.