They are seeing 60k downloads of 6GB models per day, which is about 33Gbps of bandwidth (assuming no burstiness in when people visit, which is a poor assumption).
That is starting to get out of the range of what is easily available. 10Gbps circuits are commodities (I had one at my desk at my last job), but 100Gbps circuits are still pretty pricey. And, it's not necessarily trivial to get that kind of throughput on file serving out of the box; this bandwidth is something like CPU <-> video card, not disk <-> cpu, or cpu <-> network. Some tweaking is for sure going to be necessary if you are self-hosting this, and now you're tweaking network parameters and writing a custom file server instead of writing your game.
The cloud here is making something possible that should never have been possible, which is pretty cool. Being able to go from 0 infrastructure to 30Gbps of file serving without lifting a finger is somewhat impressive... but with that fast iteration times, comes the entity that did all the work wanting their cut. It seems fair to me, though perhaps not economically viable. Such is life.
Wait... you mean the actual computation is running client in the browser? I didn't even open this "game", but I assumed such high cost is because there is a separate GPT-2 running on a GPUs for each and every user.
They send it to Google Colab. It allow you to run model over a powerful server for free. 6 GB of download at this crazy bandwidth cost would still be cheap versus offering themselves that kind of beast of a server to 60k persons each day. I remember when I tried it I saw that it used 10 GB of memory, that was crazy!
You'd save on transferring the bytes around, but now you would have to self-host jupyter and the GPUs it uses. That is going to be even more expensive than IP transit because now you have to have 60,000 12GB GPUs in your datacenter.
Like I said before, this is one of those things that wouldn't exist without the Cloud. If you run things on your user's computers, you have to send them a lot of bits. If you run things on your own computers, you're spared that bandwidth, but now have to have enough "computers" to satisfy your users. It's simply something that's not super cheap to run these days.
I will admit that it is surprising that Google <-> Google traffic is billed at the normal egress rates, but the reasoning does make sense -- a 30Gbps flow is nothing to sneeze at. That is using some tangible resources.
No, the models are running in Colab - but in the end-user's account - so each 'run' costs the lab 6GB of internet egress when the model is downloaded from GCS to the Colab VM.
That is starting to get out of the range of what is easily available. 10Gbps circuits are commodities (I had one at my desk at my last job), but 100Gbps circuits are still pretty pricey. And, it's not necessarily trivial to get that kind of throughput on file serving out of the box; this bandwidth is something like CPU <-> video card, not disk <-> cpu, or cpu <-> network. Some tweaking is for sure going to be necessary if you are self-hosting this, and now you're tweaking network parameters and writing a custom file server instead of writing your game.
The cloud here is making something possible that should never have been possible, which is pretty cool. Being able to go from 0 infrastructure to 30Gbps of file serving without lifting a finger is somewhat impressive... but with that fast iteration times, comes the entity that did all the work wanting their cut. It seems fair to me, though perhaps not economically viable. Such is life.