I think people just don't realize how big computers have gotten since 2006. A t2.micro was an ok desktop computer back then. Today you can have something 1000 times as big for a few tens of thousands. You can easily run a company that serves the whole of the US out of a closet.
It's just wild to me how seemingly nobody is exploiting this.
Our industry has really lost sight of reality and the goals we're trying to achieve.
Sufficient scalability, sufficient performance, and as much developer productivity as we can manage given the other two constraints.
That is the goal, not a bunch of cargo-culty complex infra. If you can achieve it with a single machine, fucking do it.
A monolith-ish app, running on e.g. an Epyc with 192 cores and a couple TB of RAM???? Are you kidding me? That is so much computing power, to the point where for a lot of scenarios it can replace giant chunks of complex cloud infrastructure.
And for something approaching a majority of businesses it can probably replace all of it.
(Yes, I know you need at least one other "big honkin server", located elsewhere, for failover. And yes, this doesn't work for all sets of requirements, etc)
I feel this every day I talk with cloud-brained coworkers.
I manage an infrastructure with tens of thousands of VMs and everyone is obsessed with auto scaling and clustering and every other thing the vendor sales dept shoved down their throats while simultaneously failing to realize that they could spend <5% of what we currently do and just use the datacenter cages we _already have_ and a big fat rack of 2S 9754 1U servers.
The kicker? These VMs are never more than 8 cores a piece, and applications never scale to more than 3 or 4 in a set. With sub 40% CPU utilization each. Most arguments against cloud abuse like this get ignored because VPs see Microsoft (Azure in this case) as some holy grail for everything and I frankly don't have it in me to keep fighting application dev teams that don't know anything about server admin.
And that's without getting into absolutely asinine price/perf SaaS offerings like Cosmos DB.
Well the problem nowadays is what can be done has become what must be done. totally bypassing on question of what should be done. So now instead of single service serving 5 million requests in a business is replaced by 20 micro services generating traffic of 150 million requests with distributed transactions, logging (MBs of log per request), monitoring, metrics and so on. All leading to massive infrastructure bloat. Do it for dozen more applications and future is cloudy now.
Once management is convinced by sales people or consultants any technical argument can be brushed away as not seeing the strategic big picture of managing enterprise infrastructure.
Yes. That's a risk assessment every company must make. What's the probability of downtime vs the development slowdown and the operating costs of a fully redundant infrastructure?
I worked for a payments company (think credit cards). We designed the system to maintain very high availability in the payment flow. Multi-region, multi-AZ in AWS. But all other flows such as user registration, customer care or even bill settlement had to stop during that one incident when our main datacenter lost power after a testing switch. The outage lasted for three hours and it happened exactly once in five years.
In that specific case, investing into higher availability by architecting in more redundancy would not be worth it. We had more downtime caused by bad code and not well thought out deployments. But that risk equation will be different for everyone.
Very few businesses are living and breathing by their system uptime. Sure, it's bad, but having a recovery plan and good backups (or modest multi-site redundancy, if you're really worried) is sufficient for most.