This raises so many questions that I can't even really comprehend that it worries me. Please, if there are well formed responses or dialogues that form based on this, somebody make sure to link them back. Very very interesting read.
The basic point is:
You think your dataset is big, but it probably isn't. A well written C program running on a modern laptop can handle a few billion node sized graph dataset fine. Before paying for 128 cores running a "scalable" implementation, first check your dataset size. Is it in the terabytes? If not, you're probably fine running any computations on a single core.
Google and Facebook need cluster computing solutions like MapReduce and Spanner. You most likely don't, and are years away from needing one. Even a successful startup like SnapChat or Tinder could probably do all their analytics/graph processing needs on a single core. So worry about the other things first.
Seems to me that there is more to it than this, though. The overhead added by many of the algorithms for scalability have a sizeable cost that has not kept pace with Moore's Law as to what can be done on a local machine.
There is a similar section in Knuth's Stanford GraphBase where he examines several graph algorithms and finds that the more advanced ones have better asymptotic behavior, but that the simpler algorithms outperform them at real numbers.
Bearing that I am not an expert, I do want to know of more in this discussion. And I fully welcome explanations of where I'm wrong. :)
In the article he's dealing with a graph of billions of edges - probably not billions of nodes. That's been my experience, too, with loading a graph in memory on a laptop: billions of edges and hundreds of millions of nodes is fine but getting close to being too much.