Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I had to propagate colors across a graph with a billion or so edges and had an easy time doing it in-memory in Java with a 32GB laptop and sub-byte data structures. In fact, it takes more time to serialize and deserialize the data than it takes to do the actual calculation.

I'd say, however, the trend is towards data sets being much bigger and I break non-scalable tools frequently in the data profiling process; you can extend the non-scalable ways of doing it by using multiple cores (which is sometimes easy) or SIMD instructions or the GPU. I have even sometimes gone down the rabbit hole of optimizing something non-scalable and hitting the wall. So I am using scalable systems increasingly.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: