Did the Scala 3 changeover blunt its momentum, you think? Or just Python won out...

hocuspocus · 2026-01-31T01:05:04 1769821504

Mostly the latter. Scala 3 is almost completely irrelevant to the big data space so far. Databricks took six years before upgrading their proprietary Spark runtime to Scala 2.13. Flink dropped the Scala API before even moving to 2.13. I don't know if Scio will seriously attempt the move to Scala 3. All of them suffer from Twitter libraries being abandoned, which isn't insurmountable, but an annoyance still.

And I don't think it matters anymore. I predict that the JVM will eventually be out of the equation. We're already seeing query engines being replaced by proprietary or open source equivalents in C++ or Rust. Large scale distribution is less of a selling point with modern cloud computing. Do you really need 100 executors when you can get a bare metal instance with 192, 256 or 384 cores?

People want a dataframe API in Python because that's what the the ML/DS/AI crowd knows. Queries and processing will be done in C++ or Rust, with little or even zero need for a distributed runtime. The JVM and Scala solve a problem that simply won't exist anymore.

agentcoops · 2026-02-03T20:45:46 1770151546

Yeah, this is certainly the correct take. There's an alternate timeline where the Scala community focused during the peak on making it a better language for numeric computing/ML rather than the Nth category theoretic framework, but here we are. At a job almost a decade ago, we made some progress on an open source dataframes (and unfortunately proprietary data visualization) library for Scala, but we didn't get far enough before the company closed and the project died [1].

Still my favorite language I had the privilege to work with professionally for over a decade. However, in this post-JVM world, I'm actually excited to see a lot more OCaml discussion on here lately. The Jane Street work on OxCaml is terrific after a long period of stagnation with the language. I'm using it for most of my projects these days.

[1] https://github.com/tixxit/framian