shuaiboi's comments

shuaiboi · 2025-07-24T10:53:55 1753354435

bjarne (creator of c++) has a quote about this:

Unified function call: The notational distinction between x.f(y) and f(x,y) comes from the flawed OO notion that there always is a single most important object for an operation. I made a mistake adopting that. It was a shallow understanding at the time (but extremely fashionable). Even then, I pointed to sqrt(2) and x+y as examples of problems caused by that view.

https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p19...

saghm · 2025-07-24T13:22:50 1753363370

The main benefit of x.f(y) IMO isn't emphasizing x as something special, but allowing a flat chain of operations rather than nesting them. I think the differences are more obvious if you take things a step further and compare x.f(y).g(z) and g(f(x, y), z). At the end of the day, the difference is just syntax, so the goals should be to aid the programmer in writing correct code and to aid anyone reading the code (including the original programmer at a later point in time!) in understanding the code. There are tradeoffs to using "method" syntax as well, but to me that mostly is an argument for having both options available.

roland-s · 2025-08-04T21:24:20 1754342660

The pipe operator eliminates the syntax problem.

Ocaml example:

  let increment x = x + 1
  let square x = x \* x
  let to_string x = string_of_int x

  let result = 5 |> increment |> square |> to_string
  (\* result will be "36" \*)

atsbbg · 2025-07-24T14:28:41 1753367321

That's exactly the context of where this quote comes from. He wanted to introduce Unified call syntax[1] which would have made both of those equivalent.

But he still has a preference for f(x,y). With x.f(y) gives you have chaining but it also gets rid of multiple dispatch / multimethods that are more natural with f(x,y). Bjarne has been trying to add this back into C++ for quite some time now.

https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n44...

saghm · 2025-07-24T15:09:28 1753369768

That makes sense! It wasn't immediately obvious to me from the context of this discussion though, since it seemed like you were responding to the parent comment's framing of it as a choice between one or the other. This might just be from me never having heard of the term "unified call syntax" though, and I admit that I didn't expect to need to read through the PDF you linked before super carefully after opening it and seeing a bunch of C++-specific details that I knew would go over my head. (On a good day, I can remember what words SFINAE is an acronym for, but I don't think I ever really felt like I got comfortable enough with C++ to fully understand how the template code I saw actually encoded behavior that fit what I'd expect those words to mean).

nickitolas · 2025-07-24T14:04:46 1753365886

If my memory isn't failing me, that was part of the reason rust went with a postfix notation for their async keyword ("thing().await") instead of the more common syntax ("await thing()")

saghm · 2025-07-24T14:29:41 1753367381

Yep, and that itself was similar to the rationale for introducing `?` as a postfix operator where the `try!(...)` macro had previously been used. In retrospect, it's kind of funny to look back and see how controversial that was at the time, because despite there being plenty of criticism of the async ecosystem in once, the postfix `.await` might be the one thing that seems to consistently be praised by people needing to use it. People might not like using async, but when we do use it, it seems like we're pretty happy with the syntax for `.await`.

Ferret7446 · 2025-07-26T07:11:50 1753513910

The flaw here isn't OO, but the idea that it should be applied in all cases. The point of OO methods is to control access and mutation of an object's private members. Rather than have clients know to, e.g., lock specific fields before accessing, you expose a method that handles the locking appropriately.

ivanjermakov · 2025-07-25T00:55:40 1753404940

There is the most important argument regardless, whether it is a method or regular function - first one (or last one in languages supporting currying). While it's true that there are functions with parameters of equal importance, most of them are commutative anyway.

shuaiboi · on Sept 29, 2024

Just a guess... wouldl like to hear the answer as well.

they probably have a monotonicity detector somewhere, which can decide whether to keep all the values or discard them. If they keep them, they probably use something like a segment tree to index.

ryzhyk · on Sept 29, 2024

That's right, we perform static dataflow analysis to determine what data can get discarded. GC itself is done lazily as part of LSM tree maintenance. For MAX specifically, we don't have this optimization yet. In the general case, incrementally maintaining the MAX aggregate in the presence of insertions and deletions requires tracking the entire contents of the group, which is what we do. If the collection can be proved to be append-only, then it's sufficient to store only the current max element. This optimization is yet coming to Feldera.

lsuresh · on Sept 29, 2024

Yes, we do a lot of work with monotonicity detection. It's central to we perform automatic garbage collection based on lateness.

shuaiboi · on Sept 29, 2024

This is pretty neat but I'm wondering how well this implementation obeys dataframe algebra. Ponder goes into detail about how dataframes and relations aren't the same, but your dataframe zset seems to be more or less the exact same thing as the relation zset?

https://youtu.be/7TyIjqvfWto?si=CMFH30DFEWxkltlw&t=1095

rebanevapustus · on Sept 29, 2024

It does not. The example I give on the README is only meant to show how easy it is to use it to "streamify" regular relational Dataframe operations.

shuaiboi · on Sept 29, 2024

would something like dbsp support spreadsheet style computations? Most of the financial world is stuck behind spreadsheets and the entire process of productioinizing spreadsheets is broken:

* Engineers don't have time to understand the spreadsheet logic and translate everything into an incremental version for production.

* Analysts don't understand the challenges with stream processing.

* SQL is still too awkward of a language for finance.

* Excel is a batch environment, which makes it hard to codify it as a streaming calculation.

If I understand correctly, your paper implies as long as there is a way to describe spreadsheets as a Zset, some incremental version of the program can be derived? Spreadsheets are pretty close to a relational table, but it would be a ZSet algebra on cells, not rows, similar to functional reactive programming. So dbsp on cells would be incremental UDFs, not just UDAFs?

thoughts??

lsuresh · on Sept 29, 2024

Great question. DBSP should work here -- spreadsheets are by definition incremental (and there's even recursive queries there with cells depending on each other).

Note that we use Z-Sets to bridge SQL/tables with DBSP, but Z-Sets aren't general enough for spreadsheets.