So is this similar to Amazon's Athena? I'm trying to place what a 'realtime dist...

PeterCorless · on Sept 20, 2023

Here's a better side-by-side comparison of Apache Pinot vs. Apache Druid vs. Clickhouse made back in April 2023. Let me know if anyone sees any updates or corrections that should be made. Would be more than happy to update.

https://startree.ai/blog/a-tale-of-three-real-time-olap-data...

mjirv · on Sept 20, 2023

Potentially dumb question, but why does “who viewed my profile” (the feature mentioned in the post as the original use case at LinkedIn) require a realtime OLAP datastore anyway?

PeterCorless · on Sept 20, 2023

That's a great question. A bit of history. Here's more on Pinot, back when it was invented at LinkedIn, before it was ASF-incubated as "Apache Pinot."

LinkedIn originally had been using traditional OLTP systems to power the "Who Viewed My Profile" app; they just simply hit the wall. So LinkedIn looked at other analytics databases at the time. Most just couldn't handle a large number of QPS — which, as a social media platform, they readily anticipated. This is why they defined the category as "user-facing, real-time analytics."

[This is terminology mostly specific to the Apache Pinot crowd, though I see that StarRocks / CelerData also recently started talking about user-facing analytics. So I wrote up an article explaining it here:

https://startree.ai/resources/what-is-user-facing-analytics ]

Other similar extant systems LinkedIn benchmarked at the time just couldn't give them the numbers they needed at the low latency, high concurrency and large scale they anticipated — terabytes to petabytes of data. So they wrote their own solution.

Pinot was originally intended for marketing purposes to capture live intent & action data.

The same Pinot infrastructure eventually grew to other real-time use cases. "Who Viewed My Profile" was followed by "Company Follow Analytics," then sales or recruiting, and even internal A/B testing.

[I wasn't at LinkedIn while this happened; this lore was passed down to me by others. Disclosure: yes I work at StarTree.

https://engineering.linkedin.com/analytics/real-time-analyti... ]

jovezhong · on Sept 20, 2023

LinkedIn is just another social network, more work related. When you post something new, you do want to see how many likes/comments. Just like Ins. "Who viewed my profile" is real in LinkedIn. It can be someone who is hiring, someone who may watch your talk, someone who may buy your product. I personally started some business conversations when I realized someone viewed my profile, or at least add they as new connections.

mjirv · on Sept 20, 2023

Sure, I understand why you’d want real-time. I guess the OLAP part is the one I’m not sure about here.

Most of these use cases seem more “pull all data for a specific user/post/company” and less “do analytical queries on millions+ of column values.”

PeterCorless · on Sept 20, 2023

Got it. To that end, Apache Pinot has a special index that allows certain dimensions to be drilled down on further, more granularly, than others, called the star-tree index. It's part of what makes Pinot so fast.

https://engineering.linkedin.com/blog/2019/06/star-tree-inde...

fiddlerwoaroof · on Sept 19, 2023

My impression is that it’s in the same space as RedShift, Snowflake, Citus, Greenplum, ClickHouse.

zX41ZdbW · on Sept 19, 2023

It's hardly comparable with ClickHouse. Even loading a table with 100M rows is not an easy endeavor in Pinot: https://github.com/ClickHouse/ClickBench/blob/main/pinot/ben...

fiddlerwoaroof · on Sept 19, 2023

Yeah, ClickHouse seems to me to be best in class, but I think Pinot is still aiming at the same general market segment.

riku_iki · on Sept 20, 2023

that benchmark is very limited in use cases (no heavy grouping, no joins, very small dataset), I think it is more for self-advertisement purposes

PeterCorless · on Sept 20, 2023

JOINs are a non-trivial problem with data in the terabyte-to-petabyte range. Apache Pinot had to re-architect itself with a multi-stage query engine in order to handle native query-time JOINs. There was a separate blog released today that dealt with that specifically:

https://startree.ai/blog/query-time-joins-in-apache-pinot-1-...

[Edit: if users just naively throw query-time JOINs at a problem, they might not get results in the time they want — a non-optimized real-time JOIN took upwards of 39 seconds. With predicate pushdowns, and partition-aware JOINs, suddenly results could be done in 3.3 seconds — faster by an order of magnitude. Still kinda long for a typical page or mobile app refresh but survivable. And yes, even faster subsecond results were possible with additional compute parallelism, but that has a cloud services cost associated with it.

Pre-ingestion JOINs, such as through Apache Flink, would be generally more performant than query-time JOINs. And that's how most people do them right now anyway. So just because you can do query-time JOINs doesn't mean you should if you haven't thought about it ahead of time. If you can, optimize the data partioning to ensure best performance. The good news is that users now have a lot more flexibility in how they want to do JOINs.]

riku_iki · on Sept 20, 2023

> JOINs are a non-trivial problem with data in the terabyte-to-petabyte range.

but that benchmark is 250G, plus all established players I think support joins well, I work with BQ closely and it works very well.

> So just because you can do query-time JOINs doesn't mean you should if you haven't thought about it ahead of time.

prebuilding denormalized table can significantly increase your datasize, so it is tradeoff and depends on your data structure and cardinality.

saberience · on Sept 20, 2023

Are you comparing BigQuery to Pinot? They are for totally different use cases.

riku_iki · on Sept 20, 2023

I don't agree with "totally", many usecases, especially where joins are involved totally overlap.

PeterCorless · on Sept 20, 2023

Very true.

PeterCorless · on Sept 20, 2023

Exactly.

fiddlerwoaroof · on Sept 20, 2023

I’m speaking from experience, not from those benchmarks.

riku_iki · on Sept 20, 2023

somehow I have different experience, anything unlike some simple select few fields from t where k = ? usually causes all kinds of errors and stability issues on good size of data.

glogla · on Sept 21, 2023

You can be both right. I would say that ClickHouse is a focused system - it is really really good at the things it's good at, and it is not good at the rest.

Anecdote: I tested ClickHouse as possible replacement of Trino+DeltaLake+S3 for some of our use cases.

When querying a precomputed flat tables it was easily 10x to 100x faster. When running complex ETL to prepare those tables, I gave up when CTAS with six CTEs that takes Trino 30 seconds to compute on the fly turned into six intermediate tables that took 40 minutes to compute, and I wasn't even halfway done.

The tricky part is, how do you know whether your use case fits? But you have to ask this question about all the "specialized tools", including Pinot.

ren_engineer · on Sept 19, 2023

I think Pinot has more focus on real-time stuff, the ones you listed would be more in the data warehouse category

fiddlerwoaroof · on Sept 20, 2023

I’ve used Snowflake and ClickHouse, at least, in near-real-time settings. I think you’re right that Pinot (and Druid/ClickHouse) are more suitable for this than the others, though.

glogla · on Sept 19, 2023

RedShift, Snowflake, Citus, Greenplum and Athena are OLAP engines, but not real-time focused. For this one, it is more similar to Druid, ClickHouse or RockSet.

The 1.0 version of Pinot seems to bring a lot of maturity, they seem to have added new engine that can do joins now. I'm not sure how stable it is, but it seems interesting.

As for what is this kind of database usedful for, this is for operational analytics on large data that also update in real time. In my domain that would be things like having insight into large supply chains or manufacturing operations, like power plants or factories, just in general for monitoring stuff. I know it's also used in security and finance (for fraud).

jpgvm · on Sept 20, 2023

Not really no. Athena is most likely just Presto under the hood.

This is a completely different beast in terms of expected query latency (sub-second) and as a result has different constraints about structure of data, amount of data queried and cost naturally.

You can think of Pinot and Druid as solving for the real-time analytics cases, like say for instance you run an ad network and you want to provide your customers with a live dashboard of ad impressions and allow segmentation of that dashboard by cohorts and other metrics on the fly.

glogla · on Sept 21, 2023

Nowadays, Athena is Trino under the hood, but pretty heavily patched.

jpgvm · on Sept 22, 2023

Yeah that makes sense.