Consensus: Use AI to find insights in research papers

didibus · on Aug 17, 2023

I've been doing this with chatGPT directly. But it's cutoff data is 2021. It's pretty good though at telling you even the research name so you can validate.

SamBam · on Aug 17, 2023

As someone who is starting a master's in science and will be looking at lots of research papers, I've been wondering what the best use of this could be.

If I have my own PDFs, I guess I could get ChatGPT to create summaries in some structured way, perhaps in a single file with citation:summary, and then send up that file with every question I ask?

ant6n · on Aug 18, 2023

Some article today (in spiegel) suggested these tools for pdf upload and various levels of support when extracting information from papers and writing new papers:

* https://jenni.ai/

* https://www.explainpaper.com/

* https://www.chatpdf.com/

* https://typeset.io/

Do any of you have any experience with these tools? Jenni ai seems interesting for doing research/thesis style work.

Terretta · on Aug 18, 2023

> jenni ai seems interesting for doing research/thesis style work.

“Join the Jenni influencer program and earn money for your posts. Earn up to $5,000 per post. We'll send you $1 for every 1,000 views your post receives. Minimum payout $20 (20k views).” — https://jenni.ai/influencer-program

Sounds very interesting to say it sounds interesting.

ant6n · on Aug 18, 2023

I think I would even be interested in what somebody has to say here about these services, even if they get paid for it.

dragonwriter · on Aug 18, 2023

> If I have my own PDFs, I guess I could get ChatGPT to create summaries in some structured way, perhaps in a single file with citation:summary, and then send up that file with every question I ask?

Extract the text, put it into a database (a vectordb would be the hot thing for with an LLM, is probably ideal, but ChatGPT does a pretty good job using Wikipedia as a “database” with a dirt simple ReAct pattern implementation, so you probably don’t need to be that fancy to get value out of it) and then use tooling to let ChatGPT use that database for questions.

bee_rider · on Aug 18, 2023

FWIW is is pretty common to read the abstract, skim the intro, skip to the conclusion/results, and then only then read the rest of the paper (if you even need to at that point).

https://www.science.org/content/article/how-seriously-read-s...

So, this is pretty quick, you might not need a tool for it.

SamBam · on Aug 18, 2023

Actually, what I was asking about was how to handle my storage and later retrieval of hundreds of papers. I don't mean just the database tool -- I currently use Zotero and make sure articles are tagged -- but I'm wondering if I can build a system for my articles where I can ask ChatGPT "are there any studies that show X causes Y" and have it use my articles directly, and not give me bullshit citations like it does 90% of the time now.

(To be clear, I'm pleasantly surprised when it does give me some real, useful citations of articles I didn't know about, but I'd like to get it's accuracy way up, at least for articles I have in my possession that I could feed it for context.)

chaxor · on Aug 18, 2023

First you just extract all of the text from the PDFs. It's helpful to use arxiv instead, since it gives you latex. Then you can parse to separate content from style. Then store that in a a DuckDB database, with zstd compression since it's great on text. Then, just use some encoder model (decoder models are all the rage but they're not necessary here) to process all of these texts into Qdrant.

Then use your Qdrant database happily ever after, with whatever you like, such as Vicuna or Guanaco 30b GPTQ. Sprinkle your langcgain or whatever you like. Galpaca 30b may be appropriate for science text.

llmllmllm · on Aug 18, 2023

As part of my testing of https://flowch.ai (which we are developing), I've been uploading multiple PDFs (including scientific papers). I then ask a question and ask it to cite its sources as part of the prompt and it'll give the PDFs it's using for the data / quotes it's using. Works well!

It's currently free to use :)

jimmySixDOF · on Aug 18, 2023

Litmap is a good surfacer of related papers if you have a starting seed to work from

https://www.litmaps.com/

dandiep · on Aug 18, 2023

I think https://elicit.org/ is also worth a mention in this context.

samuell · on Aug 18, 2023

Elicit is very useful! It can't sometimes answer some very specific and complex questions, but it also doesn't tend to hallucinate anywhere as much as chatgpt. Rather it will be quite upfront about when it can't really find an answer to something.

chs20 · on Aug 18, 2023

From which database are the papers that it searches?

samuell · on Aug 18, 2023

According to the FAQ [1] the source is the "Semantic Scholar Academic Graph dataset".

[1] https://elicit.org/faq#appendix-how-does-elicit-work

ZoomerCretin · on Aug 18, 2023

For my query "Can a LLM be used to build an agent?", elicit gave me highly relevant results, whereas Consensus could not find any relevant or recent results. I'm guessing the difference is that Consensus does not have papers from arxiv.

Good find, very useful!

samuell · on Aug 18, 2023

I actually think it would make sense for Consensus to try to highlight how they are different than e.g. Elicit, since they already exist on the market since quite some time.

andsoitis · on Aug 18, 2023

"how old is the universe?"

The age of the observable universe is 11 billion years, which is in close agreement with the inflationary model prediction that the age of the universe is two-thirds of the Hubble time.

https://consensus.app/details/billion-years-close-agreement-...

In contrast, when I ask Google "how old is the universe" I get as top result:

26.7 billion years old Current estimates place the Big Bang 13.8 billion years ago. University of Ottawa adjunct professor Rajendra Gupta has calculated that it is, in fact, 26.7 billion years old – nearly twice as old as the current accepted model. extracted from https://cosmosmagazine.com/space/astrophysics/universe-27-bi....

whatshisface · on Aug 18, 2023

For reference the actual consensus is 13.787±0.020Gy.

andsoitis · on Aug 18, 2023

I think my point stands that Google's answer is more accurate.

whatshisface · on Aug 18, 2023

A reply on HN is not always the counterpoint. :-)

layer8 · on Aug 18, 2023

Maybe HN should have two different Reply buttons for clarity. ;)

gobengo · on Aug 18, 2023

... depends on who u ask

andsoitis · on Aug 18, 2023

the answer gives me both the current consensus AND the latest possibly new consensus.

the Consensus site gives me... not even an accurate current consensus.

yuvalr1 · on Aug 17, 2023

> The current source material used in Consensus comes from the Semantic Scholar database

For people who are familiar with this database, are the papers in it trustworthy?

It's nice to search directly in scienific papers, but only if they're reputable.

PNewling · on Aug 17, 2023

I'm not super familiar with them, but taking a quick look it looks like the have a lot of papers published by the Allen Institute in Seattle[0] (which I am much more familiar with). So I would say they at least carry very reputable papers.

Here[1] is a summary of Semantic Scholar Database in the National Library of Medicine (note also from an Allen Institute spinoff, but it just made it easy to search for something I know as reputable).

[0] https://www.semanticscholar.org/author/Allen-Institute/21263... [1] https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5764585/

orzig · on Aug 17, 2023

Yes, I am a subscriber. They have little icons like “rigorous journal”, “systemic review”, “meta-analysis” and (if you give this any weight) “highly cited”. I believe you can filter by those in your search as well.

acqbu · on Aug 17, 2023

Is this another SaaS making API calls to OpenAI or is it more than that? If so, how?

malux85 · on Aug 17, 2023

Embed articles and throw the results in a vector database.

Throw up a search result that just uses cosine similarity on the vector search with questionable metrics and no explanations on how things are calculated.

Charge yearly because you know people will churn after a month or two.

Profit

- Every "AI startup" in the last 2 months

fzliu · on Aug 17, 2023

I'll play DA here - there's quite a bit of engineering surrounding these apps that can appear hidden to folks from the outside looking in. Various levels of prompt engineering and in-context learning might be necessary to get optimal results, and this could mean significantly more complexity at the application level.

cj · on Aug 17, 2023

Every time I hear or read "prompt engineering", I can't help but cringe a bit. I'm not sure why, but it's the same reaction I would have if I heard someone say "Google search query engineering".

Comparing to google search, there definitely is skill involved in knowing how to google things well. We're all accustomed to googling things many times per day so I think a lot of people forget that being able to google things and get the results you want is a skill that has to be learned.

But I would never refer to being "good at writing google search queries" as any kind of engineering. But is becoming good at searching google any less difficult than getting good at writing LLM prompts?

I'd love to hear the other side of the argument. How difficult is it to become good at "prompt engineering"? Why do we even call it "prompt engineering" instead of just "writing effective prompts"?

Edit: I think the main gripe I have with the term "prompt engineering" is it makes the skill of writing prompts sound a lot harder than it actually is. Maybe I'm underestimating how difficult it is to learn how to write good prompts?

reissbaker · on Aug 18, 2023

IMO you're right that "prompt engineering" is a cringe-y term because it implies what you're doing is mostly writing prompts. That being said, I don't think that's what it actually entails, any more than "backend engineering" is mostly writing SQL queries. Prompt engineering is building the systems around the prompts e.g. writing the LangChain or whatever code, parsing stuff and interacting with structured DBs, message queues, etc (and occasionally writing prompts too, but that's a relatively smaller part). It requires some domain-specific knowledge e.g. chain of thought and retrieval augmented generation techniques, some basic linear algebra, keeping up to date with new models and new ways of running them (ggml? gptq? openai functions vs logit masking llama 2?), but it's more or less backend engineering with a twist.

I've seen some of the more serious people switch to the term "Applied AI" which I think encompasses the role a lot better. Also I've seen a decent number of grifter types saying they're "prompt engineers" when what they mean is they're writing prompts into ChatGPT's UI, which I think is part of what drives the cringing feeling when you hear the phrase "prompt engineer" and probably drives some of the movement away from the term for engineers.

jimmySixDOF · on Aug 18, 2023

This is a great recap of the present role. I have been following a conversation around the AI Engineer as a label (although I like Applied AI better) from the Latent Space podcast team so a fair amount of meme hypecycle, but also active and actionable. They are setting up a conference in October and you can see the discussion unfold around the blog post below on X-itter.

https://www.latent.space/p/ai-engineer

xigency · on Aug 17, 2023

Effective prompting has more to do with theater than engineering.

bee_rider · on Aug 18, 2023

Yes, and-gineering.

xigency · on Aug 19, 2023

Clever.

orzig · on Aug 17, 2023

They also have a great Slack community and are rapidly turning out features. Everything I have seen suggests that they are competent and committed to the mission of making it easier to do good science.

It is easy to be cynical about the gold rush, but don’t throw the babies out with the bathwater.

madmaxmcfly · on Aug 17, 2023

Llama Index must be getting some pretty good use

carlossouza · on Aug 18, 2023

Yes, probably. But isn’t GitHub Copilot, Jasper, Perplexity and other great apps as well?

Judging from the engagement in their slack channel, it looks like they are onto something.

pmarreck · on Aug 17, 2023

I asked it about UBI and I guess there isn't enough data yet, because it called it "fiscally unbearable and morally unacceptable" (which seems to be what only 1 uncited source said)! At least it admitted consensus was low...

https://consensus.app/results/?q=Does%20universal%20basic%20...

Summary Top 10 papers analyzed Some studies suggest that universal basic income (UBI) can generate support for structural reforms and improve mental health, while other studies argue it may be fiscally unbearable, morally unacceptable, and increase wealth inequality.

orzig · on Aug 17, 2023

I have used it for several weeks, I would recommend a more specific question. On the UBI front, I had somewhat more success (though no consensus) with a concrete question on employment : https://consensus.app/results/?q=Does%20UBI%20decrease%20emp...

pmarreck · on Aug 19, 2023

Thanks for the pointer! Yeah, I admittedly asked it a tough question.

userbinator · on Aug 18, 2023

It's not surprising that highly political topics are likely to generate such results.

pmarreck · on Aug 19, 2023

Topics are "highly political" when there is insufficient evidence for them. Which is why heliocentrism is not "highly political" (and also why it once was).

jongalt1962 · on Aug 18, 2023

It just occurred to me recently that AI could and should be set up to replace peer reviews of scientific papers. (See the book "Science Fictions"). I asked ChatGPT v3.5 what it would take to do exactly that and what the algorithm would look like. I was very impressed with the response. Looking further down the road, if we connect AI to reality to any significant degree and train it to be completely objective, the powers-that-be will ban it. After all, they have their narratives to push and their propaganda to spew. I no longer ask myself why the conventional wisdom is so often wrong.

marcopicentini · on Aug 17, 2023

Why not reply directly with the answer learned from paper? Currently it’s like search the more relevant paper, open the pdf and read it.

permanent · on Aug 17, 2023

Because providing always accurate LLM-based paraphrasing would be much more difficult.

Fomite · on Aug 18, 2023

I got "not enough relevant results" in one of the most researched questions in my sub-field. That's a touch disappointing.

semerda · on Aug 17, 2023

This is great. How is "disputed" calculated? (hopefully not news babble but another paper disputing it?)

input_sh · on Aug 17, 2023

I felt really excited clicking on some of the suggested prompts, but that excitement quickly fell apart when trying to generate a summary for what I consider to be a fairly simple custom search:

impact of airbnb listings on house prices → can't summarise, need to post it as a question.

how do airbnb listings impact house prices? → can summarise, can't create a concensus, must use a yes/no question.

do more airbnb listings increase house prices? → can summarise, can't create a concensus because there's not enough relevant search results. But the maximum is 20 (according to the info icon) and it found 11 highly relevant articles, so I really don't understand how there isn't enough relevant search results.

And I gave up and deleted my account.

dvt · on Aug 17, 2023

> And I gave up and deleted my account.

Same. Yet another "AI product" that has zero product-market fit and zero usefulness. What sucks is that even if you do get a consensus, it's not even tractable (e.g. does not properly cite sources), so where could I possibly even use the conclusion drawn?