Try https://github.com/SecureAI-Tools/SecureAI-Tools -- it's an open-source application layer for Retrieval-Augmented Generation (RAG). It allows you to use any LLM -- you can use OpenAI APIs, or run models locally with Ollama.
We just added this in the latest release (v0.0.2). You can now create a document collection and upload as many PDFs into it as needed. The documents are processed in the background and once processing finishes, you can create as many chats with it as needed.
We are trying to build a single platform for all the AI tool needs. Chat-with-LLM and chat-with-documents are just a couple of apps or experiences that we have started with, but we have ambitious goals. In future, we would love to provide an SDK that exposes common abstractions and lets everyone build apps/experiences for the long tail of use cases.
It is secure because it allows you to fully customize where to process the data (i.e. LLM inference), where to store it, and data-retention policies, etc. You can choose to use a locally running LLM (like it does in my second video) or use a secure third-party service provider like Azure OpenAI.
For example, if you want GDPR compliance, then you can choose Azure OpenAI running in the EU region. For HIPAA compliance, you should choose a service provider that provides the Business Associate Agreement (BAA). You can even run it in air-gapped facilities (like GitLab's offline mode [1]). In all of these cases, you can always run an Ollama-like inference service on your infra and point SecureAI Tools to it)
I love the idea of allowing anyone to build apps or experiences on top of some of these common elements.
We have briefly discussed an approach where we make some of these common elements available as abstractions and let people build "apps" on top of it. It would operate kind of similar to how Google's app store does in that the head use cases (email, photos, camera, etc) are first-party apps, but then anyone can build and publish a third-party app using the Android SDK.
Right now it supports selecting & uploading a _few_ PDFs on chat-creation. Those PDFs get indexed online -- i.e. while the user waits. So it doesn't scale well with the number of PDFs selected in a chat because you'd have to wait that long before the chat responds with your initial question/prompt.
We plan to make this indexing process offline, where you can create a document collection based on either a directory upload or an integrated data source like Google Drive, Notion, Confluence, etc. Then the system would start indexing that collection in the background and notify you once indexing is complete. Once a collection is indexed, users can select it when creating a new chat and query against it.
Let us know if you have any thoughts on this proposed solution.
Exactly. You download something from arxiv and the filename has no meaningful content in it. Generally speaking, you want the filename to be descriptive in some way, extracting the title of the document is a good start.
You can just import them all into paperpile, which has good ways of inferring the metadata like title author year etc. and then connect to your google drive. Which will download them with nice filenames.
> 1. Does chat-with-pdfs function work with scanned PDFs?
Not yet. We don't do OCR or anything to extract text from images yet. But that would be an awesome feature, so we would love to add it in the future.
> 2. In the video example for chat-with-pdfs you show uploading a document interactively. The part of processing is quite slow. Can the tool be fed these documents offline as well?
Not as of right now. But we do have plans to make that an offline/background job so that we can feed a larger corpus of documents into it and query against it later.
If I've already run OCR on my PDFs and that's added now as an invisible layer, would it work then?
I've had a workflow digitizing my incoming paper documents, running OCR, and tagging them, all locally, and it would be great to have an easy front-end to talk to them.
I haven't tried this myself, but I think it should work. It would be worth trying at least, so I highly encourage you to play with it, and file issues if you find any issues with it.
I haven't found an OCR tool reliable enough when it comes to scanned PDFs containing financial data where accuracy of amounts in the document is very important.
People spend an inordinate amount of time and money solving this problem rather than spending the same amount of money in lobbying and standardization efforts for financial institutions. I’ll throw this out there: when all you know is a hammer, everything looks like a nail.
I created this for myself, to automate frequent searches like "XYZ Crunchbase" Or "ABC Angel List". Sharing it here, because I am sure this will be helpful for lot of people on HN.