Dear Community,
We're excited to share several updates regarding our open-source Retrieval-Augmented Generation (RAG) guides. Thanks to the incredible contributions from our Discord community members (which has grown to 510 members in just two weeks!), we've added the following new guides:
1. Reliable RAG: Our basic RAG guide now includes a pre-filtering step for relevant data, uses an LLM as a judge to verify that the generated answer isn't a hallucination, and provides users with a visualization of the original documents. This visualization highlights the sentences that contributed to the final answer.
2. Propositions Chunking: Based on the Medium article I published a few days ago, this guide demonstrates how to create a collection of self-contained, atomic facts from text and store them in a database. (Inspired by the paper "Dense 𝕏 Retrieval: What Retrieval Granularity Should We Use?")
3. RAG on Tabular Data: A guide on implementing RAG for CSV files.
4. Document Augmentation through Question Generation: This guide shows how to enhance retrieval by augmenting documents with theoretical questions that could be asked about them.
5. Microsoft Graph RAG: An implementation that utilizes Microsoft's graph RAG approach.
6. Runnable Scripts: We've added versions of runnable scripts for most guides, allowing you to easily run them on your data with customizable hyperparameters.
We're thrilled to see the community actively expanding these resources. If you haven't already, we invite you to join our Discord community and contribute to this growing knowledge base!
Link to our the RAG Techniques Repo
Stay tuned for more updates, and happy RAG-ing!
Best regards,
Nir.