Microsoft Unveils GraphRAG, Outperforms Traditional RAG in Data Discovery

has released , a graph-based approach to retrieval-augmented generation (RAG) that enables question-answering over private or previously unseen datasets. GraphRAG is now available on . The tool offers more structured information retrieval and comprehensive response generation than traditional RAG approaches.

Accompanying the GraphRAG code repository is a solution accelerator, providing an easy-to-use API experience hosted on Azure, deployable without coding. GraphRAG employs a large language model (LLM) to automate the extraction of a knowledge graph from any collection of text documents. This graph-based data index can report on the semantic structure of the data prior to user queries by detecting “communities” of densely connected nodes in a hierarchical fashion.

Each community summary describes its entities and their relationships, offering an overview of a dataset without needing to know specific questions in advance. In recent evaluations, GraphRAG demonstrated its ability to answer “global questions” that address the entire dataset, a task where naive RAG approaches often fail. By considering all input texts, GraphRAG’s community summaries provide more comprehensive and diverse answers.

This method uses a map-reduce approach, grouping community reports up to the LLM context window size, mapping the question across each group to create community answers, and reducing these into a final global answer. Comparative studies using GPT-4 showed that GraphRAG outperfo.