Sunday, April 21, 2024

The ABC of AI: Retrieval-Augmented-Generation (RAG) and grounding

This is the first in a series of blog posts about more advanced generative AI and Large Language Model (LLM) concepts which I use as notes to myself (check out Why I blog and you might want to consider it as well)

Retrieval-augmented generation (RAG) is an AI framework that enhances the quality of responses generated by large language models (LLMs). LLMs are trained on a massive amount of data and understand statistical relationships between words but lack true comprehension of their meanings. So when faced with specific questions in a dynamic context so that is where RAG comes in.

RAG integrates information retrieval into LLM answers by using these steps:

  1. User inputs prompt: when you ask a question, RAG uses your input prompt
  2. RAG retrieves relevant information from an external knowledge base based on the user prompt
  3. RAG combines this external content with your original promt creating a richer input for the LLM
RAG and grounding are related concepts in the context of enhancing LLMs. Grounding is the process of providing LLMs with information about a specific use-case. RAG is one of the techniques which is used for grounding (another technique is dense retrieval - see Dense X Retrieval: What retrieval granularity should we use  for more details). 

Microsoft Copilot offers a good example of grounding and RAG in use. The reason why Copilot is able to give more targetted responses, is because it uses grounding to improve the specificity of the prompt. 

Copilot uses Microsoft Graph, which can retrieve information about relationships between users, activities and organizational data (like info in Power Platform/Dataverse and/or Dynamics 365, info from e-mails, chats, documents and meetings) as part of the prompt grounding process. Microsoft Copilot will use the user prompt and additional info retrieved through Microsoft and then sends it to the LLM. For more details see Microsoft Copilot for Microsoft 365 overview


The most common systems to provide external data for RAG LLMs are vector databases and feature stores.

References:

No comments: