This is the first in a series of blog posts about more advanced generative AI and Large Language Model (LLM) concepts which I use as notes to myself (check out Why I blog and you might want to consider it as well)
Retrieval-augmented generation (RAG) is an AI framework that enhances the quality of responses generated by large language models (LLMs). LLMs are trained on a massive amount of data and understand statistical relationships between words but lack true comprehension of their meanings. So when faced with specific questions in a dynamic context so that is where RAG comes in.
RAG integrates information retrieval into LLM answers by using these steps:
- User inputs prompt: when you ask a question, RAG uses your input prompt
- RAG retrieves relevant information from an external knowledge base based on the user prompt
- RAG combines this external content with your original promt creating a richer input for the LLM
RAG and grounding are related concepts in the context of enhancing LLMs. Grounding is the process of providing LLMs with information about a specific use-case. RAG is one of the techniques which is used for grounding (another technique is dense retrieval - see Dense X Retrieval: What retrieval granularity should we use for more details).
Microsoft Copilot offers a good example of grounding and RAG in use. The reason why Copilot is able to give more targetted responses, is because it uses grounding to improve the specificity of the prompt.
Copilot uses Microsoft Graph, which can retrieve information about relationships between users, activities and organizational data (like info in Power Platform/Dataverse and/or Dynamics 365, info from e-mails, chats, documents and meetings) as part of the prompt grounding process. Microsoft Copilot will use the user prompt and additional info retrieved through Microsoft and then sends it to the LLM. For more details see Microsoft Copilot for Microsoft 365 overview
The most common systems to provide external data for RAG LLMs are vector databases and feature stores.
References:
- Implement Retrieval Augmented Generation (RAG) with Azure OpenAI Service (Microsoft Learn)
- RAGAs- How to evaluate RAG pipelines chatbot
- Intro to Retrieval-Augmented Generation (RAG) with Generative AI and OpenAI on Azure
- Azure Cosmos DB - Vector Database
- How vector search and semantic ranking improve your GPT prompts (Microsoft Mechanics)
- The 5 types of LLM apps (YouTube)
- RAG application with Azure open AI & Azure Cognitive Search (French legal use case - Python notebook)
- Microsoft Copilot for Microsoft 365 overview
- Grounding language model with chunking-free in-context retrieval
- Grounding LLMs
- Langchain - use cases - Q&A with RAG
- Retrieval Augmented Generation (RAGs) for LLMs

No comments:
Post a Comment