📚Academy
likeone
online
Animated +200 XP ~60 min

The RAG Loop

Follow a query through the complete RAG pipeline — from user question to AI-generated answer grounded in your data.

RAG in one sentence: Instead of hoping the LLM memorized the answer, we find the relevant documents and hand them to the LLM along with the question. The pipeline has six distinct steps, each transforming the data in a specific way.
The six steps:
  1. User Query — A natural language question is the starting point of every RAG loop.
  2. Embed Query — The question is converted to a vector using the same embedding model that processed the documents, so they share the same semantic space.
  3. Vector Search — The query vector is compared against all stored document vectors using cosine similarity. An HNSW index makes this fast even across millions of chunks.
  4. Retrieve Chunks — The top-K most similar chunks are fetched with their text, similarity scores, and source metadata.
  5. Augment Prompt — The retrieved chunks are inserted into a prompt template alongside the original question, giving the LLM the context it needs.
  6. LLM Response — The model generates an answer grounded in the retrieved context, not in potentially outdated training data.
🔒

This lesson is for Pro members

Unlock all 300+ lessons across 30 courses with Academy Pro. Founding members get 90% off — forever.

Already a member? Sign in to access your lessons.

Academy
Built with soul — likeone.ai