π‘ WRAP-UP
Built a complete RAG system with:
β
Optimized retrieval (k=3, 86.67% precision)
β
Evaluated prompts (8.0/10 quality)
β
Real-time monitoring (7 charts)
β
Full Docker deployment
β
Hallucination prevention
#LLMZOOMCAMP #BuildInPublic
π REPRODUCIBILITY
Everything needed to run this:
π¦ requirements.txt with pinned versions
π³ Docker Compose for one-command deploy
π Complete documentation
π― Sample data included
Clone, configure API key, run. That's it!
#LLMZOOMCAMP
β±οΈ PERFORMANCE NUMBERS
β’ Retrieval: < 1 second
β’ Processing: 1,400 chunks/min
β’ Batch size: 5,000 docs
β’ Dataset: 10+ technical books (15,354 chunks)
Fast enough for real-time queries!
#LLMZOOMCAMP #Performance
π₯ SMART INGESTION
Auto-detects existing vector DB or creates new one
Handles PDFs + TXT files
Batch processing for large collections
Graceful error handling
Set it and forget it!
#LLMZOOMCAMP #DataEngineering
π EVALUATION FRAMEWORK
Retrieval: Precision + keyword relevance
LLM: Quality scoring (accuracy, depth, honesty)
Ran ~50 test queries across both evaluations.
Measure everything. Improve what matters.
#LLMZOOMCAMP #MLOps
π‘οΈ PREVENTING HALLUCINATIONS
Tested with out-of-scope questions.
System correctly says "I cannot tell you based on the provided context" instead of making things up.
Honesty > Confidence
#LLMZOOMCAMP #AIEthics
π¨ USER EXPERIENCE
Two-tab Streamlit interface:
1. Q&A System with source previews
2. Analytics Dashboard
Auto-initialization on startup = zero-config for users
Good UX = better adoption!
#LLMZOOMCAMP #UX
SCALING CHALLENGES
Hit API limits at 15,000+ document chunks!
Solution: Batch processing (5000 chunks/batch)
Result: ~1,400 chunks/min processing speed
Always plan for scale from day one.
#LLMZOOMCAMP #Scaling
π³ FULL CONTAINERIZATION
Docker Compose with:
β’ Named volumes for persistence
β’ Health checks
β’ Resource limits (2 CPU, 4GB RAM)
β’ Non-root user for security
β’ Auto-restart policies
One command deploy!
#LLMZOOMCAMP #DevOps #Docker
π MONITORING MATTERS
Built an integrated dashboard with 7 real-time charts:
- Feedback distribution
- Response times
- Query volume
- Activity patterns
User feedback: π/π buttons after every answer
#LLMZOOMCAMP #DataViz
π€ PROMPT ENGINEERING
Tested 4 prompt templates on quality:
β’ Expert Technical: 8.0/10 β
β’ Detailed Context: 7.9/10
β’ Structured: 7.0/10
β’ Concise: 6.2/10
Comprehensive wins over brevity for technical Q&A!
#LLMZOOMCAMP #PromptEngineering
π RETRIEVAL OPTIMIZATION
Evaluated 4 different approaches:
β’ Semantic (k=3): 86.67% precision β
β’ Semantic (k=5): 84.00%
β’ Semantic (k=10): 84.00%
β’ MMR (k=5): 84.00%
Less is more! k=3 won with best relevance.
#LLMZOOMCAMP #MachineLearning
π οΈ TECH STACK
β’ LLM: Google Gemini 2.5 Pro
β’ Embeddings: text-embedding-004
β’ Vector DB: ChromaDB
β’ Framework: LangChain
β’ UI: Streamlit
β’ Container: Docker
All production-ready with monitoring!
#LLMZOOMCAMP #TechStack
π THE PROBLEM
Ever spent hours searching through multiple technical PDFs for one piece of info? Me too!
DocuMind solves this with AI-powered semantic search. Ask questions in natural language, get instant answers with sources.
#LLMZOOMCAMP #RAG
π Just completed my #DataTalksClub LLM Zoomcamp project: DocuMind - an end-to-end RAG system for technical documents!
Built with Google Gemini, LangChain, ChromaDB & Streamlit.
Let me share what I learned... π§΅
#LLMZOOMCAMP #BuildInPublic #AI
Just completed my #LLMZoomcamp final project: AA Assistant a RAG-powered chatbot providing trustworthy Alcoholics Anonymous information to people seeking help
Tech Stack
β’ NVIDIA NIM for LLM inference
β’ Jina embeddings v2 for Spanish/English
β’ FastAPI + Qdrant vector DB
github.com/marcelonieva...
π€ Agentic RAG + Function Calling + MCP = AI superpowers.
RAG gives LLMs fresh knowledge, Function Calling lets them trigger actions, and MCP standardizes tool access across platforms. Open standard β smarter, action-driven AI.
#AI #LLM #RAG #MCP #FunctionCalling #LLMZoomcamp
π MCP: The USB Port for AI Tools
MCP (Model Context Protocol) gives LLMs a toolbox β tools are discoverable, callable, and work the same across platforms. No more glue code. MCP = Open standard β LLMs + tools = instant collaboration.
#AI #LLM #MCP #ModelContextProtocol #AItools #LLMZoomcamp
π οΈ Function Calling lets AI run tools like APIs during a conversation.
In Agentic RAG, this means: fetch live data β
run computations β
trigger services β
π‘ Your AI doesnβt just talk β it gets things done.
#AI #FunctionCalling #AgenticRAG #LLM #LLMZoomcamp
π Just explored Agentic RAG β Retrieval-Augmented Generation with autonomous agents.
π It doesnβt just find info, it decides how to use it.
Like giving AI both a library card and a research assistant.
#AI #RAG #AgenticRAG #MachineLearning #LLMZoomcamp
The future of LLM evaluation involves a greater emphasis on context, ethical considerations, and user-centric metrics. Collaboration across research and industry will be vital. #llmzoomcamp
Challenges in LLM evaluation include data contamination, outdated benchmarks, and the inherent subjectivity of human judgments. Addressing these requires ongoing innovation. #AIChallenges #LLMEvaluation #llmzoomcamp
Key metrics like perplexity, BLEU, and ROUGE are valuable. However, multi-faceted approaches are essential to capture the nuances of LLM performance across different tasks and use cases. #LLMmetrics #NLP #llmzoomcamp
Building robust LLM applications requires continuous evaluation, from pre-production testing to post-production monitoring with real user data #llmzoomcamp
The use of "LLM-as-a-judge" is promising for evaluating LLMs at scale. However, it's important to remember that they inherit the biases and limitations of LLMs themselves.
#llmzoomcamp
Benchmarking LLMs helps to understand their capabilities, but real-world scenarios reveal their true performance. Avoid relying solely on leaderboards #llmzoomcamp
Evaluating LLMs goes beyond accuracy. Assessing relevance, coherence, factual correctness, fairness, and safety is necessary to ensure they are truly useful and reliable. #llmzoomcamp
π Built a comprehensive search evaluation system this week! Learned to compare multiple search approaches systematically. Now I can evaluate any search system with confidence! #LLMZOOMCAMP #SearchEvaluation #VectorSearch
β‘ Key learning: Different search methods have different strengths! Learned when to use exact text search vs semantic vector search vs scalable vector databases. Context matters! #LLMZOOMCAMP
π― Explored ROUGE evaluation for text generation quality! Learned how to measure how well generated text matches reference text - crucial skill for building better RAG systems! #LLMZOOMCAMP