Medical Chatbot RAG
Flask + Pinecone + LangChain retrieval app with a containerized AWS EC2 deployment path.
A retrieval-augmented medical assistant built to surface relevant health information through semantic search and context-aware generation. The project focused on turning a knowledge base into a usable chatbot interface with a containerized deployment path to AWS.
Built an end-to-end workflow from embedding generation to chat interface
Used Pinecone retrieval so answers were based on indexed content
Explored containerized deployment and CI/CD-oriented AWS hosting patterns
Impact
End-to-end retrieval app with containerized deployment practice
Role
Python / AI Developer
Timeline
2024
Key tags
Problem
Users need fast access to understandable medical information, but answers should stay tied to source material in a high-trust domain.
Solution
Combined semantic retrieval with LLM response generation so answers were informed by retrieved domain content.
Architecture
A Flask application serves the chat experience, Pinecone stores the vector index, LangChain orchestrates retrieval, and OpenAI handles generation over retrieved context.
Challenges
- Balancing answer usefulness with domain sensitivity
- Keeping retrieval quality high across varied medical questions
- Reducing latency without oversimplifying the pipeline
Technology stack
Continue exploring
More case studies
Lucid - AI Notes Platform
Helps students turn scattered lecture screenshots and notes into searchable study material.
Knowledge Graph RAG (ToPWR)
Helps students and staff query university information spread across pages, departments, and systems.
NeuroPark - 3D Vehicle Detection
Supports smart parking analysis by estimating vehicles and scene geometry from regular RGB images.