GenAI Data Pipeline
In this post, I recap a conversation with Asish Nelapati on how to architect and deploy real-world Generative AI systems. We covered the full pipeline—from data ingestion and chunking to embedding, RAG vs fine-tuning, and model deployment strategies. It’s a practical guide for anyone looking to build intelligent, production-ready AI workflows.
This week, I had the chance to sit down with Asish Nelapati — someone I’ve worked with for over a year and who turn to when I want to think deeply (and practically) about Data Science, AI/ML and full-stack systems.
Asish recently completed his master’s from Penn State and brings a rare blend of academic rigor and engineering versatility — from LLM research papers to React Native interfaces. It’s that full-spectrum perspective that makes conversations with him so rich.
In this session, we broke down what it really takes to go from raw data to a working GenAI system. Here are a few highlights.
Segment 1 – Designing the Data Pipeline
We walked through a typical Generative AI pipeline:
• Start with data sources: PDFs, HTML, transcripts, SQL/NoSQL databases
• Clean and preprocess the content (strip scripts, dedupe, etc.)
• Chunk the data – via fixed token size or semantic grouping
• Generate embeddings, store them in a vector DB along with metadata
• Use retrieval + context injection (e.g., user role, chat history) to power dynamic prompts
The goal? Structured, searchable knowledge that makes your chatbot feel like it actually knows what it’s talking about.
Segment 2 – RAG vs Fine-Tuning
A classic question: When do you fine-tune vs build a Retrieval-Augmented Generation (RAG) pipeline?
- Use RAG when your data changes frequently
- Use Fine-tuning when deep domain knowledge is required
Also, PSA: “training” ≠ “fine-tuning” — and many people use the terms interchangeably.
Segment 3 – Model Deployment = Software Release
Your model is ready — now what?
You’ve got 3 main paths: 1. Managed: Azure ML, Amazon Bedrock etc 2. Self-hosted: EC2, Kubernetes 3. Specialized platforms: Hugging Face, Scale AI
Model serving needs its own versioning, testing, and release cadence — just like any other software product.
We wrapped up with this: there’s no one-size-fits-all in Generative AI. But having a mental map helps — and we hope this conversation gave you one.
Curious which parts of this you’ve already tackled — and where you’re stuck. RAG vs fine-tuning? Chunking vs embedding? Serving vs scaling?
Drop your thoughts below or shoot me a DM — happy to dive deeper.
PS : took me a lot more time than I anticiapted to edit this video - hope you find it helpful 😀
#AI #GenAI #RAG #LLM #DataScience #AIStack #BuildInPublic