~/blog

Blog

Articles, tutorials, and posts.

Jun 6, 202654 min read

Building LLM Fine-Tuning Data Without Hand-Labeling a Single Example

Fine-tuning a language model on domain knowledge sounds manageable until you estimate the labor involved. A single high-quality instruction-response pair might…

llmfine-tuningdata-synthesislangchainpython

Series

Jun 2, 20268 min read

Problem Solving with DSA

A Python-first walkthrough of the data structures and techniques that show up in coding interviews and real systems work.

Series

Jun 1, 20264 min read

Machine Learning

A comprehensive guide to machine learning — from fundamentals to advanced techniques

May 31, 202610 min read

Building a SQL Agent with MLflow Observability

SQL Agent with MLflow Observability Architecture SQL Agent with MLflow Observability User Natural language SQL question SQL Agent LangChain + NVIDIA ReAct / Too…

sqlmlflowlangchainagentsobservability

May 24, 202614 min read

From Raw Text to Reasoning Conversations: Building a Robust Data Pipeline for LLM Fine-Tuning

A practical, end‑to‑end guide to building a synthetic data pipeline for LLM fine‑tuning, covering single‑turn and multi‑turn conversation generation, chain‑of‑thought reasoning, and intelligent data creation from raw text using Self‑Instruct, agentic workflows, and NVIDIA NeMo Data Designer.

llm-fine-tuningsynthetic-datadata-pipelineconversational-aialpacaself-instructchain-of-thoughtnemo-data-designer

May 23, 202616 min read

Synthesizing Evaluation Data for RAG Systems: A Deep Dive

The Problem We're Solving Evaluating RAG (Retrieval-Augmented Generation) systems is challenging. You need: - Ground truth Q&A pairs that accurately reflect you…

RAGLLMdata-synthesisevaluationlangchainnvidia-api

May 17, 20267 min read

Data Engineering for Foundation Models: The Alchemist’s Thought

A human-friendly guide to curating, cleaning, and composing datasets that give large language models their skills and personality—covering data quality, deduplication, annotation, synthesis, and governance, framed as a modern alchemist’s craft.

data-engineeringllmfoundation-modelsdataset-curationai

May 17, 20269 min read

The Tale of Meaningful Vectors: Contrastive Learning for Text Embeddings, Told with Pen and Paper

A beginner-friendly, pen-and-paper walkthrough of contrastive learning for text embeddings, from simple averaging encoders and InfoNCE loss to state-of-the-art models like SimCSE and Qwen3 Embedding, revealing how vectors capture meaning and power modern semantic search.

contrastive-learningtext-embeddingsdeep-learning

May 11, 202612 min read

Multimodal RAG: Building a Smart Document Assistant - 100% Local

Welcome to this comprehensive guide on building a multimodal Retrieval-Augmented Generation (RAG) system! In this tutorial, we'll create an AI assistant that ca…

pythonqdrantlanggraphmultimodalrag

May 9, 202613 min read

How to evaluate an Agentic AI system for reliability and scalability

A practical framework for testing agentic AI beyond unit tests—covering tools, planning, memory, and resilience. Real-world failures, research-backed metrics, and a maturity model to ensure your agent earns user trust.

aiengineeringagentic-aievaluationllm

May 8, 20263 min read

Understanding Recursion with Python

Learn the fundamentals of recursion and how to think recursively in Python

pythonalgorithmsrecursion

Apr 12, 20265 min read

Build Better with AI: Why Design, Evaluation and Real-World Testing Are Your Most Important Skills

AI is handling the execution. Your edge is knowing how to design what it builds, evaluate what it produces, and validate it in the real world.

aiengineeringagentic-aiskills

Apr 11, 20266 min read

The Four Trade-offs Every Agent System Forces You to Make

Building AI agents isn't just about picking a model. It's about making deliberate decisions under constraint — and understanding what you're giving up every time.

AI EngineeringAgent SystemsMulti-AgentSystem Design

Series

Apr 11, 20266 min read

Statistics & Probability

A comprehensive guide to learning statistics and probability

Series

Apr 1, 20264 min read

Python Tutorials

A comprehensive guide to learning Python programming from scratch

Mar 28, 202613 min read

Quantization - Making Giant AI Models Fit in the Real World

Hands-on Quantization by hand and numpy with detailed explanation.

LLMQuantizationMath

Feb 23, 20266 min read

Human In The Loop with LangChain and SQL tools

Hands-on how human-in-the-loop work behind the scene in long and complex workflows.

LLMLangChainHumanInTheLoop

Feb 3, 202621 min read

🔥 Microservices Interview Questions for AI / ML Engineers

A practical deep dive into designing microservices for AI and ML systems, focusing on real-world challenges like latency, scaling, fault tolerance, and system reliability. Written from a distributed systems perspective, not MLOps or model training.

AIMLOpsMicroservices