vLLM Part 1: PagedAttention & the LLM Serving Problem

How vLLM Rethinks Memory Management to Serve LLMs at Scale

Large language models are transforming every corner of software, but serving them in production is brutally expensive. A single LLaMA-13B model can consume over 26 GB of GPU memory just for its weights, and that is before accounting for the memory needed to actually process requests. When dozens or hundreds... [Read More]

Quantization with BitsAndBytes: Running Large Models on Consumer Hardware

A practical guide to model quantization using Hugging Face, bitsandbytes, and accelerate

Large language models have grown at a staggering pace. Models like LLaMA-2 70B carry 70 billion parameters, which in full 32-bit floating point precision would require roughly 280 GB of GPU memory just to load the weights. Even a 7B parameter model needs around 28 GB in FP32. For most... [Read More]

A/B Testing Part 3: Execution & Decision-Making

From Running Experiments to Making Confident Deployment Decisions

In Part 1 we covered experiment design fundamentals, and in Part 2 we explored the statistical framework and metric selection. In this final part, we tackle the practical realities of running experiments — the pitfalls that can invalidate your results, the infrastructure needed to run experiments reliably, and the decision-making... [Read More]

A/B Testing Part 2: Statistical Framework & Metrics

Choosing the Right Metrics and Statistical Foundations for A/B Tests

In Part 1 we covered the foundations of A/B testing: what it is, why it matters, and how to design experiments with proper user segmentation and traffic allocation. Now we turn to the statistical machinery that makes A/B testing rigorous — how to determine sample sizes, choose the right metrics,... [Read More]

A/B Testing Part 1: Foundations & Experiment Design

Model Evaluation in the Wild: A Practical Guide to A/B Testing

In the high-stakes world of machine learning deployment, launching a new model is like piloting a spacecraft - every decision matters, and there’s no room for blind leaps of faith. Enter A/B testing, the mission control center of model deployment that transforms uncertainty into calculated progress. Think of A/B testing... [Read More]

Mathematical modeling of Infectious disease spread

Essential concepts related to infectious disease modeling

With the recent pandemic disease caused by covid-19 virus, it is important to develop models that can better monitor how the disease will be propagated with time. We can check the total Covid cases that have occurred across the world from Jan 2020 till present day. This will provide a... [Read More]

Particles and Quantum mechanics

Essential concepts related to particle physics

In this post, fundamental concepts related to subatomic particles and quantum mechanics are presented. The objective of this post is to provide you a clear and precise understanding of topics related to quantum mechanics, quantum fields and taxonomy of particles. But before, delving into more on these topics, take a... [Read More]

Blockchain and cryptocurrency

Understanding the essential concepts of blockchain and cryptocurrency

Business networks today are often inefficient because each participant in the network keeps records, or a ledger, of all transactions between all the parties that the business interacts with. This process is expensive because of duplication of effort and intermediaries adding costs for their services. One solution to this problem... [Read More]