In Part 1, we explored why LLM serving is expensive, how naive KV cache allocation wastes 60–80% of GPU memory, and how PagedAttention eliminates that waste through OS-style paging. Now it is time to put theory into practice.
[Read More]
vLLM Part 1: PagedAttention & the LLM Serving Problem
How vLLM Rethinks Memory Management to Serve LLMs at Scale
Large language models are transforming every corner of software, but serving them in production is brutally expensive. A single LLaMA-13B model can consume over 26 GB of GPU memory just for its weights, and that is before accounting for the memory needed to actually process requests. When dozens or hundreds...
[Read More]
Quantization with BitsAndBytes: Running Large Models on Consumer Hardware
A practical guide to model quantization using Hugging Face, bitsandbytes, and accelerate
Large language models have grown at a staggering pace. Models like LLaMA-2 70B carry 70 billion parameters, which in full 32-bit floating point precision would require roughly 280 GB of GPU memory just to load the weights. Even a 7B parameter model needs around 28 GB in FP32. For most...
[Read More]
A/B Testing Part 3: Execution & Decision-Making
From Running Experiments to Making Confident Deployment Decisions
In Part 1 we covered experiment design fundamentals, and in Part 2 we explored the statistical framework and metric selection. In this final part, we tackle the practical realities of running experiments — the pitfalls that can invalidate your results, the infrastructure needed to run experiments reliably, and the decision-making...
[Read More]
A/B Testing Part 2: Statistical Framework & Metrics
Choosing the Right Metrics and Statistical Foundations for A/B Tests
In Part 1 we covered the foundations of A/B testing: what it is, why it matters, and how to design experiments with proper user segmentation and traffic allocation. Now we turn to the statistical machinery that makes A/B testing rigorous — how to determine sample sizes, choose the right metrics,...
[Read More]
A/B Testing Part 1: Foundations & Experiment Design
Model Evaluation in the Wild: A Practical Guide to A/B Testing
In the high-stakes world of machine learning deployment, launching a new model is like piloting a spacecraft - every decision matters, and there’s no room for blind leaps of faith. Enter A/B testing, the mission control center of model deployment that transforms uncertainty into calculated progress. Think of A/B testing...
[Read More]
Mathematical modeling of Infectious disease spread
Essential concepts related to infectious disease modeling
With the recent pandemic disease caused by covid-19 virus, it is important to develop models that can better monitor how the disease will be propagated with time. We can check the total Covid cases that have occurred across the world from Jan 2020 till present day. This will provide a...
[Read More]
Mystery Maths
Underlying mathematical patterns of the Universe
In this post, we will see some of the nice mysteries around the mathematical patterns of the universe. We will see the relationship between energy, frequency and vibration. According to Universal law, everything vibrates at some frequency. Sound is also a vibration and so are thoughts. Everything that manifests itself...
[Read More]
Universe Theories
Theoretical concepts related to Universe
In this post, fundamental concepts of matter, particles and multiverses are presented. The motivation of this post is to have a precise and clear understanding of different theories related to the universe exploration. Theories like big bang, multiverse, eternal inflation among several others can be subjected to different opinions and...
[Read More]
Particles and Quantum mechanics
Essential concepts related to particle physics
In this post, fundamental concepts related to subatomic particles and quantum mechanics are presented. The objective of this post is to provide you a clear and precise understanding of topics related to quantum mechanics, quantum fields and taxonomy of particles. But before, delving into more on these topics, take a...
[Read More]
MLOps Notes
Notes on MLOps
In this post, we we breifly discuss about several tools that can be useful while developing and structuring machine learning based projects. Specifically, we will focus on the fundamental tools related to MLOps.
[Read More]
Machine Learning model deployment
Ways for deploying machine learning model
Developing intelligent machine leaning models for solving a particular problem with considerable accuracy in itself is a great challenge. We can manage to build the most optimum model, but unless we know how to put it into production, it’s hard to get it to create the maximum amount of possible...
[Read More]
Interactive Plots using Vega
Fundamentals
Interactive Plots using Vega
Vega is a visualization grammar, a declarative format for creating, saving, and sharing interactive visualization designs. Below are examples of interactive plots created using vega:
[Read More]
Social Psychology
Essential concepts related to human and social psychology
In this post, fundamental concepts of social psychology will be described. First, we need to understand some preliminary terms related to this branch of psychology and then some of the important topics will be briefly discussed.
[Read More]
Amazon Web Services
Fundamental services related to Amazon Web Services (AWS)
Amazon Web Services (AWS) works on the idea of an on-demand, pay-as-you-go, IT services that are delivered over the internet. These cloud computing web services provide a set of primitive abstract technical infrastructure and distributed computing building blocks and tools. Different services offered by AWS are listed below:
[Read More]
Amazon Web Services S3
Understanding AWS S3
Amazon Simple Storage Service (S3) is an object storage service that stores data as objects within buckets called S3 buckets. An S3 bucket is a storage location to hold files referred to as data objects.
[Read More]
SAS Programming
Fundamentals-1 post related to SAS Programming
This post is related to SAS programmming which describes the basic tools and techniques that is required for performing some required elementary data analysis.
[Read More]
Time and Gravity
Fundamental concepts and techniques related to time travel
In this post, fundamental concepts related to time travel are presented. We would first see some related basic concepts and then try to explore some ways in which time travel may be achieved.
[Read More]
Learning Spanish
Fundamentals
Getting started
Here are some basic spanish to english language translations -
[Read More]
Docker
Example of deploying simple application through docker
Running python code in Docker
[Read More]
Map Reduce and Spark
Useful resources related to Map Reduce
Useful resources:
Map Reduce Primer
[Read More]
Blockchain and cryptocurrency
Understanding the essential concepts of blockchain and cryptocurrency
Business networks today are often inefficient because each participant in the network keeps records, or a ledger, of all transactions between all the parties that the business interacts with. This process is expensive because of duplication of effort and intermediaries adding costs for their services. One solution to this problem...
[Read More]
Spatial data tools
Useful resources related to free and open-source spatial data and tools
Useful resources:
Free tools and data sources
[Read More]
MongoDB
Basic concepts related to MongoDB databases
MongoDB is designed to provide high-availability access to your data. It does this by enabling you to maintain redundant copies of your data in a cluster called a replica set. For example, if we configure our Atlas cluster to be a 3-server replica set then in the events of the...
[Read More]
Docker
Essentials
What is Docker?
At its core, docker is tooling to manage containers. It is a simplified existing technology to enable to use by the masses.
[Read More]
My new Website
Welcome to my world
This is my new website for writing tutorials, thoughts and needful information about my work.
[Read More]