CausalInferenceLab · jhkimon · May 11, 2026 · May 11, 2026 · May 11, 2026 · May 12, 2026
diff --git a/.claude/settings.json b/.claude/settings.json
@@ -0,0 +1,10 @@
+{
+  "permissions": {
+    "allow": [
+      "Bash(pip list*)",
+      "Bash(pkill -f \"jupyter-book\"*)",
+      "Bash(jupyter-book *)",
+      "Bash(cd /Users/jhkim/Desktop/personal_study/causal/causal-studio/book*)"
+    ]
+  }
+}
diff --git a/book/ipw_basics/ipw_in_practice_en.ipynb b/book/ipw_basics/ipw_in_practice_en.ipynb
diff --git a/book/ipw_basics/ipw_in_practice_ko.ipynb b/book/ipw_basics/ipw_in_practice_ko.ipynb
diff --git a/book/ipw_basics/what_is_ipw_en.ipynb b/book/ipw_basics/what_is_ipw_en.ipynb
@@ -0,0 +1,305 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "59ed8c13",
+   "metadata": {},
+   "source": [
+    "**🌐 Language:** **English** | [한국어 →](/what-is-ipw-ko)\n",
+    "\n",
+    "# What is IPW?"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "21eb50f7",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import pandas as pd\n",
+    "import numpy as np\n",
+    "import statsmodels.formula.api as smf"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "eeefd017",
+   "metadata": {},
+   "source": [
+    "## Does Taking Medication Reduce Hospitalization Days?\n",
+    "\n",
+    "> **Does medication actually help you recover faster?**\n",
+    "\n",
+    "This is a question deeply connected to our everyday lives. Whether the drug I take when I'm sick can shorten my hospital stay — or which treatments a hospital should adopt as standard care — these decisions all hinge on getting the answer right. A wrong answer could mean missing an effective treatment, or trusting one that doesn't work.\n",
+    "\n",
+    "Let's consider a specific scenario. Suppose men tend to get sicker more often, leading to longer hospital stays and higher medication use. Women, on the other hand, tend to get less sick, so they have shorter stays and take less medication.\n",
+    "\n",
+    "In this setting, can the data we observe truly reflect the effect of the treatment? How can we use the data to estimate the treatment effect without distortion or bias?"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "cb92ecce",
+   "metadata": {},
+   "source": [
+    "Let's reframe our question in the language of causal inference. Rather than simply comparing \"people who took the drug vs. those who didn't,\" we want to compare \"the same person when they took the drug vs. when they didn't.\" Since we can never observe both outcomes for the same person at the same time, we instead target the average effect across the entire population.\n",
+    "\n",
+    "In other words, we ask: \"How much does the average number of hospitalization days differ between a world where everyone takes the drug and one where no one does?\"\n",
+    "\n",
+    "$$\n",
+    "ATE = \\mathbb{E}[Y_1 - Y_0]\n",
+    "$$\n",
+    "\n",
+    "Here, $Y_1$ is the number of hospitalization days when the drug is taken, and $Y_0$ is the number when it is not. In causal inference, this is called the **Average Treatment Effect (ATE)**."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f18b57da",
+   "metadata": {},
+   "source": [
+    "The simplest approach is to compare the average hospitalization days between those who took the drug and those who didn't. This naive comparison is valid when treatment is randomly assigned (RCT) — random assignment makes treatment independent of confounders like sex, so the two groups can be compared directly.\n",
+    "\n",
+    "But our situation is different. Most people who took the drug are male, and most who didn't are female. Since men tend to be sicker to begin with, their longer hospital stays may have nothing to do with the drug — **it's sex, not the drug, that's driving the difference**.\n",
+    "\n",
+    "```mermaid\n",
+    "graph LR\n",
+    "    Treatment[Drug T] --> Outcome[Hospitalization Days Y]\n",
+    "    Sex[Sex X] --> Treatment\n",
+    "    Sex --> Outcome\n",
+    "```\n",
+    "\n",
+    "A variable that affects both treatment assignment and the outcome is called a **confounder**. Confounders are what cause our estimate of the ATE to be distorted."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "0bb3b73b",
+   "metadata": {},
+   "source": [
+    "Let's see whether this actually causes a problem with a concrete example."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "629ae592",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "drug_example = pd.DataFrame(dict(\n",
+    "    sex= [\"M\",\"M\",\"M\",\"M\",\"M\",\"M\", \"W\",\"W\",\"W\",\"W\"],\n",
+    "    drug=[1,1,1,1,1,0,  1,0,1,0],\n",
+    "    days=[5,5,5,5,5,8,  2,4,2,4]\n",
+    "))\n",
+    "drug_example"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "4cbd6140",
+   "metadata": {},
+   "source": [
+    "| Group | Treated (T=1) | Untreated (T=0) |\n",
+    "|---|---|---|\n",
+    "| Male | 5 | 1 |\n",
+    "| Female | 2 | 2 |\n",
+    "\n",
+    "In reality we can't observe both $Y_0$ and $Y_1$ for the same person, but for the purposes of this explanation, let's assume we can.\n",
+    "\n",
+    "- Male: $Y_1 = 5, Y_0 = 8 \\Rightarrow Y_1 - Y_0 = -3$\n",
+    "- Female: $Y_1 = 2, Y_0 = 4 \\Rightarrow Y_1 - Y_0 = -2$"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3d838db9",
+   "metadata": {},
+   "source": [
+    "## The Problem with Naive Comparison\n",
+    "\n",
+    "Let's compute the naive comparison. The treated group average is $(5 \\times 5 + 2 \\times 2)/7 = 29/7$, and the untreated group average is $(1 \\times 8 + 2 \\times 4)/3 = 16/3$."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "ef7bd655",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "naive_ate = drug_example.query(\"drug==1\")[\"days\"].mean() - drug_example.query(\"drug==0\")[\"days\"].mean()\n",
+    "print(f\"Naive ATE: {naive_ate:.4f}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "6e15a062",
+   "metadata": {},
+   "source": [
+    "$$\n",
+    "\\hat{ATE}_{naive} = 29/7 - 16/3 \\approx -1.19\n",
+    "$$\n",
+    "\n",
+    "This result doesn't look right. The drug reduces hospitalization by 3 days for men and 2 days for women — so in every case, the effect should be at least $-2$. Yet the naive comparison gives only $-1.19$, which is smaller in magnitude than either group's true effect.\n",
+    "\n",
+    "This distortion isn't from the drug — it comes from the **difference in group composition** (the unequal sex distribution between treated and untreated groups).\n",
+    "\n",
+    "### True ATE\n",
+    "\n",
+    "Since 6 males each have a −3 day effect and 4 females each have a −2 day effect:\n",
+    "\n",
+    "$$\n",
+    "ATE = \\frac{-3 \\times 6 + (-2) \\times 4}{10} = -2.6\n",
+    "$$\n",
+    "\n",
+    "The naive comparison substantially **underestimates** the true effect of the drug."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f77d024f",
+   "metadata": {},
+   "source": [
+    "## Solution: Inverse Probability Weighting (IPW)\n",
+    "\n",
+    "The root cause of the problem is that **the sex composition of the treated and untreated groups was different**. What if we could reweight the data so that the two groups appear to have the same sex composition?\n",
+    "\n",
+    "The core idea is simple. **Assign each individual a weight that makes the data look as if both groups had the same composition.** If a certain sex is overrepresented in one group, reduce their influence; if underrepresented, increase it. When sex is the source of confounding, correcting for this imbalance through reweighting makes a simple mean comparison valid.\n",
+    "\n",
+    "What we want is for the sex distribution within each treatment arm to be balanced — treated and untreated groups should look the same within each sex. Since men tend to take the drug more often, there are too few untreated men. We need to \"inflate\" that underrepresented group by multiplying by the inverse of its probability.\n",
+    "\n",
+    "This is the idea behind **Inverse Probability Weighting (IPW)**. Each individual receives the following weight:\n",
+    "\n",
+    "$$\n",
+    "w =\n",
+    "\\begin{cases}\n",
+    "\\dfrac{1}{P(T=1 \\mid X)} & \\text{if } T=1 \\\\[6pt]\n",
+    "\\dfrac{1}{P(T=0 \\mid X)} & \\text{if } T=0\n",
+    "\\end{cases}\n",
+    "$$\n",
+    "\n",
+    "This is the inverse of the probability of receiving the treatment the individual actually received — **rarer observations receive larger weights**.\n",
+    "\n",
+    "The two cases are typically combined into a single expression:\n",
+    "\n",
+    "$$\n",
+    "w = \\frac{T}{P(T=1 \\mid X)} + \\frac{1-T}{P(T=0 \\mid X)}\n",
+    "$$\n",
+    "\n",
+    "When $T = 1$, only the first term survives; when $T = 0$, only the second."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "dce47cdf",
+   "metadata": {},
+   "source": [
+    "### Why Does This Weight Create Balance?\n",
+    "\n",
+    "The treatment probabilities by sex are:\n",
+    "\n",
+    "$$\n",
+    "\\text{Male: } P(T=1 \\mid X=M)=\\frac{5}{6}, \\quad P(T=0 \\mid X=M)=\\frac{1}{6}\n",
+    "$$\n",
+    "\n",
+    "$$\n",
+    "\\text{Female: } P(T=1 \\mid X=W)=\\frac{1}{2}, \\quad P(T=0 \\mid X=W)=\\frac{1}{2}\n",
+    "$$\n",
+    "\n",
+    "The weights are the reciprocals of these values:\n",
+    "\n",
+    "- Male + treated: $6/5$, &nbsp; Male + untreated: $6$\n",
+    "- Female + treated: $2$, &nbsp; Female + untreated: $2$\n",
+    "\n",
+    "There is originally only 1 untreated male, but with a weight of 6, he is **treated as if there were 6 of him**. Conversely, the 5 treated males each get a weight of $6/5$, so their total also sums to 6.\n",
+    "\n",
+    "As a result, within the male group, the total weight is 6 for both treated and untreated — a perfect balance. The same holds for females. With this balanced dataset, a simple mean comparison recovers the true ATE:\n",
+    "\n",
+    "$$\n",
+    "\\hat{ATE}_{IPW} = \\frac{5 \\times 6 + 2 \\times 4}{6+4} - \\frac{8 \\times 6 + 4 \\times 4}{6+4} = -2.6\n",
+    "$$"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "91852872",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "ps = drug_example.groupby(\"sex\")[\"drug\"].mean()\n",
+    "drug_example[\"ps\"] = drug_example[\"sex\"].map(ps)\n",
+    "drug_example[\"w\"] = (\n",
+    "    drug_example[\"drug\"] / drug_example[\"ps\"]\n",
+    "    + (1 - drug_example[\"drug\"]) / (1 - drug_example[\"ps\"])\n",
+    ")\n",
+    "\n",
+    "ate_ipw = (\n",
+    "    (drug_example[\"drug\"] * drug_example[\"days\"] * drug_example[\"w\"]).sum()\n",
+    "    / (drug_example[\"drug\"] * drug_example[\"w\"]).sum()\n",
+    "    - ((1 - drug_example[\"drug\"]) * drug_example[\"days\"] * drug_example[\"w\"]).sum()\n",
+    "    / ((1 - drug_example[\"drug\"]) * drug_example[\"w\"]).sum()\n",
+    ")\n",
+    "print(f\"IPW ATE: {ate_ipw:.4f}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ca99e148",
+   "metadata": {},
+   "source": [
+    "The reweighted dataset is not the original observed data. It can be interpreted as a **pseudo-population** — an artificial construct in which treatment and sex appear independent. In this pseudo-population, no particular sex group is systematically over- or under-treated, recreating a situation equivalent to random assignment. As a result, a simple mean comparison is sufficient to estimate the causal effect."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "229f20cc",
+   "metadata": {},
+   "source": [
+    "## Step Further: Propensity Score\n",
+    "\n",
+    "The treatment probability $e(X) = P(T=1 \\mid X)$ used to construct the weights is called the **propensity score**. It represents each individual's probability of receiving the treatment given their covariates $X$.\n",
+    "\n",
+    "In our example, since $X$ was a single binary variable, we could compute $P(T \\mid X)$ by directly counting the treatment rate within each sex group.\n",
+    "\n",
+    "In practice, however, this probability is rarely known. When there are multiple covariates or continuous variables involved, simple counting won't work. This is why we **estimate** the propensity score using a model — most commonly logistic regression.\n",
+    "\n",
+    "We define the propensity score as $e(X)$ and estimate it from data as $\\hat{e}(X)$. The IPW weight then becomes:\n",
+    "\n",
+    "$$\n",
+    "\\hat{w}_i =\n",
+    "\\frac{T_i}{\\hat{e}(X_i)} + \\frac{1 - T_i}{1 - \\hat{e}(X_i)}\n",
+    "$$\n",
+    "\n",
+    "That said, this approach depends heavily on how well we estimate the propensity score. If the model fails to capture important nonlinearities or interactions — say, between age and sex — then $\\hat{e}(X)$ may not reflect the true treatment probability. The resulting weights would be inaccurate, covariate balance may not be achieved even after reweighting, and confounding may not be fully removed."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "742f0c3c",
+   "metadata": {},
+   "source": [
+    "## References\n",
+    "\n",
+    "- [Causal Inference: What If](https://miguelhernan.org/whatifbook) — Miguel A. Hernán, James M. Robins\n",
+    "- [Causal Inference for The Brave and True](https://matheusfacure.github.io/python-causality-handbook/landing-page.html) — Matheus Facure\n",
+    "- [Causal Inference for The Brave and True (Korean translation)](https://causalinferencelab.github.io/Causal-Inference-with-Python/landing-page.html) — CausalInferenceLab"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "name": "python",
+   "version": "3.11.0"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}