Causal inference for credit risk: why prediction alone isn't enough
There's a pattern I've seen repeatedly in financial ML: a model achieves excellent predictive performance — AUC above 0.80, stable on holdout — and the team ships it. Then, six months later, someon...

Source: DEV Community
There's a pattern I've seen repeatedly in financial ML: a model achieves excellent predictive performance — AUC above 0.80, stable on holdout — and the team ships it. Then, six months later, someone asks "but why is the model denying more applicants from this postal code?" and nobody has a good answer. Prediction and causation are different things, and conflating them is expensive in credit risk specifically. The core issue When you train a credit risk model, you're typically predicting P(default | features). This is a conditional probability — it tells you what tends to be true about people who look like this applicant. It doesn't tell you what caused their credit behavior, and it doesn't tell you what will happen if you lend to them. This distinction matters for two reasons. First, selection bias. Your training data only contains outcomes for people who were previously approved for credit. The people who were denied — perhaps by a prior model or manual policy — have no observed outco