The Single-Door Criterion

Based on a SEMnet discussion with Holger Steinmetz.

Introduction

The single-door criterion helps to identify single path coefficients in structural equation models (SEMs). It is especially useful in cases where the SEM as a whole is not "identified", i.e., where it is impossible to find a unique set of coefficients for the whole model that best explains the observed covariance matrix. In this tutorial, we illustrate the single-door criterion with four examples.

Definition of the single-door criterion

Let G be any recursive causal graph in which α is the path coefficient associated with link X → Y, and let G_α denote the diagram that results when X → Y is deleted from G. The coefficient α is identifiable if there exists a set of variables Z such that

Z contains no descendant of Y and
Z d-separates X from Y in G_α

If Z satisfies these two conditions, then α is equal to the regression coefficient β_YX.Z Conversely, if Z does not satisfy these conditions, then β_YX.Z is not a consistent estimand of α (Chen & Pearl, 2014, p. 10).

Let's start to illuminate this definition using our examples.

All graphs on this page are clickable; clicking on a variable shows the effect of conditioning on it, i.e., including it into the set Z.

Examples

Model 1

We first consider a simple mediation model.

M 1 @-0.361,-0.183
X E @-1.884,-0.236
Y O @1.091,-0.125

M Y
X M

This model is clearly identified, and has positive degrees of freedom. Also the single-door criterion tells us that the parameter on the path X → M is identified because after removing the path X → M, X becomes unconditionally separated (i.e., conditionally separated given the empty set Z={}) from M:

M 1 @-0.361,-0.183
X E @-1.884,-0.236
Y O @1.091,-0.125

M Y

The parameter M → Y is also identified for the same reason.

Model 2

We now add a direct effect to the simple mediation model.

M 1 @-0.354,-0.559
X 1 @-1.884,-0.236
Y 1 @1.041,-0.270

M Y
X M Y

This model is just-identified (df=0). The coefficient X → M is identified as when we remove the path X → M, we get

M O @-0.354,-0.559
X E @-1.884,-0.236
Y 1 @1.041,-0.270

M Y
X Y

and again, X and M are unconditionally separated. Importantly, if we include Y into our set Z, this would violate the first condition of the single-door criterion, as we get a biasing path from X to M via the collider Y:

M O @-0.354,-0.559
X E @-1.884,-0.236
Y A @1.041,-0.270

M Y
X Y

Hence, X → M is no longer identified if Y is added to the regression.

The coefficient M → Y is identified as M and Y would become separated by holding X constant:

M E @-0.354,-0.559
X A @-1.884,-0.236
Y O @1.041,-0.270

X M
X Y

And finally, X → Y is also identified, as holding M constant separates X and Y.

Model 3

Now it becomes trickier: We add an error covariance between M and Y to model a confounder C that influences M and Y.

digraph G { 
M [pos="-0.521,-0.265"]
X [exposure,pos="-1.749,-0.238"]
Y [outcome,pos="1.029,-0.228"]
M <-> Y [pos="0.645,-0.279"]
X -> Y
X -> M -> Y
}

This model is un-identified with -1 df. Does this mean that we can not identify any coefficients? No, the single-door criterion tells us that we can anyway just regress M on X to identify the coefficient X → M. Just to check, this is what the graph looks like after removing the X → M arrow:

dag G { 
M [outcome,pos="-0.521,-0.265"]
X [exposure,pos="-1.749,-0.238"]
Y [pos="1.029,-0.228"]
M <-> Y [pos="0.645,-0.279"]
X -> Y
M -> Y
}

Indeed, X and M are independent now, so the criterion is fulfilled. But what about the other coefficients? Let us first check X → Y. After removing this path, we get

dag G { M [pos="-0.521,-0.265"] X [exposure,pos="-1.749,-0.238"] Y [outcome,pos="1.029,-0.228"] M <-> Y [pos="0.645,-0.279"] X -> M M -> Y }

and we see that X and Y are not separated. Our only chance to separate them would be by conditioning on M, which does not work either -- we would close one path via M, but open another one:

dag G { M [adjusted,pos="-0.521,-0.265"] X [exposure,pos="-1.749,-0.238"] Y [outcome,pos="1.029,-0.228"] M <-> Y [pos="0.645,-0.279"] X -> M M -> Y }

For the same reason, we cannot identify the error covariance M ↔ Y using the single-door criterion.

Conclusion: The single-door criterion can help us identify some, but not all, coefficients in un-identified structural equation models.

Model 4

Finally, let's add an instrument W for M to identify all coefficients:

digraph G { 
M [pos="-0.521,-0.265"]
W [pos="-1.75,-0.3"]
X [pos="-1.75,-0.23"]
Y [pos="1.029,-0.228"]
M <-> Y [pos="0.645,-0.279"]
W -> M
X -> M -> Y
X -> Y
}

This model is again just-identied; when simulating it, all parameter estimates are correct. But what will the single-door criterion tell us about the identifiability of the parameters?

The coefficient W → M (effect of the instrument) is identified as by deleting the referring path, W becomes unconditionally independent from M:

digraph G { 
M [outcome,pos="-0.521,-0.265"]
W [exposure,pos="-1.75,-0.3"]
X [pos="-1.75,-0.23"]
Y [pos="1.029,-0.228"]
M <-> Y [pos="0.645,-0.279"]
X -> M -> Y
X -> Y
}

In the same way, we can identify the coefficient X → M.

But what about X → Y? Here nothing has changed compared to model 3: The only candidate for Z would be M - but again, holding M constant opens a path between X and Y via the collider M:

digraph G { 
M [adjusted,pos="-0.521,-0.265"]
W [pos="-1.75,-0.3"]
X [exposure,pos="-1.75,-0.23"]
Y [outcome,pos="1.029,-0.228"]
W -> M 
M <-> Y [pos="0.645,-0.279"]
X -> M -> Y
X -> Y
}

For similar reasons, we cannot apply the criterion to the coefficients M → Y and M ↔ Y.

The bottom line

When coefficients are not identifiable using the single-door criterion, they can still be identifiable by other means, such as instrumental variables or even by estimating the whole model.