Parameter Choices Matter in Structural Causal Models
Given a shortage of reliable training data, causal discovery algorithms are often benchmarked on data generated by synthetic structural causal models. We argue that the parameters in the simulation of such models may have a large and potentially unintended impact on the difficulty of the causal discovery task. Concretely, we show that marginal variance tends to increase along the causal order for generically sampled additive noise models. We introduce var-sortability as a measure of the agreement between the order of increasing marginal variance and the causal order. For commonly sampled graphs and model parameters, we show that the remarkable performance of some causal discovery algorithms can be explained by high var-sortability and matched by a simple baseline method. Yet, this performance may not transfer to real-world data where var-sortability may be moderate or dependent on the choice of measurement scales. We further present work in progress on parameter choices that result in high var-sortability and on how high levels of var-sortability can still be exploited after standardization. In combination, these findings may put into question the validity of common causal discovery benchmarks and point to ways of improving them.