A Pareto-smoothing method for causal inference using generalized Pareto distribution

descriptionPublicationkeyboard_double_arrow_right Article 01 Feb 2020 English Publisher:Elsevier BVJournal:Neurocomputing, volume 378, pages 142-152 (issn: 0925-2312,

Copyright policy )Funded by:ARC | Discovery Projects - Gran...

Authors: Fujin Zhu; Jie Lu 0001; Adi Lin; Guangquan Zhang 0001;

doi: 10.1016/j.neucom.2019.09.095

A Pareto-smoothing method for causal inference using generalized Pareto distribution

- Summary
- Metrics

Abstract

Abstract Causal inference aims to estimate the treatment effect of an intervention on the target outcome variable and has received great attention across fields ranging from economics and statistics to machine learning. Observational causal inference is challenging because the pre-treatment variables may influence both the treatment and the outcome, resulting in confounding bias. The classic inverse propensity weighting (IPW) estimator is theoretically able to eliminate the confounding bias. However, in observational studies, the propensity scores used in the IPW estimator must be estimated from finite observational data and may be subject to extreme values, leading to the problem of highly variable importance weights, which consequently makes the estimated causal effect unstable or even misleading. In this paper, by reframing the IPW estimator in the importance sampling framework, we propose a Pareto-smoothing method to tackle this problem. The generalized Pareto distribution (GPD) from extreme value theory is used to fit the upper tail of the estimated importance weights and to replace them using the order statistics of the fitted GPD. To validate the performance of the new method, we conducted extensive experiments on simulated and semi-simulated datasets. Compared with two existing methods for importance weight stabilization, i.e., weight truncation and self-normalization, the proposed method generally achieves better performance in settings with a small sample size and high-dimensional covariates. Its application on a real-world heath dataset indicates its utility in estimating causal effects for program evaluation.

Related Organizations

Beijing Institute of Technology
China (People's Republic of)
University of Technology Sydney
Australia

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	3
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

3

Average

bronze

Fields of Science (4) View all

Fields of Science

Funded by

ARC| Discovery Projects - Grant ID: DP170101632