More information coming soon.
One hopes that statistics will be used to analyze data and make beneficial decisions regarding people's health, finances, and well-being. But the data fed to a statistical analysis may systematically differ from the data where these decisions are ultimately applied. For instance, suppose we analyze data in one country and conclude that microcredit is effective at alleviating poverty; based on this analysis, we decide to distribute microcredit in other locations and in future years. We might then ask: can we trust our conclusion to apply under new conditions? If we found that a very small percentage of the original data was instrumental in determining the original conclusion, we might expect the conclusion to be unstable under new conditions. So we propose a method to assess the sensitivity of data analyses to the removal of a small fraction of the data set. Analyzing all possible data subsets of a certain size is computationally prohibitive, so we provide an approximation. We call our resulting metric the Approximate Maximum Influence Perturbation. Our approximation is automatically computable and works for common estimators—including (but not limited to) OLS, IV, GMM, MLE, MAP, and variational Bayes. We show that any non-robustness our metric finds is conclusive. Empirics demonstrate that while some applications are robust, in others the sign of a treatment effect can be changed by dropping much less than 1% of the data—even in simple models and even when standard errors are small.
Tamara Broderick is an Associate Professor in the Department of Electrical Engineering and Computer Science at MIT. She is a member of the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL), the MIT Statistics and Data Science Center, and the Institute for Data, Systems, and Society (IDSS). She completed her Ph.D. in Statistics at the University of California, Berkeley in 2014. Previously, she received an AB in Mathematics from Princeton University (2007), a Master of Advanced Study for completion of Part III of the Mathematical Tripos from the University of Cambridge (2008), an MPhil by research in Physics from the University of Cambridge (2009), and an MS in Computer Science from the University of California, Berkeley (2013). Her recent research has focused on developing and analyzing models for scalable Bayesian machine learning. She has been awarded selection to the COPSS Leadership Academy (2021), an Early Career Grant (ECG) from the Office of Naval Research (2020), an AISTATS Notable Paper Award (2019), an NSF CAREER Award (2018), a Sloan Research Fellowship (2018), an Army Research Office Young Investigator Program (YIP) award (2017), Google Faculty Research Awards, an Amazon Research Award, the ISBA Lifetime Members Junior Researcher Award, the Savage Award (for an outstanding doctoral dissertation in Bayesian theory and methods), the Evelyn Fix Memorial Medal and Citation (for the Ph.D. student on the Berkeley campus showing the greatest promise in statistical research), the Berkeley Fellowship, an NSF Graduate Research Fellowship, a Marshall Scholarship, and the Phi Beta Kappa Prize (for the graduating Princeton senior with the highest academic average).
We consider the Bayesian analysis of models in which the unknown distribution of the outcomes is specified up to a set of conditional moment restrictions. The nonparametric exponentially tilted empirical likelihood function is constructed to satisfy a sequence of unconditional moments based on an increasing (in sample size) vector of approximating functions (such as tensor splines based on the splines of each conditioning variable). For any given sample size, results are robust to the number of expanded moments. We derive Bernstein-von Mises theorems for the behavior of the posterior distribution under both correct and incorrect specification of the conditional moments, subject to growth rate conditions (slower under misspecification) on the number of approximating functions. A large-sample theory for comparing different conditional moment models is also developed. The central result is that the marginal likelihood criterion selects the model that is less misspecified. We also introduce sparsity-based model search for high-dimensional conditioning variables, and provide efficient MCMC computations for high-dimensional parameters. Along with clarifying examples, the framework is illustrated with real-data applications to risk-factor determination in finance, and causal inference under conditional ignorability.
Siddhartha Chib is the Harry C. Hartkopf Professor of Econometrics and Statistics in the Olin Business School at Washington University in St. Louis. He received his bachelor's degree from Delhi University in 1979 and he earned his Ph.D. in Economics from the University of California, Santa Barbara in 1986. He works in Bayesian statistics, econometrics and Markov chain Monte Carlo (MCMC) methods. Professor Chib is a Fellow of the American Statistical Association, of the journal Econometrics, and of the International Society of Bayesian Analysis. He is an Associate Editor of the Journal of Computational and Graphical Statistics, and Statistics and Computing. He also directs the annual NBER-NSF sponsored Seminar in Bayesian Econometrics and Statistics (SBIES), which features presentations by young and established researchers working on the theory and application of Bayesian methods. He teaches statistics and econometrics to students in the MBA, specialized MS, and doctoral programs.
Single-case experimental designs hold great promise for enabling participants to create personalized protocols to make personalized treatment decisions. The most common single- case design used in the health sciences is the N-of-1 trial, a multi-crossover withdrawal/reversal design in which participants receive a set of two or more treatments multiple times in a randomized order. Other forms such as the multiple-baseline (stepped wedge) design are also common in the behavioral and social sciences. In contrast to traditional group or cluster randomized designs, single-case designs can measure individual treatment efficacy. By combining single-case trials in a multilevel structure, they can also assess average treatment effects in populations and subgroups and treatment effect heterogeneity. Implementation of single-case designs has lagged in healthcare because of a lack of infrastructure for designing and running the trials, analyzing their data and reporting and interpreting their results. Mobile device applications provide a means to implement such trials on a large scale. We discuss some completed and ongoing N-of-1 studies in which we have combined mobile device applications with server-driven statistical analytics using an R package to return results to individuals. Issues that arise include defining treatments and sequences of treatments, synthesizing treatment networks, incorporating patient-specific prior information, automating the choice of appropriate statistical models and assessment of model assumptions, and automating graphical displays and text to facilitate appropriate interpretation by non-technical users. Development of smart tools that solve these problems could help to transform health care research by expanding the settings in which it is carried out and making findings directly applicable to and interpretable by individual trial participants.
Dr. Schmid is Professor and Chair of Biostatistics at Brown University School of Public Health where he co-founded the Center for Evidence Synthesis in Health. He directs Biostatistics, Epidemiology and Research Design (BERD) Core of the Rhode Island Center to Advance Translational Science. He is a Fellow of the American Statistical Association, founding Editor of the journal Research Synthesis Methods, long-time statistical editor of the American Journal of Kidney Diseases and former member of the Drug Safety and Risk Management Committee for FDA. His research focuses on Bayesian methods for meta-analysis, methods for developing and assessing predictive models using data from multiple sources and on analysis of data from N-of-1 trials. Dr. Schmid graduated from Haverford College with a BA in Mathematics and received his PhD in Statistics from Harvard University.