Year of Publication
Doctor of Philosophy (PhD)
Arts and Sciences
Dr. Solomon Harrar
Clinical trials are often used to assess drug efficacy and safety. Participants are sometimes pre-stratified into different groups by diagnostic tools. However, these diagnostic tools are fallible. The traditional method ignores this problem and assumes the diagnostic devices are perfect. This assumption will lead to inefficient and biased estimators. In this era of personalized medicine and measurement-based care, the issues of bias and efficiency are of paramount importance. Despite the prominence, only few researches evaluated the treatment effect in the presence of misclassifications in some special cases and most others focus on assessing the accuracy of the diagnostic devices. In this dissertation, we aim to fill in this methodological gap in the estimation of treatment effects in the multivariate and nonparametric contexts. We focus on a pre-post design and address the problem of misclassifications in three distinct situations.
In clinical trials with continuous multiple endpoints, we model the outcome variables as a mixture of multivariate normal distributions to account for the effect of misclassification errors. We propose two methods for estimating and testing treatment effects. When the misclassification errors are known from previous studies, we develop moment-based tests and confidence interval procedures which are accurate in finite samples. When the misclassification errors are unknown, we propose likelihood-based procedures for estimation and testing via the EM algorithm. In addition, methods for sample size and power calculations are developed. The moment-based methods can also be used when the misclassification rates are unknown if validation samples are available. In this case, consistent estimators of the misclassification error rates are derived using a novel distance-based criterion.
When the data are measured in a nonmetric scale or when the distribution of the data is heavy-tailed or skewed, the normality assumption is not valid. In this case, we develop a fully-nonparametric method to assess treatment effect. We model the distribution of the outcomes by as a nonparametric mixture of unknown distributions. To overcome identifiability problems, we assume availability of training data from the component distributions. In the nonparametric setting, functionals of these distribution functions are used to characterize treatment effects. We provide consistent estimators and asymptotic distributions of the estimators of the misclassification error rates as well as the treatment effect. We do not require any assumptions regarding existence of moments of any order.
Typically, clinical trials involve collection of baseline covariates which are associated with the misclassification of a patient and treatment outcomes. In this situation, we propose a nonparametric finite mixture of regression models to approximate the distribution of outcomes. We establish identifiability conditions and derive an estimation procedure using the kernel methods and the EM algorithm.
Simulation results show significant advantage of the proposed methods in terms of bias reduction, coverage probability, and power. The applications of the methods are illustrated with datasets from a sleep deprivation and electroencephalogram (EEG) studies.
Digital Object Identifier (DOI)
Ye, Zi, "Estimating and Testing Treatment Effects with Misclassified Multivariate Data" (2021). Theses and Dissertations--Statistics. 60.