Your browser is unsupported

We recommend using the latest version of IE11, Edge, Chrome, Firefox or Safari.

In this section

CARMA Annotations

CARMA Annotations

The following list of articles and annotations were created as part of the activities of the Center for Annotating Research Methods and Analyses (CARMA) project funded by IMEDS of the Reagan-Udall Foundation. This list is updated regularly.

Annotations

The following list includes information on each of the articles and brief annotation. A searchable version of this list is available at the Zotero public library used for this project at www.zotero.org/groups/carma_public

Annotations Added

September 2016

June 2016

March 2016

Added September 2016

  • A Bayesian approach to estimating causal vaccine effects on binary post-infection outcomes

    Author Jincheng Zhou
    Author Haitao Chu
    Author Michael G. Hudgens
    Author M. Elizabeth Halloran
    Volume 35
    Issue 1
    Pages 53-64
    Publication Statistics in Medicine
    Date January 15, 2016
    Abstract To estimate causal effects of vaccine on post-infection outcomes, Hudgens and Halloran (2006) defined a post-infection causal vaccine efficacy estimand VEI based on the principal stratification framework. They also derived closed forms for the maximum likelihood estimators of the causal estimand under some assumptions. Extending their research, we propose a Bayesian approach to estimating the causal vaccine effects on binary post-infection outcomes. The identifiability of the causal vaccine effect VEI is discussed under different assumptions on selection bias. The performance of the proposed Bayesian method is compared with the maximum likelihood method through simulation studies and two case studies — a clinical trial of a rotavirus vaccine candidate and a field study of pertussis vaccination. For both case studies, the Bayesian approach provided similar inference as the frequentist analysis. However, simulation studies with small sample sizes suggest that the Bayesian approach provides smaller bias and shorter confidence interval length. Copyright © 2015 John Wiley & Sons, Ltd.

    Notes:

    • One of the common aims of vaccine studies is evaluating post-infection outcomes including morbidity, mortality, and secondary transmission. Since the group of vaccinated individuals who become infected may not be comparable to the group of unvaccinated individuals who become infected; comparisons of vaccine outcomes conditional on infection can be biased. Previous studies have addressed this issue and have provided frameworks based on principal stratification where the latent principal strata is not affected by treatment assignment and thus the causal effects can be determined. Frequentist approaches for causal inference has limitation when sample size is small and maximum likelihood estimation (MLE) methods are not suitable when parameter estimates are on the boundary of the parameter space. The study proposes a new Bayesian approach using Markov chain Monte Carlo (MCMC) simulations for causal inference in the context of vaccine and binary post-infection outcomes. The authors provided assumptions (stable unit treatment value or no interference, independent treatment assignment and monotonicity), estimation and identifiability of vaccine effect estimates from a previous study. The authors then presented Bayesian estimation methods where posterior computation is done by MCMC method with 100,000 iterations. To compare the Bayesian method with the MLE method, the authors conducted a reanalysis for two case studies (pertussis and rotavirus vaccine) using this approach and found no noticeable difference. Further, the credible intervals were narrower for Bayesian method. The proposed Bayesian approach provides a better alternative to frequentist approach in scenarios where the sample size is small or the suitable prior distributions can be constructed to incorporate known constraints on model parameters. Future research is required in terms of extension of this approach to non-binary outcomes and impact of relaxing assumptions.

  • A Bayesian approach to probabilistic sensitivity analysis in structured benefit-risk assessment

    Author Ed Waddingham
    Author Shahrul Mt-Isa
    Author Richard Nixon
    Author Deborah Ashby
    Volume 58
    Issue 1
    Pages 28-42
    Publication Biometrical Journal
    Date January 1, 2016
    Abstract Quantitative decision models such as multiple criteria decision analysis (MCDA) can be used in benefit-risk assessment to formalize trade-offs between benefits and risks, providing transparency to the assessment process. There is however no well-established method for propagating uncertainty of treatment effects data through such models to provide a sense of the variability of the benefit-risk balance. Here, we present a Bayesian statistical method that directly models the outcomes observed in randomized placebo-controlled trials and uses this to infer indirect comparisons between competing active treatments. The resulting treatment effects estimates are suitable for use within the MCDA setting, and it is possible to derive the distribution of the overall benefit-risk balance through Markov Chain Monte Carlo simulation. The method is illustrated using a case study of natalizumab for relapsing-remitting multiple sclerosis.

    Notes:

    • Transparent benefit-risk assessment has emerged as an important aspect of decision-making. A method for such assessment is multiple criteria decision-analysis (MCDA). This method is based on specifying expected consequences of alternative treatments, establishing their utilities and optimizing decision by selecting the treatment that maximizes value for the decision-maker. Currently, there are no standard probabilistic methods to assign uncertainty for MCDA models. This study provides a case study of MCDA for natalizumab, a treatment for multiple sclerosis where they developed Bayesian methods to assess the uncertainty. The study describes three processes: fitting a suitable Bayesian model for treatment outcomes, using Markov Chain Monte Carlo (MCMC) stimulation to generate distribution for benefit-risk score and assessing how the information on these distributions could influence the natalizumab’s benefit-risk decision.  The authors compared three active treatments: natalizumab, interferon-beta and glatiramer acetate. They used data from clinical trials, package inserts, or based on assumptions and carried out direct and indirect comparisons.  The disadvantage of the methods used in this study is complexity of fitting models; however, Bayesian modelling will be more suitable for hierarchical models, complex evidence networks, and in situations where analysis is required to be updated frequently.

  • Adaptive truncated weighting for improving marginal structural model estimation of treatment effects informally censored by subsequent therapy

    Author Xiaofei Bai
    Author Jingyi Liu
    Author Li Li
    Author Douglas Faries
    Volume 14
    Issue 6
    Pages 448-454
    Publication Pharmaceutical Statistics
    Date November 1, 2015
    Abstract Randomized clinical trials are designed to estimate the direct effect of a treatment by randomly assigning patients to receive either treatment or control. However, in some trials, patients who discontinued their initial randomized treatment are allowed to switch to another treatment. Therefore, the direct treatment effect of interest may be confounded by subsequent treatment. Moreover, the decision on whether to initiate a second-line treatment is typically made based on time-dependent factors that may be affected by prior treatment history. Due to these time-dependent confounders, traditional time-dependent Cox models may produce biased estimators of the direct treatment effect. Marginal structural models (MSMs) have been applied to estimate causal treatment effects even in the presence of time-dependent confounders. However, the occurrence of extremely large weights can inflate the variance of the MSM estimators. In this article, we proposed a new method for estimating weights in MSMs by adaptively truncating the longitudinal inverse probabilities. This method provides balance in the bias variance trade-off when large weights are inevitable, without the ad hoc removal of selected observations. We conducted simulation studies to explore the performance of different methods by comparing bias, standard deviation, confidence interval coverage rates, and mean square error under various scenarios. We also applied these methods to a randomized, open-label, phase III study of patients with nonsquamous non-small cell lung cancer. Copyright © 2015 John Wiley & Sons, Ltd.

    Notes:

    • In many randomized controlled trials, participants have changes in randomly assigned treatment as warranted by clinical circumstances. This non-random switching introduces an analytic problem of estimating direct treatment effects when they are, in a time varying manner, confounded by subsequent treatment. Existing methods include marginal structural models that use stabilized inverse probability treatment weighting to adjust for time-varying confounding. However this approach is prone to render estimation unstable due to extreme weights. Extreme weights are assigned to an event where the expected treatment switch did not occur in alignment with time-varying confounders.  One solution to extreme weights is post-hoc identification and exclusion of such observations. Instead, the authors propose an adaptive truncation approach. In adaptive truncation, stabilized IPT weights are calculated for each individual at every time point and the 99% percentile weights are truncated at that time point. The truncated weight from i-1 time point is used to update the weight at time i. This way, given their longitudinal multiplicative nature, the weights are prevented from being carried forward and becoming larger. The authors used this approach in marginal structural Cox models but mention that it can be adapted to other MSMs. Simulations that they conducted show that adaptive truncation yields lower mean squared errors in presence of large weights. When the weight distribution is uniform within a normal range, there is no difference between regular MSM and adaptive truncation MSM. The primary advantage with adaptive truncation is not having to do a post-hoc exclusion exercise. However, this approach only attempts to balance bias and variance versus elimination.

  • Adjusting for Confounding in Early Postlaunch Settings: Going Beyond Logistic Regression Models

    Author Amand Schmidt
    Author Olaf Klungel
    Author Rolf Groenwold
    Volume 27
    Issue 1
    Pages 133-142
    Publication Epidemiology
    Date January 2016
    Abstract Background: Postlaunch data on medical treatments can be analyzed to explore adverse events or relative effectiveness in real-life settings. These analyses are often complicated by the number of potential confounders and the possibility of model misspecification. Methods: We conducted a simulation study to compare the performance of logistic regression, propensity score, disease risk score, and stabilized inverse probability weighting methods to adjust for confounding. Model misspecification was induced in the independent derivation dataset. We evaluated performance using relative bias confidence interval coverage of the true effect, among other metrics. Results: At low events per coefficient (1.0 and 0.5), the logistic regression estimates had a large relative bias (greater than −100%). Bias of the disease risk score estimates was at most 13.48% and 18.83%. For the propensity score model, this was 8.74% and >100%, respectively. At events per coefficient of 1.0 and 0.5, inverse probability weighting frequently failed or reduced to a crude regression, resulting in biases of −8.49% and 24.55%. Coverage of logistic regression estimates became less than the nominal level at events per coefficient ≤5. For the disease risk score, inverse probability weighting, and propensity score, coverage became less than nominal at events per coefficient ≤2.5, ≤1.0, and ≤1.0, respectively. Bias of misspecified disease risk score models was 16.55%. Conclusion: In settings with low events/exposed subjects per coefficient, disease risk score methods can be useful alternatives to logistic regression models, especially when propensity score models cannot be used. Despite better performance of disease risk score methods than logistic regression and propensity score models in small events per coefficient settings, bias, and coverage still deviated from nominal.

    Notes:

    • Adjusting for confounding in comparative effectiveness or safety research involving newly launched drugs are particularly challenging, since both exposure and outcome are typically rare while regression methods require a minimum of 8-10 outcomes per covariate adjusted for to produce unbiased and precise estimates.  The authors compared the performance of logistic regression, propensity score (PS), disease risk score (DRS), and stabilized inverse probability weighting (IPW) methods in this stimulation study.  They found that when events per covariate were low, logistic regression had a large bias, while IPW methods failed or reduced to crude regression.  DRS and PS methods resulted in less relative bias than the other models when the events per covariate ratio was 1.0 (13.48% and 8.74% respectively). When the events per covariate ratio was 0.5, the DRS resulted in the least bias (18.83% for DRS versus >100% for the PS and logistic regression method, and 24.55% for IPW). The authors note that despite better performance of DRS in the case of small events per covariate, results were still biased.  These results are applicable to any pharmacoepidemiologic study involving newly launched drugs, or in the case of studying rare events.

  • A generalized semiparametric mixed model for analysis of multivariate health care utilization data

    Author Zhuokai Li
    Author Hai Liu
    Author Wanzhu Tu
    Pages 0962280215615159
    Publication Statistical Methods in Medical Research
    Date 2015-11-23
    Abstract Health care utilization is an outcome of interest in health services research. Two frequently studied forms of utilization are counts of emergency department (ED) visits and hospital admissions. These counts collectively convey a sense of disease exacerbation and cost escalation. Different types of event counts from the same patient form a vector of correlated outcomes. Traditional analysis typically model such outcomes one at a time, ignoring the natural correlations between different events, and thus failing to provide a full picture of patient care utilization. In this research, we propose a multivariate semiparametric modeling framework for the analysis of multiple health care events following the exponential family of distributions in a longitudinal setting. Bivariate nonparametric functions are incorporated to assess the concurrent nonlinear influences of independent variables as well as their interaction effects on the outcomes. The smooth functions are estimated using the thin plate regression splines. A maximum penalized likelihood method is used for parameter estimation. The performance of the proposed method was evaluated through simulation studies. To illustrate the method, we analyzed data from a clinical trial in which ED visits and hospital admissions were considered as bivariate outcomes.

    Notes:

    • Analysis of healthcare data often involves measurement of more than one outcome concurrently; however, the temporal dependence and underlying correlations between the outcomes within individual patients are often not taken into account. In addition, independent variables are often assumed to have a linear relationship with the outcomes, resulting in possible model misspecification and biased results. The authors of this paper propose a generalized multivariate semiparametric mixed model for analysis of multiple health utilization count outcomes and nonlinear independent variable effects. Previously developed bivariate semiparametric models are generalized to count outcomes using the exponential family of distributions and model parameters are estimated using maximum penalized likelihood. Interactions of independent variables which have nonlinear influences are modeled using thin plate regression splines and simulation studies are performed to assess the performance of the proposed method. A real world example using clinical trial healthcare utilization data is also presented. Under settings of both strong and moderate between-outcome correlation, the proposed method showed strong model fitting performance with low parameter estimation bias and confidence interval coverage probabilities close to the 95% nominal level. The proposed model also showed superior performance compared to both generalized additive mixed models and generalized linear mixed models. Using the method on clinical trial data, strong within-subject correlation was noted between ED visits and hospital admission rates (modeled as count outcomes), and non-linear patterns were noted for the effect of independent variables (depression severity and functional health status) on the outcomes; the model was thus deemed to be appropriate for data of this type. Though used with clinical trial data in this study, the methods proposed can be applied to observational studies of medication effects where correlated outcomes and/or non-linear independent variable effects (and interaction term effects) are suspected. While previously proposed methods for handling correlated outcome data usually focus on continuous outcomes, the general model proposed here by the authors will allow for its use with a wide variety of data types and study designs.

  • A Method for the Minimization of Competition Bias in Signal Detection from Spontaneous Reporting Databases

    Author Mickael Arnaud
    Author Francesco Salvo
    Author Ismaïl Ahmed
    Author Philip Robinson
    Author Nicholas Moore
    Author Bernard Bégaud
    Author Pascale Tubert-Bitter
    Author Antoine Pariente
    Volume 39
    Issue 3
    Pages 251-260
    Publication Drug Safety
    Date 2015/12/29
    Abstract Introduction The two methods for minimizing competition bias in signal of disproportionate reporting (SDR) detection—masking factor (MF) and masking ratio (MR)—have focused on the strength of disproportionality for identifying competitors and have been tested using competitors at the drug level. Objectives The aim of this study was to develop a method that relies on identifying competitors by considering the proportion of reports of adverse events (AEs) that mention the drug class at an adequate level of drug grouping to increase sensitivity (Se) for SDR unmasking, and its comparison with MF and MR. Methods Reports in the French spontaneous reporting database between 2000 and 2005 were selected. Five AEs were considered: myocardial infarction, pancreatitis, aplastic anemia, convulsions, and gastrointestinal bleeding; related reports were retrieved using standardized Medical Dictionary for Regulatory Activities (MedDRA®) queries. Potential competitors of AEs were identified using the developed method, i.e. Competition Index (ComIn), as well as MF and MR. All three methods were tested according to Anatomical Therapeutic Chemical (ATC) classification levels 2–5. For each AE, SDR detection was performed, first in the complete database, and second after removing reports mentioning competitors; SDRs only detected after the removal were unmasked. All unmasked SDRs were validated using the Summary of Product Characteristics, and constituted the reference dataset used for computing the performance for SDR unmasking (area under the curve [AUC], Se). Results Performance of the ComIn was highest when considering competitors at ATC level 3 (AUC: 62 %; Se: 52 %); similar results were obtained with MF and MR. Conclusion The ComIn could greatly minimize the competition bias in SDR detection. Further study using a larger dataset is needed.

    Notes:

    • The authors developed a new method called the Competition Index (ComIn) to minimize the competition bias in signal of disproportionate reporting (SDR) detection, and compared this new method to two existing methods - masking factor (MF) and masking ratio (MR). The existing methods involved two steps: first identify the potential competitor using an index of competition (MF or MR) and then remove the reports related to the identified competitors. The MF or MR relies on the strength of disproportionality at a drug level. However, the new approach used the proportion of the adverse event (AE) in a drug class level to identify potential competitors.

      The authors used the French spontaneous reporting database to test the three methods on serious AE using an ATC drug level. Area under the curve was used to estimate the performance for SDR unmasking. They found that ComIn identified more competitors than MF and MR by considering drug class as a level rather than individual drug used in MF and MR. Under the combination of ATC level 3 and a threshold of 12%, ComeIn provided the best performance for unmasking SDR and for sensitivity of SDR unmasking. Although the difference among the three methods was not statistically significant, ComIn had a simpler definition and consume less computational resource than the other two approaches. In conclusion, the new method ComIn greatly minimize the competition bias in SDR and exert similar ability for unmasking DRS with MF and MR. 

  • A normalization method for combination of laboratory test results from different electronic healthcare databases in a distributed research network

    Author Dukyong Yoon
    Author Martijn J. Schuemie
    Author Ju Han Kim
    Author Dong Ki Kim
    Author Man Young Park
    Author Eun Kyoung Ahn
    Author Eun-Young Jung
    Author Dong Kyun Park
    Author Soo Yeon Cho
    Author Dahye Shin
    Author Yeonsoo Hwang
    Author Rae Woong Park
    Volume 25
    Issue 3
    Pages 307-316
    Publication Pharmacoepidemiology and Drug Safety
    Date March 1, 2016
    Abstract Purpose Distributed research networks (DRNs) afford statistical power by integrating observational data from multiple partners for retrospective studies. However, laboratory test results across care sites are derived using different assays from varying patient populations, making it difficult to simply combine data for analysis. Additionally, existing normalization methods are not suitable for retrospective studies. We normalized laboratory results from different data sources by adjusting for heterogeneous clinico-epidemiologic characteristics of the data and called this the subgroup-adjusted normalization (SAN) method. Methods Subgroup-adjusted normalization renders the means and standard deviations of distributions identical under population structure-adjusted conditions. To evaluate its performance, we compared SAN with existing methods for simulated and real datasets consisting of blood urea nitrogen, serum creatinine, hematocrit, hemoglobin, serum potassium, and total bilirubin. Various clinico-epidemiologic characteristics can be applied together in SAN. For simplicity of comparison, age and gender were used to adjust population heterogeneity in this study. Results In simulations, SAN had the lowest standardized difference in means (SDM) and Kolmogorov–Smirnov values for all tests (p < 0.05). In a real dataset, SAN had the lowest SDM and Kolmogorov–Smirnov values for blood urea nitrogen, hematocrit, hemoglobin, and serum potassium, and the lowest SDM for serum creatinine (p < 0.05). Conclusion Subgroup-adjusted normalization performed better than normalization using other methods. The SAN method is applicable in a DRN environment and should facilitate analysis of data integrated across DRN partners for retrospective observational studies. Copyright © 2015 John Wiley & Sons, Ltd.

    Notes:

    • Existing approaches to normalization of data from different sources -- rank-based and scale-adjustment methods -- do not take into account either the underlying clinical meaning of values or population heterogeneity. This problem is of relevance in the conduct of observational research using lab values from Distributed Research Networks. These networks allow one to analyze data from contributing data partners at different healthcare institutions without compromising privacy/confidentiality.  However, laboratory practices differ by institution introducing systematic errors in pooled estimates. In order to circumvent this problem, the authors developed a new method: subgroup-adjusted normalization (SAN). After stratifying contributing datasets by clinico-epidemiological characteristics (for example, age and gender),  normalized values in the second data set are generated using adjusted mean and standard deviation in the first dataset.  The adjusted mean and standard deviation for  the first dataset are obtained by using patient count from the second dataset (whereas conventional normalization uses patient count from the same dataset). The authors compared distributions from various normalization techniques using standardized difference in means (SDM) and Kolmogorov-Smirnov (KS) values and found that the SAN approach generated distributions with least SDM and KS. This approach has application in epidemiologic research pooling data from healthcare institutions that follow different practices.

  • A simulation study of sample size demonstrated the importance of the number of events per variable to develop prediction models in clustered data

    Author L. Wynants
    Author W. Bouwmeester
    Author K. G. M. Moons
    Author M. Moerbeek
    Author D. Timmerman
    Author S. Van Huffel
    Author B. Van Calster
    Author Y. Vergouwe
    Volume 68
    Issue 12
    Pages 1406-1414
    Publication Journal of Clinical Epidemiology
    Date 2015-12-01
    Abstract Objectives This study aims to investigate the influence of the amount of clustering [intraclass correlation (ICC) = 0%, 5%, or 20%], the number of events per variable (EPV) or candidate predictor (EPV = 5, 10, 20, or 50), and backward variable selection on the performance of prediction models. Study Design and Setting Researchers frequently combine data from several centers to develop clinical prediction models. In our simulation study, we developed models from clustered training data using multilevel logistic regression and validated them in external data. Results The amount of clustering was not meaningfully associated with the models' predictive performance. The median calibration slope of models built in samples with EPV = 5 and strong clustering (ICC = 20%) was 0.71. With EPV = 5 and ICC = 0%, it was 0.72. A higher EPV related to an increased performance: the calibration slope was 0.85 at EPV = 10 and ICC = 20% and 0.96 at EPV = 50 and ICC = 20%. Variable selection sometimes led to a substantial relative bias in the estimated predictor effects (up to 118% at EPV = 5), but this had little influence on the model's performance in our simulations. Conclusion We recommend at least 10 EPV to fit prediction models in clustered data using logistic regression. Up to 50 EPV may be needed when variable selection is performed.

    Notes:

    • Clinical prediction models for dichotomous outcomes are often constructed using multivariate logistic regression models. A general rule-of-thumb for these analysis is to include at least 10 outcome events per variable (EPV) or potential predictor when developing prediction models. The validity of these types of rules are typically evaluated by determining degree of bias on the regression estimates with varying EPVs. The authors in this study also evaluate different EPVs by the predictive performance of the regression model. They created large source populations with intraclass correlations of 0%, 5%, and 20% to reflect situations with varying degrees of clustering. Random intercept models were fitted to each sample and covariates selected through backward selection. The authors investigate the influence of the number of EPVs on various model criteria: bias on estimated regression coefficients and predictive performance (c-statistic or index, within-cluster c-index, and calibration slopes).

      The authors showed that the number of EPV will influence the parameter estimates of the prediction model as well as the predictive performance with external data. At low EPV (e.g. 5) and higher clustering (e.g. ICC=20%), the model was over-fitted and too extreme. However, at EPV = 50, there was improved discrimination between events and nonevents. Clustering or the intraclass correlation did not appear to affect model predictive performance. Additionally, they found that sample size regardless of EPV improved stability of the prediction model, indicating that nonevents are an important contributor to model performance.

      This study enhances the discussion of adequate sample size and EPV to guide the creation of prediction models using multivariable logistic regression.

  • Asymptotically Unbiased Estimation of Exposure Odds Ratios in Complete Records Logistic Regression

    Author Jonathan W. Bartlett
    Author Ofer Harel
    Author James R. Carpenter
    Volume 182
    Issue 8
    Pages 730-736
    Publication American Journal of Epidemiology
    Date 10/15/2015
    Abstract Missing data are a commonly occurring threat to the validity and efficiency of epidemiologic studies. Perhaps the most common approach to handling missing data is to simply drop those records with 1 or more missing values, in so-called “complete records” or “complete case” analysis. In this paper, we bring together earlier-derived yet perhaps now somewhat neglected results which show that a logistic regression complete records analysis can provide asymptotically unbiased estimates of the association of an exposure of interest with an outcome, adjusted for a number of confounders, under a surprisingly wide range of missing-data assumptions. We give detailed guidance describing how the observed data can be used to judge the plausibility of these assumptions. The results mean that in large epidemiologic studies which are affected by missing data and analyzed by logistic regression, exposure associations may be estimated without bias in a number of settings where researchers might otherwise assume that bias would occur.

    Notes:

    • Complete case analysis (CCA) is a simple method commonly used to address missing data in epidemiologic research but can result in substantial bias when data are not missing completely at random (MCAR).  Authors provide guidance on special conditions where CCA will result in an unbiased odds ratio when binary logistic regression is used, even when data are missing at random (MAR) or missing not at random (MNAR).  These include circumstances where missingness only depends on the outcome (although the intercept coefficient is biased, covariate and exposure covariates are unbiased), only dependent on covariates or exposure but not the outcome (all coefficients unbiased), and only dependent on the outcome and confounders but not the exposure (only exposure covariate unbiased).  They also discuss circumstances involving model misspecification and interaction, how to assess the plausibility of missingness based on the observed data in addition to conceptual knowledge.  They use simulations based on data from a cohort study of professional flight crew to illustrate these concept.  Authors emphasize that other limitations of using CCA, for instance, reduced precision due to discarding data, are still present. This article is pertinent to any research where missing data is an issue and when logistic regression analysis is used to determine exposure-outcome relationship.

  • A system to build distributed multivariate models and manage disparate data sharing policies: implementation in the scalable national network for effectiveness research

    Author Daniella Meeker
    Author Xiaoqian Jiang
    Author Michael E. Matheny
    Author Claudiu Farcas
    Author Michel D’Arcy
    Author Laura Pearlman
    Author Lavanya Nookala
    Author Michele E. Day
    Author Katherine K. Kim
    Author Hyeoneui Kim
    Author Aziz Boxwala
    Author Robert El-Kareh
    Author Grace M. Kuo
    Author Frederic S. Resnic
    Author Carl Kesselman
    Author Lucila Ohno-Machado
    Volume 22
    Issue 6
    Pages 1187-1195
    Publication Journal of the American Medical Informatics Association
    Date 2015/11/01
    Abstract Background Centralized and federated models for sharing data in research networks currently exist. To build multivariate data analysis for centralized networks, transfer of patient-level data to a central computation resource is necessary. The authors implemented distributed multivariate models for federated networks in which patient-level data is kept at each site and data exchange policies are managed in a study-centric manner.Objective The objective was to implement infrastructure that supports the functionality of some existing research networks (e.g., cohort discovery, workflow management, and estimation of multivariate analytic models on centralized data) while adding additional important new features, such as algorithms for distributed iterative multivariate models, a graphical interface for multivariate model specification, synchronous and asynchronous response to network queries, investigator-initiated studies, and study-based control of staff, protocols, and data sharing policies.Materials and Methods Based on the requirements gathered from statisticians, administrators, and investigators from multiple institutions, the authors developed infrastructure and tools to support multisite comparative effectiveness studies using web services for multivariate statistical estimation in the SCANNER federated network.Results The authors implemented massively parallel (map-reduce) computation methods and a new policy management system to enable each study initiated by network participants to define the ways in which data may be processed, managed, queried, and shared. The authors illustrated the use of these systems among institutions with highly different policies and operating under different state laws.Discussion and Conclusion Federated research networks need not limit distributed query functionality to count queries, cohort discovery, or independently estimated analytic models. Multivariate analyses can be efficiently and securely conducted without patient-level data transport, allowing institutions with strict local data storage requirements to participate in sophisticated analyses based on federated research networks.

    Notes:

    • There has been growing interest in the formation of large electronic databases (e.g. PCORInet, HMO Research Network) networks and clinical data research networks (CDRNs) for comparative effectiveness research (CER).  However, data sharing involving multiple institutions face various regulatory and ethical challenges (e.g. patient privacy) and well as use disparate data infrastructures and analytic tools, which can limit the scalability for research.  The authors discuss their implementation of the SCAlable National Network for Effectiveness Research (SCANNER) infrastructure, a platform that supports existing research networks while adding new features such as such as algorithms for distributed iterative multivariate models, a graphical interface for multivariate model specification, synchronous and asynchronous response to network queries, investigator-initiated studies, and study-based control of staff, protocols, and data sharing policies. The authors use two CER case studies to show the potential applications.

      Institutions who wish to conduct collaborative research can be a part of the SCANNER network (details in article).  Given that any in-network institution can suggest a research protocol, if many institutions participate, this platform can potentially be used for large scale studies.  

  • A tutorial on variable selection for clinical prediction models: feature selection methods in data mining could improve the results

    Author Farideh Bagherzadeh-Khiabani
    Author Azra Ramezankhani
    Author Fereidoun Azizi
    Author Farzad Hadaegh
    Author Ewout W. Steyerberg
    Author Davood Khalili
    Volume 71
    Pages 76-85
    Publication Journal of Clinical Epidemiology
    Date 2016/03/01
    Abstract Objectives Identifying an appropriate set of predictors for the outcome of interest is a major challenge in clinical prediction research. The aim of this study was to show the application of some variable selection methods, usually used in data mining, for an epidemiological study. We introduce here a systematic approach. Study Design and Setting The P-value-based method, usually used in epidemiological studies, and several filter and wrapper methods were implemented to select the predictors of diabetes among 55 variables in 803 prediabetic females, aged ≥20 years, followed for 10–12 years. To develop a logistic model, variables were selected from a train data set and evaluated on the test data set. The measures of Akaike information criterion (AIC) and area under the curve (AUC) were used as performance criteria. We also implemented a full model with all 55 variables. Results We found that the worst and the best models were the full model and models based on the wrappers, respectively. Among filter methods, symmetrical uncertainty gave both the best AUC and AIC. Conclusion Our experiment showed that the variable selection methods used in data mining could improve the performance of clinical prediction models. An R program was developed to make these methods more feasible and visualize the results.

    Notes:

    • The authors introduce a systematic approach to help researchers identify potential predictors for the outcome of interest.  The process involves three basic steps: 1) subset generation, 2) subset evaluation, and 3) stopping criteria.  The most common variable selection methods then be described in four categories: 1) filter (generate and evaluate the subset of variables before the modeling starts), 2) wrapper (where subset is generated before modeling and the inclusion of variables is based on the performance of the model), 3) embedded (variables are proposed and evaluated while building a model, and 4) hybrid method combines both filter and wrapper methods into one.  The first step of subset generation can be completed using three approaches: heuristic (using forward, backward, and greedy hill climbing), complete (similar to heuristic but allows for backtracking to generate an optimal subset), and random searches for variables.  Next, the subset evaluation is based on area under the curve (AUC) or AIC for the wrapper methods, there are several other performance evaluation approaches available for the filter method. Lastly, a stopping criteria is implemented based on the completion of search, failure to produce better subsets on subsequent iterations, or an acceptable threshold is achieved. Using this systematic approach and a real-world dataset from Tehran Lipid and Glucose Study, the authors showed that instead of either forward or backward variable selection methods, the use of new approach of first best search or hill climbing approach combined with performance measures of AUC or AIC is better.   The alternatives to traditional methods are easy to use and important in clinical prediction modeling.

  • Augmented Beta rectangular regression models: A Bayesian perspective

    Author Jue Wang
    Author Sheng Luo
    Volume 58
    Issue 1
    Pages 206-221
    Publication Biometrical Journal
    Date January 1, 2016
    Abstract Mixed effects Beta regression models based on Beta distributions have been widely used to analyze longitudinal percentage or proportional data ranging between zero and one. However, Beta distributions are not flexible to extreme outliers or excessive events around tail areas, and they do not account for the presence of the boundary values zeros and ones because these values are not in the support of the Beta distributions. To address these issues, we propose a mixed effects model using Beta rectangular distribution and augment it with the probabilities of zero and one. We conduct extensive simulation studies to assess the performance of mixed effects models based on both the Beta and Beta rectangular distributions under various scenarios. The simulation studies suggest that the regression models based on Beta rectangular distributions improve the accuracy of parameter estimates in the presence of outliers and heavy tails. The proposed models are applied to the motivating Neuroprotection Exploratory Trials in Parkinson's Disease (PD) Long-term Study-1 (LS-1 study, n = 1741), developed by The National Institute of Neurological Disorders and Stroke Exploratory Trials in Parkinson's Disease (NINDS NET-PD) network.

    Notes:

    • In longitudinal studies, mixed effects beta regression models based on beta distributions are often used to analyze percentage or proportional data ranging between zero and one over time. Beta distributions are susceptible to outliers in situations where there are a higher proportion of events occurring in the tail areas of the distribution. Various methods have been proposed to account for this limitation, but these approaches trade off the ability to account for within-subject correlation in longitudinal data.

      To address these issues, the authors proposed a mixed effects model using an augmented model (a generalized linear mixed model framework) with beta rectangular distribution probabilities of zero and one. They conducted several simulation studies that include a fixed proportion of outliers (1%) to assess the performance of their model under several scenarios (or settings). These models were then applied to the data from the Neuroprotection Exploratory Trials in Parkinson’s Disease Long-term Study-1. They assessed performance with Bayesian fit statistics and shape parameter estimates.

      The simulation and application analyses suggested that the regression models based on beta rectangular distributions improve the accuracy of parameter estimates in the presence of outliers and “heavy tail”, reducing bias and increasing precision.

  • Automating the generation of lexical patterns for processing free text in clinical documents

    Author Frank Meng
    Author Craig Morioka
    Volume 22
    Issue 5
    Pages 980-986
    Publication Journal of the American Medical Informatics Association
    Date 2015/09/01
    Abstract Objective Many tasks in natural language processing utilize lexical pattern-matching techniques, including information extraction (IE), negation identification, and syntactic parsing. However, it is generally difficult to derive patterns that achieve acceptable levels of recall while also remaining highly precise.Materials and Methods We present a multiple sequence alignment (MSA)-based technique that automatically generates patterns, thereby leveraging language usage to determine the context of words that influence a given target. MSAs capture the commonalities among word sequences and are able to reveal areas of linguistic stability and variation. In this way, MSAs provide a systemic approach to generating lexical patterns that are generalizable, which will both increase recall levels and maintain high levels of precision.Results The MSA-generated patterns exhibited consistent F1-, F.5-, and F2- scores compared to two baseline techniques for IE across four different tasks. Both baseline techniques performed well for some tasks and less well for others, but MSA was found to consistently perform at a high level for all four tasks.Discussion The performance of MSA on the four extraction tasks indicates the method’s versatility. The results show that the MSA-based patterns are able to handle the extraction of individual data elements as well as relations between two concepts without the need for large amounts of manual intervention.Conclusion We presented an MSA-based framework for generating lexical patterns that showed consistently high levels of both performance and recall over four different extraction tasks when compared to baseline methods.

    Notes:

    • Natural language processing (NLP) of electronic medical records (EMR) often uses lexical pattern-matching techniques to identify targets among unstructured clinical data. Current automated pattern generation systems typically use static windows of tokens (e.g. two to four words) with allowance of some built in variation; however, these patterns often do not reflect the context of language usage, which can be consistent in certain areas and highly variable in others. The authors present a method of automated pattern generation that uses multiple sequence alignments (MSAs) to capture the similarities and differences between groups of phrases depending on contest. They apply this method to determine lung cancer patient status from radiology reports. They found that the MSA method was able to handle extraction of individual data elements and relations between two concepts without need for large amounts of manual annotation, and also had high levels of performance and recall over four different extraction tasks when compared to use static windows of tokens.  Authors caution that this method is still immature for complex cases and requires further research, as it only contains tokens, wildcards, and some generalized symbols.  This study is applicable to researchers interested in furthering the development of NLP methods but is probably still premature for use in studies to identify relevant diseases and diagnoses.

  • Bayesian dose–response analysis for epidemiological studies with complex uncertainty in dose estimation

    Author Deukwoo Kwon
    Author F. Owen Hoffman
    Author Brian E. Moroz
    Author Steven L. Simon
    Volume 35
    Issue 3
    Pages 399-423
    Publication Statistics in Medicine
    Date February 10, 2016
    Abstract Most conventional risk analysis methods rely on a single best estimate of exposure per person, which does not allow for adjustment for exposure-related uncertainty. Here, we propose a Bayesian model averaging method to properly quantify the relationship between radiation dose and disease outcomes by accounting for shared and unshared uncertainty in estimated dose. Our Bayesian risk analysis method utilizes multiple realizations of sets (vectors) of doses generated by a two-dimensional Monte Carlo simulation method that properly separates shared and unshared errors in dose estimation. The exposure model used in this work is taken from a study of the risk of thyroid nodules among a cohort of 2376 subjects who were exposed to fallout from nuclear testing in Kazakhstan. We assessed the performance of our method through an extensive series of simulations and comparisons against conventional regression risk analysis methods. When the estimated doses contain relatively small amounts of uncertainty, the Bayesian method using multiple a priori plausible draws of dose vectors gave similar results to the conventional regression-based methods of dose–response analysis. However, when large and complex mixtures of shared and unshared uncertainties are present, the Bayesian method using multiple dose vectors had significantly lower relative bias than conventional regression-based risk analysis methods and better coverage, that is, a markedly increased capability to include the true risk coefficient within the 95% credible interval of the Bayesian-based risk estimate. An evaluation of the dose–response using our method is presented for an epidemiological study of thyroid disease following radiation exposure. Copyright © 2015 John Wiley & Sons, Ltd.

    Notes:

    • Dose-response relationships are often investigated using the frequentist method with the mean or median doses.This assumes that the unknown true values of individual doses will vary at random about unbiased mean doses assigned to a particular cohort.  Only a few studies have examined the impact of shared uncertainties in a dose-response analyses.  Estimating individual doses in the absence of direct measurements can lead to large and complex uncertainties with the presence of both systematic and random errors.The authors proposed a Bayesian risk analysis with multiple sets for comparisons including mean, median, and individual doses to generated using Monte Carlo simulation method. The approach separates both shared and unshared errors in dose estimation. The performance of the method was evaluated in a series of simulation tests and compared against the conventional regression risk analysis. When large and combined (shared and unshared) uncertainties are present, the approach using the Bayesian method had significantly lower relative bias and increased capability to include the true coefficient within its credible intervals. The true coefficients in this case were established to compare the performance of the Bayesian and conventional methods.

  • Bias Formulas for Estimating Direct and Indirect Effects When Unmeasured Confounding Is Present

    Author Saskia le Cessie
    Volume 27
    Issue 1
    Pages 125-132
    Publication Epidemiology
    Date January 2016
    Abstract Mediation analysis examines the influence of intermediate factors in the causal pathway between an exposure and an outcome. It yields estimates of the direct effect of the exposure on the outcome and of the indirect effect through the intermediate variable. Both estimates can be biased if the relationship between the mediator and the outcome is confounded. In this article, we study the effect of unmeasured confounding on direct and indirect effect estimates for a continuous mediator and an outcome that may be either binary, count, or continuous. We formulate the effect of the confounder on the intermediate and on the outcome directly in regression models, which makes the formulas intuitive to use by applied users. The formulas are derived under the assumption that the confounder follows a normal distribution. In simulations, the formulas for a linear response model performed well, also as it did when the unmeasured confounder was binary. For a rare binary outcome, the formulas for logistic regression performed well if the unmeasured confounder followed a normal distribution, but for a binary confounder the bias in the direct effect was overcorrected. We applied the formulas to data from a case–control study (Leiden Thrombophilia Study) on risk factors for venous thrombosis. This showed that unmeasured confounding can severely bias the estimates of direct and indirect effects.

    Notes:

    • While mediation analysis has been extensively discussed in the literature, techniques to assess bias associated with its methods are still relatively undeveloped. Existing bias formulas for the estimation of the impact of unmeasured confounding in mediation analysis are often difficult to use, particularly for continuous confounders. The author of this study derives and presents simple to use bias formulas for calculating corrected direct and indirect effects of an exposure on multiple types of outcome data in the presence of a continuous mediator and an unmeasured continuous or binary confounder of the mediator-outcome relationship. Parametric structural models are developed which allow for the direct specification of the effect of the unmeasured confounder on the mediator and outcome. In turn, the direct and indirect effect of an exposure on an outcome can be calculated, including in cases where exposure-mediator interaction is present. The performance of the bias formulas is tested through simulation studies for both continuous and binary outcomes, varying the distribution of the mediator and unmeasured confounder for the continuous outcome model and the prevalence of the response for the binary outcome model. A real-world example using the bias formulas with data from an observational case-control study of risk factors for venous thrombosis is also presented. In the continuous outcome model, the bias correction formulas are demonstrated to nearly eliminate bias in the model coefficients and direct and indirect effect estimates, including when the mediator is non-normally distributed and the unmeasured confounder is binary instead of continuous. While still substantially eliminating bias compared to naïve estimates, the bias correction formulas do not perform as well in the binary outcome model, particularly when the unmeasured confounder is binary. In the case-control example which assumed a continuous unmeasured confounder, the direct effect was underestimated and the indirect effect overestimated when the confounder affected both the mediator and the risk of the outcome in the same direction. The reverse was true when the confounder decreased the mediator and increased the risk of the outcome. Overall, a large effect of the unmeasured confounder on the mediator and outcome was needed to demonstrate a substantial bias to the direct and indirect effect estimates for the exposure-outcome relationship.

      The formulas developed here can be readily used in sensitivity analyses in pharmacoepidemiologic studies which use mediation analysis to determine if significant positive (or negative) results in the direct and indirect effect estimates can be explained away by unmeasured confounding of the mediator-outcome relationship. If not, the researcher, reviewer, and reader can be more confident that the results obtained from a particular study are valid and represent a true effect in the population of interest. The formulas presented by the author are also shown to be rather flexible, allowing for use in multiple study designs with multiple data types and continuous mediator distributions. Future research could focus on derivation of similar bias formulas for unmeasured confounding of the exposure-mediator or exposure-outcome relationship.

  • Comparative validity of methods to select appropriate cutoff weight for probabilistic linkage without unique personal identifiers

    Author Ying Zhu
    Author Chih-Ying Chen
    Author Yutaka Matsuyama
    Author Yasuo Ohashi
    Author Jessica M. Franklin
    Author Soko Setoguchi
    Volume 25
    Issue 4
    Pages 444-452
    Publication Pharmacoepidemiology and Drug Safety
    Date April 1, 2016
    Abstract Purpose Record linkage can enhance data quality of observational database studies. Probabilistic linkage, a method that allows partial match of linkage variables, overcomes disagreements arising from errors and omissions in data entry but also results in false-positive links. The study aimed to assess the validity of probabilistic linkage in the absence of unique personal identifiers (UPI) and the methods of cutoff weight selection. Methods We linked an implantable cardioverter defibrillator placement registry to Medicare inpatient files of 1 year with anonymous nonunique variables and assessed the validity of three methods of cutoff selection against an internally derived gold standard with UPI. Results Of the 64 890 registry records with an expected linkage rate of 55–65%, 55% were linked at cutoffs associated with positive predictive value (PPV) of ≥90%. Histogram inspection suggested an approximate range of optimal cutoffs. The duplicate method made accurate estimates of cutoff and PPV if the method's assumption was met. With adjusted estimates of the sizes of true matches and searched files, the odds formula method made relatively accurate estimates of cutoff and PPV. Conclusions Probabilistic linkage without UPI generated valid linkages when an optimal cutoff was chosen. Cutoff selection remains challenging; however, histogram inspection, the duplicate method, and the odds formula method can be used in conjunction when a gold standard is not available. Copyright © 2015 John Wiley & Sons, Ltd.

    Notes:

    • Record linkage allows combining two or more databases with complimentary information, allowing for more complete and useful data in observational studies.  Probabilistic linkage allows linkage of data that is ‘less-than-perfect’; in other words, it may not have unique personal identifiers (UPI) readily available to ease linkage, may be missing values or be susceptible to entry errors.  Based on allowing partial match of linkage variables, this method requires the selection of a cutoff weight that distinguishes matches from non-matches.  The aim of this study was to compare three methods (histogram inspection, the duplicate method, and the odds formula) to examine cut-off selection, validating each method with an internal gold standard (GS).  The Centers for Medicare and Medicaid Services Implantable Cardioverter Defibrillator (ICD) Registry was linked to the Medicare Provider Analysis and Review (MedPAR) dataset, and the validity of each of the three methods of cutoff selection was assessed against an internally derived GS containing UPI.  Cleaned data including provider ID, admission date and other UPI were used to construct the GS dataset, and the dataset to test the probabilistic linkage was the cleaned data without UPI.  In the probabilistic linkage dataset, 64,890 ICD records were linked to 11,617,956 MEdPAR records, generating 1,784,437 linked pairs above an initial cutoff weight.  With an expected linkage rate of 55-65%, each of the three methods for cutoff weight selection was within or close to the expected range.  Validation of test linkage by GS revealed that 55% of records were linked at cutoff achieving a positive predictive value of ≥ 90%.   The duplicate method provided cutoff weights that achieved a PPV between 88.3-98.3%.  The odds formula with block adjustment provided a cutoff weight that estimated a PPV of 90%.  While linkage enhances the quality of data, robust methods must be utilized to ensure the accuracy of linked data and thereby validity of the results.  Probabilistic linkage without UPI was shown to be convenient, generating valid linkages, minimizing false positives, maximize positive predictive value (PPV) and not compromising sensitivity.  This method of linking data ensures validity of results while greatly enhancing the use of more complete data in ascertaining valuable results in pharmacoepidemiology.

  • Comparing survival curves based on medians

    Author Zhongxue Chen
    Author Guoyi Zhang
    Volume 16
    Pages 33
    Publication BMC Medical Research Methodology
    Date 2016
    Abstract Although some nonparametric methods have been proposed in the literature to test for the equality of median survival times for censored data in medical research, in general they have inflated type I error rates, which make their use limited in practice, especially when the sample sizes are small.

    Notes:

    • Current nonparametric methods for comparing survival curves based on median survival time often suffer from high type I error rates, particularly when sample sizes are small. In addition, the test statistics associated with these methods are often difficult to calculate. The authors of this paper propose a nonparametric test for comparison of median survival times which is designed to control type I error and incorporate an easy to use test statistic. Simulation studies are conducted to compare the empirical type I error rate and power of the proposed method to previously described nonparametric methods for comparison of median survival times.  To test the robustness of the methods, the survival time distributions and parameters, censoring rate, and sample size are varied across simulations. Under large sample size conditions (n ≥ 100), the proposed method and a previously developed method (Brookmeyer and Crowley) have similar empirical type I error rates and power regardless of survival time distribution and censoring rate. Under small sample size conditions (n ≤ 30), the proposed method improves control of type I error over previously developed methods, particularly when censoring is present. Empirical power is somewhat smaller for the proposed method compared to previous methods with small sample sizes, but this is likely due to the inflated type I errors associated with the previous methods. The method developed by the authors presents a viable alternative for comparison of median survival times in pharmacoepidemiologic studies using time-to-event data where nonparametric analysis is appropriate. As it appears to be particularly useful when small sample sizes and high censoring rates are involved, this method may provide an avenue for more valid inference in studies assessing rare exposures and/or long-term follow-up.

  • Competing risk bias in Kaplan–Meier risk estimates can be corrected

    Author Carl van Walraven
    Author Steven Hawken
    Volume 70
    Pages 101-105
    Publication Journal of Clinical Epidemiology
    Date 2016/02/01
    Abstract Background Kaplan–Meier (KM) analyses are frequently used to measure outcome risk over time. These analyses overestimate risk whenever competing events are present. Many published KM analyses are susceptible to such competing risk bias. This study derived and validated a model that predicted true outcome risk based on the biased KM risk. Methods We simulated survival data sets having a broad range of 1-year true outcome and competing event risk. Unbiased true outcome risk estimates were calculated using the cumulative incidence function (CIF). Multiple linear regression was used to determine the independent association of CIF-based true outcome risk with the biased KM risk and the proportion of all outcomes that were competing events. Results The final model found that both the biased KM-based risk and the proportion of all outcomes that were competing events were strongly associated with CIF-based risk. In validation populations that used a variety of distinct survival hazard functions, the model accurately predicted the CIF (R2 = 1). Conclusions True outcome risk can be accurately predicted from KM estimates susceptible to competing risk bias.

    Notes:

    • The purpose of this research was to test a method to quantify the over-inflation of Kaplan-Meier (KM) estimates as a result of competing risk bias. Using simulated data sets of “true” and “competing” events, the investigators developed a multivariable linear regression model to predict the true outcome risk that is not influenced by competing events (i.e., the cumulative incidence function, CIF). The derived model predicts CIF using KM risk, proportion of competing events in the analysis, and an interaction between these variables as the inputs. The model was then tested in validation data sets: 1) the first validation used the remaining one-third of data sets not used for derivation; 2) the second validation used data sets with survival functions based on the Weibell model; 3) the final validation used data sets with survival functions based on the lognormal model. The derived and first validation data sets were almost identical; however there were differences between the data sets in the second and third validation. The investigators found that the model fit the data very well (R2 = 1) and found all predictors were significant in the model with negligible standard error (SE = rounded to 0). The investigators concluded that this model could accurately be used to estimate the unbiased risk from KM estimate. A major limitation of this research is that the use of the model requires knowledge of the proportion of all events that are competing events. Since it is very unlikely that these will be known when conducting survival analysis, the utility of this method is restricted.

  • Controlling confounding of treatment effects in administrative data in the presence of time-varying baseline confounders

    Author David T. Gilbertson
    Author Brian D. Bradbury
    Author James B. Wetmore
    Pages n/a-n/a
    Publication Pharmacoepidemiology and Drug Safety
    Date January 1, 2015
    Abstract Purpose Confounding, a concern in nonexperimental research using administrative claims, is nearly ubiquitous in claims-based pharmacoepidemiology studies. A fixed-length look-back window for assessing comorbidity from claims is common, but it may be advantageous to use all historical claims. We assessed how the strength of association between a baseline-identified condition and subsequent mortality varied by when the condition was measured and investigated methods to control for confounding. Methods For Medicare beneficiaries undergoing maintenance hemodialysis on 1 January 2008 (n = 222 343), we searched all Medicare claims, 1 January 2001 to 31 December 2007, for four conditions representing chronic and acute diseases, and classified claims by number of months preceding the index date. We used proportional hazard models to estimate the association between time of condition and subsequent mortality. We simulated a confounded comorbidity–exposure relationship and investigated an alternative method of adjustment when the association between the condition and mortality varied by proximity to follow-up start. Results The magnitude of the mortality hazard ratio estimates for each condition investigated decreased toward unity as time increased between index date and most recent manifestation of the condition. Simulation showed more biased estimates of exposure–outcome associations if proximity to follow-up start was not considered. Conclusions Using all-available claims information during a baseline period, we found that for all conditions investigated, the association between a comorbid condition and subsequent mortality varied considerably depending on when the condition was measured. Improved confounding control may be achieved by considering the timing of claims relative to follow-up start. Copyright © 2015 John Wiley & Sons, Ltd.

    Notes:

    • The use of the time frame to adjust for baseline covariates remains controversial in pharmacoepidemiologic studies. The authors aimed to investigate the impact of baseline covariate measurement design in two approaches: use of "all available claims" and use of a fixed length baseline window. The authors used Medicare data to analyze the impact of the timing of comorbidity (baseline covariate) identification on the  strength of the association between the comorbidity and 1-year mortality (exposure-outcome effect), using all-available claims. In addition, the authors conducted a simulation study to compare the use of all-available and fixed length window methods for assessment of exposure-outcome effect, under the assumption of time-varying and time-invariant confounders. They found that the exposure-outcome effect (measured as hazard ratio in this study) decreased when the time increased between the diagnosis of the comorbidity and the start date of follow-up, across both acute and chronic conditions. The estimates generated from outpatient settings were consistently lower than inpatient settings. The results from the simulation analyses showed that various model produced similar estimates for time-invariant confounders. However, for time-varying confounders, use of all-available claims approach generated less biased results. The authors also discussed possible explanation for their findings such as depletion of susceptible as well as factors that would affect the optimal recommendation for employment of the approach. In summary, they concluded that from their study they would suggest the all-available approach to improve confounding adjustment and also considering the timing of covariate identification as well as the source of data used.

  • Creating a National Provider Identifier (NPI) to Unique Physician Identification Number (UPIN) Crosswalk for Medicare Data

    Author Helen M. Parsons
    Author Lindsey R. Enewold
    Author Robert Banks
    Author Michael J. Barrett
    Author Joan L. Warren
    Publication Medical Care
    Date Dec 17, 2015
    Abstract INTRODUCTION: Many health services researchers are interested in assessing long term, individual physician treatment patterns, particularly for cancer care. In 2007, Medicare changed the physician identifier used on billed services from the Unique Physician Identification Number (UPIN) to the National Provider Identifier (NPI), precluding the ability to use Medicare claims data to evaluate individual physician treatment patterns across this transition period. METHODS: Using the 2007-2008 carrier (physician) claims from the linked Surveillance, Epidemiology and End Results (SEER) cancer registry-Medicare data and Medicare's NPI and UPIN Directories, we created a crosswalk that paired physician NPIs included in SEER-Medicare data with UPINs. We evaluated the ability to identify an NPI-UPIN match by physician sex and specialty. RESULTS: We identified 470,313 unique NPIs in the 2007-2008 SEER-Medicare carrier claims and found a UPIN match for 90.1% of these NPIs (n=423,842) based on 3 approaches: (1) NPI and UPIN coreported on the SEER-Medicare claims; (2) UPINs reported on the NPI Directory; or (3) a name match between the NPI and UPIN Directories. A total of 46.6% (n=219,315) of NPIs matched to the same UPIN across all 3 approaches, 34.1% (n=160,277) agreed across 2 approaches, and 9.4% (n=44,250) had a match identified by 1 approach only. NPIs were paired to UPINs less frequently for women and primary care physicians compared with other specialists. DISCUSSION: National Cancer Institute has created a crosswalk resource available to researchers that links NPIs and UPINs based on the SEER-Medicare data. In addition, the documented process could be used to create other NPI-UPIN crosswalks using data beyond SEER-Medicare.

    Notes:

    • Medicare changed the physician identifier used on billed services from the Unique Physician Identification Number (UPIN) to the National Provider Identifier (NPI) in 2007, thereby, researchers are not able to identify individual physician treatment patterns during this period.  To bridge this gap, authors created a NPI-UPIN crosslink using three approaches: 1) the Surveillance, Epidemiology and End Results (SEER) cancer registry which contains linked UPINs and NPIS, 2) UPINs reported on the NPI directory, or 3) a name match between the NPI and UPIN directories.  They were able to identify 90.1% of NPIs corresponding to the UPINs based on these three methods, although matches were less frequent for female and primary care physicians.  Their crosswalk can be used by researchers examining physician treatment patterns that overlaps 2007 in the SEER-Medicare data, and also serves as an example paper for those wishing to conduct their own crosswalk research.

  • Developing Prediction Models from Results of Regression Analysis: Woodpecker™ Technique

    Author Goldfarb Rumyantzev AS
    Author Ning Dong
    Volume 07
    Issue 01
    Publication Journal of Biometrics & Biostatistics
    Date 2015

    Notes:

    • The authors proposed an approach where information already published in research studies can be converted in prediction models.  The technique is called WoodPeckerTM that can help build a prediction model and a risk stratification tool using multivariate regression model. Limitations based on the assumptions of traditional multivariate regression models still hold and in addition, the WoodPecker method can only estimate the risk among populations included in previous studies. Often epidemiological research studies focus on quantifying average risks at a population-level, however there has been interest from clinicians to develop prediction models where the heterogeneity of treatment effects is measured.  Risk prediction at an individual level is more informative to both clinicians and patients when making important clinical decisions.  The proposed method is an attempt to use the previously published literature to build a prediction model.  In addition, the authors compared the numerically generated results with using real world data from NHANES dataset.  The results from both approaches showed similar risk categories.  The tool has the potential to be directly embedded in electronic health record systems to generate automated prediction results for patients. 

  • Development and demonstration of a state model for the estimation of incidence of partly undetected chronic diseases

    Author Ralph Brinks
    Author Barbara H. Bardenheier
    Author Annika Hoyer
    Author Ji Lin
    Author Sandra Landwehr
    Author Edward W. Gregg
    Volume 15
    Pages 98
    Publication BMC Medical Research Methodology
    Date 2015
    Abstract Estimation of incidence of the state of undiagnosed chronic disease provides a crucial missing link for the monitoring of chronic disease epidemics and determining the degree to which changes in prevalence are affected or biased by detection.

    Notes:

    • The majority of chronic conditions involve a phase in which the condition is present but undiagnosed, which may vary from days to months to several years. Effective monitoring and intervention strategies for chronic disease epidemics should involve accurate estimation of undiagnosed disease incidence, though few methods to do this currently exist. The authors of this study propose a four-compartment chronic disease state model which expands on the traditional illness-death model by incorporating an undiagnosed state. With this model, prevalence and mortality information for healthy and diagnosed health states are used to estimate undiagnosed and diagnosed chronic disease incidence. A series of partial differential equations are used to estimate undiagnosed and diagnosed disease incidence in the model. A detection ratio (a rate ratio) is used to assess whether changes in disease prevalence observed over time are influenced by changes in disease detection (e.g. improved surveillance) as opposed to a true change in incidence. The proposed model is tested using a simulation study and diabetes data from the Health and Retirement Study (HRS). In simulations, the authors found that the incidence rates of undiagnosed disease (from the healthy state) and diagnosed disease (from the undiagnosed state) in the state model constructed from prevalence and mortality data were very similar to the true incidence rates. Incidence rates of undiagnosed and diagnosed diabetes from HRS data were calculated using the proposed methods, though high levels of uncertainty in the incidence estimates were observed due to high levels of uncertainty in the input data. The methods proposed in this study can be used to better characterize the true occurrence of disease over time, and may be useful in demonstrating the potential population impact (i.e. total population who may benefit or be at risk) of pharmacologic intervention strategies for certain chronic diseases. Knowledge of undiagnosed disease incidence can also aid the researcher in determining the safety and effectiveness of drug treatments, for example, by informing whether the change in the occurrence of an outcome (i.e., disease diagnosis) following treatment is due to a true effect of the drug or simply an improvement in monitoring and detecting disease that was previously undiagnosed.

  • Diagnostic accuracy of algorithms to identify hepatitis C status, AIDS status, alcohol consumption and illicit drug use among patients living with HIV in an administrative healthcare database

    Author Madeleine Durand
    Author Yishu Wang
    Author François Venne
    Author Jacques Lelorier
    Author Cécile L. Tremblay
    Author Michal Abrahamowicz
    Volume 24
    Issue 9
    Pages 943-950
    Publication Pharmacoepidemiology and Drug Safety
    Date September 1, 2015
    Abstract Purpose This study aims to develop and evaluate diagnostic algorithms for AIDS, hepatitis C status, alcohol abuse and illicit drug use in the administrative healthcare database of the Province of Quebec, Canada (Régie de l'assurance-maladie du Québec (RAMQ)). Methods We selected HIV-positive patients contributing to both the RAMQ database and a local clinical database, which was used as gold standard. We developed algorithms to identify the diagnoses of interest in RAMQ using data from hospital discharge summaries and medical and pharmaceutical claims databases. We estimated and compared sensitivity, specificity, positive predictive and negative predictive values and area under receiver operating curve for each algorithm. Results Four hundred twenty patients contributed to both databases. Prevalence of conditions of interest in the clinical database was as follows: AIDS 233 (55%), hepatitis C infection 105 (25%), alcohol abuse 106 (25%), illicit drug use 144 (34%) and intravenous drug use 107 (25%). Sensitivity to detect AIDS, hepatitis C, alcohol abuse, illicit drug use and intravenous drug use was 46% [95%CI: 39–53], 26% [18–35], 50% [37–57], 64% [55–72] and 70% [61–79], respectively. Specificity to detect these conditions was 91% [86–95], 97% [94–98], 92% [88–95], 95% [92–97] and 90% [87–93], respectively. Positive predictive values were 87% [80–92], 71% [54–85], 68% [56–78], 87% [79–93] and 72% [62–80], respectively. Area under receiver operating curve varied from 0.62 [0.57–0.65] for hepatitis C to 0.80 [0.76–0.85] for intravenous drug use. Conclusions Sensitivity was low to detect AIDS, alcohol abuse, illicit drug use and especially hepatitis C in RAMQ. Researchers must be aware of the potential for residual confounding and must consider additional methods to control for confounding. Copyright © 2015 John Wiley & Sons, Ltd.

    Notes:

    • In patient populations with HIV, algorithms for measurement of important confounding variables, such as comorbidities and lifestyle factors, have yet to be validated in administrate claims data. The authors of this study develop and validate algorithms for AIDS, hepatitis C, alcohol and illicit drug using the Canadian RAMQ administrative health claims database linked to a comprehensive clinical database of HIV-positive patients. Using medical charts in the comprehensive clinical database as the gold standard for validation, test characteristics and receiver-operator characteristic (ROC) curves were measured for four different administrative claims-based algorithms for each variable of interest in a population of HIV patients. The diagnostic algorithms were also validated based on tertiles of follow-up time in the administrative claims database. For all variables except diagnosis of AIDS, the most inclusive algorithm  resulted in the greatest area under the ROC curve; for AIDS diagnosis, the algorithm using only inpatient codes provided the best sensitivity/specificity trade-off. The sensitivity (SN), specificity (SP), positive predictive value (PPV), and area under the ROC (aROC) curve for the best performing algorithm for each variable were as follows: AIDS (SN=46%, SP=91%, PPV=87%, aROC=0.69), Hepatitis C (SN=26%, SP=97%, PPV=71%, aROC=0.62), alcohol abuse (SN=50%, SP=92%, PPV=68%, aROC=0.71), illicit drug use (SN=64%, SP=95%, PPV=87%, aROC=0.80), and IV drug use (SN=70%, SP=90%, PPV=72%, aROC=0.80). Sensitivity significantly increased with increasing follow-up time for all variables. The results of this study demonstrate that valid identification of common confounding conditions in HIV patients by pharmacoepidemiologists using administrative claims-based algorithms currently remains a challenge. Researchers conducting traditional claims-based studies in HIV populations which attempt to adjust for these variables are likely to suffer from substantial residual confounding, leading to biased results. This study demonstrates the need for alternative methods to adequately detect and adjust for key confounders in the HIV population when using administrative claims data, such as through linkage to additional data sources.

  • Estimating time-varying drug adherence using electronic records: extending the proportion of days covered (PDC) method

    Author Maarten J. Bijlsma
    Author Fanny Janssen
    Author Eelko Hak
    Pages n/a-n/a
    Publication Pharmacoepidemiology and Drug Safety
    Date January 1, 2015
    Abstract Purpose Accurate measurement of drug adherence is essential for valid risk-benefit assessments of pharmacologic interventions. To date, measures of drug adherence have almost exclusively been applied for a fixed-time interval and without considering changes over time. However, patients with irregular dosing behaviour commonly have a different prognosis than patients with stable dosing behaviour. Methods We propose a method, based on the proportion of days covered (PDC) method, to measure time-varying drug adherence and drug dosage using electronic records. We compare a time-fixed PDC method with the time-varying PDC method through detailed examples and through summary statistics of 100 randomly selected patients on statin therapy. Results We demonstrate that time-varying PDC method better distinguishes an irregularly dosing patient from a stably dosing patient and demonstrate how the time-fixed method can result in a biassed estimate of drug adherence. Furthermore, the time-varying PDC method may be better used to reduce certain types of confounding and misclassification of exposure. Conclusions The time-varying PDC method may improve longitudinal and time-to-event studies that associate adherence with a clinical outcome or (intervention) studies that seek to describe changes in adherence over time. Copyright © 2015 John Wiley & Sons, Ltd.

    Notes:

    • Measurement of drug adherence is often a key component of drug safety studies, but current measures generally do not consider changes or irregularities in adherence over time and instead focus on fixed-time intervals. This can result in biased associations between medication use and clinical outcomes. The authors of this paper extend the proportion of days covered (PDC) method to time-varying conditions using electronic pharmacy dispensing data. To calculate time-varying drug adherence, the days’ supply and total time interval for the period between a prescription fill (k) and the dispensing date for the second subsequent prescription fill (k + 2) is calculated, and these numbers divided to calculate adherence; this adherence estimate is then assigned to the first prescription fill (k). By repeating this process for each prescription fill, an adherence estimate is calculated and can assigned for each fill. This method was then compared to the time-fixed PDC method for a random sample of 100 patients initiating statin therapy. The authors found that the time-varying PDC method provided the most additional adherence information for patients with irregular fill behavior, but performed similarly to the time-fixed PDC method for patients with regular prescription fills, regardless of whether adherence was high or low. When comparing the average absolute differences between the time-fixed and time-varying PDC adherence estimates for 100 statin initiators followed-up for 1 year, the two methods were found to result in average adherence estimates which differed by more than 15% for 25% of patients. This paper provides evidence to support the use of a time-varying PDC approach for adherence estimation, particularly when adherence is suspected to be unstable over time. The calculation of adherence measurements at multiple time points could be useful as a time-varying covariate in pharmacoepidemiologic studies, allowing for more accurate estimates of the effect of adherence on safety or efficacy endpoints to be obtained. The use of time-varying adherence as an outcome may also allow for a more accurate estimation of the impact of unstable patient variables and health events on subsequent medication adherence.

  • Estimating time-varying exposure-outcome associations using case-control data: logistic and case-cohort analyses

    Author Ruth H. Keogh
    Author Punam Mangtani
    Author Laura Rodrigues
    Author Patrick Nguipdop Djomo
    Volume 16
    Pages 2
    Publication BMC Medical Research Methodology
    Date 2016
    Abstract Traditional analyses of standard case-control studies using logistic regression do not allow estimation of time-varying associations between exposures and the outcome. We present two approaches which allow this. The motivation is a study of vaccine efficacy as a function of time since vaccination.

    Notes:

    • Case-control studies assess the association between exposure and disease using logistic regression.  However, this traditional analysis doesn’t allow for estimating time-varying exposure-outcome associations.  In this study, two approaches are described for estimating time-varying associations between exposures and outcomes using standard (unmatched) case-control data.  The motivation for undertaking this analysis is a study to assess vaccine efficacy as a function of time since vaccination.  Of the two approaches, the first method estimates the association between exposure and outcome within consecutive non-overlapping time periods, assuming a step function form for the time-varying association.  Separate logistic regressions are performed within a successive series of time periods, reusing controls across time periods.  In the second method, a case-cohort study arises as a sub-study within a cohort.  In doing so, controls can contribute information to all sampled risk sets up to their index time and this approach allows modelling of time-varying exposure outcome associations as a continuous function of time, not restricted to estimating this association within time periods as in the first approach.  Additionally, extensions for these methods include frequency matching, left truncation and using time-varying exposures.  Using simulated data, both methods give correct estimates of the time-varying effect of exposures and standard errors.  Using each control only once results in substantial loss of efficiency, whereas reusing controls leads to efficiency gains.  The case-cohort analysis resulted in effect estimates closest to true hazard ratio. Case-control studies can be used to estimate time-varying exposure outcome associations which may not have been previously considered.  The case-cohort analysis has several advantages, allowing for estimation of time-varying exposure outcome associations as a continuous function of time as opposed to assuming a step function form and involves no ambiguity over the definition of a control.  Given these, this approach adds another viable study design in pharmacoepidemiology that avoids loss of efficiency and offers several advantages.

  • Feasibility of Prioritizing Drug–Drug-Event Associations Found in Electronic Health Records

    Author Juan M. Banda
    Author Alison Callahan
    Author Rainer Winnenburg
    Author Howard R. Strasberg
    Author Aurel Cami
    Author Ben Y. Reis
    Author Santiago Vilar
    Author George Hripcsak
    Author Michel Dumontier
    Author Nigam Haresh Shah
    Volume 39
    Issue 1
    Pages 45-57
    Publication Drug Safety
    Date 2015/10/08
    Abstract Background and Objective Several studies have demonstrated the ability to detect adverse events potentially related to multiple drug exposure via data mining. However, the number of putative associations produced by such computational approaches is typically large, making experimental validation difficult. We theorized that those potential associations for which there is evidence from multiple complementary sources are more likely to be true, and explored this idea using a published database of drug–drug-adverse event associations derived from electronic health records (EHRs). Methods We prioritized drug–drug-event associations derived from EHRs using four sources of information: (1) public databases, (2) sources of spontaneous reports, (3) literature, and (4) non-EHR drug–drug interaction (DDI) prediction methods. After pre-filtering the associations by removing those found in public databases, we devised a ranking for associations based on the support from the remaining sources, and evaluated the results of this rank-based prioritization. Results We collected information for 5983 putative EHR-derived drug–drug-event associations involving 345 drugs and ten adverse events from four data sources and four prediction methods. Only seven drug–drug-event associations (<0.5 %) had support from the majority of evidence sources, and about one third (1777) had support from at least one of the evidence sources. Conclusions Our proof-of-concept method for scoring putative drug–drug-event associations from EHRs offers a systematic and reproducible way of prioritizing associations for further study. Our findings also quantify the agreement (or lack thereof) among complementary sources of evidence for drug–drug-event associations and highlight the challenges of developing a robust approach for prioritizing signals of these associations.

    Notes:

    • Comprehensive assessment of drug-drug-event (DDE) associations, medical manifestations causally or non-causally associated with concomitant use of two drugs, can be difficult to achieve given the large number of possible drug use combinations. Validation of results from data mining studies aimed at detecting DDE associations in electronic health records (EHRs) has been challenging given the vast number of plausible associations produced. The authors of this study propose a method of ranking potential DDE associations detected in studies of EHR data by assessing support for the association in four information sources: public databases, spontaneous reports, literature, and non-EHR drug-drug interaction prediction models. Potential DDE associations supported by multiple information sources were deemed more likely to represent true signals. Using a set of 5,983 putative DDE associations among 345 approved drugs and 10 adverse events from a previous study of EHR data as the basis for ranking, the authors built a dictionary of drug and adverse event identifiers to match to information from the four types of data sources.  Ranking was performed for each DDE association, with the appearance in more information sources (after crossing an evidence “threshold” in each of the four source types) equating to a higher prioritization/ranking. After removing DDE associations in public data sources (which were assumed to be true), non-EHR prediction methods were found to provide evidence of support for the most DDE associations while literature provided evidence for very few associations. Only 7 DDE associations were supported by the majority of information sources while 1,777 DDE associations were supported by at least one source. This study provides a systematic method by which potential DDE associations can be identified for further study by drug safety researchers based on the degree of complementary evidence across data sources. Given the low number of DDE associations identified in multiple data sources, additional data source categories and sub-categories could be examined to allow a more granular ranking of DDE associations, based on the needs of the researcher. Researchers may also be interested in attempting to adapt the proposed ranking system to take characteristics of DDE associations into account, such as severity or rarity.

  • Identification of smoking using Medicare data — a validation study of claims-based algorithms

    Author Rishi J Desai
    Author Daniel H Solomon
    Author Nancy Shadick
    Author Christine Iannaccone
    Author Seoyoung C Kim
    Volume 25
    Issue 4
    Pages 472-475
    Publication Pharmacoepidemiology and Drug Safety
    Date April 1, 2016
    Abstract Purpose This study examined the accuracy of claims-based algorithms to identify smoking against self-reported smoking data. Methods Medicare patients enrolled in the Brigham and Women's Hospital Rheumatoid Arthritis Sequential Study were identified. For each patient, self-reported smoking status was extracted from Women's Hospital Rheumatoid Arthritis Sequential Study and the date of this measurement was defined as the index-date. Two algorithms identified smoking in Medicare claims: (i) only using diagnoses and procedure codes and (ii) using anti-smoking prescriptions in addition to diagnoses and procedure codes. Both algorithms were implemented: first, only using 365-days pre-index claims and then using all available pre-index claims. Considering self-reported smoking status as the gold standard, we calculated specificity, sensitivity, positive predictive value, negative predictive value (NPV), and area under the curve (AUC). Results A total of 128 patients were included in this study, of which 48% reported smoking. The algorithm only using diagnosis and procedure codes had the lowest sensitivity (9.8%, 95%CI 2.4%–17.3%), NPV (54.9%, 95%CI 46.1%–63.9%), and AUC (0.55, 95%CI 0.51–0.59) when applied in the period of 365 days pre-index. Incorporating pharmacy claims and using all available pre-index information improved the sensitivity (27.9%, 95%CI 16.6%–39.1%), NPV (60.4%, 95%CI 51.3%–69.5%), and AUC (0.64, 95%CI 0.58–0.70). The specificity and positive predictive value was 100% for all the algorithms tested. Conclusion Claims-based algorithms can identify smokers with limited sensitivity but very high specificity. In the absence of other reliable means, use of a claims-based algorithm to identify smoking could be cautiously considered in observational studies. Copyright © 2016 John Wiley & Sons, Ltd.

    Notes:

    • One of the limitations of administrative claims is lack of information on behavioral characteristics of patients such as smoking. Tobacco use or smoking is a risk factor for many diseases and can be a confounder for exposure-outcome associations. For observational studies, researchers may be interested in using this information to adjust for confounding. Although there are studies that use indirect indicators or proxy for smoking, there is limited information on accuracy of the indicators. This paper is a validation study for a claims-based algorithm to identify smoking. The authors used data from a cohort of Medicare insured patients with rheumatoid arthritis enrolled in Brigham and Women’s Hospital Rheumatoid Arthritis Sequential Study (BRASS). The patients enrolled in this study provided self-reported information on smoking status which was considered the gold standard. From these patients, who also were enrolled in Medicare (Final sample size=128), two algorithms based on ICD-9 codes and CPT codes related to proxy variables of smoking or tobacco use were used to compare against gold standard. Algorithms that performed better had sensitivity of 27.9%, specificity of 100%, PPV of 100%, NPV of 60.4%, and AUC of 0.64. Combining all data from medical and pharmacy and using all pre-index information were associated with better performance of the algorithm. The difference between previously published studies for validating smoking algorithms and the present study is that in the present study, the authors used self-reported smoking as the gold standard instead of chart review based smoking status. Because of the low sensitivity, using this algorithm to control for smoking may not completely eliminate confounding and residual confounding still may exist.

  • Implications of immortal person-time when outcomes are nonfatal

    Author Caihua Liang
    Author John D. Seeger
    Author David D. Dore
    Volume 26
    Issue 3
    Pages 212-217
    Publication Annals of Epidemiology
    Date 2016/03/01
    Abstract Purpose The amount of immortal time bias in studies with nonfatal outcomes is unclear. To quantify the magnitude of bias from mishandling of immortal person-time in studies of nonfatal outcomes. Methods We derived formulas for quantifying bias from misclassified or excluded immortal person-time in settings with nonfatal outcomes, assuming a constant rate of outcome. In the situation of misclassified or excluded immortal person-time, the quantification includes the immortal time and corresponding events mistakenly attributed to the exposed group (misclassified) or excluded from study (excluded) that must be attributed to the comparison group. Results With misclassified immortal person-time, the magnitude of bias varies according to the incidence rate ratio of immortal time and comparison group as well as the rate ratio of immortal time and exposed group: toward null for both ratios less than 1, no bias for both ratios equal to 1, away from null for both ratios greater than 1. For one ratio less than 1 and the other greater than 1, the direction and magnitude of bias can be obtained from the formula provided. With excluded immortal person-time, the magnitude of bias is associated with the incidence rate ratio of immortal time and comparison group: toward null for the ratio less than 1, no bias for the ratio equal to 1, and away from null for the ratio greater than 1. Conclusions Bias due to immortal person-time in studies with nonfatal outcomes can vary widely and can be quantified under assumptions that apply to many studies.

    Notes:

    • This article explores the impact of immortal-time bias on the incidence of non-fatal outcomes, and derives a method to quantify the magnitude of this bias. Methods to quantify the impact of immortal time bias when the outcome is fatal have previously been investigated in the literature, so the investigators in this study aimed to modify these methods for application to non-fatal outcomes.

      In cohort studies, time of enrollment in the cohort often precedes the time of initiation of the exposure of interest. When the outcome of interest is fatal, this creates a window of time during which an exposed person could not have had the outcome. It would be appropriate to consider this time unexposed time. For exposed groups, if the time between cohort entry is misclassified as exposed time or if it is ignored, immortal time bias can be introduced resulting in under- or over-estimation of the incidence of the outcome. If the outcome of interest is non-fatal, similar bias can occur, but the magnitude may differ because it is possible that the event could occur during the immortal time.

      The formula developed for quantifying the relative bias when immortal time is mishandled for non-fatal outcomes is an extension of the formula described by Suissa et al., for fatal outcomes. The non-fatal outcome formula contains additional parameters, such as the ratio of unexposed person-time and immortal time, the ratio of exposed person-time and immortal time, the incidence rate ratio of immortal time and unexposed time, and the incidence rate ratio of immortal time and exposed time. The investigators determined that for immortal time misclassified as exposed, when the ratio of the incidence rate during immortal time and the incidence rate during exposed time is equal to one, and the ratio of the incidence rate during immortal time and the incidence rate during unexposed is also equal to one, the relative immortal time bias is negligible. If both ratios are greater than one, the bias is away from the null. If both ratios are smaller than one, the bias is towards the null. When the immortal time is ignored, similar biases result, but only depending on the magnitude of the ratio between the incidence rate during immortal time and the incidence rate during unexposed time.

      The investigators also applied this framework to a cohort study about the incidence of acute pancreatitis and exenatide use. Systematic bias was noted when immortal time for non-fatal outcomes was mishandled in this example. Overall, this study provides a new method for quantifying this particular type of immortal time bias, which can guide investigators on the importance of correcting it in epidemiological cohort studies. 

  • Improving Information on Maternal Medication Use by Linking Prescription Data to Congenital Anomaly Registers: A EUROmediCAT Study

    Author Linda de Jonge
    Author Ester Garne
    Author Rosa Gini
    Author Susan E. Jordan
    Author Kari Klungsoyr
    Author Maria Loane
    Author Amanda J. Neville
    Author Anna Pierini
    Author Aurora Puccini
    Author Daniel S. Thayer
    Author David Tucker
    Author Anne Vinkel Hansen
    Author Marian K. Bakker
    Volume 38
    Issue 11
    Pages 1083-1093
    Publication Drug Safety
    Date 2015/07/08
    Abstract Introduction Research on associations between medication use during pregnancy and congenital anomalies is significative for assessing the safe use of a medicine in pregnancy. Congenital anomaly (CA) registries do not have optimal information on medicine exposure, in contrast to prescription databases. Linkage of prescription databases to the CA registries is a potentially effective method of obtaining accurate information on medicine use in pregnancies and the risk of congenital anomalies. Methods We linked data from primary care and prescription databases to five European Surveillance of Congenital Anomalies (EUROCAT) CA registries. The linkage was evaluated by looking at linkage rate, characteristics of linked and non-linked cases, first trimester exposure rates for six groups of medicines according to the prescription data and information on medication use registered in the CA databases, and agreement of exposure. Results Of the 52,619 cases registered in the CA databases, 26,552 could be linked. The linkage rate varied between registries over time and by type of birth. The first trimester exposure rates and the agreements between the databases varied for the different medicine groups. Information on anti-epileptic drugs and insulins and analogue medicine use recorded by CA registries was of good quality. For selective serotonin reuptake inhibitors, anti-asthmatics, antibacterials for systemic use, and gonadotropins and other ovulation stimulants, the recorded information was less complete. Conclusion Linkage of primary care or prescription databases to CA registries improved the quality of information on maternal use of medicines in pregnancy, especially for medicine groups that are less fully registered in CA registries.

    Notes:

    • The study of associations between medication use during pregnancy and congenital anomalies is important for assessment of medication safety in pregnant women; however, registry data used to assess congenital anomalies often have poor quality data on maternal medication exposure. The authors of this study link 5 European congenital anomaly registries to administrative prescription databases to assess the quality of data linkage and maternal medication exposure for several common prescription drug classes. A distributed database model was used with linkages for each registry performed locally; local datasets were not combined due to country-specific regulations. Successfully linked and non-linked congenital anomaly cases were compared on the basis of year and type of birth. Exposure rates in each registry and prescription database for six medication classes were also compared, and the percent agreement between the registries and prescription databases were calculated. In total, 65.7% of cases in the congenital anomaly registries could be linked to the prescription databases, ranging from 31.7% of cases in the Welsh registry to 100% of cases in the Danish registry. Differences in calculated first trimester medication exposure rates using the registries versus the prescription databases tended to be small, though significantly greater exposure rates were observed in the prescription databases for medications taken on a short term basis. Percentage agreement in exposure status between the registries and prescription databases was generally high for anti-epileptics and insulins, and low for systemic antibiotics and ovulation stimulants (e.g., gonadotropins). The results of this study demonstrate the potential usefulness of linking pharmacy dispensing data to congenital anomaly registry data for studying associations between maternal medication exposure and fetal outcomes; however, the utility of this approach appears to vary substantially by medication class and the extent of medication-related information already present in the registry data. Linkage of such data sources may be most useful when short-term medications are of interest or when there is a priori knowledge of poor quality medication information in the registries used.

  • Improving the efficiency of estimation in the additive hazards model for stratified case–cohort design with multiple diseases

    Author Soyoung Kim
    Author Jianwen Cai
    Author David Couper
    Volume 35
    Issue 2
    Pages 282-293
    Publication Statistics in Medicine
    Date January 30, 2016
    Abstract The case–cohort study design has often been used in studies of a rare disease or for a common disease with some biospecimens needing to be preserved for future studies. A case–cohort study design consists of a random sample, called the subcohort, and all or a portion of the subjects with the disease of interest. One advantage of the case–cohort design is that the same subcohort can be used for studying multiple diseases. Stratified random sampling is often used for the subcohort. Additive hazards models are often preferred in studies where the risk difference, instead of relative risk, is of main interest. Existing methods do not use the available covariate information fully. We propose a more efficient estimator by making full use of available covariate information for the additive hazards model with data from a stratified case–cohort design with rare (the traditional situation) and non-rare (the generalized situation) diseases. We propose an estimating equation approach with a new weight function. The proposed estimators are shown to be consistent and asymptotically normally distributed. Simulation studies show that the proposed method using all available information leads to efficiency gain and stratification of the subcohort improves efficiency when the strata are highly correlated with the covariates. Our proposed method is applied to data from the Atherosclerosis Risk in Communities study. Copyright © 2015 John Wiley & Sons, Ltd.

    Notes:

    • Existing methods for studying multiple disease in case-cohort studies do not make full use of the available covariate information. Resulting analyses may result in less precise estimations. The authors proposed an estimator that makes full use of available covariate information for the additive hazards model with data from a stratified case–cohort design with and without the rare disease assumption. They introduced weights defined by inverse of sampling probabilities. Using these weights, they were able to calculate a baseline cumulative hazard function. This preserves all the covariate information available irrespective of the outcome. Based on an analyses using data from the Atherosclerosis Risk in Communities (ARIC) study, they investigated the association between the genetic variation in PTGS1 and coronary heart disease and stroke (multiple diseases/outcomes). The authors showed that having the polymorphism is associated with increased risk in both outcomes (CHD and stroke). They compared the results with previously reported methods. The authors found their estimators to be consistent and asymptotically normally distributed. Results from their simulation analyses showed that a reduction in variance and more precise confidence intervals surrounding their point estimates.

  • Interaction between a single exposure and age in cohort-based hazard rate models impacted the statistical distribution of age at onset

    Author Shizue Izumi
    Author Ritsu Sakata
    Author Michiko Yamada
    Author John Cologne
    Publication Journal of Clinical Epidemiology
    Date Nov 10, 2015
    Abstract OBJECTIVE: Statistical interaction between a single, instantaneous exposure and attained age (age during follow-up; attained age = age at exposure + time since exposure) is used in risk analyses to assess potential effect modification by unmeasured factors correlated with age. However, the impact of such interaction on the statistical distribution of age-at-onset of outcome (disease or death) is infrequently assessed. We therefore explored the impact of such interaction on the shape of the onset-age distribution. STUDY DESIGN AND SETTING: We use for illustration age-at-onset of radiation-related early menopause in a cohort of female Japanese Atomic Bomb Survivors. The statistical distribution of age-at-onset was derived from a parametric hazard rate model fit to the data, assuming an underlying Gaussian onset-age distribution among nonexposed women. RESULTS: Commonly used forms of exposure-by-age (attained age) interaction led to unnatural estimates of the age-specific rate function and unreasonable estimates of the onset-age distribution among exposed women, including positive risk of menopause before menarche. CONCLUSION: We recommend that researchers examine the distribution of age-at-onset and exposure-age interaction when conducting risk analyses. To distinguish this from potential etiologic interaction between exposure and unmeasured factors represented by age as a surrogate, richer models or additional data may be required.

    Notes:

    • The purpose of this research was to explore the impact of a statistical interaction of age and exposure on the onset-age distribution of an outcome, and to compare approaches to modeling with this interaction. The investigators also aimed to distinguish this interaction from the traditional practice of introducing age-exposure interaction in risk analysis as a surrogate for unmeasured age-related modifiers. Data from the Life Span Study cohort about the effect of radiation exposure on onset-age of menopause among Japanese Atomic Bomb Survivors was used to investigate age effect modification when risk is age-dependent (or constrained to a specific age distribution) but may also serve as a surrogate for unmeasured age-related factors, using both additive and multiplicative models. Survival analysis results were used to derive onset-age density functions among exposed and unexposed women. Onset age density distributions were determined using six models: three excess absolute rate models and three excess relative risk models, each with either no age effect modification, age effect modification using a linear term for log of age, or age effect modification using a quadratic transformation for age. The investigators found that the absolute model with quadratic term for age interaction was the preferred model for this association. Onset-age densities derived from other models were unnatural, considering the biologically plausible effect of radiation on onset-age of menopause. This research is evidence of the need for careful consideration in handling age interaction in epidemiological studies. The investigators emphasize the importance of choosing an approach to modeling with effect modification that considers the subject-matter not just the statistical criteria. Failure to do so may lead to misinterpretation of risk estimates. Using age-exposure interaction in cases where the interaction could represent either the effect of exposure on the onset-age distribution or modification of the exposure-outcome relationship by unmeasured age-related factors is problematic. There is no simple way to account for both types of interaction, but rather it requires explicit exploration of each issue and careful interpretation of estimates. Future research should consider methods to overcome this challenge.

  • Leveraging MEDLINE indexing for pharmacovigilance – Inherent limitations and mitigation strategies

    Author Rainer Winnenburg
    Author Alfred Sorbello
    Author Anna Ripple
    Author Rave Harpaz
    Author Joseph Tonning
    Author Ana Szarfman
    Author Henry Francis
    Author Olivier Bodenreider
    Volume 57
    Pages 425-435
    Publication Journal of Biomedical Informatics
    Date October 2015
    Abstract Background Traditional approaches to pharmacovigilance center on the signal detection from spontaneous reports, e.g., the U.S. Food and Drug Administration (FDA) adverse event reporting system (FAERS). In order to enrich the scientific evidence and enhance the detection of emerging adverse drug events that can lead to unintended harmful outcomes, pharmacovigilance activities need to evolve to encompass novel complementary data streams, for example the biomedical literature available through MEDLINE. Objectives (1) To review how the characteristics of MEDLINE indexing influence the identification of adverse drug events (ADEs); (2) to leverage this knowledge to inform the design of a system for extracting ADEs from MEDLINE indexing; and (3) to assess the specific contribution of some characteristics of MEDLINE indexing to the performance of this system. Methods We analyze the characteristics of MEDLINE indexing. We integrate three specific characteristics into the design of a system for extracting ADEs from MEDLINE indexing. We experimentally assess the specific contribution of these characteristics over a baseline system based on co-occurrence between drug descriptors qualified by adverse effects and disease descriptors qualified by chemically induced. Results Our system extracted 405,300 ADEs from 366,120 MEDLINE articles. The baseline system accounts for 297,093 ADEs (73%). 85,318 ADEs (21%) can be extracted only after integrating specific pre-coordinated MeSH descriptors and additional qualifiers. 22,889 ADEs (6%) can be extracted only after considering indirect links between the drug of interest and the descriptor that bears the ADE context. Conclusions In this paper, we demonstrate significant improvement over a baseline approach to identifying ADEs from MEDLINE indexing, which mitigates some of the inherent limitations of MEDLINE indexing for pharmacovigilance. ADEs extracted from MEDLINE indexing are complementary to, not a replacement for, other sources.

    Notes:

    • Pharmacovigilance activities involving analysis of spontaneous reporting systems for detection of adverse effects have well known limitations.  The biomedical literature has been explored as one potential data source to supplement traditional pharmacovigilance efforts. Currently, two methods are being explored to extract adverse drug event (ADE) information from literature include: 1) use of natural language processing to search titles, abstracts, and full text articles for relevant ADEs, and 2) utilizing structured indexing such as Medical Subject Heading (MeSH) terms that classifies drugs and diseases.  The authors discuss properties, challenges, limitations, and feasibility of using MeSH terms to abstract all ADEs from MEDLINE, as well as their experience and future research needs.  They first ran a query to retrieve all articles relevant to ADEs, then identified MeSH indexing terms that were relevant to the ADEs, and then extracted ADE pairs, n extract the drug-manifestation pairs, along with provenance and metadata information (e.g. publication date, type). They conclude MEDLINE indexing can be used as a complementary resource that may be able to detect ADE citations that other methods are not able to detect, but not as a replacement.  This article is relevant to those exploring the use of novel data sources for pharmacovigilance actives, and offers an alternative to natural language processing methods when using the biomedical literature for signal detection.

  • Longitudinal average attributable fraction as a method for studying time-varying conditions and treatments on recurrent self-rated health: the case of medications in older adults with multiple chronic conditions

    Author Heather G. Allore
    Author Yilei Zhan
    Author Mary Tinetti
    Author Mark Trentalange
    Author Gail McAvay
    Volume 25
    Issue 9
    Pages 681-686.e4
    Publication Annals of Epidemiology
    Date 2015-09-01
    Abstract Purpose The objective is to modify the longitudinal extension of the average attributable fraction (LE-AAF) for recurrent outcomes with time-varying exposures and control for covariates. Methods We included Medicare Current Beneficiary Survey participants with two or more chronic conditions enrolled from 2005 to 2009 with follow-up through 2011. Nine time-varying medications indicated for nine time-varying common chronic conditions and 14 of 18 forward-selected participant characteristics were used as control variables in the generalized estimating equations step of the LE-AAF to estimate associations with the recurrent universal health outcome self-rated health (SRH). Modifications of the LE-AAF were made to accommodate these indicated medication-condition interactions and covariates. Variability was empirically estimated by bias-corrected and accelerated bootstrapping. Results In the adjusted LE-AAF, thiazide, warfarin, and clopidogrel had significant contributions of 1.2%, 0.4%, 0.2%, respectively, to low (poor or fair) SRH; whereas there were no significant contributions of the other medications to SRH. Hyperlipidemia significantly contributed 4.6% to high SRH. All the other conditions except atrial fibrillation contributed significantly to low SRH. Conclusions Our modifications to the LE-AAF method apply to a recurrent binary outcome with time-varying factors accounting for covariates.

    Notes:

    • In research involving multiple chronic conditions (MCC) and multiple medication use, it can be difficult to measure the contribution of a medication to the outcome, especially when medication use and health status are both time-varying, and recurrent outcomes are of interest.  A modified average attributed fraction for time-to-event outcome with time-varying medical conditions has been previously used to address this challenge.  The authors use an expansion of the longitudinal extension of the average attributable fraction (LE-AAF) to estimate the contribution of recommended medications to recurrent measures of self-rated health (SRH) in a cohort of older adults with MCC.  In the adjusted LE-AAF, only thiazide, warfarin, and clopidogrel had significant contributions to low SRH. While hyperlipidemia significantly contributed to high SRH, all the other conditions except atrial fibrillation contributed significantly to low SRH.  This article is applicable to studies of recurrent binary outcomes with time-varying covariates, and may also be of special interest for those examining MCC where medication use and disease status changes over time.

  • Modeling intensive longitudinal data with mixtures of nonparametric trajectories and time-varying effects

    Author John J. Dziak
    Author Runze Li
    Author Xianming Tan
    Author Saul Shiffman
    Author Mariya P. Shiyko
    Volume 20
    Issue 4
    Pages 444-469
    Publication Psychological Methods
    Date Dec 2015
    Abstract Behavioral scientists increasingly collect intensive longitudinal data (ILD), in which phenomena are measured at high frequency and in real time. In many such studies, it is of interest to describe the pattern of change over time in important variables as well as the changing nature of the relationship between variables. Individuals' trajectories on variables of interest may be far from linear, and the predictive relationship between variables of interest and related covariates may also change over time in a nonlinear way. Time-varying effect models (TVEMs; see Tan, Shiyko, Li, Li, & Dierker, 2012) address these needs by allowing regression coefficients to be smooth, nonlinear functions of time rather than constants. However, it is possible that not only observed covariates but also unknown, latent variables may be related to the outcome. That is, regression coefficients may change over time and also vary for different kinds of individuals. Therefore, we describe a finite mixture version of TVEM for situations in which the population is heterogeneous and in which a single trajectory would conceal important, interindividual differences. This extended approach, MixTVEM, combines finite mixture modeling with non- or semiparametric regression modeling, to describe a complex pattern of change over time for distinct latent classes of individuals. The usefulness of the method is demonstrated in an empirical example from a smoking cessation study. We provide a versatile SAS macro and R function for fitting MixTVEMs. (PsycINFO Database Record

    Notes:

    • When using longitudinal data, the variables and the nature of their relationship with other variables may change over time.  As a result, the individuals’ trajectories may be linear or non-linear over time.  Typically, researchers use time-varying effect models (TVEM) to address these concerns by allowing regression coefficients to vary based on time.  Furthermore, the regression coefficients may change over time and vary for different types of individuals possibly due to unmeasured and latent variables.  To quantify the inter-individual differences, the authors introduce an extend approached called MixTVEM which combines finite mixture modeling with non- or semiparametric regression model to help describe a complex patterns of change over time for different latent classes of individuals.  The approach helps with two major challenges of heterogeneity where the differences exist between study individuals outside of the measured variables and non-linearity where the trajectory of variables is not always modeled as linear or quadratic. MixTVEm overcomes the issue of non-linearity by modeling coefficients as smooth function of time rather than constants and the issue of heterogeneity by assuming that individuals belong to one or multiple latent classes. 

  • Modelling and extraction of variability in free-text medication prescriptions from an anonymised primary care electronic medical record research database

    Author George Karystianis
    Author Therese Sheppard
    Author William G. Dixon
    Author Goran Nenadic
    Volume 16
    Pages 18
    Publication BMC Medical Informatics and Decision Making
    Date 2016
    Abstract Free-text medication prescriptions contain detailed instruction information that is key when preparing drug data for analysis. The objective of this study was to develop a novel model and automated text-mining method to extract detailed structured medication information from free-text prescriptions and explore their variability (e.g. optional dosages) in primary care research databases.

    Notes:

    • Individual prescription information recorded in electronic health records (EHRs) often includes free-text information entered by healthcare providers, such as alternate drug dosing and directions for use. Natural language processing methods have been used previously to identify prescribing information in free-text clinical notes, though little research has been done which aims to code this information in a way that is useful for inclusion in further research. The authors of this study propose automated methods to model and extract detailed medication information from provider-entered free-text in EHRs and convert it to a structured source of data that can be directly used in observational research studies. To implement the proposed method, a set of dictionaries was created in order to identify four medication attributes (dose number, frequency, interval, and unit) that may be present in free text for individual prescriptions. A rule-based system was then applied to extract the minimum and maximum values for each attribute and provide a common structure to the extracted information. The performance of the method was tested using prescription data in the Clinical Practice Research Datalink. Using manual annotation of free-text data in 220 prescriptions as the gold standard, the automated extraction system correctly identified all dose attributes in the free text for each prescription 91% of the time, with individual medication dose attributes (e.g., frequency, interval) identified with an accuracy ranging from 94% to 100%. Using a larger prescription sample, the authors found that 24% of CPRD prescriptions had variability (i.e., different min/max values) in at least one of the attributes assessed. The methods proposed in this study have clear applications for pharmacoepidemiologists who utilize data sources with free-text prescription information. The extraction and availability of variables representing the variability in drug exposure for individual prescriptions will allow researchers greater flexibility in how they choose to model medication use in relation to safety and efficacy endpoints. This is particularly true for medications that may be taken on an as needed basis or those prescribed with non-traditional directions for use, which can make accurate exposure assessment difficult using traditional claims data.

  • Optimal full matching for survival outcomes: a method that merits more widespread use

    Author Peter C. Austin
    Author Elizabeth A. Stuart
    Volume 34
    Issue 30
    Pages 3949-3967
    Publication Statistics in Medicine
    Date December 30, 2015
    Abstract Matching on the propensity score is a commonly used analytic method for estimating the effects of treatments on outcomes. Commonly used propensity score matching methods include nearest neighbor matching and nearest neighbor caliper matching. Rosenbaum (1991) proposed an optimal full matching approach, in which matched strata are formed consisting of either one treated subject and at least one control subject or one control subject and at least one treated subject. Full matching has been used rarely in the applied literature. Furthermore, its performance for use with survival outcomes has not been rigorously evaluated. We propose a method to use full matching to estimate the effect of treatment on the hazard of the occurrence of the outcome. An extensive set of Monte Carlo simulations were conducted to examine the performance of optimal full matching with survival analysis. Its performance was compared with that of nearest neighbor matching, nearest neighbor caliper matching, and inverse probability of treatment weighting using the propensity score. Full matching has superior performance compared with that of the two other matching algorithms and had comparable performance with that of inverse probability of treatment weighting using the propensity score. We illustrate the application of full matching with survival outcomes to estimate the effect of statin prescribing at hospital discharge on the hazard of post-discharge mortality in a large cohort of patients who were discharged from hospital with a diagnosis of acute myocardial infarction. Optimal full matching merits more widespread adoption in medical and epidemiological research. © 2015 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.

    Notes:

    • Propensity scores are widely used in epidemiological studies; however, a newer approach – full matching – is rarely applied in the studies. In the full matching approach, the weighted distance of propensity scores between the treated and untreated subjects in a given stratification is minimized. The authors proposed to use full matching approach for survival outcomes and evaluated the performance of full matching compared to other methods using propensity scores (nearest neighbor matching [NNM], nearest caliper matching, and inverse probability of treatment weighting [IPTW]). The authors used Monte Carlo simulations to compare several performance outcomes of these methods including the degree to reduce the differences between treatment groups, bias estimation in the marginal hazard ratio, the variability of the estimated hazard ratios. The results showed that full matching and IPTW had comparable performance. On the other hand, the full matching had superior performance than the NNM and caliper matching which generated larger biased estimates. They also found that the performance of full matching increase with the increase of treatment prevalence. Then the authors applied full matching to empirical study evaluating the effect of statin on the time from discharge to death in a population with acute myocardial infarction requiring hospitalization. The authors concluded that full matching should be widespread adopted in the survival analysis in epidemiological studies.

  • Optimal time lags in panel studies

    Author Christian Dormann
    Author Mark A. Griffin
    Volume 20
    Issue 4
    Pages 489-505
    Publication Psychological Methods
    Date Dec 2015
    Abstract Cross-lagged regression coefficients are frequently used to test hypotheses in panel designs. However, these coefficients have particular properties making them difficult to interpret. In particular, cross-lagged regression coefficients may vary, depending on the respective time lags between different sets of measurement occasions. This article introduces the concept of an optimal time lag. Further, it is demonstrated that optimal time lags in panel studies are related to the stabilities of the variables investigated, and that in unidirectional systems, they may be unrelated to the size of possible true effects. The results presented also suggest that optimal time lags for panel designs are usually quite short. Implications are (a) that interpreting cross-lagged regression coefficients requires taking the time lag between measurement occasions into account, and (b) that in much research, far shorter time lags than those frequently found in the literature are justifiable, and we call for more "shortitudinal" studies in the future. (PsycINFO Database Record

    Notes:

    • The authors pointed out the limitation of cross-lagged regression coefficients and aimed to investigate the optimal time lags in panel studies. The authors defined an optimal time lag as “the lag that is required to yield the maximum effect of X predicting Y at a later time.” They first introduced three models (continuous time model, discrete time model, and final model) in the regression-based designs and calculated and illustrated the optimal time lags by deriving the optimal number in the three models as well as using existing data to calculate the optimal time lags. Their results showed that the optimal time lags for penal studies are usually short (<6 months) and demonstrated that the stability of the independent variables, stability of the dependent variable, and the cross-lagged effect are related to the optimal time lags. The authors concluded that “shortitudinal” pilot studies should be conducted to estimate the optimal time lags in panel studies. Use of a short time lag in the pilot studies could reveal important information about the psychological processes over time as well as identify the optimal time lag under the study.

  • Optimal Use of Available Claims to Identify a Medicare Population Free of Coronary Heart Disease

    Author Shia T. Kent
    Author Monika M. Safford
    Author Hong Zhao
    Author Emily B. Levitan
    Author Jeffrey R. Curtis
    Author Ryan D. Kilpatrick
    Author Meredith L. Kilgore
    Author Paul Muntner
    Volume 182
    Issue 9
    Pages 808-819
    Publication American Journal of Epidemiology
    Date 11/01/2015
    Abstract We examined claims-based approaches for identifying a study population free of coronary heart disease (CHD) using data from 8,937 US blacks and whites enrolled during 2003–2007 in a prospective cohort study linked to Medicare claims. Our goal was to minimize the percentage of persons at study entry with self-reported CHD (previous myocardial infarction or coronary revascularization). We assembled 6 cohorts without CHD claims by requiring 6 months, 1 year, or 2 years of continuous Medicare fee-for-service insurance coverage prior to study entry and using either a fixed-window or all-available look-back period. We examined adding CHD-related claims to our “base algorithm,” which included claims for myocardial infarction and coronary revascularization. Using a 6-month fixed-window look-back period, 17.8% of participants without claims in the base algorithm reported having CHD. This was reduced to 3.6% using an all-available look-back period and adding other CHD claims to the base algorithm. Among cohorts using all-available look-back periods, increasing the length of continuous coverage from 6 months to 1 or 2 years reduced the sample size available without lowering the percentage of persons with self-reported CHD. This analysis demonstrates approaches for developing a CHD-free cohort using Medicare claims.

    Notes:

    • As a part of the REGARDS study, which seeks to understand the higher rate of stroke mortality among US blacks versus whites and among southern US region compared to other regions, authors completed a validation study for identifying a cohort free of coronary heart disease (CHD) using Medicare claims data linked to self-reported medical history. They examined the impact of different lengths of continuous Medicare coverage required (6 months, 1 year, or 2 years) for both a fixed-window as well as an all-available look-back which included claims filed during the entire period in which a beneficiary had continuous Medicare coverage. The authors found that using all available claims instead of the fixed-window look-back approach minimized the percentage of participants with CHD claims who reported CHD and that increasing the length of required continuous coverage beyond 6 months resulted in a smaller sample size without reducing the percentage of participants with self-reported CHD.  This research may be of interest when identifying the presence or absence of CHD in studies using claims or patient self-report.

  • Performance of Time-Varying Predictors in Multilevel Models Under an Assumption of Fixed or Random Effects

    Author Rachel Baird
    Author Scott E. Maxwell
    Publication Psychological Methods
    Date Mar 7, 2016
    Abstract Time-varying predictors in multilevel models are a useful tool for longitudinal research, whether they are the research variable of interest or they are controlling for variance to allow greater power for other variables. However, standard recommendations to fix the effect of time-varying predictors may make an assumption that is unlikely to hold in reality and may influence results. A simulation study illustrates that treating the time-varying predictor as fixed may allow analyses to converge, but the analyses have poor coverage of the true fixed effect when the time-varying predictor has a random effect in reality. A second simulation study shows that treating the time-varying predictor as random may have poor convergence, except when allowing negative variance estimates. Although negative variance estimates are uninterpretable, results of the simulation show that estimates of the fixed effect of the time-varying predictor are as accurate for these cases as for cases with positive variance estimates, and that treating the time-varying predictor as random and allowing negative variance estimates performs well whether the time-varying predictor is fixed or random in reality. Because of the difficulty of interpreting negative variance estimates, 2 procedures are suggested for selection between fixed-effect and random-effect models: comparing between fixed-effect and constrained random-effect models with a likelihood ratio test or fitting a fixed-effect model when an unconstrained random-effect model produces negative variance estimates. The performance of these 2 procedures is compared. (PsycINFO Database Record

    Notes:

    • In longitudinal data analysis, researchers must often take into account time-varying predictors (TVP), which are predictor variables that change over time. However, there may be a limited number of time points when measures of TVP are available. Typical statistical models assume that the effect of the TVP affect all individuals equally and there is no allowance for individual variability (fixed models).  The authors conducted two simulation studies to compare the performance of modeling TVP as fixed or random. The fixed model converged, but the random model had poor convergence except when allowing for negative variance estimates, which are uninterpretable. Whether the random model converged depended on the number of time points, subjects, and variance.  The authors argue that negative variance issue in a random effect model are more likely to be due to the true value being too close to zero, and proposes a non-convergence and negative variance procedure that consists of first fitting the random-effect model and if the model does not converge or returns a negative variance, then a fixed-effect model can be used.

  • Predictive Bayesian inference and dynamic treatment regimes

    Author Olli Saarela
    Author Elja Arjas
    Author David A. Stephens
    Author Erica E. M. Moodie
    Volume 57
    Issue 6
    Pages 941-958
    Publication Biometrical Journal
    Date November 1, 2015
    Abstract While optimal dynamic treatment regimes (DTRs) can be estimated without specification of a predictive model, a model-based approach, combined with dynamic programming and Monte Carlo integration, enables direct probabilistic comparisons between the outcomes under the optimal DTR and alternative (dynamic or static) treatment regimes. The Bayesian predictive approach also circumvents problems related to frequentist estimators under the nonregular estimation problem. However, the model-based approach is susceptible to misspecification, in particular of the “null-paradox” type, which is due to the model parameters not having a direct causal interpretation in the presence of latent individual-level characteristics. Because it is reasonable to insist on correct inferences under the null of no difference between the alternative treatment regimes, we discuss how to achieve this through a “null-robust” reparametrization of the problem in a longitudinal setting. Since we argue that causal inference can be entirely understood as posterior predictive inference in a hypothetical population without covariate imbalances, we also discuss how controlling for confounding through inverse probability of treatment weighting can be justified and incorporated in the Bayesian setting.

    Notes:

    • Bayesian posterior predictive inferences with Monte Carlo integration is one method that can be used to select the optimal treatment pathway for a patient based on observed patient history at the time of the decision (dynamic treatment regimes (DTRs)), but is susceptible to model misspecification. In this paper, authors discuss two approaches used to improve parametric Bayesian inference for DTRs through incorporation of 1) inverse probability treatment weighting (IPTW) and 2) “null-robust” reparametrization of the problem. The two methods are illustrated in a simulation dataset as well as real data from a simplified HIV cohort.  They conclude that incorporating IPTW to Bayesian predictive inferences can help remove residual confounding due to factors that are not part of the planned decision rule, as well as approximate doubly robust estimations given that only one model will be needed to be correctly specified to have an unbiased estimate; however, the practical value of null-robust reparametrization is less clear. This paper has practical implications for researchers seeking to identify optimal treatment pathways and reduce confounding in a Bayesian setting.

  • Sensitivity analyses for partially observed recurrent event data

    Author Mouna Akacha
    Author Emmanuel O. Ogundimu
    Volume 15
    Issue 1
    Pages 4-14
    Publication Pharmaceutical Statistics
    Date January 1, 2016
    Abstract Recurrent events involve the occurrences of the same type of event repeatedly over time and are commonly encountered in longitudinal studies. Examples include seizures in epileptic studies or occurrence of cancer tumors. In such studies, interest lies in the number of events that occur over a fixed period of time. One considerable challenge in analyzing such data arises when a large proportion of patients discontinues before the end of the study, for example, because of adverse events, leading to partially observed data. In this situation, data are often modeled using a negative binomial distribution with time-in-study as offset. Such an analysis assumes that data are missing at random (MAR). As we cannot test the adequacy of MAR, sensitivity analyses that assess the robustness of conclusions across a range of different assumptions need to be performed. Sophisticated sensitivity analyses for continuous data are being frequently performed. However, this is less the case for recurrent event or count data. We will present a flexible approach to perform clinically interpretable sensitivity analyses for recurrent event data. Our approach fits into the framework of reference-based imputations, where information from reference arms can be borrowed to impute post-discontinuation data. Different assumptions about the future behavior of dropouts dependent on reasons for dropout and received treatment can be made. The imputation model is based on a flexible model that allows for time-varying baseline intensities. We assess the performance in a simulation study and provide an illustration with a clinical trial in patients who suffer from bladder cancer. Copyright © 2015 John Wiley & Sons, Ltd.

    Notes:

    • Methods used to analyze clinical trial data where recurrent events are of interest assume that the reason for attrition (drop-outs and discontinuation) are not associated with the outcome and that data are missing at random (MAR), which is an untestable assumption.  The paper discusses using the reference-based imputation method as a sensitivity analysis tool where data from the reference-arm are imputed to post-discontinuation data of under various plausible assumptions about the future behavior of dropouts.  Authors discuss how missing recurrent event data information can be imputed under the MAR mechanism, an approach where reference arm data are used to build an imputation model for the missing information in the experimental arm, and an approach, where after discontinuation, the event rate for patients in the treatment arm is a certain percentage or an absolute value higher than under MAR.  These methods are illustrated using data from a bladder cancer clinical trial data.  These sensitivity analyses may be used to test the robustness of results in pharmacoepidemiologic studies involving recurrent events where different reasons for censoring are assumed.

  • Should methods of correction for multiple comparisons be applied in pharmacovigilance?

    Author Lorenza Scotti
    Author Silvana Romio
    Author Arianna Ghirardi
    Author Andrea Arfè
    Author Manuela Casula
    Author Lorna Hazell
    Author Francesco Lapi
    Author Alberico Catapano
    Author Miriam Sturkenboom
    Author Giovanni Corrao
    Volume 12
    Issue 4
    Publication Epidemiology, Biostatistics and Public Health
    Date 2015-12-17
    Abstract Purpose. In pharmacovigilance, spontaneous reporting databases are devoted to the early detection of adverse event ‘signals’ of marketed drugs. A common limitation of these systems is the wide number of concurrently investigated associations, implying a high probability of generating positive signals simply by chance. However it is not clear if the application of methods aimed to adjust for the multiple testing problems are needed when at least some of the drug-outcome relationship under study are known. To this aim we applied a robust estimation method for the FDR (rFDR) particularly suitable in the pharmacovigilance context. Methods. We exploited the data available for the SAFEGUARD project to apply the rFDR estimation methods to detect potential false positive signals of adverse reactions attributable to the use of non-insulin blood glucose lowering drugs. Specifically, the number of signals generated from the conventional disproportionality measures and after the application of the rFDR adjustment method was compared. Results. Among the 311 evaluable pairs (i.e., drug-event pairs with at least one adverse event report), 106 (34%) signals were considered as significant from the conventional analysis. Among them 1 resulted in false positive signals according to rFDR method. Conclusions. The results of this study seem to suggest that when a restricted number of drug-outcome pairs is considered and warnings about some of them are known, multiple comparisons methods for recognizing false positive signals are not so useful as suggested by theoretical considerations.

    Notes:

    • Due to the large number of associations being investigated simultaneously, signal detection using spontaneous reporting databases may be prone to generation of positive signals purely by chance. Traditional drug safety studies have used methods to correct for multiple comparisions; however, it is currently unclear whether these methods are applicable in the pharmacovigilance setting. The authors of this study assessed the usefulness of applying multiple testing correction methods in a pharmacovigilance setting. This was accomplished via the assessment of signals generated via the proportional reporting ratio (PRR) using a previously developed robust false discovery rate (rFDR) method.  An analysis of the safety of oral hypoglycemic agents using spontaneous reporting data (FDA-AERS and EudraVigilance) was used as an illustration. Of the 106 signals generated among 261 evaluable drug-event pairs using the PRR, only 1 was determined to be a false positive using the rFDR method (i.e., 105 signals were confirmed). The results of this study may assure pharmacoepidemiologists conducting signal detection studies that correction for multiple comparisons may not dramatically impact results, particularly when a relatively small number of drug-outcome pairs are being considered. However, pharmacoepidemiologists may wish to test this assumption using other disproportionality measures, varying thresholds, and a greater number of drug-outcome pairs.

  • Some extensions in continuous models for immunological correlates of protection

    Author Andrew J. Dunning
    Author Jennifer Kensler
    Author Laurent Coudeville
    Author Fabrice Bailleux
    Volume 15
    Pages 107
    Publication BMC Medical Research Methodology
    Date 2015
    Abstract A scaled logit model has previously been proposed to quantify the relationship between an immunological assay and protection from disease, and has been applied in a number of settings. The probability of disease was modelled as a function of the probability of exposure, which was assumed to be fixed, and of protection, which was assumed to increase smoothly with the value of the assay.

    Notes:

    • The scaled logit model has been used to quantify the relationship between immunological assays and protection from disease in vaccine trials, where disease occurrence is modelled as a function of assay value by an exposure parameter using a parametric protection curve.  Using data from 16 vaccine clinical trials, authors investigate the use of alternative protection curve functions (symmetrical, incomplete protection, non-symmetrical) and suggest criteria for model fit. The authors found that none of the models was uniformly optimal for all datasets, but all performed better than the traditional scale logit models. These extension models are most likely useful in vaccine clinical trials that involve the use of immunological assays.

  • Structural Nested Mean Models to Estimate the Effects of Time-Varying Treatments on Clustered Outcomes

    Author Jiwei He
    Author Alisa Stephens-Shields
    Author Marshall Joffe
    Volume 11
    Issue 2
    Pages 203-222
    Publication The International Journal of Biostatistics
    Date Nov 1, 2015
    Abstract In assessing the efficacy of a time-varying treatment structural nested models (SNMs) are useful in dealing with confounding by variables affected by earlier treatments. These models often consider treatment allocation and repeated measures at the individual level. We extend SNMMs to clustered observations with time-varying confounding and treatments. We demonstrate how to formulate models with both cluster- and unit-level treatments and show how to derive semiparametric estimators of parameters in such models. For unit-level treatments, we consider interference, namely the effect of treatment on outcomes in other units of the same cluster. The properties of estimators are evaluated through simulations and compared with the conventional GEE regression method for clustered outcomes. To illustrate our method, we use data from the treatment arm of a glaucoma clinical trial to compare the effectiveness of two commonly used ocular hypertension medications.

    Notes:

    • The role of clinical predication models in medical research has been growing. However, such models often suffer from inaccuracy or lack of generalizability. This can be partially attributed to insufficient sample size (and representativeness) in the data used to generate the model. In this paper the authors propose a method of synthesizing multiple prediction models with individual patient data (IPD) to generate a model with reduced bias.

      If multiple prediction regression models, as well as at least one report of individual patient-level data with measurement of covariates of interest, exist in the literature, He and colleagues propose a method of non-linear model for meta-analysis to account for potential omitted variable bias (i.e. selection bias) in the original prediction model(s). Generalized Nonlinear Least Squares regression is used to synthesize the previously reported regression estimates, with off-diagonal covariance between studies estimated by imputing using IPD from another study. Of note, the authors acknowledge potential computational issues using IPD this way, and do not propose a solution. In a simulation study, the proposed method performed well, generally producing estimates that were more precise and less biased compared to ordinary meta-analysis techniques. In an empiric example of risk prediction models for stroke, a series of individual prediction models were generated off of Japan Public Health Center-based Prospective Study data, and then synthesized using the proposed method. This resulted in significantly narrower confidence intervals and slightly improved performance of the resulting prediction model, compared to conventional meta-analysis.

      The proposed method, generalized nonlinear least squares regression using reported regression coefficients supplemented by a single study’s IFD, provides a novel method to synthesize and improve clinical risk prediction models. However, it is limited by its assumption of homogeneity (similar to standard meta-analysis), and is computationally intense.

  • Synthesis of clinical prediction models under different sets of covariates with one individual patient data

    Author Daisuke Yoneoka
    Author Masayuki Henmi
    Author Norie Sawada
    Author Manami Inoue
    Volume 15
    Pages 101
    Publication BMC Medical Research Methodology
    Date 2015
    Abstract Recently, increased development of clinical prediction models has been reported in the medical literature. However, evidence synthesis methodologies for these prediction models have not been sufficiently studied, especially for practical situations such as a meta-analyses where only aggregated summaries of important predictors are available. Also, in general, the covariate sets involved in the prediction models are not common across studies. As in ordinary model misspecification problems, dropping relevant covariates would raise potentially serious biases to the prediction models, and consequently to the synthesized results.

    Notes:

    • Though development of prediction models is becoming increasingly common, they often produce imprecise estimates due to small sample sizes combined with large numbers of predictors in the model. Methods have been proposed which synthesize regression coefficients in prediction models in a multivariate meta-analysis fashion, but they either require a common set of covariates across models or rely on mean imputation for omitted coefficients. The authors of this study propose methods to synthesize logistic clinical prediction models when covariate sets may differ across models, assuming homogeneity of studies. The authors then describe bias assessment methods for regression coefficients when important predictors are omitted from a study’s prediction model, demonstrate the desirable properties of the proposed synthesis method, validate the method through simulation studies, and apply the methods to a real-world synthesis of Japanese stroke data. Compared to previous synthesis methods, the proposed methods resulted in smaller biases and more precise estimates while also accommodating covariate sets that differed between studies. Using 10 real-world stroke data sets, the proposed synthesis methods produced narrower confidence intervals (i.e., lower standard errors) and slightly better prediction performance compared to the multivariate meta-analysis methods proposed previously. The results of this study may be useful to pharmacoepidemiologists who wish to incorporate regression results from prior prediction model results into their own prediction models in order to increase precision. The proposed methods may also be extended to other observational study types (e.g., case-control, cross-sectional) where prior study results can be used to improve precision in models which aim to identify causal effects, regardless of differences in covariate sets.

  • Testing Nonnested Structural Equation Models

    Author Edgar C. Merkle
    Author Dongjun You
    Author Kristopher J. Preacher
    Publication Psychological Methods
    Date Aug 3, 2015
    Abstract In this article, we apply Vuong's (1989) likelihood ratio tests of nonnested models to the comparison of nonnested structural equation models (SEMs). Similar tests have been previously applied in SEM contexts (especially to mixture models), though the nonstandard output required to conduct the tests has limited their use and study. We review the theory underlying the tests and show how they can be used to construct interval estimates for differences in nonnested information criteria. Through both simulation and application, we then study the tests' performance in nonmixture SEMs and describe their general implementation via free R packages. The tests offer researchers a useful tool for nonnested SEM comparison, with barriers to test implementation now removed. (PsycINFO Database Record

    Notes:

    • Likelihood ratio tests can help decide which model fits better when comparing simple versus complex models. However, this statistic does not extend to scenarios where models are not nested. In non-nested cases, Akaike information Criterion (AIC) or Bayesian information criterion (BIC) can be used and the model with lower criterion scores is considered better. The issues in the context of structure equation model (SEM) are that the BIC can have large sampling variability and one cannot conclude if two models fit equally.  There is another method (Pornprasertmanit, Wu, and Little 2013) that can help decide if two models fit equally well, however, this method is computationally expensive. For this study, the authors described tests for non-nested models that can help identify which model fits better, if two models are equal or if they are indistinguishable. These tests were based on theory by Vuong 1989. The non-nested models can be sub classified as overlapping and strictly non-nested models. For strictly non-nested SEMs, the parameter space does not overlap at all and the models can have different functional forms. The authors of the study described theoretical background, method to estimate confidence intervals for difference in BIC and example of how this method can be applied. The conditions required for Vuong tests include existence of second order derivatives of the log-likelihood, invertibility of the models’ information matrices, and independence and identical distribution of the data vectors. The authors provided recommendation steps for model comparison: i) Evaluation of models for equivalence using nesting and equivalence test (NET); if models are not equivalent then step ii, ii) Test if models are distinguishable, if models are distinguishable then step iii, iii) compare models using non-nested likelihood ratio test or confidence interval for BIC differences.

  • Tests for detecting overdispersion in models with measurement error in covariates

    Author Yingsi Yang
    Author Man Yu Wong
    Volume 34
    Issue 27
    Pages 3563-3576
    Publication Statistics in Medicine
    Date November 30, 2015
    Abstract Measurement error in covariates can affect the accuracy in count data modeling and analysis. In overdispersion identification, the true mean–variance relationship can be obscured under the influence of measurement error in covariates. In this paper, we propose three tests for detecting overdispersion when covariates are measured with error: a modified score test and two score tests based on the proposed approximate likelihood and quasi-likelihood, respectively. The proposed approximate likelihood is derived under the classical measurement error model, and the resulting approximate maximum likelihood estimator is shown to have superior efficiency. Simulation results also show that the score test based on approximate likelihood outperforms the test based on quasi-likelihood and other alternatives in terms of empirical power. By analyzing a real dataset containing the health-related quality-of-life measurements of a particular group of patients, we demonstrate the importance of the proposed methods by showing that the analyses with and without measurement error correction yield significantly different results. Copyright © 2015 John Wiley & Sons, Ltd.

    Notes:

    • Count data models are frequently used in medical, economic, and other research areas to analyze the association between an exposure and frequency of events.  While used often, most existing count models assume that the covariates can be measured precisely; however, measurement error widely exists and can substantially impact the results of a statistical analysis by either leading to bias in estimation or loss of power in hypothesis testing.  Count data is widely analyzed using a Poisson model, a type of generalized linear model (GLM) characterized by mean-variance equality assumption (eqidispersion).  However, in practicality, many data display overdispersion relative to that attributable to the Poisson model, and therefore the negative binomial model is employed to account for overdispersion.  While work has been done in estimating the unknown parameter or reducing the confounding effect, little work has focused on finding reliable statistical tests for the mean-variance equality assumption when influenced by measurement error in covariates.  The objective of this paper was to develop three tests to detect overdispersion when covariates are measured with error.  The method used to detect overdispersion is based on an approximate likelihood function for GLM’s under the classical measurement error framework.  The approximate maximum-likelihood estimator (AMLE) is then compared to the quasi-likelihood estimator (QLE).  Then three tests statistics are developed for detecting overdispersion under measurement error in covariates including two tests based on AMLE and QLE, and a third on a modified version of Rao’s score test.  The resulting efficiency comparison between AMLE and QLE for the Poisson and negative binomial regression models show that the AMLE is consistently more efficient than QLE.  Amongst proposed test statistics, all three proposed test statistics are more powerful than existing tests for dispersion, providing guidance on the choice between Poisson and binomial models when covariates are measured with error.  The test statistic based on AMLE is however the most powerful test amongst the three proposed.  When applied to a real study for assessing the health-related quality of life (HRQoL) of Chinese patients with colorectal neoplasms, the test statistic based on AMLE outperforms QLE and other alternatives.  Using test statistics to detect overdispersion in the presence of measurement error is an important consideration as variation caused by measurement error confounds true data, resulting in imprecision of estimates that lead to falsely conveying messages to practitioners and other audiences of studies.  Given that count data is a powerful source to inform drug safety studies and can potentially carry policy implications, accuracy of count data analysis by accounting for the influence of measurement error is paramount.

  • The Impact of Sparse Follow-up on Marginal Structural Models for Time-to-Event Data

    Author Nassim Mojaverian
    Author Erica E. M. Moodie
    Author Alex Bliu
    Author Marina B. Klein
    Volume 182
    Issue 12
    Pages 1047-1055
    Publication American Journal of Epidemiology
    Date 12/15/2015
    Abstract The impact of risk factors on the amount of time taken to reach an endpoint is a common parameter of interest. Hazard ratios are often estimated using a discrete-time approximation, which works well when the by-interval event rate is low. However, if the intervals are made more frequent than the observation times, missing values will arise. We investigated common analytical approaches, including available-case (AC) analysis, last observation carried forward (LOCF), and multiple imputation (MI), in a setting where time-dependent covariates also act as mediators. We generated complete data to obtain monthly information for all individuals, and from the complete data, we selected “observed” data by assuming that follow-up visits occurred every 6 months. MI proved superior to LOCF and AC analyses when only data on confounding variables were missing; AC analysis also performed well when data for additional variables were missing completely at random. We applied the 3 approaches to data from the Canadian HIV–Hepatitis C Co-infection Cohort Study (2003–2014) to estimate the association of alcohol abuse with liver fibrosis. The AC and LOCF estimates were larger but less precise than those obtained from the analysis that employed MI.

    Notes:

    • There is an increasing interest in analytic methods that allow for time-varying exposures and covariates in time-to-event analysis. In particular, there is a need for statistical methods to handle time-varying confounders which may also be mediate the exposure-outcome relationship. Marginal Structural Models (MSMs) are one way of handling such variables, however they rely on certain assumptions regarding per-interval event rates that, when not met, may induce bias. Mojaverian and colleagues identify this as, foundationally, a missing-data problem, and this paper investigates the way missing data methods may be used to reduce bias in MSMs with time-varying confounder/mediators.

      Three methods are examined: AC analysis (analyzing only observed data, requiring an assumption of missing completely at random [MCAR]), last observation carried forward (LCOF, which may result in bias even if MCAR is met)), and multiple imputation (MI, requiring missing at random [MAR]). Each of these methods was simulated in and MSM Cox model using a previously published data-generating algorithm, with different data collection intervals, and different permutations of missingness on the exposure, confounder and outcome variables. The underlying data were structured to be MCAR. Analysis showed that, while MI performed well if only the confounder variable were missing, no method was successful in handling both missing confounder and exposure. With too high a rate of missing information, the MI procedure failed in some cases. In those cases, the AC method was a possible alternative, as was unbiased (given MCAR) but statistically inefficient. These results were replicated in an empiric example.

      Based on simulation and empiric results, the authors propose use of multiple imputation when dealing with time between longitudinal observations that exceed desired analytical intervals. However, they acknowledge that the traditional AC approach may also be a viable option if the MCAR assumption holds.

  • Toward high-throughput phenotyping: unbiased automated feature extraction and selection from knowledge sources

    Author Sheng Yu
    Author Katherine P. Liao
    Author Stanley Y. Shaw
    Author Vivian S. Gainer
    Author Susanne E. Churchill
    Author Peter Szolovits
    Author Shawn N. Murphy
    Author Isaac S. Kohane
    Author Tianxi Cai
    Volume 22
    Issue 5
    Pages 993-1000
    Publication Journal of the American Medical Informatics Association
    Date 2015/09/01
    Abstract Objective Analysis of narrative (text) data from electronic health records (EHRs) can improve population-scale phenotyping for clinical and genetic research. Currently, selection of text features for phenotyping algorithms is slow and laborious, requiring extensive and iterative involvement by domain experts. This paper introduces a method to develop phenotyping algorithms in an unbiased manner by automatically extracting and selecting informative features, which can be comparable to expert-curated ones in classification accuracy.Materials and methods Comprehensive medical concepts were collected from publicly available knowledge sources in an automated, unbiased fashion. Natural language processing (NLP) revealed the occurrence patterns of these concepts in EHR narrative notes, which enabled selection of informative features for phenotype classification. When combined with additional codified features, a penalized logistic regression model was trained to classify the target phenotype.Results The authors applied our method to develop algorithms to identify patients with rheumatoid arthritis and coronary artery disease cases among those with rheumatoid arthritis from a large multi-institutional EHR. The area under the receiver operating characteristic curves (AUC) for classifying RA and CAD using models trained with automated features were 0.951 and 0.929, respectively, compared to the AUCs of 0.938 and 0.929 by models trained with expert-curated features.Discussion Models trained with NLP text features selected through an unbiased, automated procedure achieved comparable or slightly higher accuracy than those trained with expert-curated features. The majority of the selected model features were interpretable.Conclusion The proposed automated feature extraction method, generating highly accurate phenotyping algorithms with improved efficiency, is a significant step toward high-throughput phenotyping.

    Notes:

    • Electronic health records (EHR) are a rich source of information for medical research; however, obtaining information from unstructured text accurately remains a challenge. Two algorithms used to identify a desired patient cohort or phenotype (e.g. all patients with a certain disease) include 1) labor intensive manual review relying on human expertise to suggest logical combinations to match a particular phenotype, and 2) using statistical or machine learning methods to optimize combinations of features that most accurately identify the relevant phenotype. The authors discuss their technique of using automated features to generate algorithms that identified patients with rheumatoid arthritis and coronary artery disease from a large multi-institutional EHR. First, automated comprehensive medical concepts were collected from publicly available knowledge sources (e.g. Medscape), then natural language processing (NLP) was used to identify occurrence patterns of these concepts in EHR notes, finally, these were combined with additional codified features and a penalized logistic regression model was trained to classify the target disease phenotype. The authors found that models trained with NLP text features selected through this automated procedure achieved comparable or slightly higher accuracy than those trained with expert-curated features. The majority of the selected model features were interpretable.  Study findings are relevant to those attempting to abstract data from EHR for research or reporting purposes.

  • Useful Interplay Between Spontaneous ADR Reports and Electronic Healthcare Records in Signal Detection

    Author Alexandra C. Pacurariu
    Author Sabine M. Straus
    Author Gianluca Trifirò
    Author Martijn J. Schuemie
    Author Rosa Gini
    Author Ron Herings
    Author Giampiero Mazzaglia
    Author Gino Picelli
    Author Lorenza Scotti
    Author Lars Pedersen
    Author Peter Arlett
    Author Johan van der Lei
    Author Miriam C. Sturkenboom
    Author Preciosa M. Coloma
    Volume 38
    Issue 12
    Pages 1201-1210
    Publication Drug Safety
    Date 2015/09/14
    Abstract Background and Objective Spontaneous reporting systems (SRSs) remain the cornerstone of post-marketing drug safety surveillance despite their well-known limitations. Judicious use of other available data sources is essential to enable better detection, strengthening and validation of signals. In this study, we investigated the potential of electronic healthcare records (EHRs) to be used alongside an SRS as an independent system, with the aim of improving signal detection. Methods A signal detection strategy, focused on a limited set of adverse events deemed important in pharmacovigilance, was performed retrospectively in two data sources—(1) the Exploring and Understanding Adverse Drug Reactions (EU-ADR) database network and (2) the EudraVigilance database—using data between 2000 and 2010. Five events were considered for analysis: (1) acute myocardial infarction (AMI); (2) bullous eruption; (3) hip fracture; (4) acute pancreatitis; and (5) upper gastrointestinal bleeding (UGIB). Potential signals identified in each system were verified using the current published literature. The complementarity of the two systems to detect signals was expressed as the percentage of the unilaterally identified signals out of the total number of confirmed signals. As a proxy for the associated costs, the number of signals that needed to be reviewed to detect one true signal (number needed to detect [NND]) was calculated. The relationship between the background frequency of the events and the capability of each system to detect signals was also investigated. Results The contribution of each system to signal detection appeared to be correlated with the background incidence of the events, being directly proportional to the incidence in EU-ADR and inversely proportional in EudraVigilance. EudraVigilance was particularly valuable in identifying bullous eruption and acute pancreatitis (71 and 42 % of signals were correctly identified from the total pool of known associations, respectively), while EU-ADR was most useful in identifying hip fractures (60 %). Both systems contributed reasonably well to identification of signals related to UGIB (45 % in EudraVigilance, 40 % in EU-ADR) but only fairly for signals related to AMI (25 % in EU-ADR, 20 % in EudraVigilance). The costs associated with detection of signals were variable across events; however, it was often more costly to detect safety signals in EU-ADR than in EudraVigilance (median NNDs: 7 versus 5). Conclusion An EHR-based system may have additional value for signal detection, alongside already established systems, especially in the presence of adverse events with a high background incidence. While the SRS appeared to be more cost effective overall, for some events the costs associated with signal detection in the EHR might be justifiable.

    Notes:

    • This study evaluated whether using electronic healthcare records (EHRs) alongside a spontaneous reporting system (SRS) will enhance signal detection of five adverse events: myocardial infarction (AMI), bullous eruption, hip fracture, acute pancreatitis, and upper gastrointestinal bleeding (UGIB).  The proportional reporting ratio (PRR) was used to evaluate signals in EudraVigilance SRS database, while the Longitudinal Gamma Poisson Shrinker (LGPS) method and Longitudinal Evaluation of Observational Profiles of Adverse events Related to Drugs (LEOPARD) were used to identify signals in the Exploring and Understanding Adverse Drug Reactions (EU-ADR) EHR network database.   All signals were verified using an automatic tool that searches Medline-indexed publications.  Authors evaluated the performance of the two datasets by calculating the percentage of unilaterally identified signals out of the total number of confirmed signals, sensitivity, and number needed to detect (NND) a true signal, and Spearman’s correlation coefficient.  They found except for bullous eruption, it was more costly (higher NND) to detect a true signal in the HER-based system (EU-ADR).   The ability to detect signals depended on the nature of the adverse event, database used, rareness of the drug-event pair, and background incidence of the event and contribution of each database.   Study methods and findings are of interest to those involved in pharmacovigilance and in particular, those who are exploring the addition of an EHR system to traditional SRS methods.

  • Using epidemiological registry data to provide background rates as context for adverse events in a rheumatoid arthritis drug development program: a coordinated approach

    Author Fredrik Nyberg
    Author Johan Askling
    Author Niklas Berglind
    Author Stefan Franzén
    Author Meilien Ho
    Author Marie Holmqvist
    Author Laura Horne
    Author Kathy Lampl
    Author Kaleb Michaud
    Author Dimitrios A. Pappas
    Author George Reed
    Author Deborah Symmons
    Author Eiichi Tanaka
    Author Trung N. Tran
    Author Suzanne M. M. Verstappen
    Author Eveline Wesby-van Swaay
    Author Hisashi Yamanaka
    Author Jeffrey D. Greenberg
    Volume 24
    Issue 11
    Pages 1121-1132
    Publication Pharmacoepidemiology and Drug Safety
    Date November 1, 2015
    Abstract Purpose Observational studies can provide context for adverse events observed in clinical trials, especially for infrequent events or long-term risks. We developed methods to improve safety contextualization for a rheumatoid arthritis drug development program through coordinated analyses of multiple registries. Methods We identified and characterized differences and similarities across five registries (Swedish Rheumatology Quality of Care Register, Consortium of Rheumatology Researchers of North America [CORRONA], Norfolk Arthritis Register, Institute of Rheumatology Rheumatoid Arthritis, and the new CORRONA International), harmonized outcome definitions, and investigated whether restricted subcohorts improved comparability with trial populations. To address confounding, we identified risk predictors for outcomes of interest (mortality, cardiovascular disease, infection, and malignancy). We used patient-level analyses at each registry and central analysis of standardized group-level data. Results Despite data differences, the coordinated approach enabled consistent variable definitions for key baseline characteristics and outcomes. Selection of restricted subcohorts (e.g., using active joint count criteria) improved baseline comparability with trial patients for some rheumatoid arthritis disease activity measures, but less for other characteristics (e.g., age and comorbidity); however, such selection decreased sample size considerably. For most outcomes, age was the most important risk predictor, emphasizing the importance of age/sex standardization to address confounding. The prospective approach enabled use of recent relevant data; the distributed analysis safeguarded confidentiality of registry data. Conclusions Compared with reliance on published data alone, a forward-looking coordinated approach across multiple observational data sources can improve comparability and consistency and better support sensitivity analyses and data interpretation, in contextualizing safety data from clinical trials. This approach may have utility to support safety assessments across diverse diseases and drug development programs and satisfy future regulatory requirements. Copyright © 2015 John Wiley & Sons, Ltd.

    Notes:

    • Observational studies have long been used to support the safety profile of medications tested in clinical trials, particularly for rare and/or long-term adverse events (AEs); however, differences in patient populations and designs between observational studies and clinical trials have made direct comparison difficult. The authors of this study propose a coordinated analysis approach using data from multiple disease registries to improve contextualization of AE rates observed in clinical trial programs and overcome barriers associated with comparison of safety data from observational studies and clinical trials. Using a rheumatoid arthritis (RA) drug development program as an example, five RA-specific disease registries were chosen with data analyzed at both the individual registry and aggregate level. Baseline and outcome variable definitions in the registries were matched as closely as possible to those used in the clinical trial program. Potential confounding was addressed in subcohort analyses of the registries by mimicking the inclusion criteria and follow-up time observed in the RA clinical trial program and by identifying significant predictors for AE outcomes to use in standardization and restriction of the data. Age and sex standardization to the clinical trial population was generally found to have the greatest impact on the crude rate of AEs observed in the registry data. Subcohorts that were restricted based on trial inclusion/exclusion criteria, follow-up time, and additional standardization for Health Assessment Questionnaire score somewhat improved baseline comparability with clinical trial patients, but significantly reduced sample size and had relatively little impact on the observed AE rates. The results of this study provide an approach for pharmacoepidemiologists to more accurately contextualize AE data observed in clinical trials by harmonizing variable/outcome definitions and coordinating analyses across multiple observational registries. Researchers may also be interested in assessing the impact of this approach on AE rates across other disease states and secondary data sources. Because a prospective approach is taken, pharmacoepidemiologists may also be interested in using these methods to rapidly assess and confirm potential safety signals which may emerge in clinical trials.

  • Using Linked Data for polarity classification of patients’ experiences

    Author Samira Noferesti
    Author Mehrnoush Shamsfard
    Volume 57
    Pages 6-19
    Publication Journal of Biomedical Informatics
    Date 2015-10-01
    Abstract Polarity classification is the main subtask of sentiment analysis and opinion mining, well-known problems in natural language processing that have attracted increasing attention in recent years. Existing approaches mainly rely on the subjective part of text in which sentiment is expressed explicitly through specific words, called sentiment words. These approaches, however, are still far from being good in the polarity classification of patients’ experiences since they are often expressed without any explicit expression of sentiment, but an undesirable or desirable effect of the experience implicitly indicates a positive or negative sentiment. This paper presents a method for polarity classification of patients’ experiences of drugs using domain knowledge. We first build a knowledge base of polar facts about drugs, called FactNet, using extracted patterns from Linked Data sources and relation extraction techniques. Then, we extract generalized semantic patterns of polar facts and organize them into a hierarchy in order to overcome the missing knowledge issue. Finally, we apply the extracted knowledge, i.e., polar fact instances and generalized patterns, for the polarity classification task. Different from previous approaches for personal experience classification, the proposed method explores the potential benefits of polar facts in domain knowledge aiming to improve the polarity classification performance, especially in the case of indirect implicit experiences, i.e., experiences which express the effect of one entity on other ones without any sentiment words. Using our approach, we have extracted 9703 triplets of polar facts at a precision of 92.26 percent. In addition, experiments on drug reviews demonstrate that our approach can achieve 79.78 percent precision in polarity classification task, and outperforms the state-of-the-art sentiment analysis and opinion mining methods.

    Notes:

    • Polarity classification is a key component of natural language processing methods aimed at automatically extracting and classifying peoples’ sentiments and opinions from free-text data sources and classifying them into the positive and negative. However, current polarity classification methods rely on sentiment being expressed directly and explicitly via specific sets of words in the text source, with poor ability to classify experiences based on objective/factual statements and indirect opinion/sentiment. To address this, the authors of this study propose an automated polarity classification method of patients’ drug experiences using linked and structured internet data, with the unique ability to classify objective statements and indirectly expressed experiences. Information was extracted from linked internet data sources (e.g., Drugbank, DailyMed, SIDER) to build FactNet, a knowledge base of drug facts and generalized semantic patterns which indicate a positive or negative patient experience (i.e., polar facts). Generalized patterns of polar facts derived from FactNet were then applied to a patient experience polarity classification task, with the effectiveness of the proposed approach compared to previous methods. The authors found that the proposed polarity classification method resulted in high precision (i.e., percentage of classified instances that are correct) and recall (i.e., percentage of instances that are correctly classified). Using these metrics, the method showed superior performance compared to other baseline and state-of-the-art polarity classification methods. The methods proposed in this paper may be useful to pharmacoepidemiologists interested in studying patients’ self-reported experiences with medications in a drug safety context where large amounts of free-text processing is required (e.g., social media and patient support forums). There may also be applicability of these methods to other free-text data sources used in drug safety research, such as clinical notes recorded by healthcare providers in patient electronic health records.

  • Validation of a coding algorithm for intra-abdominal surgeries and adhesion-related complications in an electronic medical records database

    Author Frank I. Scott
    Author Ronac Mamtani
    Author Kevin Haynes
    Author David S. Goldberg
    Author Najjia N. Mahmoud
    Author James D. Lewis
    Pages n/a-n/a
    Publication Pharmacoepidemiology and Drug Safety
    Date January 1, 2016
    Abstract Purpose Epidemiological data on adhesion-related complications following intra-abdominal surgery are limited. We tested the accuracy of recording of these surgeries and complications within The Health Improvement Network (THIN), a primary care database within the UK. Methods Individuals within THIN from 1995 to 2011 with an incident intra-abdominal surgery and subsequent bowel obstruction (SBO) or adhesiolysis were identified using diagnostic codes. To compute positive predictive values (PPVs), requests were sent to treating physicians of patients with these diagnostic codes to confirm the surgery, SBO, or adhesiolysis code. Completeness of recording was estimated by comparing observed surgical rates within THIN to expected rates derived from the Hospital Episode Statistics dataset within England. Cumulative incidence rates of adhesion-related complications at 5 years were compared with a previously published cohort within Scotland. Results Two hundred seventeen of 245 (89%) questionnaires were returned (180 SBO and 37 adhesiolysis). The PPV of codes for surgery was 94.5% (95%CI: 91–97%). The 88.8% of procedure types were correctly coded. The PPV for SBO and adhesiolysis was 86.1% (95%CI: 80–91%) and 89.2% (95%CI: 75–97%), respectively. Colectomy, appendectomy, and cholecystectomy rates within THIN were 99%, 95%, and 84% of rates observed in national Hospital Episode Statistics data, respectively. Cumulative incidence rates of adhesion related complications following colectomy, appendectomy, and small bowel surgery were similar to those published previously. Conclusions Surgical procedures, SBO, and adhesiolysis can be accurately identified within THIN using diagnostic codes. THIN represents a new tool for assessing patient-specific risk factors for adhesion-related complications and long-term outcomes. Copyright © 2016 John Wiley & Sons, Ltd.

    Notes:

    • The accuracy of identifying adhesion-related complications (small bowel obstruction and lysis of adhesions) and intra-abdominal surgeries using diagnostic codes within administrative claims databases is not well qualified. The authors used the UK primary care database (The Health Improvement Network) and  the Hospital Episode Statistics datasets to test the validity of coding for intra-abdominal surgeries and related complications.  The positive predictive value exceeded 80% for these conditions.  Researchers can use the THIN database to accurately identify intra-abdominal surgeries and related complications.

  • Validation of Case Finding Algorithms for Hepatocellular Cancer From Administrative Data and Electronic Health Records Using Natural Language Processing

    Author Yvonne Sada
    Author Jason Hou
    Author Peter Richardson
    Author Hashem El-Serag
    Author Jessica Davila
    Volume 54
    Issue 2
    Pages e9-e14
    Publication Medical Care
    Date Feb 2016
    Abstract BACKGROUND: Accurate identification of hepatocellular cancer (HCC) cases from automated data is needed for efficient and valid quality improvement initiatives and research. We validated HCC International Classification of Diseases, 9th Revision (ICD-9) codes, and evaluated whether natural language processing by the Automated Retrieval Console (ARC) for document classification improves HCC identification. METHODS: We identified a cohort of patients with ICD-9 codes for HCC during 2005-2010 from Veterans Affairs administrative data. Pathology and radiology reports were reviewed to confirm HCC. The positive predictive value (PPV), sensitivity, and specificity of ICD-9 codes were calculated. A split validation study of pathology and radiology reports was performed to develop and validate ARC algorithms. Reports were manually classified as diagnostic of HCC or not. ARC generated document classification algorithms using the Clinical Text Analysis and Knowledge Extraction System. ARC performance was compared with manual classification. PPV, sensitivity, and specificity of ARC were calculated. RESULTS: A total of 1138 patients with HCC were identified by ICD-9 codes. On the basis of manual review, 773 had HCC. The HCC ICD-9 code algorithm had a PPV of 0.67, sensitivity of 0.95, and specificity of 0.93. For a random subset of 619 patients, we identified 471 pathology reports for 323 patients and 943 radiology reports for 557 patients. The pathology ARC algorithm had PPV of 0.96, sensitivity of 0.96, and specificity of 0.97. The radiology ARC algorithm had PPV of 0.75, sensitivity of 0.94, and specificity of 0.68. CONCLUSIONS: A combined approach of ICD-9 codes and natural language processing of pathology and radiology reports improves HCC case identification in automated data.

    Notes:

    • Obtaining accurate information from unstructured text of electronic health records (EHR) for research remains a challenge.  The authors conducted a validation study of finding hepatocellular cancer (HCC) cases using Veteran’s Affairs administrative claims and EHR data.  To identify HCC, International Classification of Diseases, 9th Revision (ICD-9) codes were obtained from claims data and combined with data extracted from pathology and radiology reports in EHR using natural language processing (NLP) by the Automated Retrieval Console (ARC).  ARC is NLP-based software that allows researchers without programming expertise to conduct NLP-assisted document-level classification.  The ICD-9 method using 2 ICD-9 codes and no other malignancy resulted in a positive predictive value (PPV), sensitivity and specificity of 67%, 95%, and 93% respectively, while NLP using pathology reports from EHR resulted in a PPV, sensitivity and specificity of 96%, 96%, and 97% respectively.  Radiology reports did not perform as well.  This research is of interest to researchers attempting to identify HCC from administrative and/or EHR databases and also serves as a methods paper for other validation research using EHR data.

  • Validation of International Classification of Diseases coding for bone metastases in electronic health records using technology-enabled abstraction

    Author Alexander Liede
    Author Rohini Khorana Hernandez
    Author Maayan Roth
    Author Geoffrey Calkins
    Author Leonardo Nicacio
    Author Katherine Larrabee
    Pages 441
    Publication Clinical Epidemiology
    Date 11/2015

    Notes:

    • Authors evaluated the validity of using ICD-9 code 198.5 to identify cases of bone metastases among breast cancer patients.  They comparing abstraction from structured electronic health records (EHR) versus administrative claims, with a technology-enabled chart abstraction method from EHR as the reference.  The technology-enabled abstraction method involves a modular approach where a software was developed to collect only specific data elements (e.g. dates of bone metastasis) across many patients’ charts instead of reviewing individual patient charts in its entirety of any individual patient chart.   The sensitivity, specificity, positive predictive value, and negative predictive value was 0.67, 0.98, 0.71 and 0.98 when using structured EHR data, and 0.78, 0.98, 0.72 and 0.99 when using administrative claims data respectively.  Authors state that the technology-enabled chart abstraction of unstructured EHR greatly improves data quality by minimizing false negatives. This article may be of interest to those wishing to identify bone metastases among patients with breast cancer, as well as researchers conducting validation studies using EHR data.

  • Validity of ankylosing spondylitis diagnoses in The Health Improvement Network

    Author Maureen Dubreuil
    Author Christine Peloquin
    Author Yuqing Zhang
    Author Hyon K. Choi
    Author Robert D. Inman
    Author Tuhina Neogi
    Volume 25
    Issue 4
    Pages 399-404
    Publication Pharmacoepidemiology and Drug Safety
    Date April 1, 2016
    Abstract Background/Purpose Because ankylosing spondylitis (AS) is uncommon, large medical record databases offer important opportunities for pharmacoepidemiologic research. However, the validity of AS diagnoses recorded by a general practitioner (GP) is unknown. We assessed the validity of algorithms for identifying AS in The Health Improvement Network (THIN). Methods THIN is a database of GP records for over 10 million persons in the UK. In 2014, we administered a questionnaire to GPs of 100 adults for whom an AS diagnosis had been recorded. As high positive predictive value (PPV) is critically important in AS research, we sought to determine the PPV of an AS diagnostic code relative to the GP's clinical impression as the gold standard. Other AS algorithms included: more than one AS diagnostic code, prescription of a nonsteroidal anti-inflammatory drug (NSAID), disease modifying anti-rheumatic drug (DMARD) or biologic. Results In 61 of 85 returned questionnaires, the GP's clinical impression confirmed AS yielding an overall PPV of 72%. PPV was 89% for two AS codes >7 days apart, and was 86% for an AS code plus a DMARD/biologic. Sensitivity was reduced with algorithms requiring two AS codes (64%) and a DMARD/biologic prescription (30%). Algorithms requiring prescription of an NSAID, or the absence of OA or RA had lower PPV (71–75%) and higher sensitivity (95–98%). Conclusion An AS identification algorithm of two AS diagnoses separated by >7 days provided the highest PPV. This algorithm should be used for pharmacoepidemiologic studies in THIN. Copyright © 2016 John Wiley & Sons, Ltd.

    Notes:

    • Electronic health record (EHR) databases are well suited for pharmacoepidemiologic research in ankylosing spondylitis (AS), an uncommon chronic inflammation of the skeleton, due to their longitudinal nature and an AS disease process that may span many years. However, the validity of AS diagnoses recorded in EHRs is unknown. The authors of this study used the UK’s Health Improvement Network EHR database to validate several AS diagnostic algorithms, using general practitioners’ clinical impression of the patient (assessed by questionnaire) as the gold standard. The algorithm using at least one AS diagnostic code was found to have a positive predictive value (PPV) of 71.8%, while two AS diagnostic codes at least seven days apart had a PPV of 88.6% and at least one AS diagnostic code in combination with a disease-modifying antirheumatic drug (DMARD) or biologic had a PPV of 85.7%. Other algorithms tested had PPVs between 72% and 75%. The algorithm requiring two AS diagnoses resulted in a sensitivity of 64% while the algorithm requiring a DMARD or biologic had a sensitivity of 30%. The results of this study demonstrate that use of diagnostic algorithms in the THIN EHR database provides relatively valid results for identification of AS patients. An algorithm using two AS diagnostic codes at least 7 days apart appears to provide the best tradeoff between PPV and sensitivity, and may be used for patient identification in studies using EHR data. Analogous algorithms could also be explored for other rare chronic diseases and in other EHR databases for use in pharmacoepidemiologic research.

  • Validity of diagnostic codes and laboratory measurements to identify patients with idiopathic acute liver injury in a hospital database

    Author Renate Udo
    Author Anke H. Maitland-van der Zee
    Author Toine C. G. Egberts
    Author Johanna H. den Breeijen
    Author Hubert G. M. Leufkens
    Author Wouter W. van Solinge
    Author Marie L. De Bruin
    Pages n/a-n/a
    Publication Pharmacoepidemiology and Drug Safety
    Date June 1, 2015
    Abstract Purpose The development and validation of algorithms to identify cases of idiopathic acute liver injury (ALI) are essential to facilitate epidemiologic studies on drug-induced liver injury. The aim of this study is to determine the ability of diagnostic codes and laboratory measurements to identify idiopathic ALI cases. Methods In this cross-sectional validation study, patients were selected from the hospital-based Utrecht Patient Oriented Database between 2008 and 2010. Patients were identified using (I) algorithms based on ICD-9-CM codes indicative of idiopathic ALI combined with sets of liver enzyme values (ALT > 2× upper limit of normal (ULN); AST > 1ULN + AP > 1ULN + bilirubin > 1ULN; ALT > 3ULN; ALT > 3ULN + bilirubin > 2ULN; ALT > 10ULN) and (II) algorithms based on solely liver enzyme values (ALT > 3ULN + bilirubin > 2ULN; ALT > 10ULN). Hospital medical records were reviewed to confirm final diagnosis. The positive predictive value (PPV) of each algorithm was calculated. Results A total of 707 cases of ALI were identified. After medical review 194 (27%) patients had confirmed idiopathic ALI. The PPV for (I) algorithms with an ICD-9-CM code as well as abnormal tests ranged from 32% (13/41) to 48% (43/90) with the highest PPV found with ALT > 2ULN. The PPV for (II) algorithms with liver test abnormalities was maximally 26% (150/571). Conclusions The algorithm based on ICD-9-CM codes indicative of ALI combined with abnormal liver-related laboratory tests is the most efficient algorithm for identifying idiopathic ALI cases. However, cases were missed using this algorithm, because not all ALI cases had been assigned the relevant diagnostic codes in daily practice. Copyright © 2015 John Wiley & Sons, Ltd.

    Notes:

    • The study of drug-induced liver injury using secondary healthcare claims data is dependent on the ability to accurately detect a diagnosis of idiopathic acute liver injury (ALI). There is currently limited information available on the validity of claims-based algorithms for the detection of idiopathic ALI. The authors of this study assess algorithms for the detection of idiopathic ALI in Dutch healthcare claims data using liver-related laboratory tests (i.e., AST, ALT, alkaline phosphatase, and bilirubin), alone or in combination with ICD-9 diagnostic codes indicative of liver injury on a hospital discharge. Identified idiopathic ALI cases were confirmed by medical review and positive predictive values (PPVs) were calculated for each algorithm. Algorithms using laboratory measurements alone identified a larger number of true idiopathic ALI patients in the study sample compared to the combined algorithms (163 vs. 46), but resulted in a substantially lower average PPV (26% vs. 47%). With both sets of algorithms, more non-idiopathic ALI cases were identified than idiopathic cases, and no individual algorithm had a PPV greater than 48%. This study demonstrates that the tested algorithms for identifying idiopathic ALI in healthcare claims data based on liver-related laboratory tests and ICD-9 diagnostic codes produce a large number of false positives. Without further validation, these algorithms would thus be suboptimal for use by pharmacoepidemiologists interested in conducting research on drug-related liver injury using claims data. Further algorithm refinement and exploration of other methods are needed to accurately identify this patient population for use in drug safety research.

  • Validity of Race, Ethnicity, and National Origin in Population-based Cancer Registries and Rapid Case Ascertainment Enhanced With a Spanish Surname List

    Author Lisa C. Clarke
    Author Rudolph P. Rull
    Author John Z. Ayanian
    Author Robert Boer
    Author Dennis Deapen
    Author Dee W. West
    Author Katherine L. Kahn
    Publication Medical Care
    Date Aug 9, 2013
    Abstract BACKGROUND:: Accurate information regarding race, ethnicity, and national origins is critical for identifying disparities in the cancer burden. OBJECTIVES:: To examine the use of a Spanish surname list to improve the quality of race-related information obtained from rapid case ascertainment (RCA) and to estimate the accuracy of race-related information obtained from cancer registry records collected by routine reporting. SUBJECTS:: Self-reported survey responses of 3954 participants from California enrolled in the Cancer Care Outcomes Research and Surveillance Consortium. MEASURES:: Sensitivity, specificity, positive predictive value, and percent agreement. We used logistic regression to identify predictors of underreporting and overreporting of a race/ethnicity. RESULTS:: Use of the Spanish surname list increased the sensitivity of RCA for Latino ethnicity from 37% to 83%. Sensitivity for cancer registry records collected by routine reporting was ≥95% for whites, blacks, and Asians, and specificity was high for all groups (86%-100%). However, patterns of misclassification by race/ethnicity were found that could lead to biased cancer statistics for specific race/ethnicities. Discordance between self-reported and registry-reported race/ethnicity was more likely for women, Latinos, and Asians. CONCLUSIONS:: Methods to improve race and ethnicity data, such as using Spanish surnames in RCA and instituting data collection guidelines for hospitals, are needed to ensure minorities are accurately represented in clinical and epidemiological research.

    Notes:

    • Misclassification of data and missing data is an important problem in observational studies when the population of interest is diverse multiethnic population. Accurate demographic information is pertinent while identifying disparities in cancer burden. Rapid Case Ascertainment (RCA) is an important method of assessment of cancer cases soon after diagnosis which are used by cancer registries in US. However the medical data using RCA is not quality controlled and often have issues with demographic information and misidentification of American Indian/Alaska Native race, Latino (or Hispanic) ethnicity. There is potential  that medical data reporting can be improved by using surname lists.

      The authors suggest a combination approach of using the methodology used by RCA using a Spanish surname list which was compared with the general approaches used by the race, ethnicity, and national origin as used by the cancer registries. Both methodologies used self-reported race/ethnicity as gold standards  

      The Cancer Care Outcomes Research and Surveillance (CanCORS) Consortium was used as the reference database for this study. During baseline telephonic interviews, trained interviewers collected key self-reported data on patient demographics, treatment, health history and other characteristics. Race/ethnicity was initially obtained using RCA and later assessed during the telephone interview. For those patients for whom racial information was not collected last name was used to assist in the identification of race/ethnicity

      Using self-reported race/ethnicity as the reference or gold standard, RCA supplemented with a Spanish surname list increased sensitivity of recorded Latino ethnicity from 37% to 83% and decreased specificity from 98% to 95%. Specificity also increased for other races. In general, agreement of race/ethnicity between self-report and cancer registry records was generally good, but was quite low for some Asian/Pacific Islander and Latino origins and, especially, for American Indian/Alaska Native race. There was discordance reported between self-reported and registry-reported race/ethnicity more likely for women, Latinas, and Asians

      The findings of the study provide a strong foundation in pharmacoepidemiology research for investigators using RCA and registry data to study cancer-related disparities. Patterns of misclassification by race/ ethnicity can lead to biased cancer statistics for specific race/ethnicities. The results and methodologies reported could be used to standardize collection of race and ethnicity data to circumvent the problem of racial and ethnic disparities for a variety of health disparities especially quality of cancer care and cancer outcomes.

  • WebDISCO: a web service for distributed cox model learning without patient-level data sharing

    Author Chia-Lun Lu
    Author Shuang Wang
    Author Zhanglong Ji
    Author Yuan Wu
    Author Li Xiong
    Author Xiaoqian Jiang
    Author Lucila Ohno-Machado
    Volume 22
    Issue 6
    Pages 1212-1219
    Publication Journal of the American Medical Informatics Association
    Date 2015/11/01
    Abstract Objective The Cox proportional hazards model is a widely used method for analyzing survival data. To achieve sufficient statistical power in a survival analysis, it usually requires a large amount of data. Data sharing across institutions could be a potential workaround for providing this added power.Methods and materials The authors develop a web service for distributed Cox model learning (WebDISCO), which focuses on the proof-of-concept and algorithm development for federated survival analysis. The sensitive patient-level data can be processed locally and only the less-sensitive intermediate statistics are exchanged to build a global Cox model. Mathematical derivation shows that the proposed distributed algorithm is identical to the centralized Cox model.Results The authors evaluated the proposed framework at the University of California, San Diego (UCSD), Emory, and Duke. The experimental results show that both distributed and centralized models result in near-identical model coefficients with differences in the range 10−15<mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" display="inline"><mml:mrow><mml:msup><mml:mrow><mml:mn>10</mml:mn></mml:mrow><mml:mrow><mml:mo>−</mml:mo><mml:mn>15</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math> to 10−12<mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" display="inline"><mml:mrow><mml:msup><mml:mrow><mml:mn>10</mml:mn></mml:mrow><mml:mrow><mml:mo>−</mml:mo><mml:mn>12</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math>. The results confirm the mathematical derivation and show that the implementation of the distributed model can achieve the same results as the centralized implementation.Limitation The proposed method serves as a proof of concept, in which a publicly available dataset was used to evaluate the performance. The authors do not intend to suggest that this method can resolve policy and engineering issues related to the federated use of institutional data, but they should serve as evidence of the technical feasibility of the proposed approach.Conclusions WebDISCO (Web-based Distributed Cox Regression Model; https://webdisco.ucsd-dbmi.org:8443/cox/) provides a proof-of-concept web service that implements a distributed algorithm to conduct distributed survival analysis without sharing patient level data.

Added June 2016

Author

Patrick Aboagye-Sarfo

Author

Qun Mai

Author

Frank M. Sanfilippo

Author

David B. Preen

Author

Louise M. Stewart

Author

Daniel M. Fatovich

Volume

57

Pages

62-73

Publication

Journal of Biomedical Informatics

Date

2015-10-01

Abstract

Objective To develop multivariate vector-ARMA (VARMA) forecast models for predicting emergency department (ED) demand in Western Australia (WA) and compare them to the benchmark univariate autoregressive moving average (ARMA) and Winters’ models. Methods Seven-year monthly WA state-wide public hospital ED presentation data from 2006/07 to 2012/13 were modelled. Graphical and VARMA modelling methods were used for descriptive analysis and model fitting. The VARMA models were compared to the benchmark univariate ARMA and Winters’ models to determine their accuracy to predict ED demand. The best models were evaluated by using error correction methods for accuracy. Results Descriptive analysis of all the dependent variables showed an increasing pattern of ED use with seasonal trends over time. The VARMA models provided a more precise and accurate forecast with smaller confidence intervals and better measures of accuracy in predicting ED demand in WA than the ARMA and Winters’ method. Conclusion VARMA models are a reliable forecasting method to predict ED demand for strategic planning and resource allocation. While the ARMA models are a closely competing alternative, they under-estimated future ED demand.

  • Notes:
    • In this study, the authors compared the accuracy and validity of multivariate vector autoregressive moving average (VARMA) forecast models against ARMA and Winter’s models to predict emergency department (ED) demand in Western Australia through time-series analyses of monthly ED visits from state-wide hospitals over a seven year period.  The dataset was split into the test data and validation data to test the accuracy and validity of the three methods.  The VARMA model allows for several dependent time series to be modeled together and accounts for both cross- and within-correlations of the series. Unlike VARMA models, ARMA models only allow one time series to be modeled at a time while the Winter’s method uses a triple exponential smoothing process. Time series plots are presented for the variables age group, place of treatment, treatment category, and disposition by each method. All three methods were able to forecast the direction and magnitude of ED demand with reasonable monthly forecast errors. The VARMA models performed the best, followed by ARMA.  However, the Winters’ method was not accurate and produced poor forecast for certain variables. VARMA modeling may a useful technique for health-related time series data that takes into account the inter-relationships for all covariates, for example, time series studies in healthcare utilization.

Author

Zeynep Kalaylioglu

Author

Haydar Demirhan

Publication

Statistical methods in medical research

Date

2015 Nov 6

Abstract

Joint mixed modeling is an attractive approach for the analysis of a scalar response measured at a primary endpoint and longitudinal measurements on a covariate. In the standard Bayesian analysis of these models, measurement error variance and the variance/covariance of random effects are a priori modeled independently. The key point is that these variances cannot be assumed independent given the total variation in a response. This article presents a joint Bayesian analysis in which these variance terms are a priori modeled jointly. Simulations illustrate that analysis with multivariate variance prior in general lead to reduced bias (smaller relative bias) and improved efficiency (smaller interquartile range) in the posterior inference compared with the analysis with independent variance priors.

  • Notes:
    • In a standard approach, when a primary endpoint and longitudinal measurements on a covariate are modeled, error variance and variance/covariance of random effects are independently modeled, but those variances cannot be independent from the total variation in a response. This study presents a  Bayesian approach where variance terms are modeled jointly. The Bayesian approach incorporated prior knowledge - and distributional assumptions - about coefficients for cross-sectional covariates and unknown regression coefficients for random effects in addition to a traditional random effect model. When a joint Bayesian analysis was performed, this research assigned prior distributions to the decomposed variance-covariance matrix where the Cholesky method was utilized. A simulation showed that the joint prior approach led to smaller relative bias and narrower interquartile range than the standard approach.The approach was applied to analyze the association between HbA1c and obstetric labor complication among women with diabetes mellitus. Overall, the results from the joint variance prior and independent variance priors were generally similar.For a large variance-covariance matrix, computational complexities may arise. Nevertheless, with reduced bias and improved efficiency, the proposed approach is suitable when the dimension of random/latent terms is moderate.   

Author

D. Leann Long

Author

John S. Preisser

Author

Amy H. Herring

Author

Carol E. Golin

Volume

64

Issue

5

Pages

815-830

Publication

Journal of the Royal Statistical Society: Series C (Applied Statistics)

Date

November 1, 2015

Abstract

Public health research often concerns relationships between exposures and correlated count outcomes. When counts exhibit more 0s than expected under Poisson sampling, the zero-inflated Poisson (ZIP) model with random effects may be used. However, the latent class formulation of the ZIP model can make marginal inference on the population sampled challenging. The paper presents a marginalized ZIP model with random effects to model directly the mean of the mixture distribution consisting of ‘susceptible’ individuals and excess 0s, providing straightforward inference for overall exposure effects. Simulations evaluate finite sample properties, and the new methods are applied to a motivational interviewing-based safer sex intervention trial, designed to reduce the number of unprotected sexual acts, to illustrate the new methods.

  • Notes:
    • In exposure-outcome studies where the outcome is correlated count data (e.g. repeat measures for same subject), sometimes counts can have more zeros than expected under the Poisson distribution. When this occurs, the zero-inflated Poisson (ZIP) model with random effects may be used; however, results can be difficult to interpret since it produces subject-specific instead of marginalized estimates. The author proposes using the marginalized ZIP model with random effects instead, so more straightforward inference for overall exposure effects can be made. A simulation study was done to illustrate the properties of the marginalized ZIP model with random effects using data from the SafeTalk trial.  The SafeTalk clinical study aimed to reduce the number of unprotected sexual acts (measured at three time points during follow-up) by HIV positive patients through a motivational interviewing safe-sex intervention.  The authors noted that the marginalized ZIP model produced lower relative median bias in most cases as well as appropriate coverage, while the ZIP model had much smaller coverage probabilities, indicating that it underestimated the variability due to having too many zeros.

Author

David E. Clark

Author

Kaitlin R. Ostrander

Author

Brad M. Cushing

Publication

Health services research

Date

2015 Oct 20

Abstract

OBJECTIVE: Simultaneously evaluate postoperative mortality, length of stay (LOS), and readmission. DATA SOURCE: National Surgical Quality Improvement Program (NSQIP). DESIGN: Retrospective cohort. METHODS: Data from elective general surgical patients were obtained from the 2012 NSQIP Participant Use File. For each postoperative day, each patient's state was classified as index hospitalization, discharged home, discharged to long-term care (LTC), readmitted, or dead. Transition rates were estimated using exponential regression, assuming constant rates for specified time periods. These estimates were combined into a multistate model, simulated results of which were compared to observed outcomes. FINDINGS: Age, comorbidities, more complex procedures, and longer index LOS were associated with lower rates of discharge home and higher rates of death, discharge to LTC, and readmission. The longer patients had been discharged, the less likely they were to die or be readmitted. The model predicted 30-day mortality 0.38 percent (95 percent CI: 0.36-0.41), index LOS 2.85 days (95 percent CI: 2.83-2.86), LTC discharge 2.76 percent (95 percent CI: 2.69-2.82), and readmissions 5.53 percent (95 percent CI: 5.43-5.62); observed values were 0.39 percent, 2.82 days, 2.87 percent, and 5.70 percent, respectively. CONCLUSIONS: Multistate models can simultaneously predict postoperative mortality, LOS, discharge destination, and readmissions, which allows multidimensional comparison of surgical outcomes.

  • Notes:
    • In this study, the researchers developed a "multistate" or "compartmental" model that can simultaneously estimate the effects of covariates on each transition of a patient's postoperative status, and thus enable prediction of cumulative outcomes (mortality, LOS, and readmission) that reflect overall health and cost. The multistate model was constructed by combining the estimates obtained from "piecewise exponential" (PEW) models for each state transition. Compared to separate analyses of each outcome, this approach offers the potential for a more useful multidimensional comparison of surgical outcomes for different hospitals and regions.

Author

Julian Wolfson

Author

Sunayan Bandyopadhyay

Author

Mohamed Elidrisi

Author

Gabriela Vazquez-Benitez

Author

David M. Vock

Author

Donald Musgrove

Author

Gediminas Adomavicius

Author

Paul E. Johnson

Author

Patrick J. O'Connor

Volume

34

Issue

21

Pages

2941-2957

Publication

Statistics in Medicine

Date

September 20, 2015

Abstract

Predicting an individual's risk of experiencing a future clinical outcome is a statistical task with important consequences for both practicing clinicians and public health experts. Modern observational databases such as electronic health records provide an alternative to the longitudinal cohort studies traditionally used to construct risk models, bringing with them both opportunities and challenges. Large sample sizes and detailed covariate histories enable the use of sophisticated machine learning techniques to uncover complex associations and interactions, but observational databases are often ‘messy’, with high levels of missing data and incomplete patient follow-up. In this paper, we propose an adaptation of the well-known Naive Bayes machine learning approach to time-to-event outcomes subject to censoring. We compare the predictive performance of our method with the Cox proportional hazards model which is commonly used for risk prediction in healthcare populations, and illustrate its application to prediction of cardiovascular risk using an electronic health record dataset from a large Midwest integrated healthcare system. Copyright © 2015 John Wiley & Sons, Ltd.

  • Notes:
    • Longitudinal cohort studies (e.g., Framingham Heart Study) have been traditionally used to construct risk models. However, this kind of studies is costly and may be limited in their ability to describe unique features of contemporary patient populations, and to predict risk over shorter time horizons when the event rate is low. An alternative method is to use electronic health data for risk prediction. This paper is important because it potentially addressed the limitations of using modern observational database: messy database with high levels of missing data and incomplete follow-up. The researchers proposed an adaption of the Naive Bayes machine learning approach to time-to-event outcomes subject to censoring. They also compared the predictive performance of this method with the Cox proportional hazards model and illustrated its application to predicting CV risk using electronic health record data.

Author

Emmanuel Caruana

Author

Sylvie Chevret

Author

Matthieu Resche-Rigon

Author

Romain Pirracchio

Publication

Journal of Clinical Epidemiology

Date

May 1, 2015

Abstract

OBJECTIVES: The propensity score (PS) is a balancing score. Following PS matching, balance checking usually relies on estimating separately the standardized absolute mean difference for each baseline characteristic. The average standardized absolute mean difference and the Mahalanobis distances have been proposed to summarize the information across the covariates. However, they might be minimized when nondesirable variables such as instrumental variables (IV) are included in the PS model. We propose a new weighted summary balance measure that takes into account, for each covariate, its strength of association with the outcome. STUDY DESIGN AND SETTING: This new measure was evaluated using a simulation study to assess whether minimization of the measure coincided with minimally biased estimates. All measures were then applied to a real data set from an observational cohort study. RESULTS: Contrarily to the other measures, our proposal was minimized when including the confounders, which coincided with minimal bias and mean squared error, but increased when including an IV in the PS model. Similar findings were observed in the real data set. CONCLUSION: A balance measure taking into account the strength of association between the covariates and the outcome may be helpful to identify the most parsimonious PS model.

  • Notes:
    • This paper details a method which can be used to select the most pertinent variables to match upon when using propensity score matching for the most efficient model.  The method was proposed as a solution to using non-parsimonious models or inclusion of instrumental variables in the propensity model.  In the proposed method each standardized mean difference is weighted by its prognostic value and is used to calculates an estimator by summing across variables (equation 1). The prognostic value is calculated by multiplying the maximum likelihood coefficient of the variable by the standard deviation of the variable in the original cohort (equation 2).  The model with a minimized weighted value is considered the most efficient. This method was evaluated in both a simulated and real dataset.  For the simulated data set, the propensity score matching performance on bias, balance of covariates, and variance of balance measures was evaluated. The propensity score model which was predicted to be the most efficient performed most favorably on the performance characteristics.  For the real data set, there were 2 propensity scoring strategies with minimized weighted values, but the best model varied based on the choice of balance measures. This method could be useful in choosing variables for propensity score matching; however, as seen in the clinical, real data example, the choice of model may not be clear-cut. The authors recommend using a balance of model choice and clinical understanding of the disease state to choose an appropriate model. 

Author

Alden L. Gross

Author

Melinda C. Power

Author

Marilyn S. Albert

Author

Jennifer A. Deal

Author

Rebecca F. Gottesman

Author

Michael Griswold

Author

Lisa M. Wruck

Author

Thomas H. Mosley

Author

Josef Coresh

Author

A. Richey Sharrett

Author

Karen Bandeen-Roche

Volume

26

Issue

6

Pages

878-887

Publication

Epidemiology

Date

11/2015

  • Notes:
    • When outcome measurement tools have changed or have been updated over time, traditional approaches cannot properly analyze longitudinally collected data. This study proposed latent variable methods with the use of factor analysis to link different cognitive test results which have been collected longitudinally. The study analyzed data collected by the Atherosclerosis Risk in Communities Neurocognitive study (ARIC NCS) where the primary objective included measuring vascular risk factors and cognitive decline. The same set of tests was performed during the first and second data collection periods, but an updated version was administered during the third data collection phase. For the longitudinal cognitive decline analysis, the authors performed factor analysis structuring common covariation. Including the test results from the third data collection periods were achieved in two ways, average of standardized scores approach (i.e., Z-score based approach) and confirmatory factor analysis approach (i.e., latent variable approach). The errors in measuring cognitive decline and the associations between the cognitive decline estimates and diabetes were compared between the two approaches. Overall, the latent variable approaches accounted for errors in measuring cognitive traits better than the average of the standardized score approach. The estimatied association between diabetes and cognitive function change was more accurate with the latent variable approach. The method could be used to analyze other longitudinally collected patient-reported outcome measures where the actual instrument changes over time.

Author

Qifan Song

Author

Faming Liang

Volume

77

Issue

5

Pages

947-972

Publication

Journal of the Royal Statistical Society: Series B (Statistical Methodology)

Date

November 1, 2015

Abstract

We propose a Bayesian variable selection approach for ultrahigh dimensional linear regression based on the strategy of split and merge. The approach proposed consists of two stages: split the ultrahigh dimensional data set into a number of lower dimensional subsets and select relevant variables from each of the subsets, and aggregate the variables selected from each subset and then select relevant variables from the aggregated data set. Since the approach proposed has an embarrassingly parallel structure, it can be easily implemented in a parallel architecture and applied to big data problems with millions or more of explanatory variables. Under mild conditions, we show that the approach proposed is consistent, i.e. the true explanatory variables can be correctly identified by the approach as the sample size becomes large. Extensive comparisons of the approach proposed have been made with penalized likelihood approaches, such as the lasso, elastic net, sure independence screening and iterative sure independence screening. The numerical results show that the approach proposed generally outperforms penalized likelihood approaches: the models selected by the approach tend to be more sparse and closer to the true model.

  • Notes:
    • Variable selection for ultrahigh dimensional regression (i.e., the number of explanatory variables is significantly larger than the number of subjects) is problematic. The difficulty in variable selection has previously been handled with minimizing the penalized likelihood function. Several proposed methods trying to minimize penalty function do not perform well when the number of potential predictors is very large, which could be addressed in the two advanced methods, independence screening (SIS) approach and iterative SIS (ISIS). However, the methods are still difficult to implement in big data. In order to overcome the limitations, this study proposed a Bayesian variable selection based on the strategy of the split-and-merge (SAM) approach. The SAM consists of two stages. The first step splits the ultrahigh dimensional data into multiple lower dimensional subsets where Bayesian predictor selection is performed. In the first step, each predictor has a prior probability to be included in the model in which the maximum number of predictor variables needs to be pre-specified. Conditional on the prior probability, the posterior probability densities of the responses and the predictors, with the vector of regression coefficients, are calculated. The purpose of this first step is screening extraneous variables out from the final model. Selected variables from the first step are aggregated at the second stage of the model building where a Bayesian variable selection is performed again for the aggregated dataset. The SAM approach showed lower false selection rates and negative selection rates with a low prediction dimension compared with the SIS and ISIS, demonstrating that the SAM generally outperforms in selecting true predictors. Further, the SAM approach can partially incorporate joint information of predictors, which could not be achieved through the SIS and ISIS. Using ultrahigh dimensional data has been unusual in pharmacoepidemiology, and the benefit of using this advanced method looks marginal in a usual retrospective study. For under-explored area using big data, such as nucleotide polymorphism or metabolic phenotype information in a pharmacoepidemiologic study, however, the SAM has potential to offer value.

Author

Lai Wei

Author

John Scott

Volume

24

Issue

9

Pages

922-933

Publication

Pharmacoepidemiology and Drug Safety

Date

September 1, 2015

Abstract

Purpose Spontaneous adverse event reporting systems are critical tools for monitoring the safety of licensed medical products. Commonly used signal detection algorithms identify disproportionate product–adverse event pairs and may not be sensitive to more complex potential signals. We sought to develop a computationally tractable multivariate data-mining approach to identify product–multiple adverse event associations. Methods We describe an application of stepwise association rule mining (Step-ARM) to detect potential vaccine-symptom group associations in the US Vaccine Adverse Event Reporting System. Step-ARM identifies strong associations between one vaccine and one or more adverse events. To reduce the number of redundant association rules found by Step-ARM, we also propose a clustering method for the post-processing of association rules. Results In sample applications to a trivalent intradermal inactivated influenza virus vaccine and to measles, mumps, rubella, and varicella (MMRV) vaccine and in simulation studies, we find that Step-ARM can detect a variety of medically coherent potential vaccine-symptom group signals efficiently. In the MMRV example, Step-ARM appears to outperform univariate methods in detecting a known safety signal. Conclusions Our approach is sensitive to potentially complex signals, which may be particularly important when monitoring novel medical countermeasure products such as pandemic influenza vaccines. The post-processing clustering algorithm improves the applicability of the approach as a screening method to identify patterns that may merit further investigation. Copyright © 2015 John Wiley & Sons, Ltd.

  • Notes:
    • For spontaneous adverse event reporting systems (SRS), the commonly used signal detection algorithms can only identify bivariate associations between individual products and individual adverse events and may not be sensitive to more complex event associations. The authors proposed a stepwise association rule mining (Step-ARM) algorithm that detects potential vaccine-symptom group associations in the US Vaccine Adverse Event Reporting System (VAERS). Also, they proposed a post-processing algorithm that can find clusters of multi-symptom events associated with the vaccine of interest and can address the redundant-rule issue of association rule mining (ARM). After applying the novel method to three data examples, they suggested that Step-ARM is able to put adverse events in a medically coherent context and is appropriately sensitive to a more complex signal. The post-processing algorithm significantly reduces the number of redundant association rules and can be used to identify a set of patterns that may merit further investigation. This new method can be used in other SRS to find out the symptom groups as preferred terms which were decided by the researchers before.

Author

Kathrine Grell

Author

Peter J. Diggle

Author

Kirsten Frederiksen

Author

Joachim Schüz

Author

Elisabeth Cardis

Author

Per K. Andersen

Volume

34

Issue

23

Pages

3170-3180

Publication

Statistics in Medicine

Date

October 15, 2015

Abstract

We study methods for how to include the spatial distribution of tumours when investigating the relation between brain tumours and the exposure from radio frequency electromagnetic fields caused by mobile phone use. Our suggested point process model is adapted from studies investigating spatial aggregation of a disease around a source of potential hazard in environmental epidemiology, where now the source is the preferred ear of each phone user. In this context, the spatial distribution is a distribution over a sample of patients rather than over multiple disease cases within one geographical area. We show how the distance relation between tumour and phone can be modelled nonparametrically and, with various parametric functions, how covariates can be included in the model and how to test for the effect of distance. To illustrate the models, we apply them to a subset of the data from the Interphone Study, a large multinational case-control study on the association between brain tumours and mobile phone use. Copyright © 2015 John Wiley & Sons, Ltd.

  • Notes:
    • Spatial epidemiology is traditionally used to describe and analyze geographic variation in diseases, pertaining to environmental, demographic, socioeconomic, genetic and infection-related risk factors. This paper adopts this (point process) model in the context of human brain, investigating the spatial distribution of tumors associated with exposure to mobile phone emitted radio frequency electromagnetic fields (EF-EMF), for defining tumor locations over multiple people instead of multiple tumors in one person, in a three-dimensional space.

478 glioma cases with one voxel origins and tumor localization data were included in the study, with sample split into regular and non-regular phone users. The model assumes the spatial distribution of RF energy to be similar between the two halves of the brain, thus assuming the spatial distribution of tumors to be similar in both halves of the brain. The model tests the relation of distance to the earpoint based on side of use, hypothesizing that the effect diminishes as the distance to the phone grows larger. This model is advantageous in that it is excludes the baseline spatial distribution of tumors, and it assist in inference of including covariates in the model. Bootstrapping with replacement was performed using the tumor points data. Covariates include ‘phone use’, defined as call time (accounting for hands-free devices as well), which is a continuous covariate.

This model shows that the distance to the preferred ear of mobile phone usage was associated with brain tumor intensity in those subjects who were mobile phone users. The advantage of this model is that unlike other models that localize tumors to a region of the brain, this model helps pin-point exact tumor location in relation to mobile phone usage.

Author

Haftom T. Abebe

Author

Frans Es Tan

Author

Gerard Jp van Breukelen

Author

Martijn Pf Berger

Volume

24

Issue

5

Pages

594-611

Publication

Statistical methods in medical research

Date

2015 Oct

Abstract

In medicine and health sciences, binary outcomes are often measured repeatedly to study their change over time. A problem for such studies is that designs with an optimal efficiency for some parameter values may not be efficient for other values. To handle this problem, we propose Bayesian designs which formally account for the uncertainty in the parameter values for a mixed logistic model which allows quadratic changes over time. Bayesian D-optimal allocations of time points are computed for different priors, costs, covariance structures and values of the autocorrelation. Our results show that the optimal number of time points increases with the subject-to-measurement cost ratio, and that neither the optimal number of time points nor the optimal allocations of time points appear to depend strongly on the prior, the covariance structure or on the size of the autocorrelation. It also appears that for subject-to-measurement cost ratios up to five, four equidistant time points, and for larger cost ratios, five or six equidistant time points are highly efficient. Our results are compared with the actual design of a respiratory infection study in Indonesia and it is shown that, selection of a Bayesian optimal design will increase efficiency, especially for small cost ratios.

  • Notes:
    • The authors first pointed out the concerns of equal distance of time points for outcome measurements and linear assumption of response over time in longitudinal studies. This study aimed to use a Bayesian approach to identify the optimal number of time points and allocation of distinct time points in longitudinal studies with repeated measurement of a binary outcome, taking into account the cost of measurement. Specifically, the authors used “Bayesian D-optimality designs for logistic mixed-effects model with quadratic time effects and autocorrelations between the repeated measurements.” They found that the estimates from the Bayesian D-optimality designs were not sensitive to different starting points and were hardly affected by the value of autocorrelations. In addition, the Bayesian D-optimality designs generated very similar results under different variation of prior and covariance structures. The optimal number of time points increased with the increase of cost ratio. The authors found that the relative efficiency was highest (>0.96) when the number of equidistant time points was equal to the optimal number of time points. For a cost ratio ≤ 5, four equidistant time points were highly efficient; while for a cost ratio >5, five or six equidistant time points were highly efficient (>0.95). They also applied this Bayesian D-optimality design to an empirical study and found that when cost ratio was small, use of six equidistant time points in the original study design lost efficiency at most 24% compare to the Bayesian D-optimality design. The authors concluded that use of Bayesian D-optimality design is effective and robust for determining the optimal time points for repeated binary outcome measurement in longitudinal studies.

Author

Mehreteab Aregay

Author

Andrew B. Lawson

Author

Christel Faes

Author

Russell S. Kirby

Publication

Statistical methods in medical research

Date

2015 Sep 29

Abstract

In disease mapping, a scale effect due to an aggregation of data from a finer resolution level to a coarser level is a common phenomenon. This article addresses this issue using a hierarchical Bayesian modeling framework. We propose four different multiscale models. The first two models use a shared random effect that the finer level inherits from the coarser level. The third model assumes two independent convolution models at the finer and coarser levels. The fourth model applies a convolution model at the finer level, but the relative risk at the coarser level is obtained by aggregating the estimates at the finer level. We compare the models using the deviance information criterion (DIC) and Watanabe-Akaike information criterion (WAIC) that are applied to real and simulated data. The results indicate that the models with shared random effects outperform the other models on a range of criteria.

  • Notes:
    • Aregay 2015 evaluates 4 multiscale models for the scale effect seen when details from finer geographical level is aggregated to form coarser geographical levels and as a result, the variation in the finer geographical level may be lost.  Models 1 and 2 are similar, both are poisson models in which spatial association is defined at the coarser (unit) level and assumed to be same at the finer level (i.e. shared spatially correlated random effects), but model 2 has an additional spatially structured random effect at the finer level included in addition.  Model 3 assumes that the finer and coarser levels are independent and uses 2 separate convolution models, 1 for each level.  Model 4 uses a convolution model at the finer level and the risk at the coarser level is calculated by the sum of the finer level effects.  These models were applied to both simulation and real-world data - models 1 and 2 consistently outperform models 3 and 4 in the simulation data; model 1 also converge most quickly in application to real-world data, but the models do not perform significantly differently otherwise. 

Author

Yanxun Xu

Author

Peter Müller

Author

Abdus S. Wahed

Author

Peter F. Thall

Pages

1-46

Publication

Journal of the American Statistical Association

Date

September 30, 2015

  • Notes:
    • Frequentist approaches to analyzing initial therapies are frequenlty implemented techniques to estimate clinical outcomes such as overall survival associated with a given treatment. However, when a dataset or a trial involves multi-stage regimens, the results from the traditional approaches do not represent possible scenarios well. Through multiple simulations, this study showed that the Bayesian nonparametric (BNP) estimation for dynamic treatment regimes (DTR) is feasible approach to analyze multi-stage chemotherapy trial data. The main outcome of this study was mean survival time for each direct treatment regime. The baseline covariates and non-random treatment assignments were addressed in a dependent Dirichlet process prior and Gaussian process base measure (DPP-GP) where priors for empirical Bayesian parameters were determined by doing preliminary fits for each transition. The estimates were used to calculate the transition time between two cancer stages. The sum of the transition time estimates represented the overall survival for each DTR in each simulation.The study compared BNP-DTR approach to invers probability of treatment weighting (IPTW) approach and augmented IPTW (AIPTW). Simulations using leukemia data demonstrated that the BNP-DTR estimated the overall survival closer to the true overall survival than the IPTW and AIPTW. Also, BNP-DTR estimation more precisely simulated the clinical outcome than the IPTW and AIPTW. 

Author

Aaron T. Porter

Author

Scott H. Holan

Author

Christopher K. Wikle

Volume

165

Pages

78-90

Publication

Journal of Statistical Planning and Inference

Date

October 2015

Abstract

We introduce a general hierarchical Bayesian framework that incorporates a flexible nonparametric data model specification through the use of empirical likelihood methodology, which we term semiparametric hierarchical empirical likelihood (SHEL) models. Although general dependence structures can be readily accommodated, we focus on spatial modeling, a relatively underdeveloped area in the empirical likelihood literature. Importantly, the models we develop naturally accommodate spatial association on irregular lattices and irregularly spaced point-referenced data. We illustrate our proposed framework by means of a simulation study and through three real data examples. First, we develop a spatial Fay–Herriot model in the SHEL framework and apply it to the problem of small area estimation in the American Community Survey. Next, we illustrate the SHEL model in the context of areal data (on an irregular lattice) through the North Carolina sudden infant death syndrome (SIDS) dataset. Finally, we analyze a point-referenced dataset from the North American Breeding Bird Survey that considers dove counts for the state of Missouri. In all cases, we demonstrate superior performance of our model, in terms of mean squared prediction error, over standard parametric analyses.

  • Notes:
    • The Bayesian hierarchical modeling (BHM) framework consists of at least three levels of model hierarchy, which are the data model, the process model, and the parameter model. Modeling typically proceeds by selecting a parametric distribution for each hierarchy. BHM has been used extensively, but the application of the framework to empirical likelihood (EL) has been largely underdeveloped. This study incorporated EL and semiparametric BHM. This semiparametric hierarchical empirical likelihood (SHEL) was applied to solve the problem of spatial data analysis. By using EL, the proposed approach allowed the data to be modeled directly, which avoided inappropriate transformation of the data to the model distribution. In addition, the hierarchical approach enabled incorporating the dependence among outcomes in a spatial or temporal setting in a lower hierarchy by specifying structural dependence in a higher order model. The approach was implemented to analyze the North Carolina Sudden Infant Death Syndrome (NC-SIDS) dataset and the North American Bleeding Birds Survey data. By using the model where the correlation between spatial regions with a Bayesian framework, the SHEL approach precisely – smaller variance - estimated the model parameters (posterior medians) with smaller mean squared prediction error compared to the parametric model based approach. The SHEL model can address spatial correlation and handled spatial point-referenced data, which achieved better performance than parametric approach. 

Author

Thomas R. Sullivan

Author

Amy B. Salter

Author

Philip Ryan

Author

Katherine J. Lee

Volume

182

Issue

6

Pages

528-534

Publication

American Journal of Epidemiology

Date

09/15/2015

Abstract

Multiple imputation (MI) is increasingly being used to handle missing data in epidemiologic research. When data on both the exposure and the outcome are missing, an alternative to standard MI is the “multiple imputation, then deletion” (MID) method, which involves deleting imputed outcomes prior to analysis. While MID has been shown to provide efficiency gains over standard MI when analysis and imputation models are the same, the performance of MID in the presence of auxiliary variables for the incomplete outcome is not well understood. Using simulated data, we evaluated the performance of standard MI and MID in regression settings where data were missing on both the outcome and the exposure and where an auxiliary variable associated with the incomplete outcome was included in the imputation model. When the auxiliary variable was unrelated to missingness in the outcome, both standard MI and MID produced negligible bias when estimating regression parameters, with standard MI being more efficient in most settings. However, when the auxiliary variable was also associated with missingness in the outcome, alarmingly MID produced markedly biased parameter estimates. On the basis of these results, we recommend that researchers use standard MI rather than MID in the presence of auxiliary variables associated with an incomplete outcome.

  • Notes:
    • Multiple imputation (MI) is used to handle missing data in research. Another, more recent method is called "multiple imputations, then deletion" or MID; MID uses the same imputation methods as MI, but for analysis of the data, only uses those cases in which the outcome was observed (i.e. deleting the cases with imputed outcomes because those do not contribute any additional information after imputation).  The authors wished to evaluate the performance of standard MI and MID in regressions which had both missing outcomes and exposures with auxiliary variables (those associated with the outcome) included in the model.  When the auxiliary variable included in the model was not related to missingness, MI and MID performed similarly in terms of bias, with MI often performing marginally better. If the auxiliary variable is related to the outcome being unobserved, i.e. missing, MI performed better than MID; thus, MI should be used when the auxiliary variable included in the model is associated with the outcome being unobserved. 

Author

Robert Leaman

Author

Ritu Khare

Author

Zhiyong Lu

Volume

57

Pages

28-37

Publication

Journal of Biomedical Informatics

Date

October 2015

Abstract

Background Identifying key variables such as disorders within the clinical narratives in electronic health records has wide-ranging applications within clinical practice and biomedical research. Previous research has demonstrated reduced performance of disorder named entity recognition (NER) and normalization (or grounding) in clinical narratives than in biomedical publications. In this work, we aim to identify the cause for this performance difference and introduce general solutions. Methods We use closure properties to compare the richness of the vocabulary in clinical narrative text to biomedical publications. We approach both disorder NER and normalization using machine learning methodologies. Our NER methodology is based on linear-chain conditional random fields with a rich feature approach, and we introduce several improvements to enhance the lexical knowledge of the NER system. Our normalization method – never previously applied to clinical data – uses pairwise learning to rank to automatically learn term variation directly from the training data. Results We find that while the size of the overall vocabulary is similar between clinical narrative and biomedical publications, clinical narrative uses a richer terminology to describe disorders than publications. We apply our system, DNorm-C, to locate disorder mentions and in the clinical narratives from the recent ShARe/CLEF eHealth Task. For NER (strict span-only), our system achieves precision = 0.797, recall = 0.713, f-score = 0.753. For the normalization task (strict span + concept) it achieves precision = 0.712, recall = 0.637, f-score = 0.672. The improvements described in this article increase the NER f-score by 0.039 and the normalization f-score by 0.036. We also describe a high recall version of the NER, which increases the normalization recall to as high as 0.744, albeit with reduced precision. Discussion We perform an error analysis, demonstrating that NER errors outnumber normalization errors by more than 4-to-1. Abbreviations and acronyms are found to be frequent causes of error, in addition to the mentions the annotators were not able to identify within the scope of the controlled vocabulary. Conclusion Disorder mentions in text from clinical narratives use a rich vocabulary that results in high term variation, which we believe to be one of the primary causes of reduced performance in clinical narrative. We show that pairwise learning to rank offers high performance in this context, and introduce several lexical enhancements – generalizable to other clinical NER tasks – that improve the ability of the NER system to handle this variation. DNorm-C is a high performing, open source system for disorders in clinical text, and a promising step toward NER and normalization methods that are trainable to a wide variety of domains and entities. (DNorm-C is open source software, and is available with a trained model at the DNorm demonstration website: http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/tmTools/#DNorm.)

  • Notes:
    • In clinical natural language processing, a common task to try to identify key clinical variables such as diagnoses and treatments; however, disorder (disease) name entity recognition (NER) and normalization in clinical narratives is often poor when compared to applications in biomedical publications. In this paper, the authors describe a way to identify and normalize all disorders mentioned in a clinical narrative using DNorm, a machine learning method for learning the similarity between mentions and concept names from a specified controlled vocabulary directly from training data.  They also compare the performance of this method in biomedical publications and clinical narratives, and evaluate reasons for underperformance in clinical notes vs. biomedical publications. Errors made by DNorm on the test set was manually analyzed and classified under normalization errors and NER errors (false positives, false negatives, and boundary errors). In this smaller dataset of 299 clinical notes, comparative analysis of language use in clinical narrative and biomedical publications showed that there were no differences in the total number of different words used; however, clinical narrative text used a larger number of phrases to refer to a larger variety of different disorders.  Most normalization errors revealed were between related concepts (e.g. liver cirrhosis and liver cirrhosis), NER errors were more frequent and there were more false negatives (annotated disorder that was not matched) and boundary errors than false positives.  DNorm is available as an open source system.

Author

Cuizhen Niu

Author

Xu Guo

Author

Wangli Xu

Author

Lixing Zhu

Volume

168

Pages

1-19

Publication

Journal of Statistical Planning and Inference

Date

January 2016

Abstract

The partial linear regression model is wildly used due to its well established theories, flexibility and easy interpretation. This paper aims to investigate the specification test of nonparametric component in partial linear model with response missing at random. Two quadratic conditional moment tests are proposed and both two test statistics own limiting normal distributions when nonparametric component is correctly specified. Our tests’ virtue is that p -values can be easily determined based on limiting null distributions which are intractable for existing tests. The tests can detect the alternative hypotheses distinct from the null hypothesis at the optimal nonparametric rate for local smoothing-based methods. Simulation studies reveal that our tests can control type I error well and have excellent power performance. A HIV clinical trial real data is analyzed for illustrating our methods.

  • Notes:

The partial linear regression model (PLM) is used as an alternative to multiple linear regression to overcome problems such as over-fitting when there are many explanatory variables. PLM is a useful compromise between completely parametric and nonparametric models. However, in the event of a response variable missing at random, model checking and estimating regression coefficients and goodness of fit measures becomes a problem in PLM. Researchers have used test methods to judge whether the goodness of fit measure can be fit by a certain parametric or nonparametric model before further analysis. However, current methods make it difficult to determine p-values for these test methods. The authors developed two quadratic conditional moment test statistics. These are smoothing-based tests that have normal distributions as the limiting null distributions making it easy to compute p-values. These tests operate under the null hypothesis that the goodness of fit statistic can be estimated using parametric methods. Based on resulting p-values, researchers facing a similar problem can decide whether they should use parametric versus non-parametric approaches. While these tests are novel for their ease of computing p-values, their application in pharmacoepidemiology might be limited. These tests were developed for a specific use-case (modeling in the event of response variable missing at random) which might not be transferable to pharmacoepidemiology.   

Author

Junhui Wang

Author

Xiaotong Shen

Author

Yiwen Sun

Author

Annie Qu

Pages

0-00

Publication

Journal of the American Statistical Association

Date

September 11, 2015

  • Notes:
    • doi: 10.1080/01621459.2015.1089771
    • Wang et al. outline a method of natural language processing for extraction of subjective information and opinions, and use the relationship detected between the extracted information to predict how the subjective information should be classified. The authors propose using a margin-based ordinal classifier using the unstructured predictors to solve the problem of partial loss of information when unstructured information is converted into numerals and using both parallelized algorithms and large-margin classifiers to make the number of records and/or number of predictors manageable--thereby attempting to solve the 2 major problems in unstructured data analysis. The authors demonstrate the process, and show through applications (simulated and real data) that this proposed method is useful.  

Author

Tim P. Morris

Author

Ian R. White

Author

James R. Carpenter

Author

Simon J. Stanworth

Author

Patrick Royston

Volume

34

Issue

25

Pages

3298-3317

Publication

Statistics in Medicine

Date

November 10, 2015

Abstract

Multivariable fractional polynomial (MFP) models are commonly used in medical research. The datasets in which MFP models are applied often contain covariates with missing values. To handle the missing values, we describe methods for combining multiple imputation with MFP modelling, considering in turn three issues: first, how to impute so that the imputation model does not favour certain fractional polynomial (FP) models over others; second, how to estimate the FP exponents in multiply imputed data; and third, how to choose between models of differing complexity. Two imputation methods are outlined for different settings. For model selection, methods based on Wald-type statistics and weighted likelihood-ratio tests are proposed and evaluated in simulation studies. The Wald-based method is very slightly better at estimating FP exponents. Type I error rates are very similar for both methods, although slightly less well controlled than analysis of complete records; however, there is potential for substantial gains in power over the analysis of complete records. We illustrate the two methods in a dataset from five trauma registries for which a prognostic model has previously been published, contrasting the selected models with that obtained by analysing the complete records only. © 2015 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.

  • Notes:
    • Morris et al. proposes and evaluates methods for combining 2 statistical tools which are commonly used separately in medical research-fractional polynomial model (FP) and multiple imputation (MI).  Combining the 2 methods is difficult as the methods are usually incompatible in the types of statistics needed by 1 method are not generated by the other method (e.g. likelihood is needed for building an FP model but MI does not generate likelihood).  The first method combines adapted imputation methods such as substantive model compatible fully conditional specification or drawing exponents via bootstrapping and FPs.  Another method is to use log-likelihoods (all the imputed data is treated as 1 complete data set, and the FP models are fitted onto it; parameter is estimated to maximize log-likelihood) and wald statistics (maximizing wald statistics for parameter=0) to estimate the FP model. In a simulation example comparing the ability of these 2 methods for identification of the FP dimension d, although the Wald statistic method performed better than log-likelihood, the difference was minimal.  The authors also compared the 2 aforementioned methods as applied to choosing between different dimensions of FPs; they found that both stacking and Wald-statistics can be used to build multivariate FPs.  The authors also apply the discussed methods to a real-world data example.  The authors recommend the log-liklihood (also referred to as stacking) or the Wald methods over analysis of complete records, but still note that choosing the satisfactory models and imputations require care in practice.   

Author

Lu Wang

Author

Lan Xue

Volume

167

Pages

27-40

Publication

Journal of Statistical Planning and Inference

Date

December 2015

Abstract

Monotone additive models are useful in estimating productivity curve or analyzing disease risk where the predictors are known to have monotonic effects on the response. Existing literature mainly focuses on univariate monotone smoothing. Available methods for estimation of monotone additive models are either difficult to interpret or have no asymptotic guarantees. In this paper, we propose a one-step backfitted constrained polynomial spline method for monotone additive models. It is not only easy to compute by taking numerical advantages of linear programming, but also enjoys the optimal rate of convergence asymptotically. The simulation study and application of our method to Norwegian Farm data suggest that the proposed method has superior performance than the existing ones, especially when the data has outliers.

  • Notes:
    • Various estimation and inferential methods have been used to fit non-parametric and semiparametric data, including additive models.  The authors proposes using a new method - a one-step backfitted constrained polynomial spline estimator - to estimate the additive model when each additive predictor is constrained to a monotonously increasing or decreasing function.  The model approximates nonparametric components using polynomial splines, and obtains smooth monotone estimates by applying constrained polynomial spline in a backfitted procedure.  Authors provide mathematic proof and used simulation studies to evaluate the numerical performance of the proposed methods.  They also show applicability using Norwegian Farm data as an example.

Author

David C. Lee

Author

Judith A. Long

Author

Stephen P. Wall

Author

Brendan G. Carr

Author

Samantha N. Satchell

Author

R. Scott Braithwaite

Author

Brian Elbel

Volume

105

Issue

9

Pages

e67-e74

Publication

American Journal of Public Health

Date

July 16, 2015

Abstract

Objectives. We sought to improve public health surveillance by using a geographic analysis of emergency department (ED) visits to determine local chronic disease prevalence.Methods. Using an all-payer administrative database, we determined the proportion of unique ED patients with diabetes, hypertension, or asthma. We compared these rates to those determined by the New York City Community Health Survey. For diabetes prevalence, we also analyzed the fidelity of longitudinal estimates using logistic regression and determined disease burden within census tracts using geocoded addresses.Results. We identified 4.4 million unique New York City adults visiting an ED between 2009 and 2012. When we compared our emergency sample to survey data, rates of neighborhood diabetes, hypertension, and asthma prevalence were similar (correlation coefficient = 0.86, 0.88, and 0.77, respectively). In addition, our method demonstrated less year-to-year scatter and identified significant variation of disease burden within neighborhoods among census tracts.Conclusions. Our method for determining chronic disease prevalence correlates with a validated health survey and may have higher reliability over time and greater granularity at a local level. Our findings can improve public health surveillance by identifying local variation of disease prevalence.

  • Notes:
    • In 2012 the Institute of Medicine (IOM) made understanding local population health a priority in an effort to offer care to patients who need it most.  Estimating disease burden with a population-based approach is difficult due to the reliance on national telephone-based surveys which may be marred by small sample sizes or low response rates.  With the increasing availability and use of big data, coupled with the previous promise of using emergency claims data to track acute illness, there is a potential to leverage these data to detect chronic disease prevalence.  Given that, regardless of socioeconomic status, all patients can use emergency services, the use of emergency department (ED) claims offers an ideal avenue to undertake public health surveillance in determining chronic disease prevalence at a local level over time.  To ascertain chronic disease prevalence, unique patients were identified in ED claims (Statewide Planning and Research Cooperative (SPARCS)) dataset who had a diagnosis of diabetes, asthma, or hypertension between 2009 and 2012, and were compared to estimates from the New York City (NYC) Community Health Survey (CHS).  The CHS is a citywide and neighborhood level survey that identifies disease prevalence using a  telephone-based stratified random sampling to track chronic disease and health behaviors of noninstitutionalized 18 years or older NYC residents.  SPARCS is an all-payer data reporting system for all hospital charges, ambulatory-surgery procedures and ED visits in the state of New York.  The main outcome was the correlation of neighborhood, age-adjusted chronic disease prevalence between the two sources, CHS and SPARCS.  Disease prevalence was also captured by United Hospital Fund (UHF) neighborhoods.  Additionally, specific to diabetes, longitudinal trends by UHF neighborhoods were analyzed using logistic regression, as well as prevalence at the census tract level using geocoding.  Based on SPARCS, 4.4 million adults visited an ED between 2009 and 2012, and the correlation coefficients for diabetes, hypertension or asthma between the two datasets were 0.86, 0.88 and 0.77, respectively.  Longitudinal estimates of diabetes revealed large variation in the CHS, and statistically significant increases in 27 of 34 UHF neighborhoods using SPARCS.  Using a novel geographic method of determining chronic disease prevalence, ED claims correlated well to the standard for measuring population health.  This method thereby offers a mechanism to capture prevalence amongst a heterogeneous population, potentially offering more stable longitudinal estimates and greater granularity to identify clusters of disease prevalence.  Taking from this method, this approach applied to pharmacoepidemiology can inform medication utilization patterns, and potentially allow the exploration of effects of medications on outcomes.  Additionally, if connected to census or geographic data, it can aid in ascertaining regional patterns of medication use and provide insights on priorities for improvements in population health.

Author

Valerie A. Smith

Author

John S. Preisser

Publication

Statistical methods in medical research

Date

2015 Sep 1

Abstract

The marginalized two-part (MTP) model for semicontinuous data proposed by Smith et al. provides direct inference for the effect of covariates on the marginal mean of positively continuous data with zeros. This brief note addresses mischaracterizations of the MTP model by Gebregziabher et al. Additionally, the MTP model is extended to incorporate the three-parameter generalized gamma distribution, which takes many well-known distributions as special cases, including the Weibull, gamma, inverse gamma, and log-normal distributions.

  • Notes:
    • The goal of this paper was to correct misconceptions regarding a novel method of modeling marginal effects of semi-continuous data (SCD). SCD is characterized by a point mass at zero, with the remaining data typically exhibiting a right-skew. When observed as an outcome (for example, rare outcome event counts, or healthcare spending), these types of data present a unique challenge in modelling and estimating treatment effects.  Traditional methods involve two-part models which separately model the dichotomous probability of the outcome not equaling zero and the probability of the remaining, continuous outcome distribution). The interpretation of such methods is restricted, with resulting treatment effects limited to those who had a non-zero outcome, as opposed the full population. The authors previously proposed the method of marginalized two-part models (MTP) as a way of directly modelling marginal effects of covariates on the overall mean of the full data distribution. The MTP, details of which are described elsewhere, is not conditional on a positive outcome, and so can provide population-level effect estimates through modelling of the unconditional marginal mean.

In this paper, the authors clarify claims made by Gebregziabher and colleagues regarding methods of implementing and parameterizing the MTP. These clarifications, related to the unconditionality of the MTP, were necessary to ensure appropriate coding and interpretation of MTP models. Recommendations for implementing an MTP model in statistical software are also provided. In addition, the authors provide direction on extending the model to handle different variations of SCD distribution (Weibull, gamma, etc.) Though limited in scope and application, this paper provides necessary clarification for those researchers interested in leveraging MTP models when examining outcomes with a semi-continuous distribution.

Author

Wolf-Peter Schmidt

Author

Sophie Boisson

Author

Michael G. Kenward

Volume

26

Issue

6

Pages

839-845

Publication

Epidemiology

Date

11/2015

  • Notes:
    • In this study, a maximum likelihood estimator of episode duration was derived, which allowed estimating the mean duration of disease episodes in cross-sectional studies. For two consecutive days, the estimator of the mean episode duration in a population equals one plus twice the ratio of the number of subjects with the condition on both days to the number of subjects with only 1 day ill. The estimator assumes non-overlapping episodes and a time-constant incidence rate and is more precise for shorter than for longer average episode duration. This method may help to address the difficulty of close surveillance in estimating disease duration.

Author

Roee Gutman

Author

Donald B. Rubin

Volume

34

Issue

26

Pages

3381-3398

Publication

Statistics in Medicine

Date

November 20, 2015

Abstract

Estimation of causal effects in non-randomized studies comprises two distinct phases: design, without outcome data, and analysis of the outcome data according to a specified protocol. Recently, Gutman and Rubin (2013) proposed a new analysis-phase method for estimating treatment effects when the outcome is binary and there is only one covariate, which viewed causal effect estimation explicitly as a missing data problem. Here, we extend this method to situations with continuous outcomes and multiple covariates and compare it with other commonly used methods (such as matching, subclassification, weighting, and covariance adjustment). We show, using an extensive simulation, that of all methods considered, and in many of the experimental conditions examined, our new ‘multiple-imputation using two subclassification splines’ method appears to be the most efficient and has coverage levels that are closest to nominal. In addition, it can estimate finite population average causal effects as well as non-linear causal estimands. This type of analysis also allows the identification of subgroups of units for which the effect appears to be especially beneficial or harmful. Copyright © 2015 John Wiley & Sons, Ltd.

  • Notes:
    • Many methods exist for generating an unbiased causal effect estimates in observational studies, almost all of which face limitations or risk of residual bias. Gutman and Rubin have previously proposed a novel method of estimating treatment effects by leveraging multiple imputation (MI) methods to approximate a subject’s unobserved potential outcomes directly, from which a variety of treatment effects may then be calculated with theoretically minimal bias. This paper expands on the previous method, multiple imputation with two subclassification splines (MITSS), by demonstrating its application with continuous outcomes and multiple covariates. The procedure relies on a mixture of propensity scores (to account for multiple covariates), and Bayesian modelling with multiple imputation (for causal inference). It involves sub-classifying subjects into groups by propensity score (PS), then using observed data to estimate the treatment and control vector parameters within a regression spline whose knots are defined by the boundaries of the PS subgroups. These parameters are then used to conduct multiple imputation on the missing control potential outcomes (in the treatment group) and treatment potential outcomes (in the control group), out of which the treatment effect of interest can then be directly calculated.

The authors demonstrate the application of the MITSS model through simulation and applied examples. Results from multiple methods were compared, including MITSS, two expanded matching methods (M-N-m and M-C-m), full matching (FM), inverse probability weighting (IPW) and doubly robust methods (DR). Models were parametrized with both a single and multiple covariates, and each model’s respective interval coverage (coverage of the treatment effect distribution), bias and variance (interval widths) were examined.  Across scenarios, MITSS generally outperformed other models, particularly FM, IPW and DR, with improved interval coverage, and smaller bias and interval width. One main exception was when treatment and control group covariate values substantially differenced, in which case no model performed well.

MITSS is limited in cases where treatment and control covariate distribution do not overlap, perhaps more than other methods as it implicitly extrapolates from one group to another, incorrectly introducing out-of-bounds values. MITSS may also not be the most statistically efficient method, depending on a priori knowledge of the expected distribution of potential outcomes. However, MITSS does provide a novel and flexible way of modeling a number of treatment effects with accuracy and minimal bias, and may a powerful tool in causal effect modelling.

Author

Panteha Hayati Rezvan

Author

Ian R. White

Author

Katherine J. Lee

Author

John B. Carlin

Author

Julie A. Simpson

Volume

15

Issue

1

Pages

83

Publication

BMC Medical Research Methodology

Date

2015-10-13

Abstract

Multiple imputation (MI) is a well-recognised statistical technique for handling missing data. As usually implemented in standard statistical software, MI assumes that data are ‘Missing at random’ (MAR); an assumption that in many settings is implausible. It is not possible to distinguish whether data are MAR or ‘Missing not at random’ (MNAR) using the observed data, so it is desirable to discover the impact of departures from the MAR assumption on the MI results by conducting sensitivity analyses. A weighting approach based on a selection model has been proposed for performing MNAR analyses to assess the robustness of results obtained under standard MI to departures from MAR.

  • Notes:
    • Multiple Imputation (MI) is a statistical technique that replaces missing data with plausible values based on observed data – the underlying assumption being that missing data is a function of observed data (Missing At Random or MAR). However, this assumption might not hold true in all studies. When data is Not Missing At Random or NMAR, it is a function of unknown/missing data itself and is therefore hard to replace without subject matter expertise. In terms of statistical techniques, there are two approaches proposed in the literature to evaluate the performance of MI when MAR assumption is not met (that is, when data are NMAR). These are: weighting approach and pattern-mixture approach. The current study evaluates weighting approach for replacing data when it is NMAR. In this approach, estimates from standard MI operating under MAR assumption are generated and then re-weighted according to the assumed magnitude of departure from MAR (assessed either by subject matter experts or how far δ is from zero). The authors calculated these weights based on work published by Carpenter et al., on a single simulated dataset. They evaluated the regression coefficient for the partially observed continuous outcome on the exposure variable in a simple bivariate analysis of imputed data. They found that the MNAR estimates (for marginal mean outcome or regression coefficient) were biased and did not converge to true value as the number of imputations increased. This issue seemed to persist even when the sample size of simulated dataset was increased to 1000, departure from MAR was set to a moderate level (δ = 0.5), and partially observed outcome was coded as a binary variable. The authors conclude that the weighting approach to replace data NMAR will likely produce biased estimates and that there might not be an alternative using subject-matter knowledge when dealing with data that are NMAR.

Author

Judita Preiss

Author

Mark Stevenson

Author

Robert Gaizauskas

Volume

22

Issue

5

Pages

987-992

Publication

Journal of the American Medical Informatics Association

Date

2015/09/01

Abstract

Objective Literature-based discovery (LBD) aims to identify “hidden knowledge” in the medical literature by: (1) analyzing documents to identify pairs of explicitly related concepts (terms), then (2) hypothesizing novel relations between pairs of unrelated concepts that are implicitly related via a shared concept to which both are explicitly related. Many LBD approaches use simple techniques to identify semantically weak relations between concepts, for example, document co-occurrence. These generate huge numbers of hypotheses, difficult for humans to assess. More complex techniques rely on linguistic analysis, for example, shallow parsing, to identify semantically stronger relations. Such approaches generate fewer hypotheses, but may miss hidden knowledge. The authors investigate this trade-off in detail, comparing techniques for identifying related concepts to discover which are most suitable for LBD.Materials and methods A generic LBD system that can utilize a range of relation types was developed. Experiments were carried out comparing a number of techniques for identifying relations. Two approaches were used for evaluation: replication of existing discoveries and the “time slicing” approach.1Results Previous LBD discoveries could be replicated using relations based either on document co-occurrence or linguistic analysis. Using relations based on linguistic analysis generated many fewer hypotheses, but a significantly greater proportion of them were candidates for hidden knowledge.Discussion and Conclusion The use of linguistic analysis-based relations improves accuracy of LBD without overly damaging coverage. LBD systems often generate huge numbers of hypotheses, which are infeasible to manually review. Improving their accuracy has the potential to make these systems significantly more usable.

  • Notes:
    • Literature based discovery is a process that uses natural language processing to find novel connections in disjoint scientific literature. Semi-automating the process of literature review in pharmacoepidemiology finds application in identifying intermediary pathways between seemingly disconnected exposures and outcomes. Several approaches to language processing are described and evaluated in this paper. Newer approaches that use linguistic analysis techniques such as extracting subject-relation-object triples (SemRep), binary relations expressed by verbs (ReVerb), grammatical relations between pairs of words (Stanford Parser) are shown to perform better than primitive co-occurrence based approaches (finding words in the same sentence, title, or document). Linguistic analysis techniques generate fewer but more accurate leads to hypothesis generation compared to co-occurrence based approaches.

Author

Anna Ivanova

Author

Geert Molenberghs

Author

Geert Verbeke

Publication

Statistical methods in medical research

Date

2015 Oct 7

Abstract

In longitudinal studies, continuous, binary, categorical, and survival outcomes are often jointly collected, possibly with some observations missing. However, when it comes to modeling responses, the ordinal ones have received less attention in the literature. In a longitudinal or hierarchical context, the univariate proportional odds mixed model (POMM) can be regarded as an instance of the generalized linear mixed model (GLMM). When the response of the joint multivariate model encompass ordinal responses, the complexity further increases. An additional problem of model fitting is the size of the collected data. Pseudo-likelihood based methods for pairwise fitting, for partitioned samples and, as introduced in this paper, pairwise fitting within partitioned samples allow joint modeling of even larger numbers of responses. We show that that pseudo-likelihood methodology allows for highly efficient and fast inferences in high-dimensional large datasets.

  • Notes:
    • The authors compared the performance of three methods, all based on the pseudo-likelihood function, to the full maximum likelihood approach for modelling longitudinal data with ordinal response categories.  The three methods were 1) pairwise modeling, 2) modeling using independent samples and 3) pairwise modeling of independent subsample. Method 3 was developed by the authors and essentially a combination of methods 1 and 2.  Method 3 involved three steps: 1) divide the data into M independent subsamples, 2) apply pairwise fitting on each subsample separately, and 3) combine results from steps 1 and 2 based on the pseudo-likelihood function.  For this simulation study, the four ways of modeling were tested on data from the Belgium diabetes project, which evaluated whether having a diabetes care team improved HbA1C, lowered LDL, and decreased systolic blood pressure (sBP), three jointly modeled outcomes.  HbA1C, LDL, and sBP were all categorized as an ordinal variable (e.g., HbA1C <7%, 7 to <8%, and >8%).  Authors found that all three pseudo-likelihood methods yielded similar approximations to the full maximum likelihood method and that the standard errors were also approximately equal. The benefit of using one of the pseudolikelihood models is really to decrease computational time, which can be long and cumbersome when the full maximum likelihood model is used for jointly modelling several ordinal outcomes.

Author

Ming-Yen Cheng

Author

Toshio Honda

Author

Jin-Ting Zhang

Pages

0-0

Publication

Journal of the American Statistical Association

Date

September 1, 2015

  • Notes:
    • Varying coefficient models (VCMs) are non-parametric models which allow regression coefficients to vary with covariate values. They can be useful when analyzing data that may not meet the assumptions required of traditional linear models, or when the research question of interest involves decomposing covariate-on-covariate effects on the outcome. However, with the growth of high-dimensional data sets (ex. genetic databases with many thousands of marker variables), the typical variable selection processes to support VCM building have demonstrated weaknesses. In this paper the authors propose a novel method for selecting variables for VCMs out of high-dimensional data sets using a groupwise forward selection process with a BIC stopping criteria (fBIC).

The authors provide mathematical proofs of their model, as well as simulate and applied analyzes to compare their procedure (fBIC) with other forward selection procedures using different stopping criteria (fAIC, fEBIC) and a third algorithm proposed elsewhere in the literature (gNIS). Simulation results showed the fBIC method generally outperformed other forward selection methods, selecting more true covariates and fewer false covariates, primarily due to the nature of penalization in each stopping criteria. When models were examined using a breast cancer dataset with more than 24000 gene expressions (along with other clinical risk factors), results were similar, with fBIC selecting covariates (genes) with strong impact on the selected base covariate in the VCM (ex. estrogen reception), and fEBIC and gNIS identifying more or less covariates, respectively. The fBIC-identified model fit well.

The fBIC method seems well-suited to the problem of fitting VCMs in high-dimensional dataset, as demonstrated in simulation. However, the article lacked a “gold standard” with which to compare the results of their applied analysis, without which it is difficult to understand the true practical implications of the method.

Author

David B. Richardson

Author

Ghassan B. Hamra

Author

Richard F. MacLehose

Author

Stephen R. Cole

Author

Haitao Chu

Volume

182

Issue

5

Pages

459-467

Publication

American Journal of Epidemiology

Date

09/01/2015

Abstract

In cohort mortality studies, there often is interest in associations between an exposure of primary interest and mortality due to a range of different causes. A standard approach to such analyses involves fitting a separate regression model for each type of outcome. However, the statistical precision of some estimated associations may be poor because of sparse data. In this paper, we describe a hierarchical regression model for estimation of parameters describing outcome-specific relative rate functions and associated credible intervals. The proposed model uses background stratification to provide flexible control for the outcome-specific associations of potential confounders, and it employs a hierarchical “shrinkage” approach to stabilize estimates of an exposure's associations with mortality due to different causes of death. The approach is illustrated in analyses of cancer mortality in 2 cohorts: a cohort of dioxin-exposed US chemical workers and a cohort of radiation-exposed Japanese atomic bomb survivors. Compared with standard regression estimates of associations, hierarchical regression yielded estimates with improved precision that tended to have less extreme values. The hierarchical regression approach also allowed the fitting of models with effect-measure modification. The proposed hierarchical approach can yield estimates of association that are more precise than conventional estimates when one wishes to estimate associations with multiple outcomes.

  • Notes:
    • In epidemiologic cohort studies there is often an interest in examining the effect of one exposure on mortality from multiple causes. The traditional approach to this analysis, running separate regressions, may suffer from reduced precision due to sparse data. The authors propose leveraging hierarchical regression techniques to avoid this issue; hierarchical analysis allows regression parameters (i.e. exposure effect) to vary as a function of an overall mean with residual variation, with this variation estimated from a Markov-Chain Monte-Carlo simulation. This approach allows for outcome-type (cause) specific associations, but with a lower mean squared error than if each exposure/outcome-type model were run separately, resulting in narrower confidence intervals.

The authors present two empiric examples to demonstrate the method. One example assessed TCDD (chemical carcinogen) exposure and its association with mortality due to different categories of cancer. In comparing results of separate Poisson regressions with those of a combined-outcome hierarchical regression, the coefficients from the second model were found to be stabilized, particularly for less common cancer types (rare outcomes), while still allowing for interpretation of independent effect estimates. The hierarchical modelling approach does require a priori assumptions about the undelrying pattern of variation, and in situations where this pattern is heterogeneous between outcome-types, this method may struggle. Sensitivity analysis around selection of the prior distribution is strongly recommended.

Author

Patricia Rodríguez De Gil

Author

Aarti P. Bellara

Author

Rheta E. Lanehart

Author

Reginald S. Lee

Author

Eun Sook Kim

Author

Jeffrey D. Kromrey

Volume

50

Issue

5

Pages

520-532

Publication

Multivariate Behavioral Research

Date

September 3, 2015

Abstract

Considering that the absence of measurement error in research is a rare phenomenon and its effects can be dramatic, we examine the impact of measurement error on propensity score (PS) analysis used to minimize selection bias in behavioral and social observational studies. A Monte Carlo study was conducted to explore the effects of measurement error on the treatment effect and balance estimates in PS analysis across seven different PS conditioning methods. In general, the results indicate that even low levels of measurement error in the covariates lead to substantial bias in estimates of treatment effects and concomitant reduction in confidence interval coverage across all methods of conditioning on the PS.

  • Notes:
    • Measurement error of covariates between exposure groups are usually assumed absent in research. The authors of this paper examined the impact of measurement error in a propensity score model by using a Monte Carlo simulation. The authors varied several factors in the simulation study: number of covariates, population treatment effect, covariates relationship to treatment, covariates relationship to outcome, correlation between covariates, sample size, covariate reliability, and propensity score conditioning method, resulting a total of 60,480 conditions. Study outcomes included statistical bias, RMSE, Type 1 error control, and confidence interval coverage and width associated with treatment effect. The study found that the larger the number of covariates and the more measurement error that exists led to a more biased treatment estimate. In addition, covariate reliability had a profound effect on type I error control and confidence interval coverage, especially when in the condition where the sample size was large and the relationship between treatment and covariates was strong. Most importantly, the study found that even a small amount of measurement error (0.2) could result in a substantial impact on the accuracy of inferences. The authors discussed the fact that the covariate reliability should be considered in the covariate selection process and should be reported as a standard element in the results.The authors also discussed two developed approaches (errors-in-variables logistic model or latent model) that could address the presence of a measurement error.

Author

Bradley G. Hammill

Author

Lesley H. Curtis

Author

Laura G. Qualls

Author

Susan N. Hastings

Author

Virginia Wang

Author

Matthew L. Maciejewski

Volume

53

Issue

11

Pages

974-979

Publication

Medical Care

Date

11/2015

  • Notes:
    • In this study, the authors linked the laboratory data to Medicare claims data using the Medicare Health Insurance Claim (HIC) number and/or Social Security number (SSN), date of birth, and sex. There are some limitations in the linkage. The first one is coverage, as only testing done by specific vendor’s independent laboratories is reported. Second, the laboratory results present tests for a variety of indications in the clinical care. Third, inpatient and outpatient institutional laboratory tests are not represented in these data. However, the linkage of laboratory data to Medicare claims makes some detailed clinical data available and may broaden the use of Medicare claims data in comparative effectiveness research.

Author

Richard Wyss

Author

Alan R. Ellis

Author

M. Alan Brookhart

Author

Michele Jonsson Funk

Author

Cynthia J. Girman

Author

Ross J. Simpson

Author

Til Stürmer

Volume

24

Issue

9

Pages

951-961

Publication

Pharmacoepidemiology and Drug Safety

Date

September 1, 2015

Abstract

Purpose We use simulations and an empirical example to evaluate the performance of disease risk score (DRS) matching compared with propensity score (PS) matching when controlling large numbers of covariates in settings involving newly introduced treatments. Methods We simulated a dichotomous treatment, a dichotomous outcome, and 100 baseline covariates that included both continuous and dichotomous random variables. For the empirical example, we evaluated the comparative effectiveness of dabigatran versus warfarin in preventing combined ischemic stroke and all-cause mortality. We matched treatment groups on a historically estimated DRS and again on the PS. We controlled for a high-dimensional set of covariates using 20% and 1% samples of Medicare claims data from October 2010 through December 2012. Results In simulations, matching on the DRS versus the PS generally yielded matches for more treated individuals and improved precision of the effect estimate. For the empirical example, PS and DRS matching in the 20% sample resulted in similar hazard ratios (0.88 and 0.87) and standard errors (0.04 for both methods). In the 1% sample, PS matching resulted in matches for only 92.0% of the treated population and a hazard ratio and standard error of 0.89 and 0.19, respectively, while DRS matching resulted in matches for 98.5% and a hazard ratio and standard error of 0.85 and 0.16, respectively. Conclusions When PS distributions are separated, DRS matching can improve the precision of effect estimates and allow researchers to evaluate the treatment effect in a larger proportion of the treated population. However, accurately modeling the DRS can be challenging compared with the PS. Copyright © 2015 John Wiley & Sons, Ltd.

  • Notes:
    • Annotation - Wan-Ju

Wyss R, Ellis AR, Brookhart MA, Jonsson Funk M, Girman CJ, Simpson RJ Jr, Stürmer T.

Matching on the disease risk score in comparative effectiveness research of new treatments.

Pharmacoepidemiol Drug Safety. 2015 Sep;24(9):951-61. doi: 10.1002/pds.3810. Epub 2015 Jun 25.

The goal of this study was to compare the performance of disease risk score (DRS) matching and propensity score (PS) matching in both a simulation and an empirical study where a newly marketed treatment was evaluated. The DRS models covariates that are associated with the study outcome and can be used to control for the confounding effect between exposure and control groups. In this study, the DRS was modeled within a historical cohort who were not exposed to study drug and the PS was modeled within the study cohort. The distribution of DRS and PS between groups was compared and the area of overlap was estimated under different scenarios, with variation in sample size, strength of covariate-treatment association, strength of covariate-outcome association, treatment-effect heterogeneity, and caliper distances. The bias, mean-square error (MSE) and precision of effect estimates were calculated as the performance parameters for DRS and PS. The estimates from the DRS model generally had higher precision, lower MSE, and higher proportion of matched controls than the estimates from the PS model. Specifically, DRS had better performance in the scenarios of small sample size, strong covariate effects and heterogeneous treatment-effect. Although modeling DRS required certain assumptions and may run into an issue of model misspecification, the strengths and advantages of using DRS was presented and discussed. In summary, the DRS model using historical data is a valid approach in a study evaluating newly introduced medications. The DRS model could be superior to PS model in certain scenario and in condition where the distribution of PS do not sufficiently overlap among exposed and unexposed.

Author

Chung-Wei Shen

Author

Yi-Hau Chen

Volume

16

Issue

4

Pages

740-753

Publication

Biostatistics (Oxford, England)

Date

Oct 2015

Abstract

Missing observations and covariate measurement error commonly arise in longitudinal data. However, existing methods for model selection in marginal regression analysis of longitudinal data fail to address the potential bias resulting from these issues. To tackle this problem, we propose a new model selection criterion, the Generalized Longitudinal Information Criterion, which is based on an approximately unbiased estimator for the expected quadratic error of a considered marginal model accounting for both data missingness and covariate measurement error. The simulation results reveal that the proposed method performs quite well in the presence of missing data and covariate measurement error. On the contrary, the naive procedures without taking care of such complexity in data may perform quite poorly. The proposed method is applied to data from the Taiwan Longitudinal Study on Aging to assess the relationship of depression with health and social status in the elderly, accommodating measurement error in the covariate as well as missing observations.

  • Notes:
    • Longitudinal studies that account for time varying nature of covariates utilize marginal regression models to determine the relationship between an outcome variable and a given set of covariates. Common challenges with repeated measures include missing data and error in measurement of covariates (due to instrumentation or other issues). Yi and colleagues proposed a Generalized Method of Moments (GMM) approach, an extension of GEE,  to estimate parameters in a marginal regression model where data can be missing or measured with an error. Currently, there are three existing approaches to variable selection in marginal regression models. They are, the Quasi-likelihood under the Independence model Criterion (QIC) method, the GCp method, and the Missing Longitudinal Information Criterion (MLIC) method. While the QIC and GCp methods only apply for well-measured and completely observed data, the MLIC method is applicable when there is missing data. However, the MLIC method does not allow for error in covariate measurement. This paper summarizes the development of a new model selection criterion called the Generalized Longitudinal Information Criterion (GLIC), that can produce an unbiased estimate of the error term in the presence of missing/poorly measured data. This criterion includes three terms: (1) a measure of goodness-of-fit (mean squared error between observed and fitted outcomes); (2) a penalty for overfitting (covariance between observed and fitted values); (3) trace of the true covariance. Developing this criterion, the authors made two assumptions: (1) data was missing at random (MAR); (2)  variance of measurement error can be estimated from repeated measures. When implemented, a model with smaller GLIC statistic has a better fit.  The authors applied the GMM approach with GLIC and non-GLIC methods in simulated and real world data and found that GMM+GLIC was more likely to select empirically significant covariates (α = 0.05). The authors have also extended their work to other regression models such as normal linear and log binomial. This work might be of interest to those pharmacoepidemiologists that are working with repeated measures data and time varying confounding. There are conceptual and logistical barriers to practical application. Conceptually, the assumptions made during the development of this statistic (MAR, using repeated measures to estimate variance of measurement error) might not be applicable in all settings. Logistically, the use of this statistic will require extensive statistical programming (R/SAS/STATA packages unavailable). Even if these barriers are overcome, it is important to note  that empirical and clinical significance must be balanced while selecting variables in a model.

Author

Andrew D. A. C. Smith

Author

Jon Heron

Author

Gita Mishra

Author

Mark S. Gilthorpe

Author

Yoav Ben-Shlomo

Author

Kate Tilling

Volume

26

Issue

5

Pages

719-726

Publication

Epidemiology

Date

09/2015

  • Notes:
    • In this study, the authors use a new method – Least Angle Regression (LARS) approach to the Least Absolute Shrinkage and Selection Operator (lasso) – to evaluate a small number of models that test the hypothesized relationships between repeated exposures over the life course and an outcome.  The LARS lasso approach is contrasted with the structured F test approach and Akaike information criterion using data from a previously published example (socioeconomic status over the life course and body mass index) and also through simulation of dataset with two or three binary exposures. This method involves first encoding a set of hypothesized models as covariates, and then using the LARS procedure for the lasso to identify the most appropriate covariate subset that accounts for the outcome variation.  Variable selection is aided visually with an elbow plot or guided by a hypothesis test.  Results showed that this method identified the same hypotheses as previous research and the correct hypothesis in simulation when sample sizes are large. When little is known regarding the association between the exposures and outcome over the life course (e.g. cumulative risk vs. risk during a critical period (e.g. during childhood)), the LARS approach can be used to suggest the most likely of a predefined set of hypothesized associations.

Author

Jessie K. Edwards

Author

Stephen R. Cole

Author

Daniel Westreich

Author

Heidi Crane

Author

Joseph J. Eron

Author

W. Christopher Mathews

Author

Richard Moore

Author

Stephen L. Boswell

Author

Catherine R. Lesko

Author

Michael J. Mugavero

Volume

26

Issue

5

Pages

645-652

Publication

Epidemiology

Date

09/2015

  • Notes:
    • Joint marginal structural models are useful to determine joint effects of two or more exposures when there is time varying confounding affected by prior exposure. However, these models assume that covariates are measured without error. Several studies have demonstrated the discordance between provider documentation (in electronic health records) of health behaviors and self-reported ones. Existing approaches to account for measurement error include regression calibration and Bayesian methods. However, these approaches do not work under situations where the extent of measurement error is determined by membership in a group (providers are more likely to accurately document health behaviors of sicker individuals). The current paper describes an alternative approach to account for measurement error (differential or non-differential) in a joint marginal structural model that estimates the joint effect of exposures in the presence of time varying confounding. This alternate approach involves the use of multiple imputation that treats misclassified data as missing data. The authors apply this method to observational data from Centers for AIDS Research Network of Integrated Clinical Systems (CNICS) to determine the joint effect of initiating therapy and current smoking status on all-cause mortality in patients with HIV. The covariate susceptible to measurement error was smoking status. A sub-group of patients served as an internal validation group (n = 3,686) that had both provider documented and self-reported smoking status. In the analysis  using provider documented smoking status, the authors fit a joint marginal structural Cox model using inverse-probability weights to control for time varying confounding. In the analysis accounting for measurement error, multiple imputation was used to determine self-reported smoking status for the full cohort based on information from the sub-group. Joint marginal models were fit for each imputation and the joint effect of exposures of interest were estimated by averaging across all imputations. Using provider-documented data, prevalence of smoking was 39% and results suggested that smoking (irrespective of initiation of therapy) had a protective effect on all-cause mortality in HIV patients. Accounting for measurement error with multiple imputation, the prevalence of smoking was 50% and results suggested individuals that smoked and did not initiate therapy were at highest risk for mortality. Patients that initiated therapy (irrespective of smoking status) had lower risk for mortality. These findings were consistent with literature. In observational data, we often make the assumption that lack of information (for example, absence of an ICD-9) implies absence of risk factor. The current paper provides a quantitative approach to investigate the impact of such measurement errors on results.

Author

Wei Wang

Author

Michael E. Griswold

Publication

Statistical methods in medical research

Date

2015 Sep 1

Abstract

The Tobit model, also known as a censored regression model to account for left- and/or right-censoring in the dependent variable, has been used in many areas of applications, including dental health, medical research and economics. The reported Tobit model coefficient allows estimation and inference of an exposure effect on the latent dependent variable. However, this model does not directly provide overall exposure effects estimation on the original outcome scale. We propose a direct-marginalization approach using a reparameterized link function to model exposure and covariate effects directly on the truncated dependent variable mean. We also discuss an alternative average-predicted-value, post-estimation approach which uses model-predicted values for each person in a designated reference group under different exposure statuses to estimate covariate-adjusted overall exposure effects. Simulation studies were conducted to show the unbiasedness and robustness properties for both approaches under various scenarios. Robustness appears to diminish when covariates with substantial effects are imbalanced between exposure groups; we outline an approach for model choice based on information criterion fit statistics. The methods are applied to the Genetic Epidemiology Network of Arteriopathy (GENOA) cohort study to assess associations between obesity and cognitive function in the non-Hispanic white participants.

  • Notes:
    • The authors explored the use of two approaches, the direct-marginalization approach and the average-predicted-value (APV) approach in simulation studies to show the unbiased and robustness properties under various scenarios. Both approaches use the Tobit regression model likelihood for parameter estimation to compare response means between groups while controlling for baseline covariates. Unlike linear regression models, Tobit regression models can account for left or right censoring of the dependent variable. The direct-marginalization approach allows a non-linear relationship between the dependent variable men and marginal mean for likelihood specification and parameter estimation, and the exposure coefficient provides the estimated overall exposure effect. The APV approach compares model-predicted response values for each individual under different exposure statuses. The direct-marginalization approach assumes homogeneous effects for the marginal mean, whereas the APV approach assumes homogeneous effects on the latent response scale. The authors use the Genetic Epidemiology Network of Arteriopathy cohort study to conduct the simulation studies and found that although both approaches provided valid and robust inferences when there is an exposure effect and covariates are balanced, when there is an exposure effect and there are no covariates, or if there is a null exposure effect. However, when covariates are unbalanced, both methods provide biased estimates, especially when the covariate has a large effect on the dependent variable. Based on the simulation studies, the authors recommend using AIC and BIC, two regression fit criteria, to determine which regression model approach, direct-marginalization or APV, should be used when modeling exposure-outcome relationship, but only when covariates are balanced.

Author

David B. Richardson

Author

Alexander P. Keil

Author

Eric Tchetgen Tchetgen

Author

Glinda Cooper

Volume

26

Issue

5

Pages

727-732

Publication

Epidemiology

Date

09/2015

  • Notes:
    • In epidemiologic cohort studies, Standardized Mortality Ratios area common method of assessing relative mortality , comparing observed mortality rates to an expected mortality rate based on an external reference population and controlling for covariates. The accuracy of the SMR relied on the assumption that the external reference mortality rate represents the counterfactual of the observed outcomes; that is, what would have happened, had exposure not occurred. In this paper, Richardson and colleagues propose a new application of negative control outcomes analysis to handle situations where the external reference is not an accurate counterfactual, and the SMR assumption does not hold. By leveraging a negative control, a third variable which is independent of the causal effect of interest, but suffers from the same sources of bias of concern, this proposed method may reduce, or even remove, potential bias in the SMR. This works by using the known, exposure-independent, bias of the negative control to adjust the rate ratios of interest.

The authors demonstrate the utility of this method using a case study of exposure to ionizing radiation and an outcome of bladder cancer in a longitudinal cohort of Oak Ridge National Laboratory workers. In unadjusted analysis, The SMR was elevated for hourly-paid men, but not for monthly-paid men. As there was no conceptual reason to suspect a difference, the result was suspected to be due to a “healthy worker” bias. After negative control adjustment, the SMRs for the two groups were not statistically different. This novel approach is a potentially elegant way to leverage SMRs with reduced bias. However, use of a negative control requires certain assumptions about the causal relationship (or lack thereof) between the exposure of interest, outcome of interest, and negative control. In particular, this method relies on the unverifiable assumption that the negative control outcome is not caused but he exposure of interest, but is subject to similar bias factors. A strong causal framework may assist in determining whether a particular scenario may, or may not, meet this assumption.

Author

John L. Hopper

Volume

182

Issue

10

Pages

863-867

Publication

American Journal of Epidemiology

Date

11/15/2015

Abstract

How can the “strengths” of risk factors, in the sense of how well they discriminate cases from controls, be compared when they are measured on different scales such as continuous, binary, and integer? Given that risk estimates take into account other fitted and design-related factors—and that is how risk gradients are interpreted—so should the presentation of risk gradients. Therefore, for each risk factor X0, I propose using appropriate regression techniques to derive from appropriate population data the best fitting relationship between the mean of X0 and all the other covariates fitted in the model or adjusted for by design (X1, X2, … , Xn). The odds per adjusted standard deviation (OPERA) presents the risk association for X0 in terms of the change in risk per s = standard deviation of X0 adjusted for X1, X2, … , Xn, rather than the unadjusted standard deviation of X0 itself. If the increased risk is relative risk (RR)-fold over A adjusted standard deviations, then OPERA = exp[ln(RR)/A] = RRs. This unifying approach is illustrated by considering breast cancer and published risk estimates. OPERA estimates are by definition independent and can be used to compare the predictive strengths of risk factors across diseases and populations.

  • Notes:
    • This paper outlines a method of comparing strengths of risk factors (i.e. relative risk) across different types of risk factors, e.g. categorical, ordinal, continuous, etc.  The authors proposed the use of odds per adjusted standard deviation, or OPERA; the formula is as follows: OPERA=exp[ln(RR)/A] = RR^s where RR is relative risk, s is standard deviation of the variable, and A is the adjusted standard deviation equal to 1/s.  The logic is that every variable, no matter the type (e.g. binary, continuous), has a standard deviation which can be calculated using population data and adjusted for relevant covariates.  The paper discusses a number of examples of the formula in binary, continuous, ordinal risk factors, and some special instances such as interaction between covariates or variation in risk gradients with other covariates.  This method can be extremely useful in comparison of size of risk between risk factors which are different types variables, and it has been used in citations discussed in the paper. 

Author

Tibor Schuster

Author

Menglan Pang

Author

Robert W. Platt

Volume

24

Issue

9

Pages

1004-1007

Publication

Pharmacoepidemiology and Drug Safety

Date

September 1, 2015

Abstract

Purpose The high-dimensional propensity score algorithm attempts to improve control of confounding in typical treatment effect studies in pharmacoepidemiology and is increasingly being used for the analysis of large administrative databases. Within this multi-step variable selection algorithm, the marginal prevalence of non-zero covariate values is considered to be an indicator for a count variable's potential confounding impact. We investigate the role of the marginal prevalence of confounder variables on potentially caused bias magnitudes when estimating risk ratios in point exposure studies with binary outcomes. Methods We apply the law of total probability in conjunction with an established bias formula to derive and illustrate relative bias boundaries with respect to marginal confounder prevalence. Results We show that maximum possible bias magnitudes can occur at any marginal prevalence level of a binary confounder variable. In particular, we demonstrate that, in case of rare or very common exposures, low and high prevalent confounder variables can still have large confounding impact on estimated risk ratios. Conclusions Covariate pre-selection by prevalence may lead to sub-optimal confounder sampling within the high-dimensional propensity score algorithm. While we believe that the high-dimensional propensity score has important benefits in large-scale pharmacoepidemiologic studies, we recommend omitting the prevalence-based empirical identification of candidate covariates. Copyright © 2015 John Wiley & Sons, Ltd.

  • Notes:
    • Within the multiple-step algorithm for the employment of high-dimensional propensity score (hd-PS), the authors aimed to understand the role of marginal prevalence of variables in the step of variable selection. Based on the formula used for calculation of “bias term” in the hd-PS algorithm and the law of probability, the authors discussed the relationship among three parameters: the marginal prevalence of the confounder (Pc), prevalence of exposure (Pe), and the relative risk between confounder and outcome (RRcd). They gave a hypothetical example to describe that even a confounder with low marginal prevalence might have great confounding effect on the exposure-outcome association. Then, they varied the value of the three parameters and concluded that confounders with small marginal prevalence might have a chance to generate great bias toward the association evaluation. Therefore, the step in the hd-PS algorithm where variables with low marginal prevalence were excluded may not be appropriate. The authors thus recommend that investigators should eliminate the step of variable selection using the marginal prevalence of the covariates in the hd-PS algorithm.

Author

Wan-Hsiang Hsu

Author

A. Gregory DiRienzo

Publication

Statistical methods in medical research

Date

2015 Oct 1

Abstract

We propose a flexible continuation ratio (CR) model for an ordinal categorical response with potentially ultrahigh dimensional data that characterizes the unique covariate effects at each response level. The CR model is the logit of the conditional discrete hazard function for each response level given covariates. We propose two modeling strategies, one that keeps the same covariate set for each hazard function but allows regression coefficients to arbitrarily change with response level, and one that allows both the set of covariates and their regression coefficients to arbitrarily change with response. Evaluating a covariate set is accomplished by using the nonparametric bootstrap to estimate prediction error and their robust standard errors that do not rely on proper model specification. To help with interpretation of the selected covariate set, we flexibly estimate the conditional cumulative distribution function given the covariates using the separate hazard function models. The goodness-of-fit of our flexible CR model is assessed with graphical and numerical methods based on the cumulative sum of residuals. Simulation results indicate the methods perform well in finite samples. An application to B-cell acute lymphocytic leukemia data is provided.

  • Notes:
    • The current article is of particular relevance in the use of ultrahigh dimensional data to fit regression models that predict an ordered response variable. Traditionally, proportional odds models are used to predict ordinal response variables. With ultrahigh dimensional data, the sparsity assumption comes into picture. It is the assumption that there are relatively small number of important predictors when the number of covariates exceeds the sample size. This is common  in studies where exposures are gene or protein expression data and the outcome is disease progression. An added specification is the potential for the predictors to change with each level of the ordinal response variable. For use in such a scenario, the authors developed several model building strategies that can select "important" covariates to fit a flexible continuation ratio  model with improved accuracy. In brief, the recommended strategy involves four steps. The first step is penalized variable screening using iterative sure independent screening (ISIS) based on smoothly clipped absolute deviation (SCAD).  In this step, log likelihood contributions for each subject and covariate are assessed in an iterative manner so as to choose only those covariates with nonzero estimates. In the next step, a bootstrap estimate of prediction error is determined using various combination of models with intercept, one, and two variables. In the third step, each of the models are compared in such a manner that the model with smaller number of variables is the "reference" model and its comparator is a "competing" model.  Under the null hypothesis, the reference model converges to a standard normal distribution. During the iterative process, several null hypotheses are rejected. Among those rejected, the reference model with the largest number of covariates is identified and its corresponding competing model is recommended to be the final choice. In the last step, the model is checked for its correctness using a Kolmogorov-type supremum test. The authors provided codes in R for this procedure.

Author

Ridgeway Greg

Author

Kovalchik Stephanie Ann

Author

Griffin Beth Ann

Author

Kabeto Mohammed U.

Volume

3

Issue

2

Pages

237

Publication

Journal of Causal Inference

Date

2015

  • Notes:
    • In many studies where a propensity score analysis was used in data from survey design, the sampling weight was usually ignored or not properly accounted for in the analyses. In this paper the authors argue that the sampling weights should be employed in both phases: 1) generating the propensity score and 2) the final outcome model. The authors derived the theoretical justification for using sampling weights in the estimation of propensity score weights. They also conducted a simulation study to examine which approach for using sampling weights (ignore, used as a covariate, used as a weight in the PS model and outcome model) performed the least biased treatment effect estimates. They found that using sampling weights as observation weights at all stages of an analysis reduced the covariate imbalance and produced most robust and accurate treatment effect estimate, especially when the presence of complex survey study design and misspecification of models. Then they also compare the difference between inclusion and exclusion of sampling weights in real data. Consistently, they found a better covariate balance for employing sampling weights at all stage of analysis. The authors concluded that the most robust approach is to use the sampling weights in the PS model and use the production of sampling weights and PS weights in the final outcome model. This approach could also be generalized to other weights in survey data such as nonresponse weights.

Author

Yong Chen

Author

Jing Ning

Author

Chunyan Cai

Volume

16

Issue

4

Pages

727-739

Publication

Biostatistics (Oxford, England)

Date

Oct 2015

Abstract

In longitudinal data analyses, the observation times are often assumed to be independent of the outcomes. In applications in which this assumption is violated, the standard inferential approach of using the generalized estimating equations may lead to biased inference. Current methods require the correct specification of either the observation time process or the repeated measure process with a correct covariance structure. In this article, we construct a novel pairwise likelihood method for longitudinal data that allows for dependence between observation times and outcomes. This method investigates the marginal covariate effects on the repeated measure process, while leaving the probability structure of the observation time process unspecified. The novelty of this method is that it yields consistent estimator of the marginal covariate effects without specification of the observation time process or the covariance structure of the repeated measures process. Large sample properties of the regression coefficient estimates and a pairwise likelihood ratio test procedure are established. Simulation studies demonstrate that the proposed method performs well in finite samples. An analysis of weight loss data from a web-based program is presented to illustrate the proposed method.

  • Notes:
    • Chen et al discuss a new pairwise likelihood method for longitudinal data to measure the dependence between observation times and outcomes; the proposed method does not require modeling of the observation time process (i.e. it is treated as a nuisance) nor any specification of the covariance structure of the repeated measuring process.  The authors achieve this by assuming that "the probability of observing the response variable can be written as a product of functions of the response variable and the covariates"; the existence of this relationship is the main assumption of this method.  To solve for beta (the parameter of interest), the authors maximize the log pairwise likelihood or solve score equations by setting specific derivative equal to 0.  Of note, the log pairwise likelihood does not behave like a true likelihood function and has asymptotic behavior.   This method of estimation can be extended to flexible semiparametric models because the nonparametric component does not need to be estimated to estimate the marginal covariate effect (beta).  In a simulation study comparing general estimating equations (GEE) to the new method in scenarios with varying relationship strengths between the repeated measure process and the observation time process - none, weak, or strong. The GEE method is nearly unbiased when there is no relationship between the repeated measure process and the observation time process, but the proposed method remains fairly unbiased through the spectrum of relationship strengths. In a sensitivity analysis, the authors found that the proposed method is somewhat robust even when the main assumption of the method does not hold.

Author

Stefanie Muff

Author

Lukas F. Keller

Volume

57

Issue

6

Pages

1068-1083

Publication

Biometrical Journal

Date

November 1, 2015

Abstract

Covariate measurement error may cause biases in parameters of regression coefficients in generalized linear models. The influence of measurement error on interaction parameters has, however, only rarely been investigated in depth, and if so, attenuation effects were reported. In this paper, we show that also reverse attenuation of interaction effects may emerge, namely when heteroscedastic measurement error or sampling variances of a mismeasured covariate are present, which are not unrealistic scenarios in practice. Theoretical findings are illustrated with simulations. A Bayesian approach employing integrated nested Laplace approximations is suggested to model the heteroscedastic measurement error and covariate variances, and an application shows that the method is able to reveal approximately correct parameter estimates.

  • Notes:
    • When two covariates are correlated and one of them is incorrectly measured, regression estimates for both can be biased either toward the null (attenuation effect) or away from the null (reverse attenuation).  Approaches like regression calibration and Bayesian analyses help address bias in parameter estimates for covariates. However, approaches to address bias in estimates for multiplicative interaction terms involving incorrectly measured covariates do not exist yet. The current article demonstrates reverse attenuation effects in interaction terms when the error or sampling  variance of mismeasured covariate is heteroscedastic in nature. Reverse attenuation can potentially introduce  spurious interaction effects undermining the validity of a study. A new approach is proposed to model such mismeasurement. This approach uses integrated nested Laplace approximations (INLA) to estimate marginals that can be subsequently used to fit generalized linear mixed models.  The authors successfully tested this approach in a subgroup drawn from Framingham Heart Study where the error prone covariate was long term systolic blood pressure and the interaction term was smoking*blood pressure. The practical application of this approach depends on whether or not researchers know the nature of measurement error in a covariate that is included in the interaction term -- that is, whether or not error/sampling variance of incorrectly measured covariate depends on the other variable in the interaction term. It is important to acknowledge the potential for these problems when working with certain types of data such as patient self-report or physician documented data and to use appropriate error modeling to avoid introducing spurious interaction effects.

Author

Philip Pallmann

Author

Mias Pretorius

Author

Christian Ritz

Publication

Statistical methods in medical research

Date

2015 Sep 18

Abstract

We discuss several aspects of multiple inference in longitudinal settings, focusing on many-to-one and all-pairwise comparisons of (a) treatment groups simultaneously at several points in time, or (b) time points simultaneously for several treatments. We assume a continuous endpoint that is measured repeatedly over time and contrast two basic modeling strategies: fitting a joint model across all occasions (with random effects and/or some residual covariance structure to account for heteroscedasticity and serial dependence), and a novel approach combining a set of simple marginal, i.e. occasion-specific models. Upon parameter and covariance estimation with either modeling approach, we employ a variant of multiple contrast tests that acknowledges correlation between time points and test statistics. This method provides simultaneous confidence intervals and adjusted p-values for elementary hypotheses as well as a global test decision. We compare via simulation the powers of multiple contrast tests based on a joint model and multiple marginal models, respectively, and quantify the benefit of incorporating longitudinal correlation, i.e. the advantage over Bonferroni. Practical application is illustrated with data from a clinical trial on bradykinin receptor antagonism.

  • Notes:
    • When repeated measurements (within the same treatment at different points of time or across treatments over time) are being made and missing data is prevalent, which can often occur in longitudinal data analysis, traditional methods for making inferences (e.g. multivariate ANOVA) may fall short, especially when variance changes over time and the heteroscedasticity assumption is violated. The authors compare the applicability of joint extended linear and conditional independence models to a combination of occasion-specific marginal linear models with the goal of obtaining accurate mean parameters and covariance matrixes, as well as having enough power to avoid a type two error. They conducted a simulation using data from a clinical trial on bradykinin reception antagonism to compare the joint model and multiple marginal models. They also or antagonism. They found that the marginal model works well in most situations when multiple comparisons are made on multiple occasions across treatments, and had equivalent power to tests based on joint models. However, when comparing measurements at multiple occasions within treatment arms: the marginal models approach can produce too many incorrect rejections of the null hypothesis when the sample size is small.

Author

Dario Gasbarra

Author

Elja Arjas

Author

Aki Vehtari

Author

Rémy Slama

Author

Niels Keiding

Volume

21

Issue

4

Pages

594-625

Publication

Lifetime Data Analysis

Date

2015/06/12

Abstract

This paper was inspired by the studies of Niels Keiding and co-authors on estimating the waiting time-to-pregnancy (TTP) distribution, and in particular on using the current duration design in that context. In this design, a cross-sectional sample of women is collected from those who are currently attempting to become pregnant, and then by recording from each the time she has been attempting. Our aim here is to study the identifiability and the estimation of the waiting time distribution on the basis of current duration data. The main difficulty in this stems from the fact that very short waiting times are only rarely selected into the sample of current durations, and this renders their estimation unstable. We introduce here a Bayesian method for this estimation problem, prove its asymptotic consistency, and compare the method to some variants of the non-parametric maximum likelihood estimators, which have been used previously in this context. The properties of the Bayesian estimation method are studied also empirically, using both simulated data and TTP data on current durations collected by Slama et al. (Hum Reprod 27(5):1489–1498, 2012).

  • Notes:
    • This study aimed to identify and estimate the distribution of waiting time to pregnancy in a current duration design. Unlike traditional cohort or case-control study designs usually used in this field, the current duration design involves a cross-sectional sample that study subjects are women currently waiting for their pregnancy. However, the current duration design usually generates unstable estimates due to very short durations are collected. The authors used a Bayesian method, applying a Moarkov chain Monte Carlo algorithm, for the estimation of waiting time distribution and compared this method to other nonparametric maximum likelihood estimators (NPMLE). They applied this comparison in a simulation study as well as an empirical study. They demonstrated that nonparametric Baysian model and inference is an alternative approach to analyze study using the current duration design in both simulation study and empirical study. The authors discussed the concern of the dependence of the posterior inference on the specification of hyperparameters controlling the prior distribution in Bayesian modeling and suggested that use of studies with different study designs may be a good compliment pertaining this issue in the Bayesian model.

Author

Quentin Boucherie

Author

Vanessa Pauly

Author

Elisabeth Frauger

Author

Xavier Thirion

Author

Vincent Pradel

Author

Joëlle Micallef

Volume

24

Issue

9

Pages

991-998

Publication

Pharmacoepidemiology and Drug Safety

Date

September 1, 2015

Abstract

Purpose In pharmacoepidemiology, one of the main concerns is analysis of drug exposure time. However, in real-life settings, patient's behavior is complex and characterized by drug exposure dynamics. Multi-state models allow assessing the probabilities of various patterns, instead of just continuous use and/or discontinuation. The aim of this study was to illustrate with methadone, the use of multi-state model (MSM) in a large claims database. Methods This study is based on the French health insurance reimbursement database. Methadone exposure is defined using four states for each period of follow-up: syrup only, capsule only, syrup-capsule and no dispensing. The model considered 12 possible transitions (including reverse transitions) from one state to another. To describe these transitions a time-homogeneous Markov model was used. Results A total of 1265 methadone users were included. When patients belonged to the syrup-capsule state, they had a 61.8% chance of moving to capsules the following month and 20.9% of moving to syrup. The probability of moving from the syrup to the non-user state was 13.6% (11.7% from capsule state to non-user state). The average length of stay was 5.9 months (CI95%: [5.5–6.4]) in capsule state, 4.9 (CI95%: [4.6–5.2]) in syrup state and 5.9 (CI95%: [5.5–6.3]) in non user state. Conclusion MSM provided a good description of methadone patterns of use. It outlined behaviors which have led to a rapid spread of capsule formulation among methadone users. Therefore, it illustrates the utility of MSM for modeling multiple sequences of drug use in a large claims database. Copyright © 2015 John Wiley & Sons, Ltd.

  • Notes:
    • Treatment persistence is frequently measured using “time to event” models that censors an individual’s follow up upon treatment discontinuation. In fact, real world treatment patterns are commonly more complex, with discontinuation possibly an informative treatment state, either in and of itself, or as a state between periods of persistence. In this study, the authors propose extending Markov modelling methods, specifically multi-state modelling (MSM), to model these complexities. Proposed benefits of such a model include the ability to estimate the instantaneous probability of transition between states and mean time spent in each state. A proportional intensities multi-state model allows for inclusion of covariates for adjustment.

The article demonstrates MSM’s potential functionality in the context of methadone maintenance treatment (MMT, as modeled by moving in, out, or remaining in the following states: capsule form only, syrup form only, both dosage forms, or no dispensing. Data from France’s National Health Insurance scheme was used, from which a sample of 1,265 incident users of MMT were identified. Model fit  was assessed by comparing observed to expected transition rates. Results indicated that, between the syrup and capsule formulations, subjects had a higher probability of moving to, and a longer time spent in, the methadone capsule state. The “no dispensing” (traditionally discontinuation) state was shown to be a dynamic, rather than absorbing, state.

A number of limitations were identified, particularly a lack of power in the multivariate adjusted model, possibly due to small effect sizes. In addition, MSM methodologies must meet common Markov assumptions, such as being “memoryless” and homogenous (no influence on duration of follow up on transition probability). However, should these assumptions be met, multi-state modelling provides a useful tool that allows researchers to acknowledge, and follow patients beyond, treatment discontinuation.

Added March 2016

Author

Guan Wang

Author

Kenneth Jung

Author

Rainer Winnenburg

Author

Nigam H. Shah

Volume

22

Issue

6

Pages

1196-1204

Publication

Journal of the American Medical Informatics Association

Date

2015/11/01

Abstract

Objective Adverse drug events (ADEs) are undesired harmful effects resulting from use of a medication, and occur in 30% of hospitalized patients. The authors have developed a data-mining method for systematic, automated detection of ADEs from electronic medical records.Materials and Methods This method uses the text from 9.5 million clinical notes, along with prior knowledge of drug usages and known ADEs, as inputs. These inputs are further processed into statistics used by a discriminative classifier which outputs the probability that a given drug–disorder pair represents a valid ADE association. Putative ADEs identified by the classifier are further filtered for positive support in 2 independent, complementary data sources. The authors evaluate this method by assessing support for the predictions in other curated data sources, including a manually curated, time-indexed reference standard of label change events.Results This method uses a classifier that achieves an area under the curve of 0.94 on a held out test set. The classifier is used on 2 362 950 possible drug–disorder pairs comprised of 1602 unique drugs and 1475 unique disorders for which we had data, resulting in 240 high-confidence, well-supported drug-AE associations. Eighty-seven of them (36%) are supported in at least one of the resources that have information that was not available to the classifier.Conclusion This method demonstrates the feasibility of systematic post-marketing surveillance for ADEs using electronic medical records, a key component of the learning healthcare system.

  • Notes:
    • ANNOTATION - Wang et al. A method for systematic discovery of adverse drug events from clinical notes

Using discriminative classifiers, such as logistic regression modeling, for identifying drug - adverse event (ADE) associations from observational data sources has demonstrated superior performance to traditional disproportionality analysis (DPA). Discriminative classifiers use data inputs to identify drug-ADE associations based on a learned function. One major advantage to this method is that features, such as prior knowledge about drug usage and ADEs, can be incorporated in with the inputs which can improve the predictive ability of the model. While discriminative classifiers have been successfully used for ADE surveillance using data from spontaneous reporting systems (SRS), fewer attempts have been made to utilize data extracted from electronic medical records (EMRs) or free text clinic notes (CNs) using natural language processing – data sources that may be more complete and suffer from fewer biases. In this article, the investigators describe the development, testing, and validation of a classifier using data extracted from the CNs from STRIDE by an efficient test-processing system, NCBO Annotator-based text-processing pipeline. This research aims to distinguish true drug-ADE associations from drug-ADE pairs reported in the data, using a classifier that incorporates features of drug-drug and disorder-disorder similarity and previous knowledge of drug use and ADEs. Overall, this classifier performed superior to other methods like DPA - demonstrating high sensitivity and specificity in the test set (AUC 0.94). In the validation, the PPV of the classifier was not improved from other methods; however, applying additional filters to the results of the classifier improved the PPV. This research has implications for advancement of the use of EMR data for ADE surveillance. The method allows for additional features, such as prior knowledge of ADEs, to be incorporated in the model which may improve signal detection, and the process is efficient, automated, and easily transferable across other healthcare institutions with various EMR systems.

Author

Ying Li

Author

Patrick B. Ryan

Author

Ying Wei

Author

Carol Friedman

Volume

38

Issue

10

Pages

895-908

Publication

Drug Safety

Date

2015/07/08

Abstract

Introduction Observational healthcare data contain information useful for hastening detection of adverse drug reactions (ADRs) that may be missed by using data in spontaneous reporting systems (SRSs) alone. There are only several papers describing methods that integrate evidence from healthcare databases and SRSs. We propose a methodology that combines ADR signals from these two sources. Objectives The aim of this study was to investigate whether the proposed method would result in more accurate ADR detection than methods using SRSs or healthcare data alone. Research Design We applied the method to four clinically serious ADRs, and evaluated it using three experiments that involve combining an SRS with a single facility small-scale electronic health record (EHR), a larger scale network-based EHR, and a much larger scale healthcare claims database. The evaluation used a reference standard comprising 165 positive and 234 negative drug–ADR pairs. Measures Area under the receiver operator characteristics curve (AUC) was computed to measure performance. Results There was no improvement in the AUC when the SRS and small-scale HER were combined. The AUC of the combined SRS and large-scale EHR was 0.82 whereas it was 0.76 for each of the individual systems. Similarly, the AUC of the combined SRS and claims system was 0.82 whereas it was 0.76 and 0.78, respectively, for the individual systems. Conclusions The proposed method resulted in a significant improvement in the accuracy of ADR detection when the resources used for combining had sufficient amounts of data, demonstrating that the method could integrate evidence from multiple sources and serve as a tool in actual pharmacovigilance practice.

  • Notes:
    • ANNOTATION - Li et al. A method to combine signals from spontaneous reporting systems and observational healthcare data to detect adverse drug reactions

The purpose of this research was to develop and test the performance of a framework to combine adverse drug reaction (ADR) surveillance data from a spontaneous reporting system, FAERS, with electronic healthcare data. The investigators evaluated the signal discrimination performance, area under the ROC curve (AUC), of the following data combinations to detect four specified ADRs: 1) FAERS + a small hospital electronic healthcare record (EHR); 2) FAERS + a large EHR; and 3) FAERS + claims data. In addition, the investigators developed a confounder adjustment method for calculating signal scores from FAERS and the small EHR data. The confounder adjustment method used a two-step LASSO regression: Step 1: standard logistic LASSO regression to identify significantly associated confounders; and Step 2: weighted linear LASSO regression to select covariates associated with drug use. The association between the ADR and the medication was then estimated with adjustment for all confounders, and the one-sided p-values served as the signal scores. For the large EHR data and the claims data signal scores (p-values) were calculated from the log relative risk and standard error. The investigators calibrated the signal scores using a reference negative set for each ADR of interest and then combine the scores from the heterogeneous databases. Performance of the framework was evaluated by comparing its AUC to that of the data sources independently.

The combined method performed better than the independent methods for the FAERS + large EHR and FAERS + claims data, but not FAERS + small EHR. The investigators suggest that lack of significant improvement for the FAERS + small EHR data was a result of a small sample size and and/or sampling bias. While this study presents an interesting approach at combing data from two sources for signal detection, there are significant concerns about the validity of the results. First, the appropriateness of using p-values for signal scores is questionable. Second, little information is provided about the formula used to combine signal scores from two databases. More transparency regarding this method is necessary for understanding the implications of this research.

Author

Melissa M. Parker

Author

Howard H. Moffet

Author

Alyce Adams

Author

Andrew J. Karter

Volume

22

Issue

5

Pages

957-961

Publication

Journal of the American Medical Informatics Association

Date

2015/09/01

Abstract

Objective Identifying patients who are medication nonpersistent (fail to refill in a timely manner) is important for healthcare operations and research. However, consistent methods to detect nonpersistence using electronic pharmacy records are presently lacking. We developed and validated a nonpersistence algorithm for chronically used medications.Materials and Methods Refill patterns of adult diabetes patients (n = 14,349) prescribed cardiometabolic therapies were studied. We evaluated various grace periods (30-300 days) to identify medication nonpersistence, which is defined as a gap between refills that exceeds a threshold equal to the last days’ supply dispensed plus a grace period plus days of stockpiled medication. Since data on medication stockpiles are typically unavailable for ongoing users, we compared nonpersistence to rates calculated using algorithms that ignored stockpiles.Results When using grace periods equal to or greater than the number of days’ supply dispensed (i.e., at least 100 days), this novel algorithm for medication nonpersistence gave consistent results whether or not it accounted for days of stockpiled medication. The agreement (Kappa coefficients) between nonpersistence rates using algorithms with versus without stockpiling improved with longer grace periods and ranged from 0.63 (for 30 days) to 0.98 (for a 300-day grace period).Conclusions Our method has utility for health care operations and research in prevalent (ongoing) and new user cohorts. The algorithm detects a subset of patients with inadequate medication-taking behavior not identified as primary nonadherent or secondary nonadherent. Healthcare systems can most comprehensively identify patients with short- or long-term medication underutilization by identifying primary nonadherence, secondary nonadherence, and nonpersistence.

  • Notes:
    • ANNOTATION - Parker et al. An algorithm to identify medication nonpersistence using electronic pharmacy databases

Medication nonpersistence can be defined as the "failure to refill a medication within a time period consistent with the use of the drug." This concept differs from medication nonadherence, which is defined as the "extent to which a patient does not take medications as prescribed." Validated algorithms to measure nonadherence using electronic claims data have been developed and are used frequently; however, validated algorithms do not exist for nonpersistence. Currently, one method used to measure nonpersistence dichotomously is to identify when the time between refills exceeds an “allowable gap.” This method needs to be evaluated for two major limitations: 1) the impact of the length of the allowable gap, and 2) the impact of considering stockpiled medications into the calculation. The investigators in this study tested an algorithm for measuring nonpersistence to determine the impact of length of allowable gap with and without consideration for stockpiling. In the algorithm, the gap was calculated as “last days supplied” + “grace period” +/- “days of stockpiled medication.” The grace period was varied to change the length of the gap: 30, 75, 100, 180, and 300 days. The investigators found that as long as the grace period in the “allowable gap” calculation is at least as long as the days supplied, an accurate measure of nonpersistence can be obtained, whether or not the stockpiled medications are included. This research may be informative when determing exposure to medications using electronic claims data. The investigators suggest that this measure might inform epidemiology research aiming to use nonpersistence as an exposure, effect modifier, or outcome.

Author

Hua Shen

Author

Richard J. Cook

Volume

57

Issue

5

Pages

725-742

Publication

Biometrical Journal

Date

September 1, 2015

Abstract

Interval-censored recurrent event data arise when the event of interest is not readily observed but the cumulative event count can be recorded at periodic assessment times. In some settings, chronic disease processes may resolve, and individuals will cease to be at risk of events at the time of disease resolution. We develop an expectation-maximization algorithm for fitting a dynamic mover-stayer model to interval-censored recurrent event data under a Markov model with a piecewise-constant baseline rate function given a latent process. The model is motivated by settings in which the event times and the resolution time of the disease process are unobserved. The likelihood and algorithm are shown to yield estimators with small empirical bias in simulation studies. Data are analyzed on the cumulative number of damaged joints in patients with psoriatic arthritis where individuals experience disease remission.

  • Notes:
    • Bivariate failure time data are common in many medical studies where estimation of the strength of dependence is required. There is greater understanding of right censored bivariate failure time data and less on bivariate interval censored data, where the continuous outcome survival time is not observed exactly but lies on an interval. The authors present a nonparametric procedure for testing independence of bivariate interval censored data. They modified the Kendall's tau, a conventional measure of non-parametric rank correlations, for interval censored data using expected numbers of concordant discordant pairs. The authors concluded that their modification offered less intensive computation and higher power than other procedures such as multiple imputation through simulation studies. 

Author

Matteo Bottai

Publication

Statistical methods in medical research

Date

2015 Sep 18

Abstract

The occurrence of an event of interest over time is often summarized by the incidence rate, defined as the average number of events per person-time. This type of rate applies to events that may occur repeatedly over time on any given subject, such as infections, and Poisson regression represents a natural regression method for modelling the effect of covariates on it. However, for events that can occur only once, such as death, the geometric rate may be a better summary measure. The geometric rate has long been utilized in demography for studying the growth of populations and in finance to compute compound interest on capital. This type of rate, however, is virtually unknown to medical research. This may be partly a consequence of the lack of a regression method for it. This paper describes a regression method for modelling the effect of covariates on the geometric rate. The described method is based on applying quantile regression to a transform of the time-to-event variable. The proposed method is used to analyze mortality in a randomized clinical trial and in an observational epidemiological study.

  • Notes:
    • The new occurrence of an outcome of interest is often reported as an incidence rate. While this measure is simple to calculate and an appropriate summary measure for an outcome that occurs several times to study subject, the authors argue that the geometric rate is a better measure than the incidence rate for outcomes that a subject can only experience once (e.g. death), because constant rates (i.e. incidence rates) will underestimate risk over time. The authors employ a quantile regression model that transforms time-to-event variables in two contexts: (i) a randomized clinical trial evaluating the effect of two drugs on mortality in patients with cancer and (ii) an observational cohort of healthy Swedish men. Observed and expected survival under constant mortality rates, and geometric rates were calculated for both contexts and compared qualitatively. This study may be viewed as a proof-of-concept paper, where the two contexts are used as case studies that demonstrate the application of geometric models, including its interpretation, in medical research.

Author

Srijan Sengupta

Author

Stanislav Volgushev

Author

Xiaofeng Shao

Pages

0-0

Publication

Journal of the American Statistical Association

Date

September 1, 2015

  • Notes:
    • Comparison of computational efficiency and algorithm quality are important considerations when comparing bootstrapping procedures for massive data. This study proposes a new method for bootstrapping for massive data using a combination of subsampling and double bootstrap which the authors termed subsampled double bootstrap (SDB). It can overcome the limitations associated with the of bag of little bootstraps (BLB) method since it provides more comprehensive coverage of data within given computational constraints. The SDB is also cheaper as well as faster than the fast double bootstrap (FDB). SDB is straightforward, does not require the selection of any tuning parameters and can be applied to both independent data and stationary time series data. This approach offers an alternative method for sampling from 'big data' and may provide gains in computational efficiency.
    • doi: 10.1080/01621459.2015.1080709

Author

Lendle Samuel David

Author

Fireman Bruce

Author

van der Laan Mark J.

Volume

3

Issue

2

Pages

139

Publication

Journal of Causal Inference

Date

2015

  • Notes:
    • ANNOTATION - Lendle et al. Balancing score adjusted targeted minimum loss-based estimation

      This article provides a review of propensity score (PS) based estimators that include a balancing score property, describes a modified version of a targeted minimum loss-based estimator (TMLE) that includes a balancing score property, and evaluates its performance compared to traditional estimators. In observational studies, estimation of the causal effects of a treatment (i.e., average treatment effect [ATE], average treatment effect among the treated [ATT], etc.) often utilizes PS methods (e.g., stratification, PS matching, inverse probability of treatment weighting, etc.) that adjust for the PS in place of baseline covariates. These methods require consistent estimation of the PS. For nonparametric approaches, like matching and stratification, the criterion of consistency is met if the estimated PS converges at the balancing score, but misclassification and bias are possible. Some PS estimators have a balancing score property that improves the robustness of the estimate; however, these estimators are often inefficient. Other types of estimators, like the TMLE, are considered "efficient" and "doubly robust." Therefore, in this article the investigators aimed to develop a TMLE estimator with a balancing score property that would maintain the efficiency of the TMLE with the added robustness of the balancing score property. The new estimator, balancing score adjusted TMLE (BSA-TMLE), was tested in a number of simulations and compared with five other estimators: 1) Simple plug-in estimator; 2) Doubly robust, balancing score adjusted plug-in estimator (DR-BSA) without TMLE; 3) Non-doubly robust balancing score adjusted estimator (BSA); 4) Inverse probability of treatment weighted estimator (IPTW); and 5) Regular TMLE. Through simulations, the investigators demonstrated that a balancing score property improves the function of an estimator. Estimators without BSA are more biased. They also simulated situations where the PS estimator was not consistent and found that if the estimated PS is close enough to the balancing score then bias remains low. Overall, the BSA-TMLE performed well in simulation against other estimators. The BSA-TMLE may be used to improve the calculation/estimation of causal effects, such as ATE and ATT. This work advances efforts to develop efficient modes of PS estimation with reduced bias.

Author

Charles E Rose Yi Pan

Author

Andrew L Baughman

Volume

06

Issue

04

Publication

Journal of Biometrics & Biostatistics

Date

2015

  • Notes:
    • The log-binomial and Poisson models are commonly used to estimate relative risk (RR) and logistic regression, which estimates an odds ratio (OR), is often used to approximate the RR when the outcome is rare. However, logistic regression may be used to estimate the RR, though maximum likelihood estimation of the SE of the RR can be a challenge.

The authors developed a Bayesian logistic regression  (BLR) model to estimate the RR, with credible intervals (CIBs). They applied the BLR model to a large hypothetical cross-sectional study with categorical variables and to a small hypothetical clinical trial with a continuous variable for which the log-binomial method did not converge. Results of the BLR model were compared to those from several commonly used RR modeling methods (stratified Mantel-Haenszel, logistic standardization by Marginal Method, Poisson, Robust Poisson, and log-binomial).

The authors were able to demonstrate that the BLR model estimates adjusted RRs and 95% CIBs comparable to results from the other methods. They concluded that BLR modeling approach compares favorably with commonly used modeling methods, regardless of outcome prevalence.

Author

Renate Meyer

Author

Jose S. Romeo

Volume

57

Issue

6

Pages

982-1001

Publication

Biometrical Journal

Date

November 1, 2015

Abstract

The analysis of recurrent event data is of particular importance in medical statistics where patients suffering from chronic diseases often present with multiple recurring relapses or cancer patients experience several tumor recurrences. Whereas individual subjects can be assumed to be independent, the times between events of one subject are neither independent nor identically distributed. Apart from the marginal approach by Wei et al. (), the shared frailty model, see for example Duchateau and Janssen (), has been used extensively to analyze recurrent event data, where the correlation between sequential times is implicitly taken into account via a random effect. Oakes () and Romeo et al. () showed and exemplified the equivalence of frailty models for bivariate survival data to Archimedean copulas. Despite the fact that copula-based models have been used to model parallel survival data, their application to recurrent failure time data has only recently been suggested by Lawless and Yilmaz () for the bivariate case. Here, we extend this to more than two recurrent events and model the joint distribution of recurrent events explicitly using parametric copulas within a Bayesian framework. This framework allows for parametric as well as a nonparametric modeling of the marginal baseline hazards and models the influence of covariates on the marginals via a proportional hazards assumption. Furthermore, the parameters of the copula may also depend on the covariates. We illustrate the flexibility of this approach using data from an asthma prevention trial in young children.

  • Notes:
    • When analyzing recurrent time-to-event data e.g. relapses in cancer, the individual subjects can be independent but not the times between the events for a particular subject and therefore, one cannot use traditional proportional hazards models. Previously, shared frailty models have been used to model dependent gap times between events; however, these models are not flexible in terms of marginal distributional forms and do not provide easy interpretation of covariate effects. This paper describes an extension of copula based methods using a Bayesian approach for two or more recurrent events. The comparison of shared frailty models versus copula models using data from an asthma prevention trial showed similar results. Further, the copula based approach provided several advantages in terms of making probability statements about any lower-dimensional marginal distribution or any conditional distribution of interest, extension to non-parametric modeling, assessing transient carryover effects  and straightforward interpretation of covariate effects on the marginal hazards. One–stage estimation of marginal and copulas with WinBUGS described in this paper can avoid efficiency loss compared to previous frequentist two-stage methods. Since the copula based method decouples marginal distribution and dependent structure, this approach is more generalizable and can be used in cure rate models and modelling distinct final events like death.

Author

Tomohiro Shinozaki

Author

Yutaka Matsuyama

Volume

26

Issue

6

Pages

873-877

Publication

Epidemiology

Date

11/2015

  • Notes:
    • Standardization is one of several methods used to account for potential confounders. There are two model-based standardization methods for confounding variables that contain too many combinations to be fully stratified: regression standardization and use of an inverse probability of exposure weighted–reweighted estimators. However, each of these require multiple assumptions. Regression standardization requires an outcome regression model conditional on exposure and confounders and inverse probability weighted-reweighted estimators requires a propensity score model.

The authors propose doubly robust estimators, which only require correct specification of either the outcome regression or the propensity score model but do not necessitate both. They demonstrate the method using an observational study on the effect of postoperative tamoxifen therapy on breast cancer recurrence and run multiple bootstrap simulations that vary the number of correct specifications.

The doubly robust estimator yielded minimally biased estimates when at least one of these model specifications was correct. The authors emphasize that their modification assumes that censoring and time-to-event are unconditionally independent; in practice, analysts assume that these are conditionally independent given exposure and confounders, which is typically more realistic.

Author

Goldfarb Rumyantzev AS

Author

Ning Dong

Volume

07

Issue

01

Publication

Journal of Biometrics & Biostatistics

Date

2015

Author

Ulrich K. Müller

Author

Andriy Norets

Pages

0-00

Publication

Journal of the American Statistical Association

Date

September 11, 2015

  • Notes:
    • doi: 10.1080/01621459.2015.1086654
    • Set estimators such as confidence or credible sets are used to describe uncertainty around model parameters. It is desirable for these estimators to provide both frequentist coverage and Bayesian credibility properties. For locally asymptotically normal (LAN) models, both approaches can deliver asymptotically equivalent set estimators, however, in case of nonstandard problems, this equivalence is not achieved. This paper discusses construction of set estimators which have both frequentist and Bayesian properties and shows that under a mild continuity condition, for a given type of credible set, a prior distribution exists that induces frequentist coverage of level 1 − α for the (1 − α)-credible set. The coverage inducing priors can be applied in many settings including prediction sets, settings with a general parameter space, hierarchical Bayesian priors, and weighted frequentist coverage.

Author

Abel N. Kho

Author

John P. Cashy

Author

Kathryn L. Jackson

Author

Satyender Goel

Volume

22

Issue

5

Pages

1072-1080

Publication

Journal of the American Medical Informatics Association

Date

2015/09/01

Abstract

Objective To design and implement a tool that creates a secure, privacy preserving linkage of electronic health record (EHR) data across multiple sites in a large metropolitan area in the United States (Chicago, IL), for use in clinical research.Methods The authors developed and distributed a software application that performs standardized data cleaning, preprocessing, and hashing of patient identifiers to remove all protected health information. The application creates seeded hash code combinations of patient identifiers using a Health Insurance Portability and Accountability Act compliant SHA-512 algorithm that minimizes re-identification risk. The authors subsequently linked individual records using a central honest broker with an algorithm that assigns weights to hash combinations in order to generate high specificity matches.Results The software application successfully linked and de-duplicated 7 million records across 6 institutions, resulting in a cohort of 5 million unique records. Using a manually reconciled set of 11 292 patients as a gold standard, the software achieved a sensitivity of 96% and a specificity of 100%, with a majority of the missed matches accounted for by patients with both a missing social security number and last name change. Using 3 disease examples, it is demonstrated that the software can reduce duplication of patient records across sites by as much as 28%.Conclusions Software that standardizes the assignment of a unique seeded hash identifier merged through an agreed upon third-party honest broker can enable large-scale secure linkage of EHR data for epidemiologic and public health research. The software algorithm can improve future epidemiologic research by providing more comprehensive data given that patients may make use of multiple healthcare systems.

  • Notes:
    • Integrating health records across care delivery sites is critical to developing more comprehensive and representative healthcare data. This paper describes designing and implementing a privacy preserving electronic health record (EHR) linkage tool in the Chicago area using a secure encryption algorithm. Each of the participating six sites in Chicago provided EHR data. Demographic information across linked records were reconciled to produce only one demographic record per patient. Using a HIPAA-compliant cryptographic hash function, hashes were created for each patient. A simple deterministic algorithm was used for matching records across institutions. In order to evaluate the record linkage tool, demographic features were compared between the study population and US Census data and number of patients diagnosed with three well captured conditions was compared before and after deduplication. The results showed significant reduction in number of unique patients identified and the number of diagnosed patients after deduplication. Comparing combined data with 2010 US Census data, much higher proportion of minority and much lower proportion of younger patients were observed. Race and ethnicity demographic characteristics varied widely among participating sites. This variation demonstrates that using data from a single center could lead to a biased result. This approach and software algorithm are potentially important as data are combined across institutions to avoid double-counting patients using more than a single system, which was observed to be prevalent in this study. The software is available for research purposes. 

Author

Yannick Girardeau

Author

Claire Trivin

Author

Pierre Durieux

Author

Christine Le Beller

Author

Lillo-Le Louet Agnes

Author

Antoine Neuraz

Author

Patrice Degoulet

Author

Paul Avillach

Volume

38

Issue

9

Pages

799-809

Publication

Drug Safety

Date

2015/06/21

Abstract

Background and Objective While risk of acute kidney injury (AKI) is a well documented adverse effect of some drugs, few studies have assessed the relationship between drug–drug interactions (DDIs) and AKI. Our objective was to develop an algorithm capable of detecting potential signals on this relationship by retrospectively mining data from electronic health records. Material and methods Data were extracted from the clinical data warehouse (CDW) of the Hôpital Européen Georges Pompidou (HEGP). AKI was defined as the first level of the RIFLE criteria, that is, an increase ≥50 % of creatinine basis. Algorithm accuracy was tested on 20 single drugs, 10 nephrotoxic and 10 non-nephrotoxic. We then tested 45 pairs of non-nephrotoxic drugs, among the most prescribed at our hospital and representing distinct pharmacological classes for DDIs. Results Sensitivity and specificity were 50 % [95 % confidence interval (CI) 23.66–76.34] and 90 % (95 % CI 59.58–98.21), respectively, for single drugs. Our algorithm confirmed a previously identified signal concerning clarithromycin and calcium-channel blockers (unadjusted odds ratio (ORu) 2.92; 95 % CI 1.11–7.69, p = 0.04). Among the 45 drug pairs investigated, we identified a signal concerning 55 patients in association with bromazepam and hydroxyzine (ORu 1.66; 95 % CI 1.23–2.23). This signal was not confirmed after a chart review. Even so, AKI and co-prescription were confirmed for 96 % (95 % CI 88–99) and 88 % (95 % CI 76–94) of these patients, respectively. Conclusion Data mining techniques on CDW can foster the detection of adverse drug reactions when drugs are used alone or in combination.

  • Notes:
    • Adverse drug reactions due drug-drug interactions (DDI) can be extremely difficult to identify. This paper explores a way to identify signals concerning acute kidney injury (AKI) and DDI utilizing data from electronic health records (EHR). EHR data from a French hospital was used with a set of algorithms to identify AKI. AKI was defined as an increase of greater than or equal to 50% of creatinine from baseline. The algorithm was tested for accuracy using known nephrotoxic drugs and non-nephrotoxic drugs. Among single drugs, sensitivity was 50% and specificity was 90% for the algorithm in detecting AKI based on known properties of the medications. Subsequent application of the algorithm to 45 pairs of non-nephrotoxic drugs were tested for signals concerning AKI and DDI. The algorithm identified AKI associated with concomitant use of clarithromycin and calcium channel blockers which had been previously reported. The algorithm identifed an increased risk of AKI for bromazepam and hydorxyzine in combination; however chart reviews for this combination indicated it was a false positive signal as nearly all patients had another cause of AKI. Although this study did not identify new DDIs for the risk of AKI in hospitalized patients, it demonstrated the use of EHR data, specifically clinical data, and the potential utility for signal detection for identification of DDIs. The frequency of the combination use of medications that lead to sufficient sample sizes and the ability to identify signals that are the result of the DDI and not another underlying cause would limit the full scale application of this approach. It will be important for future signal detection efforts to tease apart true signals from the DDIs versus those due to other causes.

Author

Peter C. Austin

Author

Elizabeth A. Stuart

Publication

Statistical methods in medical research

Date

2015 Sep 1

Abstract

Many non-experimental studies use propensity-score methods to estimate causal effects by balancing treatment and control groups on a set of observed baseline covariates. Full matching on the propensity score has emerged as a particularly effective and flexible method for utilizing all available data, and creating well-balanced treatment and comparison groups. However, full matching has been used infrequently with binary outcomes, and relatively little work has investigated the performance of full matching when estimating effects on binary outcomes. This paper describes methods that can be used for estimating the effect of treatment on binary outcomes when using full matching. It then used Monte Carlo simulations to evaluate the performance of these methods based on full matching (with and without a caliper), and compared their performance with that of nearest neighbour matching (with and without a caliper) and inverse probability of treatment weighting. The simulations varied the prevalence of the treatment and the strength of association between the covariates and treatment assignment. Results indicated that all of the approaches work well when the strength of confounding is relatively weak. With stronger confounding, the relative performance of the methods varies, with nearest neighbour matching with a caliper showing consistently good performance across a wide range of settings. We illustrate the approaches using a study estimating the effect of inpatient smoking cessation counselling on survival following hospitalization for a heart attack.

  • Notes:
    •   Propensity score matching is an approach for balancing treatment and control groups in a quasi-experimental study. One-by-n nearest neighbor matching (either optimal or greedy matching) and inverse probability of treatment weighting (IPTW) methods have been widely used in pharmacoepidemiology studies to estimate the treatment effects of binary outcomes. However, full matching, which is a synthesis of conventional pair matching on propensity score strata and weighting, has not been frequently used. This paper compared propensity-score methods, tested the performance of full matching on propensity score, and implemented the method in a case study of the impact smoking cessation on mortality following a hospitalization for mycoardial infaction.

  Full matching attempts to use all available individuals in the data by grouping individuals into strata that include 1 treated and at least 1 control subject or 1 control subject and at least 1 treated subject. The stratum are determined using an optimal matching strategy that minimizes the mean difference of the propensity score within the matched set. In estimatig the average treatment effect among the treated, the weighting factor is one for the treated subjects and the ratio of treated subjects to control subjects in the matched set for the control subjects. Similar to other matching strategies, this matchign approach can result in exlcusion of individuals from the initial population. Specifically, subjects with a propensity score lower than the lowest propensity score in the treated group or higher than the highest score in the treated group may be excluded. 

  The authors examined the relative performance of different propensity-score methods, specifically full matching, pair-matching and IPTW, in estimating marginal odds ratio, relative risk, and risk difference estimations using simulated datasets. The results of the simulation indicated that the full matching with caliper method was equal or superior to the other matching schemes in minimizing bias when estimating the risk difference. When estimating relative risk or odds ratios, full matching performed similar to IPTW; however nearest neighbor matching with calipers had the lowest bias. Across all the approaches and all the risk estimators, the bias was smallest with small degree of confounding and increased as the amount of confounding due to covariates increased.

  An important limitation of the full matching apporach is that methods for estimating standard error of the treatment effect are not available. The authors demonstrated that a bootstrap approach results in modest overestimate of the standard deviation. The degree of overestimation was relatively modest and decreased as the treatment prevalence increased. The authors assert that a bootstrap approach is reasonable for estimating standard errors when utilizing the full matching method; however this can be computational intensive.

  In a binary outcomes research, full matching with a caliper restriction resulted in an unbiased estimation when the magnitude of the effect of covariates on treatment selection was weak. When the magnitude of covariate effects on treatment selection was strong, the estimates using nearest neighbour matching with a caliper showed the lowest bias.

Author

Lisa G. Johnston

Author

Katherine R. McLaughlin

Author

Houssine El Rhilani

Author

Amina Latifi

Author

Abdalla Toufik

Author

Aziza Bennani

Author

Kamal Alami

Author

Boutaina Elomari

Author

Mark S. Handcock

Volume

26

Issue

6

Pages

846-852

Publication

Epidemiology

Date

11/2015

  • Notes:
    • ANNOTATION - Johnston et al. Estimating the size of hidden populations using respondent-driven sampling data

This research presents a new method for estimating population sizes in hard-to-reach groups (e.g., estimating the number of intravenous drug users in a country without a framework to capture that information in order to understand HIV risk). The approach described in this study is called successive sampling-population size estimation (SS-PSE). The investigators present the methods for conducting SS-PSE, in the context of a case example, and discuss the advantages of it use over other methods.

SS-PSE uses data captured through respondent-driven sampling, much like the alternative modes of population size estimation (i.e., multiplier methods). Respondent-driven sampling is a method to recruit patients using social networks. Through this sampling process, data is collected about each person’s “degree,” or social network size, and recruitment pattern, which can then be used in SS-PSE. However, contrary to other methods, SS-PSE does not require the use of a second data source. Instead, SS-PSE utilizes information on prior knowledge of the population size in the estimation. Using a Bayesian framework, SS-PSE estimates the distribution of a population size given prior knowledge and other observed data (i.e., the degrees and recruitment patterns from the samples). Specifically, a four-component Gibbs sampler is used to produce a posterior predictive distribution of population size. The population size estimate is obtained from the mean or median of this distribution.

The investigators acknowledge that there may be bias associated with using self-reported degree in SS-PSE; therefore, they present an imputation method that replaces “degree” with “visibility” (based on degree and recruitment success) to strengthen estimates of SS-PSE.

Overall, SS-PSE produced population size estimates in between those produced by other methods (i.e., multiplier method or the literature) and may be a more precise approach. Caution should be used when the sample proportion used in SS-PSE is < 10% due to risk of inaccurate estimates. Results of SS-PSE can be used in collaboration with other methods to strengthen assumptions about population size in epidemiology research.

Author

Erel Joffe

Author

Emily J. Pettigrew

Author

Jorge R. Herskovic

Author

Charles F. Bearden

Author

Elmer V. Bernstam

Volume

22

Issue

5

Pages

962-966

Publication

Journal of the American Medical Informatics Association

Date

2015/09/01

Abstract

Introduction Automatically identifying specific phenotypes in free-text clinical notes is critically important for the reuse of clinical data. In this study, the authors combine expert-guided feature (text) selection with one-class classification for text processing.Objectives To compare the performance of one-class classification to traditional binary classification; to evaluate the utility of feature selection based on expert-selected salient text (snippets); and to determine the robustness of these models with respects to irrelevant surrounding text.Methods The authors trained one-class support vector machines (1C-SVMs) and two-class SVMs (2C-SVMs) to identify notes discussing breast cancer. Manually annotated visit summary notes (88 positive and 88 negative for breast cancer) were used to compare the performance of models trained on whole notes labeled as positive or negative to models trained on expert-selected text sections (snippets) relevant to breast cancer status. Model performance was evaluated using a 70:30 split for 20 iterations and on a realistic dataset of 10 000 records with a breast cancer prevalence of 1.4%.Results When tested on a balanced experimental dataset, 1C-SVMs trained on snippets had comparable results to 2C-SVMs trained on whole notes (F = 0.92 for both approaches). When evaluated on a realistic imbalanced dataset, 1C-SVMs had a considerably superior performance (F = 0.61 vs. F = 0.17 for the best performing model) attributable mainly to improved precision (p = .88 vs. p = .09 for the best performing model).Conclusions 1C-SVMs trained on expert-selected relevant text sections perform better than 2C-SVMs classifiers trained on either snippets or whole notes when applied to realistically imbalanced data with low prevalence of the positive class.

  • Notes:
    • The ability to automatically identify specific phenotypes (diseases) in free-text clinical notes is of critical importance given the implications it may have on the reuse of clinical data for research and other purposes.  While natural language processing (NLP) methods are available to train classifiers to identify relevant text, previous attempts have been marred with difficulties, such as building classifiers on noisy or irrelevant text.  New methods that offer a possible solution include training a support vector machine (SVM) classifier to identify relevant text for one-class classification (1-CC) and two-class classification system (2-CC) models (1C-SVM and 2C-SVM, respectively). The objective of this study was to evaluate a novel method to enable machine learning for text classification.  More specifically, this study assesses expert-guided feature selection in the construction of NLP classifiers (snippets) versus classifiers trained on manually annotated whole notes, and evaluating the combination of a 1C-SVM model and expert-guided features. The study authors trained the 1C-SVM and 2C-SVM models to identify breast cancer specific classifiers (as the sample phenotype). Model performance was evaluated on a random 70:30 split (classifiers trained on 70% and tested on 30% of a balanced dataset) for 20 iterations and on an imbalanced, realistic dataset. In looking at the performance of classifiers, in the balanced dataset the 1C-SVM model trained on snippets performed comparably to 2C-SVM model trained on whole notes.  However, in the realistic, imbalanced dataset, precision in 1C-SVM trained on snippets outperformed the 2C-SVM model trained on whole notes.  Given that most clinical databases have an imbalanced representation of many phenotypes (e.g. diseases), it is far more critical to evaluate methods that correctly identify specific clinical phenotypes in large datasets.  In this context, the advantage of combining a 1-CC model with expert-guided feature selection over a 2-CC model manually identifying key concepts becomes apparent.  While free-text clinical notes and electronic health records may not be traditionally used for signal detection in pharmacoepidemiologic methods, with increased precision and performance demonstrated in automatically identifying specific phenotypes, this may be an added source to consider in signal detection related research. 

Author

Lauren M. Kunz

Author

Sharon-Lise T. Normand

Author

Art Sedrakyan

Volume

34

Issue

21

Pages

2913-2925

Publication

Statistics in Medicine

Date

September 20, 2015

Abstract

Modeling events requires accounting for differential follow-up duration, especially when combining randomized and observational studies. Although events occur at any point over a follow-up period and censoring occurs throughout, most applied researchers use odds ratios as association measures, assuming follow-up duration is similar across treatment groups. We derive the bias of the rate ratio when incorrectly assuming equal follow-up duration in the single study binary treatment setting. Simulations illustrate bias, efficiency, and coverage and demonstrate that bias and coverage worsen rapidly as the ratio of follow-up duration between arms moves away from one. Combining study rate ratios with hierarchical Poisson regression models, we examine bias and coverage for the overall rate ratio via simulation in three cases: when average arm-specific follow-up duration is available for all studies, some studies, and no study. In the null case, bias and coverage are poor when the study average follow-up is used and improve even if some arm-specific follow-up information is available. As the rate ratio gets further from the null, bias and coverage remain poor. We investigate the effectiveness of cardiac resynchronization therapy devices compared with those with cardioverter-defibrillator capacity where three of eight studies report arm-specific follow-up duration. Copyright © 2015 John Wiley & Sons, Ltd.

  • Notes:
    • Conventional meta-analysis provides a summary statistic that is weighted by the number of subjects in each study. When the follow-up window is fixed for each subject, the number of events in the study population represents the probability of events. However, the approach of summarizing estimates across studies is not simple when there is differential follow-up across the studies and across treatment arms. This study used similuated data to illustrate the potential bias when failing to adequately account for differential follow-up between groups. The authors demonstrate that even in single studies, the use of the average follow-up time for the study population can lead to bias compared to the use of the arm-specific follow-up time. Simulation of meta-analyses indicated that as the ratio of the follow-up time between the treated and comparator moves away from one (i.e. unequal follow-up time), the amount of bias in the summary effect estimate increased when failing to account for the differential follow-up. Simulations that included partially reported follow-up times by treatment arm also resulted in biased summary effect estimates whether the studies with unreported follow-up times were missing at random and missing completely at random. The findings of these simulations indicate the importance of reporting and using arm-specific follow-up information in meta-analyses. The results of the empirical data analysis were consistent with the simulation findings where meta-analyses that fail to account for differential follow-up time have substantial bias. The authors recommend studies report arm-specific follow-up times as summarizing effect estimates across studies becomes more common. Moreover, the failure to account for differenital follow-up time on summary effect estimates could be even more problematic in network meta-analyses. 

Author

Chung-Wei Shen

Author

Yi-Hau Chen

Volume

16

Issue

4

Pages

740-753

Publication

Biostatistics (Oxford, England)

Date

Oct 2015

Abstract

Missing observations and covariate measurement error commonly arise in longitudinal data. However, existing methods for model selection in marginal regression analysis of longitudinal data fail to address the potential bias resulting from these issues. To tackle this problem, we propose a new model selection criterion, the Generalized Longitudinal Information Criterion, which is based on an approximately unbiased estimator for the expected quadratic error of a considered marginal model accounting for both data missingness and covariate measurement error. The simulation results reveal that the proposed method performs quite well in the presence of missing data and covariate measurement error. On the contrary, the naive procedures without taking care of such complexity in data may perform quite poorly. The proposed method is applied to data from the Taiwan Longitudinal Study on Aging to assess the relationship of depression with health and social status in the elderly, accommodating measurement error in the covariate as well as missing observations.

  • Notes:
    • Longitudinal studies that account for time varying nature of covariates utilize marginal regression models to determine the relationship between an outcome variable and a given set of covariates. Common challenges with repeated measures include missing data and error in measurement of covariates (due to instrumentation or other issues). Yi and colleagues proposed a Generalized Method of Moments (GMM) approach, an extension of GEE,  to estimate parameters in a marginal regression model where data can be missing or measured with an error. Currently, there are three existing approaches to variable selection in marginal regression models. They are, the Quasi-likelihood under the Independence model Criterion (QIC) method, the GCp method, and the Missing Longitudinal Information Criterion (MLIC) method. While the QIC and GCp methods only apply for well-measured and completely observed data, the MLIC method is applicable when there is missing data. However, the MLIC method does not allow for error in covariate measurement. This paper summarizes the development of a new model selection criterion called the Generalized Longitudinal Information Criterion (GLIC), that can produce an unbiased estimate of the error term in the presence of missing/poorly measured data. This criterion includes three terms: (1) a measure of goodness-of-fit (mean squared error between observed and fitted outcomes); (2) a penalty for overfitting (covariance between observed and fitted values); (3) trace of the true covariance. Developing this criterion, the authors made two assumptions: (1) data was missing at random (MAR); (2)  variance of measurement error can be estimated from repeated measures. When implemented, a model with smaller GLIC statistic has a better fit.  The authors applied the GMM approach with GLIC and non-GLIC methods in simulated and real world data and found that GMM+GLIC was more likely to select empirically significant covariates (α = 0.05). The authors have also extended their work to other regression models such as normal linear and log binomial. This work might be of interest to those pharmacoepidemiologists that are working with repeated measures data and time varying confounding. There are conceptual and logistical barriers to practical application. Conceptually, the assumptions made during the development of this statistic (MAR, using repeated measures to estimate variance of measurement error) might not be applicable in all settings. Logistically, the use of this statistic will require extensive statistical programming (R/SAS/STATA packages unavailable). Even if these barriers are overcome, it is important to note  that empirical and clinical significance must be balanced while selecting variables in a model.

Author

Benjamin P. Chapman

Author

Alexander Weiss

Author

Kevin Fiscella

Author

Peter Muennig

Author

Ichiro Kawachi

Author

Paul Duberstein

Volume

53

Issue

11

Pages

909-915

Publication

Medical Care

Date

11/2015

  • Notes:
    • ANNOTATION - Chapman et al. Mortality risk prediction: Can comorbidity indices be improved with psychosocial data?

The objective of this research was to evaluate the change in predictive capability of the Charlson Comorbidity Index (CCI) for mortality with the addition of the selected psychosocial factors to the index. The investigators used data from the Health and Lifestyle Survey (HALS), a 1984 cohort study in a population of adults in the United Kingdom. The baseline data collected includes information about health behaviors and numerous psychosocial assessments. Mortality data was collected for the cohort at 5, 10, 15, 20, and 25-year time points. Select psychosocial factors were tested to identify those with significant added predictive value to the model, calculated as a prognostic utility score based on the bootstrap inclusion fraction, AUC improvement, classification calibration improvement, and net reclassification improvement (NRI). Ultimately, income, education, the trait “communalism,” the score from the EPI lie scale, and the type A scale were incorporated into the revised CCI-Psychosocial Risk (CCI-PSR) index. The CCI-PSR performed better than CCI in terms of discriminative ability. According to NRI results, prediction was improved by CCI-PSR in about 60% of people for early time points, but declined as the length of follow up increased (20 and 25 year time points). Similarly, calibration was improved with the new index for all time points except 25 years. Overall, this modified index demonstrated superior performance to the original CCI, except that there was slight decline in calibration and reclassification as the time horizon was extended, possibly a result of variation in psychosocial factors over time. Ultimately, this research does provide evidence to support consideration of these factors in prognostic models. The investigators suggest that these factors could feasibly be collected in health care records and would be a relatively low cost method of improving mortality predictive capability.

Author

Maria Konstantinou

Author

Stefanie Biedermann

Author

Alan C. Kimber

Volume

165

Pages

27-37

Publication

Journal of Statistical Planning and Inference

Date

October 2015

Abstract

Time-to-event data are often modelled through Cox’s proportional hazards model for which inference is based on the partial likelihood function. We derive a general expression for the asymptotic covariance matrix of Cox’s partial likelihood estimator for the covariate coefficients. Our approach is illustrated through an application to the special case of only one covariate, for which we construct minimum variance designs for different censoring mechanisms and both binary and interval design spaces. We compare these designs with the corresponding ones found using the full likelihood approach and demonstrate that the latter designs are highly efficient also for partial likelihood estimation.

  • Notes:
    • Time-to-event data are often modelled using Cox’s proportional hazards models for which inference is based on the partial likelihood function. There is limited guidance in the published literature on the most efficient designs for time to event data with censoring, the majority of the available articles consider parametric models. The authors present a closed-form expression for the asymptotic covariance matrix in the Cox model, and provide a necessary condition for the optimality of a design that can be used to screen out non-optimal designs. The authors summarize data issues in studies survival analyses and define design terminology. Using secondary data, the authors present an optimality criterion for the asymptotic covariance matrix of Cox’s partial likelihood estimator for covariate coefficients. Their approach is applied to the special case of only one covariate, for which we construct minimum variance designs for different censoring proportions (0.1 to 0.9) in both binary and interval design spaces. They compare these designs cases utilizing a full likelihood approach and demonstrate that the interval designs efficient are high for partial likelihood estimation. Efficiency percentages remain high with varying and remain so even with different degrees of censored data. The authors conclude by recommending the use of highly efficient c-optimal designs, also for partial likelihood estimation, and argue that their work demonstrates little disadvantage in doing so in most situations: a smaller sample size is required to obtain estimates.

Author

Tsung-Chiung Fu

Author

Deng-Huang Su

Author

Shu-Hui Chang

Publication

Biostatistics (Oxford, England)

Date

Sep 21, 2015

Abstract

Recurrent event data are frequently encountered in long-term follow-up studies. In many applications, the gap times between two successive recurrent events are natural outcomes of interest. Investigation on patterns of associations among recurrent gap times within subjects is an important inferential issue. In this paper, we introduce flexible functions of previous gap times to create a class of summary measures of serial associations for a sequence of recurrent gap times through Kendall's tau. Such a general class of serial association measures provides a useful tool to quantify the predictive abilities of event history with different aspects. Non-parametric estimators of the proposed measures of serial associations are developed by generalizing the existing estimator of Kendall's tau for two serial gap times, in which inverse probability of censoring weights is used to overcome the induced dependent censoring. Various tests are further constructed for testing the constancy of serial associations over different events. Our method is applied to Denmark schizophrenia data and the results show that association structures are different for distinct ages of onset of schizophrenia.

  • Notes:
    • Disease progression studies may be evaluated by using outcomes that may occur multiple times. In this case, evaluating the gap time between events can be useful in predicting subsequent events. Two approaches that are used to classify gap time are: (1) marginal methods; and (2) intensity-based methods. Typical assumptions made in intensity-based methods can ignore potential complicated patterns of pairwise association between gap times. To this end, non-parametric association analysis for clustered failure time has been studied, including Kendall’s tau and Lakhal-Chaieb. In this paper, the authors investigate serial association structures in a sequence of recurrent gap times. The authors describe non-parametric estimators for serial association measures, present tests for constancy and then conduct simulation studies using sample data from a Danish psychiatric case registry. These methods may provide a more efficient way for predicting future gap time using event history and applied medical research in studies attempting to elucidate the prognostic utility of short-term event history on long-term course.

Author

Yuneung Kim

Author

Johan Lim

Author

DoHwan Park

Volume

57

Issue

6

Pages

1131-1145

Publication

Biometrical Journal

Date

November 1, 2015

Abstract

In this paper, we study a nonparametric procedure to test independence of bivariate interval censored data; for both current status data (case 1 interval-censored data) and case 2 interval-censored data. To do it, we propose a score-based modification of the Kendall's tau statistic for bivariate interval-censored data. Our modification defines the Kendall's tau statistic with expected numbers of concordant and disconcordant pairs of data. The performance of the modified approach is illustrated by simulation studies and application to the AIDS study. We compare our method to alternative approaches such as the two-stage estimation method by Sun et al. (Scandinavian Journal of Statistics, 2006) and the multiple imputation method by Betensky and Finkelstein (Statistics in Medicine, 1999b).

  • Notes:
    • There is a paucity of evidence for bivariate interval censored data where continuous survival time is often determined to lie in an interval rather than observed directly. There are several dependence measures, including Kendall’s tau, that estimate bivariate joint distributions and are calculated based on imputed complete data. In this paper, the authors propose a score-based modification that estimates the tau statistic non-parametrically with conditionally predicted numbers of concordant and discordant pairs given the interval-censored observations. After a brief review of the current state of the science around the Kendall’s tau statistic, the authors compared their score-based modficiation, with two other existing estimators of tau. Measures of this numerical comparison included mean square error and empirical power. Finally, they apply their procedure within two clinical datasets with AIDS blood and urine samples. The authors found that their non-parametric procedure was less intensive with respect to computational effort and had a higher power than the two procedures proposed by others. 

Author

Talbot Denis

Author

Lefebvre Geneviève

Author

Atherton Juli

Volume

3

Issue

2

Pages

207

Publication

Journal of Causal Inference

Date

2015

  • Notes:
    • In observational studies, using fully adjusted models may include unnecessary covariates which increases the variance of regression coefficient of interest. The classic approach to select models such as Bayesian model averaging (BMA) performs poorly when the goal is causal effect estimation. This study introduces a novel method called Bayesian causal effect estimation (BCEE) which is based on the graphical framework of causal inference. This approach was shown to slightly lower variance compared to fully adjusted, BMA, Bayesian adjustment for confounding (BAC) and two-stage BAC (TBAC) methods in some cases. This method can be used for assessing causal relationships between continuous exposure and continuous outcomes. Application to binary exposure is straightforward however, future research is needed for other types of data. Computationally, this method can take longer to fit models, especially compared to BAC.

Author

Daniel Commenges

Author

Anne Gégout-Petit

Volume

21

Issue

4

Pages

561-578

Publication

Lifetime Data Analysis

Date

2015/02/11

Abstract

The problem of assessing the effect of a treatment on a marker in observational studies raises the difficulty that attribution of the treatment may depend on the observed marker values. As an example, we focus on the analysis of the effect of a HAART on CD4 counts, where attribution of the treatment may depend on the observed marker values. This problem has been treated using marginal structural models relying on the counterfactual/potential response formalism. Another approach to causality is based on dynamical models, and causal influence has been formalized in the framework of the Doob–Meyer decomposition of stochastic processes. Causal inference however needs assumptions that we detail in this paper and we call this approach to causality the “stochastic system” approach. First we treat this problem in discrete time, then in continuous time. This approach allows incorporating biological knowledge naturally. When working in continuous time, the mechanistic approach involves distinguishing the model for the system and the model for the observations. Indeed, biological systems live in continuous time, and mechanisms can be expressed in the form of a system of differential equations, while observations are taken at discrete times. Inference in mechanistic models is challenging, particularly from a numerical point of view, but these models can yield much richer and reliable results.

  • Notes:
    • In cases of dynamic treatment regimens, assessment of causality can be challenging as changes in co-variates over time can affect the treatment selection and the outcome of interest. Traditional regression analysis in this case may be biased and can negatively impact the assessment of effectiveness of the treatment. This paper describes a new approach to deal with this situation and it is based on dynamic approach to causality where the outcome of interest is represented by stochastic processes. The author describes the assumptions and methods to estimate causal effects for models with different degrees of complexity for discrete as well as continuous time. The authors suggest that the continuous time approach is more realistic and allows unequally spaced observations; however, it requires specific software for inference.

Top