Sensitivity and Specificity Adjustment for Verification Bias in the Validation of Electronic Phenotypes
Welcome to Northwestern’s webpage for evaluating computable phenotype performance. This resource was created by a multidisciplinary project team interested in improving computable phenotype validation and creating accurate performance measures. This page is housed by the Center for Genetic Medicine as analysis of genotyped biobank samples (which generally requires phenotype generation from Electronic Health Records) is a critical element of many modern human genetics studies.
Sensitivity, Specificity, NPV, PPV and accuracy are designed to assess performance of screening tests in detecting disease in a population. The assumption is that a random population sample both receives the screening test and is also classified by a “gold-standard” test of disease.
In the context of the validation of EHR algorithms, adjustments often need to be made to these calculations for the values to be interpretable, as it is often not feasible in this context to use a random sample of the population for validation.
It is necessary to correct Sensitivity, Specificity, NPV, PPV and accuracy estimates if EHR-algorithm-defined disease positive patients were oversampled (compared to proportion of predicted disease in the sample) for chart review. This will be the case if your disease prevalence is less than 50% but you selected equal numbers of people with disease and without disease for chart review validation.
The tool below was created by Ajay Bhasin, MD, and Laura Rasmussen-Torvik, PhD, to facilitate these calculations. We hope researchers will use this tool with their sample data to accurately gauge model performance. The data is not stored in any way and the tool may be used free of charge. A paper describing the work can be viewed here.
Share Your Feedback
We are interested in hearing your feedback. If you have any comments or suggestions regarding this tool, please contact us.