Abstract
Since 1998, the US Food and Drug Administration (FDA) has been exploring new automated and rapid Bayesian data mining techniques. These techniques have been used to systematically screen the FDA’s huge MedWatch database of voluntary reports of adverse drug events for possible events of concern.
The data mining method currently being used is the Multi-Item Gamma Poisson Shrinker (MGPS) program that replaced the Gamma Poisson Shrinker (GPS) program we originally used with the legacy database. The MGPS algorithm, the technical aspects of which are summarised in this paper, computes signal scores for pairs, and for higher-order (e.g. triplet, quadruplet) combinations of drugs and events that are significantly more frequent than their pair-wise associations would predict. MGPS generates consistent, redundant, and replicable signals while minimising random patterns. Signals are generated without using external exposure data, adverse event background information, or medical information on adverse drug reactions. The MGPS interface streamlines multiple input-output processes that previously had been manually integrated. The system, however, cannot distinguish between already-known associations and new associations, so the reviewers must filter these events.
In addition to detecting possible serious single-drug adverse event problems, MGPS is currently being evaluated to detect possible synergistic interactions between drugs (drug interactions) and adverse events (syndromes), and to detect differences among subgroups defined by gender and by age, such as paediatrics and geriatrics.
In the current data, only 3.4% of all 1.2 million drug-event pairs ever reported (with frequencies ≥ 1) generate signals [lower 95% confidence interval limit of the adjusted ratios of the observed counts over expected (O/E) counts (denoted EB05) of ≥ 2]. The total frequency count that contributed to signals comprised 23% (2.4 million) of the total number, 10.4 million of drug-event pairs reported, greatly facilitating a more focused follow-up and evaluation.
The algorithm provides an objective, systematic view of the data alerting reviewers to critically important, new safety signals. The study of signals detected by current methods, signals stored in the Center for Drug Evaluation and Research’s Monitoring Adverse Reports Tracking System, and the signals regarding cerivastatin, a cholesterol-lowering drug voluntarily withdrawn from the market in August 2001, exemplify the potential of data mining to improve early signal detection. The operating characteristics of data mining in detecting early safety signals, exemplified by studying a drug recently well characterised by large clinical trials confirms our experience that the signals generated by data mining have high enough specificity to deserve further investigation. The application of these tools may ultimately improve usage recommendations.






Similar content being viewed by others
References
Baum C, Kweder SL, Anello C. The spontaneous reporting system in the United States. In: Strom BL, editor. Pharmacoepidemiology. 2nd ed. New York; John Wiley & Sons, 1994: 125–37
Graham D, Waller P, Kurz X. A view from regulatory agencies. In: Strom BL, editor. Pharmacoepidemiology, 3rd ed. New York: John Wiley & Sons, 2000: 109–24
Trontell AE. How the US food and drug administration defines and detects adverse drug events. Curr Ther Res Clin Exp 2001; 62: 641–9
DuMouchel W, Pregibon D. Empirical bayes screening for multi-item associations. Proceedings of the conference on knowledge discovery and data; 2001 Aug 26-29; San Diego (CA): ACM Press: 67–76
DuMouchel W. Bayesian data mining in large frequency tables, with an application to the FDA spontaneous reporting system. The American Statistician 1999; 53: 177–90
O’Neill RT, Szarfman A. Discussion: Bayesian data mining in large frequency tables, with an application to the FDA spontaneous reporting system. The American Statistician 1999; 53: 190–6
Louis TA, Shen W. Discussion: Bayesian data mining in large frequency tables, with an application to the FDA spontaneous reporting system by William DuMouchel. The American Statistician 1999; 53: 196–8
Madigan D. Discussion: Bayesian data mining in large frequency tables, with an application to the FDA spontaneous reporting system by William DuMouchel. The American Statistician 1999; 53: 198–200
DuMouchel W. Bayesian data mining in large frequency tables, with an application to the FDA spontaneous reporting system. Reply. The American Statistician 1999; 53: 201–2
Szarfman A. Discussion: a report on the activities of the adverse events working groups: focus on improving the detection of rare but serious events. Proceedings of the Biopharmaceutical Section, 1999. Alexandria (VA): American Statistical Association: 12–4
Szarfman A. The application of bayesian data mining and graphic visualization tools to screen FDA’s spontaneous reporting system database. Proceedings of the Section on Bayesian Statistical Science, 2000. American Statistical Association, 2000: 67–71
Szarfman A, Talarico L, Levine JG. Analysis and risk assessment of hematological data from clinical trials: toxicology of the hematopoietic system. In: Sipes IG, McQueen CA, Gandolfi AJ. Comprehensive toxicology. Vol. 4. New York; Elsevier Science Inc.: 1997: 363–79
Levine JG, Szarfman A. Standardised data structures and visualisation tools: a way to accelerate the regulatory review of the integrated summary of safety of new drug applications. Biopharmaceutical Report 1996; 4(3): 12–7
Video Clips. Workshop on datamining with applications in genomics, clinical trials and post-marketing drug risk. Schering-Plough Workshop 2000–2001. Harvard School of Public Health. Available from URL: http://www.biostat.harvard.edu/events/schering-plough/old/agenda2000-01.html [Accessed 2002 May]
ftp://ftp.research.att.com/dist/gps [Accessed 2002 May]
US Department of Commerce National Technical Information Service (NTIS), http://www.ntis.gov [Accessed 2002 May]
Rolka H, Barker L, Cadwel B, et al. Data mining for post-licensure vaccine safety and policy implications for using results. 2001 Proceedings of the Section on Health Policy Statistics, American Statistical Association. In press
O’Neill RT, Szarfman A. Some FDA perspectives on data mining for pediatric safety assessment. Workshop on Adverse Drug Events in Pediatrics. Curr Ther Res Clin Exp 2001; 62: 650–63
Niu MT, Erwin DE, Braun MM. Data mining in the US vaccine adverse event reporting system (VAERS): early detection of intussusception and other events after rotavirus vaccination. Vaccine 2001; 19: 4627–34
FDA Talk paper. Bayer voluntarily withdraws baycol. FDA talk paper no. T01-34. 2001 Aug 8
Acknowledgements
The datamining technology referred to in this article was developed with grants from the Office of Women’s Health and the Center of Drug Evaluation and Research of the Food and Drug Administration and from an ‘Unmet Needs’ Grant from the National Centers for Disease Control and Prevention, United States Department of Health & Human Services.
We thank William DuMouchel of AT&T for developing the empirical Bayes data mining algorithms that we are applying to frequency counts; David Fram of Lincoln Technologies, Inc, Jeremy Pool, Ilya Yunus, and Ava-Robin Cohen of PPD Informatics ™ for providing critical technical information development and implementation expertise; Diane Wysowski and Janos Bacsanyi from CDER for providing adverse event signals detected by current methods; Susan Ellenberg, Miles Braun, and Manette Niu from CBER, FDA and Henry Rolka from CDC for precious feedback and collaboration. We thank Phillip Perucci and Stacey Nichols from FDA for very valuable technical support.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Szarfman, A., Machado, S.G. & O’Neill, R.T. Use of Screening Algorithms and Computer Systems to Efficiently Signal Higher-Than-Expected Combinations of Drugs and Events in the US FDA’s Spontaneous Reports Database. Drug-Safety 25, 381–392 (2002). https://doi.org/10.2165/00002018-200225060-00001
Published:
Issue Date:
DOI: https://doi.org/10.2165/00002018-200225060-00001