Medication-related morbidity and mortality in ambulatory care in the United States results in estimated 100,000 deaths and $177 billion spending annually. Currently, much of the information necessary for active drug safety surveillance is “locked” in the unstructured text of electronic records. Our long-term goal is to develop information technology to recognize and prevent drug therapy related adverse events. Sophisticated natural language processing systems have been developed to find medical terms and their synonyms in the unstructured text and use them to retrieve information. In order to monitor alarming trends in symptoms in medical records, we need mechanisms that will allow not only accurate term and concept identification but also grouping of semantically related concepts that may not necessarily be synonymous. Measures of semantic relatedness rely on existing ontologies of domain knowledge as well as large textual corpora to compute a numeric score indicating the strength of relatedness between two concepts. Our central hypothesis is that such measures will be able to make fine-grained distinctions among concepts in the biomedical text, and provide a foundation upon which to organize concepts into meaningful groups automatically. In particular, this proposal seeks to develop methods that leverage the medical knowledge contained within Unified Medical Language System (UMLS) and corpora of clinical text. Our short-term goals are 1) to develop and validate a common open-source platform for developing and testing semantic relatedness measures; 2) to determine the validity of electronic medical records with respect to identification of symptoms associated with medication-related problems. 3) to design a novel methodology to aggregate adverse reaction terms used to code spontaneous post-marketing drug safety surveillance reports; Our next step will be to develop and validate a generalizable active medication safety surveillance system that will automatically track medication exposure and alarming trends in signs and symptoms in out- and in-patient populations for a broad range of disease states.
Serguei VS Pakhomov, PhD (PI); Ted Pedersen, PhD (Co-PI); Terrence Adam, MD, PhD (Co-I); Brian Isetts, PharmD (Co-I); Robert Cipolle, PharmD (Consultant); Bridget McInnes, PhD (post-doc); Ying Liu, PhD (post-doc)
Semantic Similarity and Relatedness between Clinical Terms: An Experimental Study. Pakhomov S., McInnes, B., Adams, T., Liu, Y., Pedersen, T. and Melton, G.B. To Appear in the Proceedings of the Annual Symposium of the American Medical Informatics Association. Washington, D.C. November, 2010. (paper: doc data: similarity, relatedness, )
UMLS-Interface and UMLS-Similarity : Open source software for measuring paths and semantic similarity. McInnes B.T., Pedersen T., and Pakhomov S.V. In: Proceedings of the Annual Symposium of the American Medical Informatics Association. San Fransisco, CA. 2009;431-435. (data: csv)
Measures of semantic similarity and relatedness in the biomedical domain. Pedersen T., Pakhomov S.V.S., Patwardhan S., and Chute C.G. Journal of Biomedical Informatics. 2007;40(3):288-299. (data: csv)