Archived

This content is available here strictly for research, reference, and/or recordkeeping and as such it may not be fully accessible. If you work or study at University of Kentucky and would like to request an accessible version, please use the SensusAccess Document Converter.

Abstract

Objective—We introduce a structural-lexical approach for auditing SNOMED CT using a combination of non-lattice subgraphs of the underlying hierarchical relations and enriched lexical attributes of fully specified concept names. Our goal is to develop a scalable and effective approach that automatically identifies missing hierarchical IS-A relations.

Methods—Our approach involves 3 stages. In stage 1, all non-lattice subgraphs of SNOMED CT’s IS-A hierarchical relations are extracted. In stage 2, lexical attributes of fully-specified concept names in such non-lattice subgraphs are extracted. For each concept in a non-lattice subgraph, we enrich its set of attributes with attributes from its ancestor concepts within the non-lattice subgraph. In stage 3, subset inclusion relations between the lexical attribute sets of each pair of concepts in each non-lattice subgraph are compared to existing IS-A relations in SNOMED CT. For concept pairs within each non-lattice subgraph, if a subset relation is identified but an IS-A relation is not present in SNOMED CT IS-A transitive closure, then a missing IS-A relation is reported. The September 2017 release of SNOMED CT (US edition) was used in this investigation.

Results—A total of 14,380 non-lattice subgraphs were extracted, from which we suggested a total of 41,357 missing IS-A relations. For evaluation purposes, 200 non-lattice subgraphs were randomly selected from 996 smaller subgraphs (of size 4, 5, or 6) within the “Clinical Finding” and “Procedure” sub-hierarchies. Two domain experts confirmed 185 (among 223) missing IS-A relations, a precision of 82.96%.

Conclusions—Our results demonstrate that analyzing the lexical features of concepts in non-lattice subgraphs is an effective approach for auditing SNOMED CT.

Document Type

Article

Publication Date

2-2018

Notes/Citation Information

Published in Journal of Biomedical Informatics, v. 78, p. 177-184.

© 2017 Elsevier Inc.

This manuscript version is made available under the CC‐BY‐NC‐ND 4.0 license https://creativecommons.org/licenses/by-nc-nd/4.0/.

The document available for download is the author's post-peer-review final draft of the article.

Digital Object Identifier (DOI)

https://doi.org/10.1016/j.jbi.2017.12.010

Funding Information

This work was supported by the National Science Foundation through grants IIS-1657306 and ACI-1626364, and the National Institutes of Health (NIH) National Center for Advancing Translational Sciences through grant UL1TR001998.

Share

COinS