The MEDLINE database is publicly available through the National Library of Medicine’s PubMed but the data file itself is also licensed to a number of vendors, who may offer their versions to institutional and other parties as part of a database platform. These vendors provide their own interface to the MEDLINE file and offer other technologies that attempt to make their version useful to subscribers. However, little is known about how vendor platforms ingest and interact with MEDLINE data files, nor how these changes influence the construction of search queries and the results they produce. This poster presents a longitudinal study of five MEDLINE databases involving 29 sets of logically and semantically consistent search queries (five search queries for each set). The goal is to understand whether it is possible to reproduce search queries by: a) analyzing search query syntax per database, and b) controlling for total search results. We also highlight the barriers to creating reproducible queries across MEDLINE databases.

Document Type

Conference Proceeding

Publication Date


Notes/Citation Information

Published in iConference 2019 Posters Proceedings, which is available online at https://www.ideals.illinois.edu/handle/2142/102119.

Copyright 2019 C. Sean Burns, Robert M. Shapiro II, Tyler Nix, and Jeffrey T. Huber

The copyright holders have granted the permission for posting the poster description here.

Digital Object Identifier (DOI)