In this paper we present Kratylos, at www.kratylos.org/, a web application that creates searchable multimedia corpora from data collections in diverse formats, including collections of interlinearized glossed text (IGT) and dictionaries. There exists a crucial lacuna in the electronic ecology that supports language documentation and linguistic research. Vast amounts of IGT are produced in stand-alone programs without an easy way to share them publicly as dynamic databases. Solving this problem will not only unlock an enormous amount of linguistic information that can be shared easily across the web, it will also improve accountability by allowing us to verify analyses across collections of primary data. We argue for a two-pronged approach to sharing language documentation, which involves a popular interface and a specialist interface. Finally, we briefly introduce the potential of regular expression queries for syntactic research.

Document Type


Publication Date


Notes/Citation Information

Published in Language Documentation & Conservation, v. 12, p. 124-146.

Licensed under Creative Commons Attribution-NonCommercial 4.0 International.

Funding Information

The programming and field work reported here are supported by NSFDEL Grant #1500753 to Raphael Finkel and Daniel Kaufman.