Abstract
In this paper we present Kratylos, at www.kratylos.org/, a web application that creates searchable multimedia corpora from data collections in diverse formats, including collections of interlinearized glossed text (IGT) and dictionaries. There exists a crucial lacuna in the electronic ecology that supports language documentation and linguistic research. Vast amounts of IGT are produced in stand-alone programs without an easy way to share them publicly as dynamic databases. Solving this problem will not only unlock an enormous amount of linguistic information that can be shared easily across the web, it will also improve accountability by allowing us to verify analyses across collections of primary data. We argue for a two-pronged approach to sharing language documentation, which involves a popular interface and a specialist interface. Finally, we briefly introduce the potential of regular expression queries for syntactic research.
Document Type
Article
Publication Date
3-2018
Funding Information
The programming and field work reported here are supported by NSFDEL Grant #1500753 to Raphael Finkel and Daniel Kaufman.
Repository Citation
Kaufman, Daniel and Finkel, Raphael, "Kratylos: A Tool for Sharing Interlinearized and Lexical Data in Diverse Formats" (2018). Computer Science Faculty Publications. 17.
https://uknowledge.uky.edu/cs_facpub/17
Notes/Citation Information
Published in Language Documentation & Conservation, v. 12, p. 124-146.
Licensed under Creative Commons Attribution-NonCommercial 4.0 International.