Year of Publication


Degree Name

Doctor of Philosophy (PhD)

Document Type

Doctoral Dissertation




Computer Science

First Advisor

Dr. Jerzy Wl. Jaromczyk


This work presents new models and algorithms for creating, modifying, and controlling access to complex text. The digitization of texts opens new opportunities for preservation, access, and analysis, but at the same time raises questions regarding how to represent and collaboratively edit such texts. Two issues of particular interest are modelling the relationships of markup (annotations) in complex texts, and controlling the creation and modi cation of those texts. This work addresses and connects these issues, with emphasis on data modelling, algorithms, and computational complexity; and contributes new results in these areas of research.

Although hierarchical models of text and markup are common, complex texts often exhibit layers of overlapping structure that are best described by multihierarchical markup. We develop a new model of multihierarchical markup, the globally ordered GODDAG, that combines features of both graph- and range-based models of markup, allowing documents to be unambiguously serialized. We describe extensions to the XPath query language to support globally ordered GODDAGs, provide semantics for a set of update operations on this structure, and provide algorithms for converting between two di erent representations of the globally ordered GODDAG.

Managing the collaborative editing of documents can require restricting the types of changes di erent editors may make, while not altogether restricting their access to the document. Fine-grained access control allows precisely these kinds of restrictions on the operations that a user is or is not permitted to perform on a document. We describe a rule-based model of ne-grained access control for updates of hierarchical documents, and in this context analyze the document generation problem: determining whether a document could have been created without violating a particular access control policy. We show that this problem is undecidable in the general case and provide computational complexity bounds for a number of restricted variants of the problem.

Finally, we extend our ne-grained access control model from hierarchical to multihierarchical documents. We provide semantics for ne-grained access control policies that control splice-in, splice-out, and rename operations on globally ordered GODDAGs, and show that the multihierarchical version of the document generation problem remains undecidable.