Year of Publication


Degree Name

Doctor of Philosophy (PhD)

Document Type

Doctoral Dissertation




Computer Science

First Advisor

Dr. Jane Huffman Hayes


Developing complex software systems often involves multiple stakeholder interactions, coupled with frequent requirements changes while operating under time constraints and budget pressures. Such conditions can lead to hidden problems, manifesting when software modifications lead to unexpected software component interactions that can cause catastrophic or fatal situations. A critical step in ensuring the success of software systems is to verify that all requirements can be traced to the design, source code, test cases, and any other software artifacts generated during the software development process. The focus of this research is to improve on the trace matrix generation process and study how human analysts create the final trace matrix using traceability information generated from automated methods.

This dissertation presents new results in the automated generation of traceability matrices and in the analysis of analyst actions during a tracing task. The key contributions of this dissertation are as follows: (1) Development of a Proximity-based Vector Space Model for automated generation of TMs. (2) Use of Mean Average Precision (a ranked retrieval-based measure) and 21-point interpolated precision-recall graph (a set-based measure) for statistical evaluation of automated methods. (3) Logging and visualization of analyst actions during a tracing task. (4) Study of human analyst tracing behavior with consideration of decisions made during the tracing task and analyst tracing strategies. (5) Use of potential recall, sensitivity, and effort distribution as analyst performance measures.

Results show that using both a ranked retrieval-based and a set-based measure with statistical rigor provides a framework for evaluating automated methods. Studying the human analyst provides insight into how analysts use traceability information to create the final trace matrix and identifies areas for improvement in the traceability process. Analyst performance measures can be used to identify analysts that perform the tracing task well and use effective tracing strategies to generate a high quality final trace matrix.