Year of Publication


Document Type





Computer Science

First Advisor

Jane Huffman Hayes


It is important to track how a requirement changes throughout the software lifecycle. Each requirement should be validated during and at the end of each phase of the software lifecycle. It is common to build traceability matrices to demonstrate that requirements are satisfied by the design. Traceability matrices are needed in various tasks in the software development process. Unfortunately, developers and designers do not always build traceability matrices or maintain traceability matrices to the proper level of detail. Therefore, traceability matrices are often built after-the-fact. The generation of traceability matrices is a time consuming, error prone, and mundane process. Most of the times, the traceability matrices are built manually. Consider the case where an analyst is tasked to trace a high level requirement document to a lower level requirement specification. The analyst may have to look through M x N elements, where M and N are the number of high and low level requirements, respectively. There are not many tools available to assist the analysts in tracing unstructured textual artifacts and the very few tools that are available require enormous pre-processing. The prime objective of this work was to dynamically generate traceability links for unstructured textual artifacts using information retrieval (IR) methods. Given a user query and a document collection, IR methods identify all the documents that match the query. A closer observation of the requirements tracing process reveals the fact that it can be stated as a recursive IR problem. The main goals of this work were to solve the requirements traceability problem using IR methods and to improve the accuracy of the traceability links generated while best utilizing the analysts time. This work looked into adopting different IR methods and using user feedback to improve the traceability links generated. It also applied wrinkles such as filtering to the original IR methods. It also analyzed using a voting mechanism to select the traceability links identified by different IR methods. Finally, the IR methods were evaluated using six datasets. The results showed that automating requirements tracing process using IR methods helped save analysts time and generate good quality traceability matrices.