Threats of Aggregating Software Repository Data

Authors - Martin Robillard, Mathieu Nassif, Shane McIntosh
Venue - International Conference on Software Maintenance and Evolution, pp. To appear, 2018

Related Tags - ICSME 2018 knowledge loss software evolution

Abstract - Software repository mining techniques can provide insights about software systems and their development processes through the use of metrics that aim to capture a construct of interest. However, linking development history metrics with high-level constructs is fraught with threats to validity. We conducted a case study in which we performed a critical review of the underlying artifacts used to compute a metric of knowledge at risk in software projects, proposed in prior work. The case study revealed eight major threats to validity that have the potential to generalize to other software process metrics derived from repository data. In addition to a detailed description of each threat, we contribute a questionnaire to facilitate their assessment in past and future studies.

Preprint


