Abstract - Background: Code review is a well-established software quality practice where developers critique each others' changes. A shift towards automated detection of low-level issues (e.g., integration with linters) has, in theory, freed reviewers up to focus on higher level issues, such as software design. Yet in practice, little is known about the extent to which design is discussed during code review.
Aim: To bridge this gap, in this paper, we set out to study the frequency and nature of design discussions in code reviews.
Method: We perform a multiple case study of the code reviews of the OpenStack Nova and Neutron projects. We manually classify 2,817 review comments from a randomly selected sample of 220 code reviews. We then train and evaluate classifiers to automatically label review comments as design related or not. Finally, we apply the classifiers to a larger sample of 2,506,308 review comments to study the characteristics of reviews that include design discussions.
Results: Our manual analysis indicates that (1) design discussions are still quite rare, with only 9% and 14% of Nova and Neutron review comments being related to software design, respectively; and (2) design feedback is often constructive, with 73% of the design-related comments also providing suggestions to address the concerns. Furthermore, our classifiers achieve a precision of 59%–66% and a recall of 70%–78%, outperforming baselines like zeroR by 43 percentage points in terms of F1-measure. Finally, patches that have design-related feedback have a statistically significantly increased rate of abandonment (Pearson Χ2 test, DF=1, p < 0.001).
Conclusion: Design-related discussion during code review is still rare. Since design discussion is a primary motivation for conducting code review, more may need to be done to encourage such discussions among contributors.