Dissertação de Mestrado
Understanding the shape of feature code
Fecha
2015-07-31Autor
Rodrigo Barbosa de Queiroz
Institución
Resumen
Feature annotations (e.g., code fragments guarded by ifdef C-preprocessor directives) are widely used to control code extensions related to features. Feature annotations have long been said to be undesirable. When maintaining features guarded by annotations, there is a high risk of ripple effects. Also, excessive use of feature annotations may lead to code clutter, hinder program comprehension and harden maintenance. To prevent such problems, developers should monitor the use of feature annotations, for example, by setting acceptable thresholds. Interestingly, little is known about how to extract thresholds in practice, and which values are representative for feature-related metrics. To address this issue, in this master dissertation we analyze the statistical distribution of three feature-related metrics collected from a corpus of 20 well-known and long-lived C-preprocessor-based systems from different domains. We consider three metrics: scattering degree of feature constants, tangling degree of feature expressions, and nesting depth of preprocessor annotations. Our findings show that feature scattering is highly skewed; in 14 systems (70%), the scattering distributions match a power law, making averages and standard deviations unreliable limits. Regarding tangling and nesting, the values tend to follow a uniform distribution; although outliers exist, they have little impact on the mean, suggesting that central statistics measures are reliable thresholds for tangling and nesting. Following our findings, we then propose thresholds from our benchmark data, as a basis for further investigations. We also report in this work the result of a systematic literature review, conducted to identify empirical findings and assumptions on the usage of ifdefs as reported in the literature. The inspection of the assumptions and findings shows that studies do not investigate the statistical distributions that better describe feature-related metric values, and also do not propose thresholds for such metrics.