Weighted Content Similarity Feature for Software Architecture Anti-Patterns Prediction(مقاله علمی وزارت علوم)
As user needs change frequently over time, software systems must evolve; therefore, increased software complexity inevitably violates software engineering principles. The violations of these principles are called anti-patterns, which differ from bugs and faults, and can occur at various levels ofion; finally, they reduce software quality. Anti-patterns can occur in various software, including web applications, and their prediction can effectively help prevent their occurrence. The anti-patterns prediction process at different levels ofion utilizes software features, whose threshold values impact the accuracy of this process. This study presents an improved component-level feature, called weighted content similarity, to more accurately detect component dependencies by minimizing the influence of common words that are often used in comments but are worthless in identifying the relationship between components. Therefore, the comment words are weighted using TF-IDF values. F-Measure values are calculated to show the greater impact of our proposed weighted feature compared to structural, topological, and content similarity features on detecting dependencies between components of an open-source system. The prediction of component anti-patterns, such as cyclic and hub-like dependencies, will be possible with the help of dependency detection. The average F-Measure of topological features in OpenJPA 2.0.0 software is 0.73, content similarity features is 0.76, and weighted content similarity features is 0.88. Therefore, the F-Measure of our weighted content similarity feature is 0.12 higher than the unweighted content similarity feature and is 0.15 higher than the topological feature. So, it is more effective than these two features in predicting dependencies between components using machine learning algorithms.