Elyasi, MiladSimitcioğlu, Muhammed EsadSaydemir, AbdullahEkici, AliÖzener, Okan ÖrsanSözer, Hasan2023-08-112023-08-112023-06-260928-8910http://hdl.handle.net/10679/8642https://doi.org/10.1007/s10515-023-00384-yLarge scale software systems must be decomposed into modular units to reduce maintenance efforts. Software Architecture Recovery (SAR) approaches have been introduced to analyze dependencies among software modules and automatically cluster them to achieve high modularity. These approaches employ various types of algorithms for clustering software modules. In this paper, we discuss design decisions and variations in existing genetic algorithms devised for SAR. We present a novel hybrid genetic algorithm that introduces three major differences with respect to these algorithms. First, it employs a greedy heuristic algorithm to automatically determine the number of clusters and enrich the initial population that is generated randomly. Second, it uses a different solution representation that facilitates an arithmetic crossover operator. Third, it is hybridized with a heuristic that improves solutions in each iteration. We present an empirical evaluation with seven real systems as experimental objects. We compare the effectiveness of our algorithm with respect to a baseline and state-of-the-art hybrid genetic algorithms. Our algorithm outperforms others in maximizing the modularity of the obtained clusters.engrestrictedAccessGenetic algorithms and heuristics hybridized for software architecture recoveryarticle30200101679870000110.1007/s10515-023-00384-yGenetic algorithmsReverse engineeringSoftware architecture recoverySoftware modularitySoftware module clustering2-s2.0-85163341762