Abstract:
The emergence of smart devices in the market leads to exponential growth of malware in the market posing a significant
challenge to smart device users. These malicious programs are designed with advanced techniques to evade existing detection techniques,
infiltrate systems, and cause harm to any platform. One such platform is Android, the open-source smartphone operating system which
has experienced exponential growth since its inception. However, this progress has been increased by the growing threat of Android
malware, which exploits smartphones to carry out malicious acts. These malware employs a plethora of techniques to circumvent
detection systems, presenting novel obstacles to reliable detection. Currently, Android malware detection approaches can be broadly
classified into two categories, signature- based detection and machine learning-based detection. Signature-based detection relies on
patterns or signatures of malware to identify and block malicious software. Nevertheless, this approach is subject to limitations, as it
inadequately detects novel or un- known malware variants. To address the limitations of signature-based detection, researchers and antimalware
firms have turned to machine learning-based detection techniques. These methods harness the power of machine learning
algorithms to analyze and categorize applications based on their behavioral patterns, intrinsic features, or other distinctive characteristics.
By assimilating knowledge from extensive datasets comprising known malware and legitimate applications, machine learning models can
identify previously unseen malware by identifying similarities to known malevolent behavior. This study aims to disseminate the current
landscape of machine learning-based Android malware detection techniques and undertake a parametric comparison of their efficacy. The
objective is to explore a large number of detection methods and elucidate prospective avenues in this domain. By scrutinizing and
contrasting these approaches, we can gain profound insights into the strengths and limitations of various machine learning techniques,
while identifying potential areas for further research and enhancement.