Abstract:
Drug development has traditionally been expensive and time consuming. Computational approaches such as
machine learning have been widely applied to improve efficiency, yet interpreting prediction outcomes remains a
challenge. This study aims to improve the efficiency of Alzheimer's drug discovery by conducting QSAR (Quantitative
Structure Activity Relationship) modelling with Random Forest model to predict the inhibition potential (IC50 values) of
each Alzheimer's drug candidate compound. A total of 5779 compounds were collected from ChEMBL and PubChem
databases. The QSAR model in this study was built using features that were extracted by generating 1024 Morgan
Fingerprints representing the substructure of compounds. In this study, SHapley Additive exPlanations (SHAP) are
implemented to understand locally and globally important features from the prediction results of the developed model.
The effectiveness of the QSAR model in this study was tested with 10-fold cross validation, where the developed regression
model can achieve a MAPE score of 11.10% and the classification model achieves an AUC-ROC score of 84.77%. In this
work, molecular docking is conducted to simulate how a drug binds to its target and verify the best molecules'
effectiveness. Additionally, a web based application was developed in this study to facilitate predicting the bioactivity
value of Acetylcholinesterase (AChE) inhibitors.