University of Bahrain
Scientific Journals

Identifying Duplicate Bug Records Using Word2Vec Prediction with Software Risk Analysis

Show simple item record

dc.contributor.author Mahfoodh, Hussain
dc.contributor.author Hammad, Mustafa
dc.date.accessioned 2022-02-09T20:12:50Z
dc.date.available 2022-02-09T20:12:50Z
dc.date.issued 2022-02-15
dc.identifier.issn 2210-142X
dc.identifier.uri https://journal.uob.edu.bh:443/handle/123456789/4580
dc.description.abstract Reporting duplicated bugs in bug reports have serious productivity consequences on software projects. The fewer reporting of duplicated bugs, the better software maturity processes are set between the internal software stakeholders. Automated identification of the duplicated category through bug reports could enhance risk identification approaches during the software life cycle. In this paper, we propose two different similarity measures to identify duplicated bugs using the word-embedding (Word2Vec) natural language processing technique through Tensorflow tool. We conduct a comparison experiment on two related bug records descriptions from eight different software components from the Mozilla Core dataset. We choose different sentence types through the duplicated bug category records to compare and discuss each component’s accuracy results and identify whether the proposed module will be able to detect the related records. Using an earlier work, this paper calculates software risk values from duplication records and from bug-fix time prediction for the components that have not been identified as duplicated by the Word2Vec approach. The study results show maximum precision accuracy of 99.89% for the components that have been identified correctly as duplicated by the used approach. Additionally, we found that 66% of the software components that were excluded from the bug duplication proposed module showed an increase in software risk values. en_US
dc.language.iso en_US en_US
dc.publisher University Of Bahrain en_US
dc.subject Bug reports en_US
dc.subject duplicated bugs en_US
dc.subject bug-fix time en_US
dc.subject software risk estimation en_US
dc.subject bug-fix time prediction en_US
dc.subject software risk management en_US
dc.subject word embedding en_US
dc.subject natural language processing en_US
dc.subject machine learning en_US
dc.title Identifying Duplicate Bug Records Using Word2Vec Prediction with Software Risk Analysis en_US
dc.identifier.doi http://dx.doi.org/10.12785/ijcds/110162
dc.volume 11 en_US
dc.issue 1 en_US
dc.pagestart 763 en_US
dc.pageend 773 en_US
dc.contributor.authorcountry Bahrain en_US
dc.contributor.authoraffiliation Department of Computer Science, University of Bahrain en_US
dc.source.title International Journal of Computing and Digital Systems en_US
dc.abbreviatedsourcetitle IJCDS en_US


Files in this item

This item appears in the following Issue(s)

Show simple item record

All Journals


Advanced Search

Browse

Administrator Account