Abstract:
Fingerprint is one type of physical evidence frequently encountered at a crime scene. It is useful in revealing identity of the
culprit. However, poor quality latent fingerprint collected from a crime scene seldom makes an identification reliable. In practice, the
identification is accomplished by matching a known print to an unknown print according to the types and locations of their minutiae
features. When it is infeasible to conduct an identification, forensic scientist can attempt to predict the sex of donor of the latent
fingerprint in order to narrow down the scope of searching of suspect. In the context of forensic science, sexual dimorphism in ridge
count has been studied for a few decades ago. Meanwhile, gender classification based on fingerprint images have been regularly
reported in the field of computer science. Viewed from a practical perspective, extraction of salient fingerprint features depends on the
quality of the input fingerprint image of which could be very low in a real crime scene. Hence, fingerprint data studied in this work,
i.e. diagonal ridge counts within a well-defined region, i.e. 25 centimeter squared, were determined manually. Firstly, the fingerprint
data was explored using self-organizing maps method. Next, Naïve Bayes (NB) and Classification and Regression Trees (CART)
algorithms were, respectively, used to construct predictive model for discriminating gender based on the fingerprint data. A multitude
of prediction models were constructed by considering ten-digit, five-digit and one-digit samples, respectively, to predict gender by
three races, i.e. Chinese, Indians and Malays; and the combined sub-population. Each of the models was validated using bootstrapping
without replacement approach. Results showed that the single-digit samples produced accuracy rate slightly lower than that obtained
using five- or ten-digit samples. Comparing to the global predictive model, ethnicity-specific models of Indian and Malay subjects
showed slight improvement in external accuracy rate. Moreover, by considering all five digits of a particular hand as input data, NB
tends to outperform CART. However, both NB and CART are comparable to each other when one-digit sample was considered as
input data. In conclusion, both NB and CART can be useful in predicting gender of Malaysian based on fingerprint ridge counts.