Abstract:
Artificial intelligence has a tremendous potential to reinforce and revolutionize law enforcement. By automating the
tedious and time-consuming tasks of data collection and analysis, AI can help police departments become more effective. In this
paper, we provide an overview of possible deep learning models that can be utilized by law enforcement to reconstruct the suspect’s
face, through generating real and sketch facial portraits. To accomplish this, we collect four types of data from the crime scene,
handwritten text and audio from an officer’s note, images from a smartphone, and video from a surveillance camera. Then, employ
two pre-trained models: ABM-CNN for attribute multi-label classification and Google Speech API for speech-to-text conversion.
In addition, we train three other models, the first of which, a handwritten model trained on the IAM Handwriting dataset, reads
and digitizes handwritten notes with an accuracy of 76%, outperforming state-of-the-art results. Second, we train YoloV5 with
the Wider Face dataset to detect one or multiple faces on images or videos with an average precision of 93%, a recall of 90%,
and a precision of 88%. In the third model, we adapt the Zero-Shot Text-to-Image Generation technology to generate real faces
and sketch. Our resulting model outperforms existing models from literature regards to high-quality, and the training reach a loss of 37.4%.