Abstract:
The task of modeling and identifying people’s emotions using facial cues is a complex problem in computer vision. Normally
we approach these issues by identifying Action Units, which have many applications in Human Computer Interaction. Although Deep
Learning approaches have demonstrated a high level of performance in recognizing AUs and emotions, they require large datasets of
expert-labelled examples. In this article, we demonstrate that good deep features can be learnt in an unsupervised fashion using Deep
Convolutional Generative Adversarial Networks, allowing for a supervised classifier to be learned from a smaller labelled dataset. The
paper primarily focuses on two key aspects: firstly, the generation of facial expression images across a wide range of poses (including
frontal, multi-view, and unconstrained environments), and secondly, the analysis and classification of emotion categories and Action
Units. Utilizing a pioneering methodology and incorporating an extensive array of datasets for feature acquisition and classification, we
substantiate a remarkably persuasive generalization and achieve enhanced outcomes. In contrast to prevailing state-of-the-art techniques,
our proposed model showcases exceptional performance, specifically on the Radboud dataset, boasting an unparalleled overall accuracy
rate of 98.57%.