Here, we develop an audiovisual deep residual network for multimodal apparent personality trait recognition. The network is trained end-to-end for predicting the Big Five personality traits of people from their videos. That is, the network does not require any feature engineering or visual analysis such as face detection, face landmark alignment or facial expression recognition. Recently, the network won the third place in the ChaLearn First Impressions Challenge with a test accuracy of 0.9109.
Deep Impression: Audiovisual Deep Residual Networks for Multimodal Apparent Personality Trait Recognition
An audiovisual deep residual network predicts Big Five personality traits from videos without feature engineering or explicit visual analysis.
- Year
- 2016
- Venue
- arXiv 2016
- Authors
- 4
- Hosting
- Abstract onlyARXIV-DEFAULT
Cite
Notes
Only stored in your browser.
Attribution
- Abstract & full text
- arxiv.org/abs/1609.05119ARXIV-DEFAULT
- TL;DR
- Semantic Scholar