Loading…
Thursday April 9, 2026 3:00pm - 5:00pm GMT+07

Authors - Karn Na Sritha, Khang Tran Chi Nguyen, Dao Khanh Duy, Khanista Namee
Abstract - Multimodal affective computing system (MACS) aims to improve the affect prediction performance by fusing the complementary cues in visual and audio channels. While late fusion approaches are modular and can be flexibly deployed, they often rely on static modality weights which pre-assumes fixed reliability among modalities. In practical situation, visual stream can be corrupted by occlusion, variation of illumination and motion artifact while audio stream could be interfered by noise and reverberation or channel mismatch. Moreover, domain shifts between different datasets further contribute to the problem of in consistent calibration across modalities, which results in inaccurate fused predic tion. In this paper, a reliability-aware late fusion model is proposed to enhance ro bustness for multimodal emotion recognition. Based on the independently trained branches of FER and SER, we conduct an analytical process for theoretical var iance-covariance stability analysis of linear late fusion with respect to a modality imbalance condition. We further investigate entropy-driven reliability estima tion and calibration-aware weighting schemes. Experiment results from original test report are incorporated into the theoretical framework, it makes evidence that one modality’s dominance is more related to entropy stable and calibration char acteristics than raw unimodal accuracy. Our results also indicate that reliability aware weighting increases robustness under simulated degradation and missing modalities, without the need for retraining unimodal models.
Paper Presenter
Thursday April 9, 2026 3:00pm - 5:00pm GMT+07
Virtual Room D Bangkok, Thailand

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link