Loading…
Saturday April 11, 2026 3:00pm - 5:00pm GMT+07

Authors - Harita Venkatesan
Abstract - Fusion-based multimodal models typically assume full modality availability at inference, an assumption that often fails in real-world settings. When a modality is missing, common strategies such as zerovector masking or unimodal fallback can lead to unstable predictions. We propose CORE, an embedding-level framework that completes multimodal representations by integrating original and cross-modally reconstructed embeddings in a fusion-consistent manner prior to fusion. CORE employs lightweight bidirectional cross-modal imagination networks with a cycle-consistency constraint to preserve shared semantic structure across modalities. The model is trained with stochastic modality dropout, enabling unified inference under complete and incomplete modality configurations. Experiments on a multimodal MRI–text classification task for lumbar spine analysis demonstrate that CORE yields more stable predictions than zero-vector masking under severe modality absence, while maintaining comparable performance when all modalities are present.
Paper Presenter
Saturday April 11, 2026 3:00pm - 5:00pm GMT+07
Virtual Room B Bangkok, Thailand

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link