Loading…
Thursday April 9, 2026 3:00pm - 5:00pm GMT+07

Authors - Deepak T. Mane, Deepak R. More, Gopal D. Upadhye, Rucha C. Samant, Hemlata U. Karne, Suraksha Suryawanshi, Prem Borse
Abstract - Efficient vehicle type classification is vital for intelligent transportation systems, traffic monitoring, and urban mobility planning. This paper presents a Real-time Multimodal Vehicle Type Classification System that leverages both visual and acoustic data to identify and categorize vehicles such as cars, buses, trucks, and motorcycles from live video streams. The proposed system integrates CNN-based and Transformer- based models for feature extraction across modalities, enhancing detection robustness under diverse lighting, weather, and traffic conditions. A lightweight preprocessing pipeline performs synchronized frame extraction, audio segmentation, and feature fusion while ensuring minimal latency in real-time environments. The proposed multimodal architecture combines late fusion of visual and audio features to enhance the reliability of classification when either modality is suffering from low visibility or occlusion. Experimental evaluations demonstrate that the proposed framework achieves a classification accuracy of 96.2% at 28 fps, outperforming unimodal baselines with real-time efficiency. This system is deployable for intelligent traffic surveillance, automated tolling, and urban safety analytics.
Paper Presenter
Thursday April 9, 2026 3:00pm - 5:00pm GMT+07
Virtual Room G Bangkok, Thailand

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link