Name: A Multimodal Knowledge-Driven Framework for Urban Scene Summarization
Start: 2026-04-11T12:15:00+0700
End: 2026-04-11T14:15:00+0700

A Multimodal Knowledge-Driven Framework for Urban Scene Summarization

Saturday April 11, 2026 12:15pm - 2:15pm GMT+07

Virtual Room G

Open Zoom

Authors - Hemamalini Siranjeevi, Swaminathan Venkatraman, Dharshini V, Gayathri A, Sushma Sri R
Abstract - Urban environments generate massive video data from surveillance and mobile sensors, necessitating efficient and intelligent summarization for smart city and transportation systems. This paper proposes a multimodal video summarization framework that moves beyond object-centric analysis toward high-level urban scene understanding. Unlike traditional methods that rely on low-level visual features or isolated object detection, the proposed approach captures contextual relationships and temporal continuity through a multi-stage pipeline. The system integrates multimodal perception, combining deep learning-based object detection, multi-object tracking, and acoustic analysis to preserve entity identities and environmental context. We employ relational inference and motion heuristics to model spatial and semantic interactions, which are then structured into a Dynamic Knowledge Graph (DKG) representing entities, interactions, and temporal events. A semantic synthesis module, powered by a transformer-based language model, generates concise, coherent, and semantically meaningful summaries. This architecture enables scalable, context-aware video summarization adaptable to real-world urban applications.

Paper Presenter

Hemamalini Siranjeevi

India

Saturday April 11, 2026 12:15pm - 2:15pm GMT+07
Virtual Room G Bangkok, Thailand

Virtual Room_11G, Virtual Room G

11th International Conference on ICTIS

Hemamalini Siranjeevi

Get help with the event

11th International Conference on ICTIS

Hemamalini Siranjeevi

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Get help with the event