Authors - Soji Binu Mathew, A. Hepzibah Christinal Abstract - Permanent Magnet Synchronous Motors (PMSMs) are commonly utilized in electric vehicle (EV) traction systems because of its high efficiency, power density, and reliability. Conventional field-oriented control (FOC) schemes require accurate rotor position and speed information, typically obtained from mechanical sensors, which increase cost and reduce system reliability. Sensor less control techniques based on observer theory have therefore gained significant attention. Among them, sliding mode observers (SMOs) offer strong robustness against parameter variations and external disturbances but suffer from chattering and noise sensitivity. This paper presents an advanced sensor less FOC strategy for PMSM drives using a super-twisting SMO (ST-SMO) for rotor position sensing and estimation of speed. The proposed approach employs a ST-SMO algorithm to achieve the convergence in finite-time while significantly reducing chattering effects. The observer is integrated into a standard FOC framework and evaluated under EV-relevant operating conditions, including low-speed operation and load transients. Comparative performance discussion demonstrates the suitability and the effectiveness of the proposed method for high-efficiency EV traction.
Authors - Mohammed Mudassir, Irene Joseph, Jyothi Mandala, Sandeep J Abstract - This study introduces a Bidirectional Long Short-term Memory based multichannel speech enhancement framework that operates in the short-time Fourier transform domain using time-varying complex spectral masking. The pro-posed approach predicts channel-specific complex masks, allowing adaptive frame-wise suppression of noise in reverberant and multi-noise environments. A comprehensive dataset was created using multiple noise sources, and experiments were carried out at different signal-to-noise ratios. The proposed method outperformed the Relative Transfer Matrix and Deep Multichannel Active Noise Control techniques in perceptual speech quality and intelligibility across all test conditions, indicating its potential for real-world speech enhancement applications.
Authors - Gauri P Nair, Vinaya V, Dona Sebastian, Kavitha K V Abstract - Reliable stock price forecasting remains challenging due to the noisy, nonlinear, and non-stationary characteristics of financial time-series data. Traditional statistical methods and deep learning models that rely solely on raw price data often struggle to capture short-term fluctuations and evolving market dynamics. To address these limitations, this study proposes a hybrid forecasting framework that integrates causal time-domain filtering, time–frequency feature extraction, and deep learning–based temporal modeling. The proposed approach employs Savitzky–Golay and Kalman filters to sup press high-frequency market noise while preserving important price trends in a causality-aware manner suitable for real-time forecasting. Localized spectral fea tures representing transient and time-varying market behavior are then extracted using the Short-Time Fourier Transform (STFT). These enhanced time-domain and frequency-domain features are combined and modeled using a Long Short Term Memory (LSTM) network, which effectively captures long-range depend encies and nonlinear temporal patterns in financial data. The framework is evaluated using standard performance metrics, including RMSE, MAPE, and R². Experimental results demonstrate that integrating causal filtering with STFT-based features significantly improves forecasting accuracy and robustness compared to baseline models, providing a reliable and practical solution for short-term and multi-step stock price prediction.
Authors - Zubair Zaland, Mumtaz Begum Mustafa, Miss Laiha Mat Kiah, Hua-Nong Ting, Zuraidah M Don, Saravanan Muthaiyah Abstract - As digital marketing expands in Oman, many organizations struggle to transform large volumes of customer data into actionable insights. This study presents an AI-driven marketing intelligence framework designed for non-technical users, combining automated customer segmentation, sentiment analysis, and personalized recommendations. The framework employs an autoencoder-based feature extraction approach to capture key behavioral patterns, followed by K-Means clustering to define meaningful customer segments (Berahmand et al., 2024). A fine-tuned BERT model analyzes multilingual feedback in Arabic and English to assess customer sentiment (Manias et al., 2023). The framework was evaluated using 12 months of campaign data from 450 customers across multiple Omani businesses. Analysis revealed four distinct customer groups and an overall positive sentiment of +0.55. Controlled A/B experiments demonstrated that AI-guided campaigns outperformed traditional methods, increasing conversion rates by 27%, improving retention by 15%, and generating a threefold return on marketing spend. These results indicate that accessible AI tools can deliver measurable marketing benefits in emerging markets and provide a scalable solution for Gulf-region businesses.
Authors - B.Usha Rani, M.Sudhakar, A.Srivani, Y.Surya Praveen Abstract - The purpose of Diabetic Retinopathy Prediction is to use computer technology to identify early stages of retinal damage caused by diabetes. Since diabetic retinopathy can lead to blindness or permanent vision impairment if not treated in a timely manner, accurate and rapid diagnosis is vital. Recent tech niques for diagnosing diabetic retinopathy require an ophthalmologist to perform a manual examination of the eye’s retina with the use of fundus photography. The diagnostic process can be costly, time-consuming, and vary significantly from one person to another. A large percentage of diabetes patients live in rural areas, where it is difficult or impossible for them to have periodic screening by a diabetic specialist or receive healthcare services. There is a need to develop a solution to these problems, and the Diabetic Retinopathy Prediction System uses deep learning based techniques to analyze retinal fundus images and produce pre dictions regarding diabetic retinopathy. Analysis of the retinal fundus images will include preprocessing, feature extraction using CNNs, and automated classifica tion into diabetic retinopathy by degree and severity. This approach increases the accuracy and consistency of diabetic retinopathy diagnosis while minimizing the need for human input. The proposed system will allow for early identification of diabetic retinopathy in resource poor environments, support large scale screening programs and aid in clinical decision making by ophthalmologists. Additionally, the system has potential integration into mobile health systems and tele-ophthal mology networks. Experimental results indicate the proposed system is capable of accurately detecting diabetic retinopathy with high levels of specificity and sensitivity.
Authors - Bai B Mathura, Narra Dhanalakshmi Abstract - This paper presents a novel Reversible Data Hiding (RDH) method for dual images. First, secret data is converted into a binary sequence of equal length and then divided into shorter segments to control the amount of data embedded into each pixel. The embedding process uses two copies of the original image to distribute the data, reducing the impact on each image while maintaining overall image quality. During recovery, the original image is restored by averaging the pixel values at corresponding locations in the two stego images, while the embedded data is recovered through a reverse process. Experimental results on grayscale images demonstrate that the method maintains good image quality, achieving a high Peak Signal-to-Noise Ratio (PSNR) across different embedding levels while ensuring accurate recovery of both the secret data and the original image.
Authors - Priyanka Khalate, Satish S. Banait, Chandrakant Kokane, Dnyanada Shinde, Madhumati Pol, Pravinkumar M. Sonsare Abstract - The emerging use of digital deepfake technology is creating a myriad of obstacles in verifying the authenticity of digital media. Most of today’s detection methods yield satisfactory results when applied to clean samples of content, however, they are still susceptible to adversarial perturbations specifically created to bypass these detection methods. The current research paper introduces DC-DAFDN, a dual-stream architecture for detecting fraudulent digital content, which fuses frequency-domain analysis using the Discrete Cosine Transform (DCT) with Space-Attention Mechanisms. The current architecture uses adversarial training to develop more robust features. The proposed model uses EfficientNet-B4 as a backbone, augmented with Spatial Reduction Attention Blocks and Forged Fea tures Attention Modules to detect manipulation artifacts in the spatial domain, while the parallel DCT stream analyzes inconsistencies in the frequency-domain. Through an adversarial training procedure using Fast Gradient Sign Method (FGSM)-induced adversarial perturbations, the model learns robust feature sets that are resistant to evasion attacks. When evaluated on Face-Forensics++ dataset, DC-DAFDN significantly improves upon the original Dual Attention for Deepfake Detection Network (DAFDN) in terms of adversarial robustness. When attacked with large adversarial perturbations (e.g., FGSM with ϵ ranging from 0.1 to 0.25), the DC-DAFDN architecture maintained greater than average accu racy enhancements from +2.74% up to +3.61%, for an average accuracy increase of +3.36%, for the tested att, from all strengths. Our findings suggest that fusing frequency-domain analysis with adversarial training provides measurable improvement in the model’s robustness to adversarial attacks and simultaneously preserves the detection capabilities of the dual-attention method.
Authors - Cristian Castillo-Olea, Clemente Zuniga Gil, Angelica Huerta Abstract - Question paper preparation in educational institutions is conventionally manual and time-consuming, often generating question papers of uneven difficulty and less diversity. This project solves the problem of automatic question paper generation from voluminous academic content available in multiple formats. The motivation for this work is reducing human effort and enhancing efficiency, ensuring fair and balanced assessment generation, while supporting modern digital learning environments. Input content, in the form of text documents, portable document files, presentation slides, images, audio recordings, and video lectures, forms the bedrock of the proposed system; first, it gets preprocessed into a unified textual format through document parsing, optical character recognition, and speech-to-text techniques. Natural language processing approaches like sentence segmentation, tokenization, stop word removal, and extraction of key concepts are subsequently applied on the meaningful and relevant identification of the contents. It follows a hybrid approach relying on the Transformer architecture: a classification model that assesses the importance of a sentence, relevance of concepts, and difficulty level; and a generation model providing question types such as multiple choice, short answer, long answer, case studies, reasoning, fill-in-blanks, and programming. The proposed model goes through training and fine-tuning using publicly available datasets of question-answer pairs and pre- processed information in textbooks. In the experimental results, the proof of efficiency by the proposed approach is shown in generating accurate and diverse question papers with high relevance. Such an approach would definitely ensure much better outcomes for the question papers and the assessment.
Authors - Y. C. A. Padmanabha Reddy, Panigrahi Srikanth, Kavita Goura Abstract - Advances in Artificial Intelligence, Machine Learning and Internet of Things technologies have enabled wearable devices to sense as well as process and respond to human behaviour in real time. While most wearable devices today are used for health and fitness tracking. Many people face communication challenges such as language barriers, difficulty understanding emotions or social cues, social anxiety and accessibility issues for individuals with hearing or speech impairments. Existing systems often collect data but fail to provide meaningful, real-time assistance during actual human interactions. This research paper presents a literature-based study on AI powered wearable devices designed to support and enhance human communication. The research papers are focusing on intelligent wearables that use multimodal sensors such as microphones, cameras and sensors. These systems apply AI techniques to interpret speech, gestures, facial expressions and emotional signals in real time. The wearable devices considered include everyday consumer-oriented systems such as smart eyewear that provides audio visual assistance and wrist worn wearables that offer haptic feedback. The key focus of this study is to examine how such devices can deliver subtle, real-time support through visual prompts, audio cues or vibrations to improve conversational awareness and user confidence. The expected outcome is to identify current capabilities, practical limitations and design considerations for developing human centric wearable technologies that move beyond passive tracking toward meaningful communication support.
Authors - Anisha Panja, Ranjita Kumari Dash, Biswajit Sahoo Abstract - Singer identification is a challenging task because of pitch and me lodic variations, tempo, vibrato, and adaptive singing styles. This paper propos es a novel approach towards singer identification and classification by adapting a model originally meant for speaker recognition. Specifically, this work utiliz es vector representations extracted from a pretrained Speech Brain Emphasized Channel Attention, Propagation and Aggregation in Time Delay Neural Net work (ECAPA-TDNN) model. The research pipeline processes a custom curated dataset of four prominent Indian playback singers into fixed, 8 second audio clips, with mono channel sampled at 16 kHz and exported as wav files. The Speech Brain Emphasized Channel Attention, Propagation and Aggrega tion (ECAPA) encoder transforms these labelled clips into fixed embeddings which are unique vector representations of voice characteristics of each audio clips. A suite of classical machine learning classifiers is trained on these em beddings. The study evaluates four of them namely, Logistic Regression, Sup port Vector Machines, Random Forests, and a Multi-Layer Perceptron (MLP). The MLP achieved the highest accuracy of 99.38% on held-out test data. Sup porting this result, both confusion matrix analysis and t-SNE projection clearly demonstrate clear cluster separation based on individual singer identities. These findings thus collectively validate that ECAPA embeddings contain sufficient identity-bearing structure on a singing voice. This analysis thus concludes that adaptation of speaker recognition models with appropriate classifiers is a great ly effective and efficient approach for singer identification.