Professor, School of Computing and Electrical Engineering and Chairperson of the Centre for Human Computer Interaction, Indian Institute of Technology (IIT) , India
Authors - Hector Rafael Morano Okuno Abstract - Mechatronics is an interdisciplinary field that draws on mechanics, electronics, and computer science. In recent years, the term biomechatronic has been used with increasing frequency; it is also a multidisciplinary field that in volves biological sciences and, therefore, bioinformatics. With the development of AI, bioinformatics provides data to biomechatronic systems, enabling appli cations ranging from agriculture to medicine. This article explores how bio mechatronics and CFD simulations can help monitor a person's health status. The objectives of this research were: 1) to determine whether, using biomarkers such as hemoglobin, fibrinogen, and low-density lipoprotein (LDL), among others, and CFD simulations, it is possible to obtain blood flow velocity pro files; and 2) to investigate whether the information from CFD simulations can be used to feed a biomechatronic system to monitor a person's health condi tions. Among the results, it was found that it is necessary to have models that allow relating the main biomarkers to determine the state of health of a person, as well as with suitable sensors to measure each variable according to the orien tation of the application that is to be developed, for example, for physical train ing or for the monitoring of nutrition.
Authors - Radha Gawande, Supriya Nara Abstract - Complicated nature of the intensive care unit (ICU), immediate and accurate decision-making is vital to the survival of the patient. The problems that healthcare providers are struggling with are the overload of information, slowness of the decision making process, and the human factor due to growing amount of various patient information. Recent development in artificial intelligence (AI) offers promising solutions since they facilitate effective analysis of data, pattern detection and predictive modelling. This changes the provision of critical care. In this paper, the changing application of AI in ICUs is discussed. It talks about its usage, merits and demerits, as well as technological basis. It also discusses AI methods such as machine learning (ML), deep learning (DL), natural language process (NLP), and expert system, predictive analytics, early sepsis detection, clinical decision support system, automated monitoring and insight-based treatments by documentation fueled by natural language processing, are but a few of the practical methods of applying AI. The advantages of automation and robotics to enhance productivity and patient care are also discussed, which are AI-based medication delivery system and robotics helper. Nonetheless, a number of challenges to implement AI in critical care units are a lack of consensus, algorithm bias, understanding model decisions, and various data, personalized AI-driven care in the ICU, integration of edge computing and internet of medical things (IoMT), reinforcement learning in adapting patient management are some of the future prospects[1].
Authors - Priyanka Patel, Ashvi Padshala, Moxa Patel Abstract - This paper surveys recent advances in the application of data analysis, machine learn ing, artificial intelligence, and big data techniques for climate pattern detection. It covers sources of climate data, analytical methods, computational architectures, key challenges, and emerging trends. The focus is on identifying how integrated data-driven methods enhance the understanding, prediction, and interpretability of climate phenomena.
Authors - Rohan Dafare, Supriya Narad Abstract - The quick spread of big data and the rising need for instant analytics have shown the built-in limits of old-school relational database management systems (RDBMS). NoSQL ("Not SQL") databases give schema-less design, side-to-side growth, and adaptable data shaping making them a better fit for handling messy and semi-messy data on a big scale. This paper looks at the edge NoSQL has over SQL systems by checking out key traits like how flexible the data model is how well it works under high output how easy it is to grow sideways, and how well it fits with cloud-native setups. Using a careful review of NoSQL teaching and use, we boil down real-world findings and suggest ways to pick the right database tech based on what the app needs. Our talk ends with a plan to help pros and teachers get when and why to use NoSQL fixes instead of, or along with classic SQL databases. Modern data intensive workloads driven by real time analytics, large scale user interactions, IoT streams, and unstructured content. It demands storage system capable of delivering high throughput, scalability and flexible data models. Traditional SQL databases continue to offer strong consistency, ACID guarantees and structured schema support, making them ideal for transactional applications and environments requiring strict data integrating. However, as data volume, variety and velocity increase, NOSQL databases have emerged as powerful alternative, providing horizontal scalability, schema-less design and optimized performance for distributed and semi-structured data processing.
Authors - Anshuman Prajapati, Madhav Desai, Priyanka Patel Abstract - Analysis of facial skin conditions is essential for both dermatological and cosmetic evaluation; however, inter-class similarity and localized texture variations make multi-label classification of characteristics like wrinkles, dark circles, enlarged pores, hyperpigmentation, pimples, and fine lines difficult. The effectiveness of transfer learning for this task is examined in this paper, and an attention-enhanced framework based on EfficientNet-B0 is proposed. In order to highlight the importance of pre-trained feature representations, we first assess a bespoke convolutional neural network (CNN) as a baseline. Using the Convolu tional Block Attention Module (CBAM), which combines channel and spatial attention processes to enhance discriminative feature localization while maintain ing computational efficiency, we build upon this by using EfficientNet-B0 as the transfer learning backbone. According to experimental data, our CBAM augmented EfficientNet achieves better class-balanced performance in macro-F1 score than both the baseline EfficientNet and the bespoke CNN. Consistent in creases are confirmed by per-class analysis and confusion matrices, even for dif ficult settings. Additionally, Grad-CAM visualizations show that by concentrat ing activation on pertinent facial regions, the attention mechanism improves in terpretability. These results imply that a promising avenue for multi-label derma tological image analysis is attention-guided transfer learning.
Professor, School of Computing and Electrical Engineering and Chairperson of the Centre for Human Computer Interaction, Indian Institute of Technology (IIT) , India
Authors - Fredy Gavilanes-Sagnay, Edison Loza-Aguirre, Luis Castillo-Salinas, Narcisa de Jesus Salazar Alvarez Abstract - Ayurveda, India's ancient system of medicine, is full of inter-connected knowledge about diseases, their symptoms, herb and formulation (compounds). However, texts such as Charaka Samhita are mostly unstructured and cannot be readily analysed computationally. This work presents AyurKOSH which is a machine-readable, high-quality Ayurvedic dataset that is designed as a Knowledge Graph (KG) in order to support Artificial Intelligence driven research. The dataset is represented as subject–predicate–object triplets, which enables semantic interoperability, graph traversal, and multi-hop inferencing across entities. The dataset is designed by following schema-driven ontology which standardizes relationships between various nodes such as diseases, symptoms, pharmacological attributes, and compound formulations. DB Schema ensures consistency and computational tractability. AyurKOSH has the structured data of diseases and related symptoms, drug preparations, herbs and the detailed pharmacological properties are Rasa, Guna, Virya, Vipaka, Karma. The graph structure shows real-world biomedical network characteristics such as high sparsity and low average degree, which makes it suitable for embedding-based learning, graph neural networks, and explainable AI frameworks. Moreover, there is botanical metadata and herb-substitution relationships added for the prediction of synergy and repurposing of drugs. The dataset facilitates applications in biomedical NLP, and automated reasoning systems and clinical decision assistance, and pedagogy in integrative medicine. AyurKOSH became available for academic and non-commercial research under CC BY-NC-SA 4.0 license.
Authors - W M I T Warnasooriya, T D Jayadeera, A M G S Adhikari, M A F Zumra, A J Vidanaralage, M Samaraweera Abstract - The integration of large language models (LLMs) into primary educa tion remains limited in low resource, diglossic languages like Sinhala. General purpose models often produce grammatically inconsistent or cognitively over whelming output for young learners. This paper introduces a grade-adaptive, con straint-driven framework for automated Sinhala story and quiz generation target ing Grades 1-5. Building upon an 8-billion-parameter Sinhala-adapted LLaMA 3 model, we apply Quantized Low-Rank Adaptation (QLoRA) using a curated multi-task educational dataset. The system enforces tier-specific linguistic con straints separating conversational Sinhala for lower grades from formal written Sinhala for upper grades while embedding strict structural rules such as con trolled sentence counts (5-6 vs. 7-8) and validated multiple-choice formats (3 vs. 4 options). Evaluation on 100 structured prompts demonstrated substantial im provements over a zero-shot baseline: structural compliance increased from 64% to 93%, and hallucination-related failures decreased from 31% to 8%. Further more, evaluation against 50 unseen real-world classroom prompts yielded a 0.0% crash rate and 95% register adherence, confirming robust qualitative perfor mance. Results demonstrate that diglossia-aware dataset engineering and con straint-aware fine-tuning enable reliable, pedagogically aligned deployment of LLMs in low-resource primary learning environments.
Authors - S. M. Mizanoor Rahman Abstract - Removable USB storage devices are widely used in day-to day computing, but they also introduce risks such as unauthorized data transfer and misuse of external media. Understanding how these devices are used on a system is important during forensic investigations, espe cially when analyzing potential data leakage incidents. On Windows sys tems, traces of USB activity are not stored in a single location. Instead, they are distributed across registry entries, system logs, and file system records. Examining these sources individually often makes it difficult to form a clear picture of events. This paper introduces a forensic frame work that brings together USB-related artifacts from multiple system components and analyzes them in a unified manner. The method gath ers data from sources such as registry entries, Plug-and-Play logs, and f ile system structures, and then aligns them based on their timestamps. A Python-based implementation is used to automate this process and to relate device connection events with file operations. Experiments con ducted on a Windows setup show that the framework can identify device usage and reconstruct the sequence of related activities with clarity. By combining evidence into a single timeline, the approach helps simplify analysis and supports consistent interpretation of results.
Authors - Shamita Jagarlamudi, Soormayee Joshi, Aman Aditya, Anushka Gangwar, Pratvina Talele Abstract - Federated Learning (FL) is a privacy-preserving, distributed learning framework where models are trained locally on client devices, and only the trained parameters are shared with a central server. Nevertheless, FL encounters substantial obstacles in real-world applications due to data heterogeneity, such as non-IID distributions leading to local inconsistencies and client drift thereby diminishing global model efficacy. To tackle these challenges, we propose a Federated Prox Drift Correction (FedPDC), an effective and practical method designed to mitigate client drift and local overfitting through the use of drift correction and proximal terms. Comprehensive experiments conducted on public datasets demonstrate that FedPDC performance is superior compared to state-of-the-art methods.
Authors - U. A. Walke, G. A. Kulkarni, Pranav Mungankar, Om Kale, Tejas Kadam Abstract - Digitizing damaged historical texts requires multiple processing steps that can propagate semantic noise through the workflow. Efforts have been made to improve the recognition, correction, and normalization steps of the pipeline, but few studies have quantified model-level effects in isolation under a controlled architecture setup. Here we present Probanza, an extensible staged evaluation framework that decouples preprocessing normalization from semantic modeling to facilitate clean comparisons between LLMs. We perform super-resolution, contextual correction, and historical normalization before English translation. We selected 30 total degraded pages from the Florentine Codex and digitized them with three LLM configurations: GPT-5, GPT-4o, and Gemini 3 Flash. Co sine similarity was computed between model predictions and archival baseline translations to measure semantic accuracy. A one-way repeated-measures ANOVA was done to examined differences across configurations. The analysis revealed a significant main effect of LLM configuration. Gemini 3 Flash pro duced the highest mean similarity (M = .881, SD = .075), while GPT-5 (M = .783, SD = .147) and GPT-4o (M = .769, SD = .135) which were not significantly dif ferent from one another. Our results demonstrate that significant differences exist between LLM configurations for the task of digitizing damaged historical texts when preprocessing is held constant. Probanza allows an isolating model-level effects comparison in LLM-based historical digitization workflows.
Authors - Kushall Pal Singh, Vijay Kumar, Monu Verma, Dinesh Kumar Tyagi, Santosh Kumar Vipparthi Abstract - Hybrid enterprise environments spanning on-premises systems and public cloud services increase exposure to credential abuse, lateral movement, and misconfiguration-driven attack paths, motivating continuous verification and policy enforcement beyond perimeter assumptions. This paper presents an Azure-native, AI-enhanced Zero Trust framework that integrates identity-first enforcement (Microsoft Entra Conditional Access, Continuous Access Evaluation, and Privileged Identity Management), telemetry centralization (Microsoft Sentinel with UEBA), and an Azure Machine Learning classifier that outputs a probability-derived 0–100 trust score. Because identity policy engines consume bounded native signals, the framework binds external scoring to enforcement using SOAR automation that updates policy-targeted identity group membership via Microsoft Graph. A controlled A/B evaluation compares a static baseline (non-adaptive enforcement) with an adaptive mode (ML-in-the-loop scoring and automated score-to-policy binding) using MITRE ATT&CK-aligned scenarios: impossible travel sign-in, privilege escalation attempts via privileged activation workflows, and lateral movement via remote access/filesharing pathways. Quantitative outcomes are reported using median (P50) and tail (P95) time-to-detect, decision latency, and false-positive rate. To technically validate the adaptive control loop, the paper also reports an instrumented latency decomposition (trigger delay, playbook runtime, ML scoring call duration, and score-to-policy execution time) to show which components dominate end-to-end delay.
Authors - Karuppasamy E, Krithika V, Harish P, Pravinbaalaa V, Satheeskumar Abstract - The large online data consist of duplication and plagiarized contents. Due to Artificial Intelligence, data generation has become very easy. But, it may also lack an ethical data generation process. Hence, there is a need of validating plagiarism free data for authentic usage. In this research work, authors focus on word-level plagiarism detection methods in Natural Language Processing. The proposed method uses a comparative analysis of cosine similarity, Euclidean distance and Manhattan distance methods for word-level plagiarism detection for different n-gram sizes. The inculcation of n-gram size improved the accuracy compared to unigram based methods. The experimental results of the cosine similarity method outperform Euclidean and Manhattan distance methods by achieving an average accuracy range of 88 % to 92 % and 75 % to 80 % for direct plagiarism and lightly paraphrased text respectively. The future work is to identify reused images and visual contents.
Authors - Nagaraj.M, V. Balamurugan, Matam Veera Chandra Kundan, M.J. Mathesh, V. Vijairam Abstract - Academic credential fraud is a global issue that undermines institutional trust. Although blockchain solutions provide immutability, they are generally reactive, securing documents only after potential errors or fraud have already occurred. This paper proposes a proactive approach to prevent inconsistencies before degree issuance. We introduce a hybrid model that integrates Digital Twins as a preventive validation layer and Multichain as an immutable ledger. The Digital Twin operates as a virtual sensor during the degree creation process at Universidad El Bosque, simulating and validating academic, financial, and national exam data (Saber Pro) in real time; if inconsistencies are detected, “red flags” are triggered prior to issuance. Once validated, the degree’s hash is anchored to a Multichain network. A functional prototype developed in Python achieved a 100% detection rate of inconsistent records during testing. The pro-posed model transforms the academic certification process into a proactive, se-cure, and trustworthy ecosystem by combining preventive validation with block-chain immutability.
Authors - S. M. Mizanoor Rahman Abstract - Driver fatigue is a major cause of accidents on the road that generates major safety issues for drivers as well as passengers. Real-time detection of driver fatigue can help avert accidents by warning the driver about impending lapses in his attention. This paper proposes a real-time automated system for the detection of driver fatigue through observation of eye blink and yawn, which are major notifications for fatigue. The system uses a combination of deep learning models that give high accuracy levels in detecting a drowsy driver. Eye blink is detected by using a state-of-the-art object detection model that is trained to locate the open and closed states of the eyes accurately using correct coordinate mapping methods, giving an accuracy level of 96 percent. Yawning is detected using a combination of CNN and LSTM models that allow it to analyze spatial information as well as temporal information obtained through videos, giving an accuracy level of 98 percent. Both of these modules work on real-time camera inputs, which makes it possible for a constant monitoring of the alertness of the driver. Whenever the driver is found dozing off due to either excessive blinking or yawning, the system releases a real time auditory warning alert to caution the driver. The result of the experiments has justified that the capability of the combined system works well while operating reliability with low-latency responses in real time. This study has shown that the hybrid detection strategy with spatial and temporal analysis is quite effective in detecting a dozy driver on the road and developing such a system that can be helpful in increasing the safety of the road.
Authors - Kaniska D, Shreya J V, Srinidhi K, Sudhakar K S, Bagavathi Sivakumar P, Krishna Priya G Abstract - Language modeling of clinical text in healthcare pens down a necessitated context along with a high level of security measure for sensitive patient information. A few large language models have shown very good clinically related performance in documentation, summarization, and these models have been rolled out freely. Therefore, these models generate hallucinated or non verifiable outputs. Retrieval augmented approaches thus fix the problem by limiting the answer to the evidences retrieved. However, majority of the existing systems rely on the textual records only and the integration of the diagnostic imaging is not done systematically. In this paper, we put forward a retrieval grounded multimodal clinical modeling framework that unifies structured clinical text with imaging-derived contextual features. A patient specific vector indexing approach is used for isolated retrieval and a modality aware visual analytics approach turn imaging outputs into structured signals, hence language generation. The entire framework is performed fully offline, thus supporting privacy preserving deployment in resource-limited clinical settings. Experimental results show steady multimodal integration as well as the semantic consistency alignment between the retrieved evidence and the generated output.
Authors - Pratham Vasa, Amishi Desai, Chahel Gupta, Avani Bhuva, Mohini Reddy Abstract - Content Delivery Networks (CDNs) play an essential role in enhancing the content delivery speed by caching frequently requested data in edge servers distributed across geographical regions. Traditional CDNs utilize rule-based pol icy and machine learning approaches for optimizing the cache. Machine learning is performed centrally, and the cache optimization is performed using the traffic logs collected by the central server. Although the use of central learning ap proaches is beneficial, it poses certain limitations, including data privacy and high communication cost. The central learning approach aggregates raw data, which poses data privacy issues. This paper proposes an architecture for secure federated learning, which is utilized for cache hit prediction in CDNs. The proposed archi tecture is evaluated using a synthetic dataset containing 1,30,548 records, and the features include temporal and network features. The proposed architecture is com pared with the traditional central learning approach, and the results reveal that the secure federated learning model achieves an accuracy of 70.15%, which is com parable to the central learning approach. The proposed architecture is found to reduce data privacy exposure by 30%.
Authors - Syed Shanika Zaida, Kamineni Leela Tapaswi, Kilari Dhana Malikarjuna Rao, Adarapu Sandeep, Amar Jukuntla Abstract - Removable USB storage devices are widely used in day-to day computing, but they also introduce risks such as unauthorized data transfer and misuse of external media. Understanding how these devices are used on a system is important during forensic investigations, espe cially when analyzing potential data leakage incidents. On Windows sys tems, traces of USB activity are not stored in a single location. Instead, they are distributed across registry entries, system logs, and file system records. Examining these sources individually often makes it difficult to form a clear picture of events. This paper introduces a forensic frame work that brings together USB-related artifacts from multiple system components and analyzes them in a unified manner. The method gath ers data from sources such as registry entries, Plug-and-Play logs, and f ile system structures, and then aligns them based on their timestamps. A Python-based implementation is used to automate this process and to relate device connection events with file operations. Experiments con ducted on a Windows setup show that the framework can identify device usage and reconstruct the sequence of related activities with clarity. By combining evidence into a single timeline, the approach helps simplify analysis and supports consistent interpretation of results.
Authors - Sanchi Mahajan, Nandini Jain, Evangelin G, Jansi K R, Shivam Shivam Abstract - The issue of efficient work planning in heterogeneous multi-cloud in frastructures is still an open issue due to scalability limitations, data privacy, and latency sensitivity. The conventional centralized scheduling approach requires data aggregation, which is associated with critical privacy challenges and com munication cost. The proposed work aims to design a privacy-preserving feder ated multi-cloud task scheduling framework for smart mobility applications to overcome the limitations of conventional approaches. The proposed framework employs a decentralized scheduler for separate cloud regions. The proposed framework employs a novel task abstraction approach to transform real-time traffic data into task-scheduling forms. The proposed framework eliminates the requirement to communicate raw traffic data by employing a federated learning based aggregation approach. The proposed framework employs a federated ag gregation approach, which is associated with scalability, routing, and multi cloud coordination while ensuring data locality. The proposed framework is evaluated by conducting experiments on Random, Rule-Based, Local-ML ap proaches using a Smart Mobility dataset. As can be observed from the results, considerable reductions in communication overhead and privacy leakage are achieved with the preservation of competitive execution latency and SLA com pliance. The strategy has been observed to scale well with an increase in cloud regions, as the communication scalability results indicate. It is the ability to sup port federated, scalable, and privacy-aware job scheduling for smart traffic sys tems without central data sharing that makes this work interesting.
Authors - Thota Neha, Napa. Sai Gopi, R. Aarthi Abstract - The increasing realism of deepfake media has raised signifi cant concerns regarding the authenticity of digital content. Most existing detection methods rely on audio–visual fusion, which often introduces ad ditional complexity and may degrade performance when one modality is unavailable or unreliable. This work presents a dual-stream deep learning framework that pro cesses audio and video independently, avoiding explicit fusion. The au dio stream employs a CNN–BiLSTM model on log-Mel spectrograms to capture temporal and spectral artifacts, while the video stream uses EfficientNet-B0 with BiLSTM to model spatial inconsistencies and tem poral variations in facial sequences. Experiments conducted on multiple benchmark datasets, including ASVspoof 2019, WaveFake, LJSpeech, FaceForensics++, and Celeb-DF (v2), demon strate that the proposed approach achieves competitive detection perfor mance. In addition, the framework maintains robustness under missing modality conditions and offers improved interpretability compared to fusion-based methods. These results indicate that independent modality-specific learning pro vides a practical and effective alternative for deepfake detection in real world scenarios.
Authors - Ankit Podder, Piyush Ranjan Das, Soham Acharya, Ayushmaan Singh, Soumitra Sasmal, Partho Mallick Abstract - Static perimeter-based security architectures are now inef fective in the current threat scenario. The ability of attackers to obtain legitimate credentials and the presence of zero-day exploits often cause real-time breaches of the network perimeter. An area of concern is the real-time monitoring of these systems. In the current scenario, security monitoring is performed in a segregated manner, where network analysts analyze time-stamped network logs and identity analysts analyze time stamped login attempts, without cross-referencing in real time between these two domains. The proposed solution is a fusion platform capable of ingestion of raw network transport data and real-time human element monitoring data. This is achieved through the integration of two dif ferent threat detection mechanisms using a FastAPI backend. The first threat detection system will be the Network Threat Detector (NTD), im plemented in Python and using the Scapy library to parse deep packet data in real time for flow analysis. The second threat detection system will be a JavaScript tracker designed for monitoring digital behavioral indicators and calculating real-time metrics such as mouse velocities, ac celerations, kinematic jerk, and typing speeds. Real-time monitoring will be achieved through a machine learning framework with three different modules for inferring user intent using the Random Forest algorithm, detecting anomalous statistical patterns using the Isolation Forest algo rithm, and detecting malicious plaintext syntax using Logistic Regres sion. The system has been tested in a lab scenario and has been able to classify user session states into four different states: Engaged, Con fused, Frustrated and Suspicious with accuracy exceeding 95%. These digital behavioral indicators will be fed into the Network Transport Data (NTD), allowing the computation of a real-time risk score.
Authors - Lavu Uha Saranya, T.V.S.S. Reddy, I.V.M.K. Sarma, Dipesh Kumar Kushwaha, T.N.V.D. Sai Krishna Abstract - Digital Forensic investigations have typically focused on the identification of private browsing at the application layer using artifacts from memory and disk, as well as the fact that modern browsers rely extensively on the operating system for fundamental capabilities such as rendering, input processing, and networking. This paper extends the forensic scope by demonstrating that session Data related to private Sessions remain in shared Subsystems of the OS in Volatile Memory. In particular, This paper examines the three primary components of the linux desktop environment: the display compositor (GNOME shell); the Input Pipeline (IBus Daemon); and the network resolver (systemdresolved). utilizing physical memory acquisitions via LiME on an ubuntu 25.04 System, This paper monitored the migration of high entropy inputs across these subsystems. The results of this research indicate that critical session data including: Window metadata associated with wayland sessions; Plaintext keystroke data received through D-Bus; and fallback queries made via DNS-over-HTTPS were found to remain in OS Managed Memory for extended periods of time after the conclusion of the private browsing session. The author provides a reproducible framework for analysis of memory associated with the OS level and demonstrates that browser based privacy controls are structurally insufficient to fully sanitize volatile memory.
Authors - Venkata Saikumar Thalupuru, Shubham Kumar, Santhoshini Pranathi Singaraju, Vishal Gupta Abstract - As the use of online banking and digital payments grew faster, that has also left the institution at risk of becoming the victims of credit card fraud, which has become a major challenge for traditional banks and other financial institutions. This huge discrepancy in transaction datasets is one of the greatest challenges in fraud analytics wherein only the rare fraudulent activity takes up a tiny fraction of the total transaction. Traditional machine learning models are often quite accurate but not great at detecting occasional frauds. To overcome this limitation, this study proposes a cost-aware hybrid framework comprising Attention-based Long Short-Term Memory (Attention-LSTM) and ensemble-based machine learning. This method will take care to preprocess the data, maintain balance among classes using SMOTE, select features based on mutual information by leveraging a soft-voting ensemble of the Logistic Regression, Random Forest, and the XGBoost models. Cost-aware learning is coupled with decision threshold enhancement to minimize false negative predictions. Additionally, SHAP-based explainability is added on top for enhanced transparency and interpretability of the model. The experimental results show 99.3% accuracy, 0.905 precision, 0.892 recall, 0.898 F1-score, and 0.98 ROC-AUC, indicating that our new framework is effective in detecting genuine financial fraud.
Authors - Ismail Suleiman, Dinesh Reddy Vemula, Abhaya Kumar Pradhan Abstract - This paper presents the evaluation and demonstration phases of a Design Science Research Methodology (DSRM) study that produced the Organisational Security Culture Framework (OSCF) for Namibian Public Enterprises. An empirical needs assessment established a three-tier security culture maturity deficit: a 40% policy awareness gap; a widespread misconception among non-IT staff that cybersecurity is solely an IT responsibility; and a training gap in which 25% of staff had received no formal security training in the preceding year. The OSCF comprises five interrelated components: Risk Assessment, Security Policy and Enforcement, Security Compliance, Training and Awareness, and Ethical Conduct. Demonstration was executed across four staged phases: baseline assessment, component testing, pilot integration, and full-scale deployment. Evaluation employed a dual approach: expert panel review against eight criteria and Key Performance Indicator (KPI) measurement across five strategic objectives. Results confirm that the OSCF closed the 40% policy awareness gap, achieving 95% staff awareness post-implementation, and significantly reduced phishing susceptibility. Seven evidence based refinements evolved the OSCF from a static policy model into a continuous security culture maturity loop. The framework’s modular, tiered architecture supports long-term sustainability of behavioural change and scalable deployment across organisations of varying cybersecurity maturity, including federated multi-institutional environments.