Loading…
Friday April 10, 2026 9:30am - 11:30am GMT+07

Authors - Gia Nghi Thoi, My An Tran, Tram Thi Tuyet Le, Nhat Van Hoang Nguyen, Long Hong Buu Nguyen, Dien Dinh
Abstract - Medical diagnosis using Small Language Models (SLMs) of ten suffers from hallucinations and knowledge inconsistency. While re inforcement learning (RL) from knowledge graph feedback offers a po tential solution, pure reinforcement learning strategies often encounter challenges related to sample inefficiency and poor exploration. To address this, a hybrid training pipeline that combines supervised alignment with structural reinforcement is proposed. The method applies knowledge guided supervised fine-tuning (SFT) with hard negatives to refine deci sion boundaries and employs a bipartite-specific reward model to capture interactions between symptoms and diseases. Experiments on multiple medical datasets, including DXY, GMD, and MED-D, demonstrate that this hybrid approach outperforms pure RL methods. By incorporating knowledge graph (KG) information as a structural regularizer, the model achieves improved accuracy, stronger cross-dataset generalization, and reduced overfitting while maintaining strict adherence to diagnostic out put constraints
Paper Presenter
Friday April 10, 2026 9:30am - 11:30am GMT+07
Virtual Room F Bangkok, Thailand

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link