Authors - Komendra Sahu, Aayush Sahu, Aparajita Vaish, Kavita Jaiswal Abstract - The AWARE framework (USENIX ATC ’23) applied meta learning so reinforcement learning (RL) agents could adapt more quickly to different workload patterns. However, this approach still assumes that workloads seen during deployment are similar to those used during train ing. When this assumption breaks, system performance can decline. In the real world, workload behavior often changes due to traffic spikes, configuration updates, or shifts in resource demand. Under these condi tions, a fixed meta-policy may no longer reflect the current environment, leading to unstable scaling decisions. To handle this , we introduce a Shift-Aware Meta-PPO framework. The system tracks workload behav ior using the KL-divergence to detect changes in distribution. When a shift is detected, the meta-buffers are cleared and exploration resumes, allowing the RL agent to adjust its policy to the upcoming new work load. Tests show that this approach stays stable during workload changes and avoids the sharp performance drops seen in standard meta-learning methods under out-of-distribution (OOD) workloads.