Authors - Ritesh Kumar Verma, Preethiya T Abstract - Contemporary customer support systems require processing a massive number of user queries with low latency and high semantic relevance. Rule-based systems fail to capture context, while fully LLM-based systems are computation ally expensive and suffer from high latency. This paper introduces an adaptive AI-assisted customer support automation system using an optimized Retrieval Augmented Generation (RAG) model. The proposed system combines Azure OpenAI embeddings, FAISS-based vector search, selective Cross-Encoder re ranking, and a Learning-to-Rank (LambdaMART) model for adaptive score fu sion. Unlike vanilla RAG models, the proposed system adaptively re-ranks only the top-k retrieved candidates, trading off ranking precision and latency. Experi ments were carried out on a 1,30,000-sample e-commerce customer support da taset with query-response pairs annotated with intent labels. Compared to rule based retrieval, embedding+FAISS, and vanilla RAG models, the proposed hybrid system showed improved top-1 retrieval precision with a concurrent reduc tion in end-to-end latency from 0.414s to 0.365s (≈11.8% relative improvement). The LambdaMART model adaptively learned weights from FAISS and Cross Encoder scores, improving ranking robustness and eliminating misranked top re sponses. The system was implemented on Azure Machine Learning with a cloud scale pipeline and interactive Streamlit web interface, showcasing the cost-effec tive inference capabilities of the proposed system via selective re-ranking.