Abstract: This paper presents a comprehensive review of the literature on AI-driven proactive monitoring and self- healing mechanisms for banking IT infrastructure [1], [2]. The study systematically examines how AI-enabled models—including supervised and unsupervised machine learning, deep neural networks, and multi-agent systems—are employed to enhance system reliability, reduce operational downtime, and ensure continuous service availability within mission-critical financial environments [3]–[6]. It further analyzes the application of predictive analytics for forecasting infrastructure failures and resource demands, alongside automated remediation systems designed for rapid fault recovery [7], [8]. These technological advancements collectively signify a paradigm shift from reactive IT support toward increasingly proactive and autonomous operational management [1], [2].The review critically assesses emerging AIOps frameworks that seek to unify discrete functions—such as monitoring, predictive analytics, root-cause analysis, and automated remediation—into a cohesive, intelligent pipeline [2]. A key finding is the pressing need for these integrated systems to deliver real-time intelligence, adaptive operational thresholds, and, critically, explainable AI (XAI) capabilities [9], [10]. The requirement for transparency and auditability is paramount in the heavily regulated financial sector, where justifying automated decisions to auditors and regulators is non-negotiable [9], [11]. This underscores a significant gap between advanced algorithmic potential and the practical, compliant deployment of fully autonomous systems.Ultimately, the synthesis of current research reveals that while substantial progress has been made in individual domains—such as anomaly detection or automated scripting—existing solutions often remain fragmented, lacking the scalability, real-time adaptability, and holistic integration required for enterprise-wide banking ecosystems [1], [2]. There- fore, this review identifies a clear and urgent research trajectory toward the development of trustworthy, scalable, and fully autonomous self-healing architectures [5]. Such next-generation systems are essential to address the escalating complexity, security threats, and availability demands inherent in modern digital banking infrastructure [6].
Publication Date: 2026-06-01