Infrastructure and Device Management
3AI September 6, 2020
Leading infrastructure major
Problem Statement
- Leading infrastructure major was having trouble managing its IT operations. With multiple servers, delayed services because of unplanned downtime and failed transitions from delayed infrastructure support – they were loosing out because of customer dis-satisfaction
- Using multiple infrastructure log files, and details from over 500+ different nodes, a pattern was established in the data to identify predictive downtime and RCA for unplanned fails
Analytics Led Approach
- Different approach for this engagement:
- Multiple-correlated data sources
- Consolidation and correlation was a key
- Near-real time data was a challenge and
- False-Positive and True-Negatives were to be tracked
- multiple models or scenarios, to counter different events, was worked to build a near-real time trigger environment variable
Critical Success Factors
- Usage of multiple data sources: Correlation of server, node, switch data to come with a collective understanding of the system
- Two levels of output: Descriptive and Predictive set of outputs and notifications
- Test on limited systems, scale up implementation on full environment in a phased approach