Technology

Navigating IT Outages: Insights from Ketan Patel, Group Chief Information Officer at OCS

Ketan Patel

Ketan Patel

02 Sep, 2024

Navigating IT Outages: Insights from Ketan Patel, Group Chief Information Officer at OCS

In today’s 24/7 digital landscape, managing IT updates and outages is more critical than ever. Ketan Patel, Group Chief Information Officer at OCS, shares his expert insights on the best practices for handling auto updates, the importance of thorough testing, and the systems and processes that can build resilience within organisations. Patel emphasises the need for strategic timing, robust testing, and the implementation of digital twins to ensure minimal disruption and maximum efficiency. Read on to discover how your organisation can navigate the complexities of IT updates and outages with confidence.

When is the best time to update?

Auto updates are a feature of most modern cloud platforms and there are a number of golden rules which are easy to forget as everything is 24/7 these days. Historically, you should never do an update on a Friday going into the weekend. This goes back to the old tech principles of making sure updates take place in the middle of the week as that’s going to have minimal impact if things go wrong. 

How should you manage auto updates?

Auto updates can be tested in very sophisticated ways before they are implemented. Any updates should be tested to the right levels – and when that has been done, there should be a simple regression back so the changes can be reverted to an older version. 

By making sure there’s a path backwards, it’s easier to get people back to an older version of the software than trying to fix what you’ve got.

What are the systems and processes you should have in place?

When you release new software, it should be tested thoroughly with all the connected systems. Now that’s easier said than done. It needs to be robustly managed to ensure that what you’ve tested is exactly what goes live. 

What sort of tooling, hardware, software can build resilience into organisations should this sort of thing happen?

Companies should have their own regression plan – and that comes back to the point of version control. When it comes to any major updates, companies should think about implementing them in selected parts of the estate first rather than across the whole business to mitigate things going wrong. 

Additionally, we have digital twins for example in Facilities management – a virtual representation of an object or system designed to accurately reflect a physical object. In a world where real time interdependence systems are critical we need to pay more attention to how we create replica technology and testing stacks where tests can happen in real life situations in far more controlled manners, so you’re not impacting everyone at the same time. Based on this, there are learnings to be made in the technology divisions of web servers and cloud-based business so updates can be released in a more controlled fashion.

Share this story