The UK Government (Treasury Committee) has recently published (October 2019) its report, ‘IT failures in the Financial Services Sector’. This report criticises both the banks, for an unacceptable level of service outages resulting from IT failures, and the regulators (BoE, FCA, PRA) for not properly enforcing accountability for these failures.

This is understandable, given the risks that banking service outages pose to consumers and the financial system overall. The banks recognise the need to make their IT systems as robust as possible, but many are faced with an uphill task, as they seek to build new digital services on top of ageing legacy IT systems.  The resulting complexity of banks’ IT estates makes it difficult to anticipate service outages and to identify the specific IT components that cause them.

It is for this reason that the PRA, in its discussion paper Building the UK financial sector’s operational resilience, sets out guidelines that banks should follow in ensuring that their services are able to withstand unpredictable IT failures. A key element of the PRA’s advice is that banks should focus their attention and investment on ensuring that the performance of their critical business services remains within agreed impact tolerances, whatever happens to the underlying IT systems.

The first step in achieving this goal is to understand the precise nature of the connection between the quality of a business service – for example, a credit card payment – and the performance of the underlying IT systems and processes that execute it. The PRA recommends that banks should create detailed maps to expose how the myriad technology components within their IT estate combine to support each ‘important’ service they deliver to customers. (It seems at this stage that the PRA is allowing banks to decide for themselves which of their services are important enough to warrant this attention. That may change next year if, as expected, legislation is put forward to mandate compliance with these regulatory requirements).

These service maps are necessarily complex, since the way in which different technology tiers interact to deliver a single customer service is, in itself, complex.  However, they are essential in understanding the relationship between a financial service and the technology components that deliver it. This diagram shows how a set of technology components may combine in different ways to support several services.

The diagram above illustrates how the technology components that support, in this case, a card payment, interact with other systems supporting other services. Detailed versions of this type of map serve to identify precisely how these interactions combine to influence the quality of each business service.

Once detailed service maps have been created, generally using automated tools, specialist performance architects can apply the concept of ‘performance budgeting’ to assess which IT elements have the greatest influence on the performance of each end-to-end customer service (e.g. a card payment transaction).  Armed with this analysis, it is possible to see which technology components are most likely to be responsible for a service outage, and to engineer them to improve service performance and resilience.  When a service outage does occur, the service map makes it easier to pinpoint the IT systems responsible and to effect a speedy recovery – keeping the end-to-end service itself within the agreed impact tolerance.

It is clear that the financial regulator is under increasing pressure from government to adopt an uncompromising approach to UK financial services businesses.  As mentioned above, legislation is likely to come before parliament in 2020 that mandates specific measures to ensure the operational resilience of the UK financial services sector. Moreover, the regulators are being encouraged to hold individual board directors personally responsible for service outages, in an effort to focus the attention of senior executives on the need to invest in service resilience.

Detailed service maps are a major step towards decoupling the quality and continuity of financial services from the poor performance or failure of underlying IT systems. Creating the maps, and using expert performance architects to interpret them, looks like a sound move.