Blog provided by our Trusted Partner PGI
Many of our first conversations with our clients involve our cyber security consultants aiming to simplify things a bit. Cyber security shouldn’t be seen as hugely complicated or difficult.
One of the ways we try and clarify client thinking is by starting with the idea that it helps to think of cyber security as being predominantly about risk management. Pretty much all of the things we do in our various organisations that are recognised as good cyber security practice are, after all, aimed at minimising the risk that something bad will happen to our networks, our infrastructure or our data (and therefore our business operations).
Of course we also codify many of these good practices into international standards and create compliance regimes – but these are also aimed at establishing a common standard of ‘good’ that is used to minimise the risk of the ‘bad’. There’s also a research angle; the nature of the cyber security threat changes and evolves, delivering more ‘bad’ that we have to find the right response to. Again though, we do this to understand the risk and manage it appropriately.
To illustrate this for a wider audience, we wanted to write up an example. We were inspired by the growing prevalence of ransomware attacks and this very useful article by Brian Krebs. So, we’re going to look at where the risk management approach comes into cyber security using data back-ups as an example.
Where Ransomware meets Recovery: Some classic risk management terms
Ransomware is the bogeyman du jour of current cyber security, and no wonder. A successful attack can wreak havoc on an organsiation’s profitability and even threaten its very viability. Here’s a risk – how do we manage it?
…It’s fine – we’ve got backups!
So, your organisation has backups. Great! This is a key measure to insure yourself against the worst outcome from a multitude of threats potentially stemming from:
However, merely having a backup is like having a fire extinguisher that may or may not be out of date. Have you tested that your backup will actually be useful if it is required?
You’ve tested it, great! You’ve established that the backup works. But now another question:
How long ago is your most recent backup from?
Let’s imagine it’s a backup of a customer order database, taken every Friday night. Due to a ransomware attack, you lose the database on a Tuesday night. What happens to the data from Monday and Tuesday? Is that data lost? What happens to those orders? These questions relate to a risk management term known as the Recovery Point Objective (RPO) i.e., how old the data should be at the time when it’s recovered to prevent intolerable disruption. As a quick example, if your RPO is 60 minutes, your data will need to be backed up every 60 minutes.
In this example, hopefully the RPO is more than the time between now, when the recovery happens, and when the backup took place. Otherwise, while we have managed to restore the database of customer orders, the missing orders mean that the database is no longer useful, and despite a successful recovery of the data from the backup, the database is useless.
Do you know how long restoring your organisation from backup could take?
The other risk-related measurement of importance is the Recovery Time Objective (RTO). This measurement focuses on how long it takes to recover and, in this instance, that’s how long it takes to restore the database from the backup. If Friday’s backup worked, but was written to Ye Ole Faithful system, which is extremely slow and it takes 48 hours to recover the data back into a useful state – i.e., on the systems where users make use of it, rather than on the backup system, then the question arises, is 48 hours too long? If so, while your backup works, it may take too long to restore to be useful.
So, about the risk management…
In both situations – an incorrectly specified RPO and RTO can create a situation where a backup system works and can be restored from, but the business doesn’t gain the benefit it expected from the backup.
This is where a risk management approach to your cyber security comes in.
The RTO is usually a business-driven factor based on not serving customers and breaking contracts. Questions you may ask when defining your RTO include:
What is the cost of the lost functionality on an hourly / daily basis?
How will we cope without the system?
Is there a legal requirement that applies to the system?
On the other hand, RPO tends to be driven by the database solution that underpins transactions with customers. The process to work it out is based on the severity of system impacts and a systems dependency tree. Questions you may ask when defining your RPO include:
Can we re-enter the data from another source?
What is the cost of re-entering the data?
What downsteam processes use this data?
This is all then properly validated (or not) by a series of exercises, beginning with documentation and tabletop scenarios, before real exercises and full-scale site/system/network outages. As you can imagine, the pandemic has moved this to the top of the list for many organisations, who have needed to move quickly to address changes to their working environments.