A ransomware attack took down one of the world’s most trusted stores of information for months – what can we learn about avoiding the same mistakes?
When the British Library announced it had encountered “technical issues” then a “technology outage” in October 2023, it wasn’t immediately apparent what had happened. But in the months since, we’ve learned that the institution, which holds nearly 15 million physical books, as well as 170 million other documents and items, fell victim to a ransomware attack.
It took nearly three months for the library to start to restore access to researchers, albeit in a significantly different way than it previously existed. “There are many further steps ahead,” warned Sir Roly Keating, chief executive of the British Library, in a blog post. “The broader programme of full technical rebuild and recovery from the attack will take time, and we’re keen to listen to our users and the wider research community to ensure we get the priorities right in the months ahead.”
The library, which declined a request to participate in this story, has received support from the UK’s National Cyber Security Centre (NCSC) as it starts to rebuild access to its archive. Lessons will be learned, wrote Keating. But what are they? And how can the British Library’s issues help others design and maintain database projects?
“An opportunity to learn”
“There’s an opportunity to learn from the British Library’s experience,” says William Kilbride, executive director of the Digital Preservation Coalition (DPC), an industry body for those involved in preserving digital archives. It starts with thinking about how to build and structure any database in the first place. “When building a database, most often people think about efficiencies in structure, including things like normalization of tables and data, but not necessarily the continuity and resilience of the database itself,” says Sajeeb Lohani, global head of cybersecurity at Bugcrowd.
Paul Wheatley, head of research and practice at the DPC, says that engineers ought to look for “single points of failure” in their software design. Redundancy is often baked into the design of databases from first principles. “But there’ll be some level in the stack where you have a single point of failure, and you need to try and minimize and defend that as much as possible.” This could either require some changes to the codebase, or throwing extra monitoring and defense resources at the weakness.
Lohani says it’s important to consider database security as a high priority, because they hold an organization’s most treasured information. “This includes things like the ability to access the database from known hosts, automatic backup capabilities, and others,” he says.
Backup and backup (and plan for failure)
Any business- (or society-) critical database needs to go beyond standard automatic data backup capabilities. Lohani recommends supplementing these as-standard backup capabilities with something like Amazon Web Services’ Relational Database Service (RDS) snapshots on a daily basis, the results of which should then be encrypted and stored in a secure location. This is offered for free on a low-usage basis, but becomes chargeable at scale.
Those snapshots could also be backed up for extra security – making it easier to quickly restore a database in the aftermath of a ransomware attack. “For engineers, using cloud resources can aid in a speedy recovery time, since these resources are already stored within the correct [secure] areas for faster recovery,” says Lohani.
For any organization where data is critical, these are tactics that should form part of a robust disaster recovery plan. “Testing these recovery plans is also quite important to ensure that the correct steps and tools are used to perform the recovery,” says Lohani.
Prevention is better than cure
Of course, preventing an attack is far better than having to recover from one. Engineers can start by wargaming how databases are accessed in the wild.
“To prevent the attack itself, defense in depth concepts need to be taken into account,” says Lohani. This defense strategy includes building in multiple fail safes in the event that one area of a system is breached, so that others can protect it. “This includes areas such as identity and access management, application security, infrastructure security, and cloud security.”
The principle of least privilege, which throttles user access to data and resources to solely what they need to do their job, is important in this context, even if it can be difficult to undo access that has already been coded into a system. “To decrease the likelihood of such attacks, utilizing a database user in the general application context, which only has the ability to run select queries, is a very powerful tool,” Lohani said. “This is defending the resource in depth, which is essential in complex environments.”
Careful consideration over knee-jerk reactions
However, when to drastically tamp down on user access to databases is up for debate. “When a data breach occurs, such as the one within the British Library, it can cause a knee-jerk reaction within organizations,” says Bart Koek, field chief technology officer for EMEA and Asia-Pacific and Japan at data security firm Immuta.
When an organization is in the public sector, and media scrutiny is high, the desire to act quickly and drastically increases. “The most common knee-jerk reaction is to massively restrict the data that the organization shares externally and internally,” says Koek. But it shouldn’t always be the case. “This has huge hidden opportunity costs and debilitates decision making, hampering the benefits of building and maintaining large databases,” he says. In essence, an organization could stymie their growth by being overly stringent on data access, harming the ability for those within the business to do their job well.
Instead, Koek believes that it’s important to carefully consider access long before, rather than after, a data breach or attack. “To mitigate data breaches, whether within the public sector or private sector, developers should place an emphasis on tighter access controls when building databases,” he says. “By employing fine-grained automated access controls with usage detection, businesses can still benefit from sharing data and getting the most out of larger databases, but with the confidence that access controls remain accurate, sensitive data is protected, and data is only used correctly.”
Given the British Library’s mandate is to secure the world’s knowledge, that slower pace of bringing systems back online – which has been criticized by some – is also important, if you can stomach the negative press. “Disaster recovery for an organization doing digital preservation is different. If your access system goes down for a year, it's not ideal and it doesn't look great,” Wheatley said. However, “if you keep all the data you’re trying to preserve forever, that's the critical thing.” Other, more commercial entities may not have this option. “They need to be up and running in the next day or the next week, otherwise, they go out of business,” he said.
As with many technical decisions, the approach here will depend on your business goals and priorities. Take the time and the effort to do it right – it could pay dividends when disaster strikes.
Pay the ransom?
There is one other alternative – a last resort that all authorities and experts recommend against, but which many organizations do anyway: pay the ransom asked of you by the hackers. Paying up benefits the hackers to the detriment of the victims, as Sophos data shows the amount paid by victims of ransomware nearly doubled from 2022 to 2023, with an average payout of $1.5 million.
“It is just basic practice that you don’t pay money to criminal blackmailers,” the British Library chief executive Keating told the Financial Times. “It was important for us to articulate choices, to set a tone.”
There’s also no guarantee that paying up will result in your data being returned: ransomware gangs aren’t necessarily known for their trustworthiness. But it is a quicker solution, and one that many choose when the worst happens.