Staff training is an essential part of every business. The better trained your staff are, the more motivated and efficient they are likely to be. For your training to work, it has to be a realistic simulation of the actual work your employees do.
However, based on the ICO’s findings this week, it seems Durham University went a step too far in delivering realistic training:
From the ICO news release for 1 March 2012:
Durham University breached the Data Protection Act after disclosing personal information in training materials published on its website, the Information Commissioner’s Office (ICO) said today.
The personal data was contained in screenshots used to demonstrate the use of particular University systems and included details such as names, addresses and dates of birth of up to 177 former students and staff. The information – which had not been anonymised – was made available on the University’s website in February 2011. The University discovered the error in July 2011 and removed the material before reporting the matter to the ICO.
Durham University is not alone here, and a similar problem is faced by every organisation that wants to build Service Continuity or system testing platforms.
If you are processing sensitive data – be it personal data, or commercially sensitive – you must ensure that every environment you use is appropriate. Hackers dont care if you are on a development system, identity thieves dont care if it is training material – all that matters if they can access the data or not.
Where live data isn’t required (test systems, training material etc), a much better solution is to sanitise the data (sometimes referred to as obfuscation), which involves replacing some or all of the data with meaningless information but maintaining the structure. A common example is replacing names with randomised strings of letters, and dates of birth with fictional data such as 31 February.
Whenever you generate sanitised data, care must be taken to ensure it really is fictional. If you are generating personal data, for example, then should your system produce data which unintentionally matches a living person, you must treat it as live personal data.
Alternatively, if you really must use live data in your training materials, development / testing environments or whatever, you must ensure that the correct security controls are in place to prevent its unauthorised loss, disclosure or modification.
Dont repeat the mistakes of Durham University.