Data Protection – Dont forget to sanitise!

Staff training is an essential part of every business. The better trained your staff are, the more motivated and efficient they are likely to be. For your training to work, it has to be a realistic simulation of the actual work your employees do.

However, based on the ICO’s findings this week, it seems Durham University went a step too far in delivering realistic training:

From the ICO news release for 1 March 2012:

Durham University breached the Data Protection Act after disclosing personal information in training materials published on its website, the Information Commissioner’s Office (ICO) said today.

The personal data was contained in screenshots used to demonstrate the use of particular University systems and included details such as names, addresses and dates of birth of up to 177 former students and staff. The information – which had not been anonymised – was made available on the University’s website in February 2011. The University discovered the error in July 2011 and removed the material before reporting the matter to the ICO.

Durham University is not alone here, and a similar problem is faced by every organisation that wants to build Service Continuity or system testing platforms.

If you are processing sensitive data – be it personal data, or commercially sensitive – you must ensure that every environment you use is appropriate. Hackers dont care if you are on a development system, identity thieves dont care if it is training material – all that matters if they can access the data or not.

Where live data isn’t required (test systems, training material etc), a much better solution is to sanitise the data (sometimes referred to as obfuscation), which involves replacing some or all of the data with meaningless information but maintaining the structure. A common example is replacing names with randomised strings of letters, and dates of birth with fictional data such as 31 February.

Whenever you generate sanitised data, care must be taken to ensure it really is fictional. If you are generating personal data, for example, then should your system produce data which unintentionally matches a living person, you must treat it as live personal data.

Alternatively, if you really must use live data in your training materials, development / testing environments or whatever, you must ensure that the correct security controls are in place to prevent its unauthorised loss, disclosure or modification.

Dont repeat the mistakes of Durham University.

Taz Wake - Halkyn Security

Certified Information Systems Security Professional with over 19 years experience providing in-depth security risk management advice to government and private sector organisations. Experienced in assessing risks, and producing mitigation plans, worldwide in both peaceful areas and war zones. Additionally, direct experience carrying out investigations into security lapses, producing evidential standard reports and conducting detailed interviews to ascertain the details of the incident. Has a detailed understanding of the Security Policy Framework (SPF) and JSP440, as well as in depth expertise in producing cost-effective solutions in accordance with legislative and regulatory guidelines. Experienced in accrediting establishments and networks as well as project managing the development of secure, compliant, workable business processes.