Making regular backups of data is probably the most important and, fortunately, one easiest tasks to manage.
Although most people are quite aware of the risk and cost of losing data through hard drive failure or accidental deletion, it is best to have a policy and schedule in place for maintaining data backups.
Backup security requires further mention. If the data is sensitive then it should not be stored on a computer that is connected to the internet, and preferably not connected to any network. If the data needs to be destroyed at the end of a project then consider what level is required – a hard drive will need to be overwritten several hundred times to ensure that no data can be recovered. Very high-level security institutions, such as Defence, require hard-disks to be physically destroyed and optical discs to be shredded.
The lifetime of backups should also be considered. Burned optical discs have average lifetime of two years, and five years if kept in a cool dark place.
If you are using a network drive then your data is probably already being backup up for you by IT staff. It is still a good idea to check with them to find out how often they backup, what is the maximum amount of data they can backup, and how long they keep old backups.
You may need to maintain your own backups if:
Your data will be used to obtain the results and conclusions of your research, so it is important to ensure its accuracy. Your data may also become an important dataset that is used by many others, so errors have the potential to hinder many research efforts.
It is therefore important to set up policies and practices to ensure the accuracy and authenticity of your data. This can include:
It is important to document the experimental or data gathering methods. Other researchers may question your results or want to repeat/extend your research, so it is important to document this. The sciences already have a culture of keeping good lab notes and the social sciences often record their survey methodology. This is often done in a notebook, but you should also consider recording this information digitally or converting it manually. This is important as notebooks are easily lost or put into storage when an academic or postgraduate student leaves. This information is far more useful if it is archived with the data it refers to. Scanners are available in most ANU library buildings.
It is also valuable to document analytical methods. For example, if you write a script/macro/program to help analyze the dataset by producing graphs or statistics from your dataset.
Well-defined access controls help you comply with privacy and confidentiality policies and help maintain data authenticity by limiting who can modify data. The access controls may change throughout the life of the research project. Initially all data will usually be restricted to the research group, when the results are published the data may then be made available to other researchers.
Access controls can be defined on a per-user or per-data basis. When the data is active and there are a small number of people using the data then you will probably use per-user access permissions:
As an example, the principal researcher would have Administrator permissions over all data and may be the only one with Read permissions of confidential survey data. Research collaborators would have no access to the confidential survey data, Read access to de-identified survey data, and Write access to data analyses and publications.
Access permissions are usually set by right-clicking on a file or directory and editing the security properties.
It is important to consider the security of your own data to prevent:
Security of digital research data is part of the issue of Information Technology Security. The ANU has extensive range of policies and information related to IT security.
The topic of IT Security is too large to cover here, but at the least you should install up to data antivirus software on your computer. ANU staff and students can install Sophos Anti-Virus on their office and home computers.
If you have sensitive data that is covered by privacy laws or confidentiality agreements it is best to store them on a computer that is not connected to any network. If this is not possible then you can also consider encrypting your data. Encrypting data is a non-trivial exercise and there are currently no services at ANU to do this.
The final issue to consider is physical security. A computer that is not connected to a network is still vulnerable to someone removing the hard-drive and installing it in their own computer where they can bypass passwords and access restrictions. For highly sensitive data you can use an external hard-drive and store it in a locked safe overnight.