1 / 30

GRAD 521, Research Data Management Winter 2014 – Lecture 7 Amanda L. Whitmire, Asst. Professor

Data storage, backup & security. GRAD 521, Research Data Management Winter 2014 – Lecture 7 Amanda L. Whitmire, Asst. Professor. Follow-up from last class. What is a reasonable timeline for DCP?. Overview for today. Why? Where to store data Local drive | network drive | cloud

donnel
Télécharger la présentation

GRAD 521, Research Data Management Winter 2014 – Lecture 7 Amanda L. Whitmire, Asst. Professor

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data storage, backup & security GRAD 521, Research Data Management Winter 2014 – Lecture 7 Amanda L. Whitmire, Asst. Professor

  2. Follow-up from last class • What is a reasonable timeline for DCP?

  3. Overview for today Why? Where to store data Local drive | network drive | cloud Consider: capacity & access by co-workers Data backup Disaster recovery (research continuity) Data security Corruption or loss (hardware failure or data deletion) Confidentiality (personal or intellectual property)

  4. Why data storage, backup & securityare important “Your data are the life blood of your research. If you lose your data recovery could be slow, costly or even worse… it could be impossible.”

  5. Most common loss scenario:drive failure

  6. This happens a lot: physical theft & unintentional damage Cute, but not a valid security plan.

  7. Rare, unexpected events happen University of Southampton, School of Electronics and Computer Science, Southampton, UK, 2005

  8. It CAN happen to you

  9. Real-world lesson:Audit your backups…

  10. Data storage options Personal computers (PCs) & laptops External storage devices Networked Drives Cloud servers

  11. Storage: PC/laptop Advantages Convenient Disadvantages Drive failure common Laptops: susceptible to theft & unintentional damage Not replicated Bottom Line Do NOT use to store master copies of data Not a long term storage solution Back up important data & files regularly

  12. Storage: external storage devices Advantages Convenient, cheap & portable Disadvantages Longevity not guaranteed (e.g. Zip disks) Errors writing to CD/DVD are common Easily damaged, misplaced or lost (=security risk) May not be big enough to hold all data; multiple drives needed Bottom Line Do NOT use to store master copies of data Not recommended for long-term storage

  13. Storage: networked drives Advantages Data in single place, backed up regularly Replicated storage not vulnerable to loss due to hardware failure Secure storage minimizes risk of loss, theft, unauthorized use Available as needed (assuming network avail.) Disadvantages Cost may be prohibitive; export control Bottom Line Highly recommended for master copies of data Recommended for long-term storage (~5 years)

  14. Storage: cloud storage Advantages Data in single place, backed up regularly Replicated storage not vulnerable to loss due to hardware failure Secure storage minimizes risk of loss, theft, unauthorized use Disadvantages Cost may be prohibitive Upload/download bottleneck & fees Longevity? Export control Bottom Line Possibly recommended for master copies of data Not recommended for in-process data, large files

  15. Storage: Google Drive for OSU Advantages All same advantages of network & cloud storage File sharing & collaboration w/variable access levels Unlimited storage (GD), 30 GB non-GD Automatic version control on GD Disadvantages 30 GB may not be enough Upload/download bottleneck Bottom Line Possibly recommended for master copies of data Possibly not recommended for in-process data, large files

  16. ? ? ? ?

  17. Data backup “Keeping backups is probably your most important data management task.” -Everyone

  18. Data backup Best Practice: 3 Copies of datasets

  19. Backups: full Advantages Data can be easily & fully restored from a recent full backup Disadvantages Time consuming Take up the most storage Bottom Line Recommended for master copies of data Frequency depends on data size & mutability

  20. Backups: differential Advantages Data can be easily & fully restored from a full backup + 1 differential backup Disadvantages Size of each differential backup increases each time Backup window increases each time Bottom Line Frequency depends on data size & mutability

  21. Backups: incremental Advantages Smallest file size between backups (full or incremental) Shortest backup window Disadvantages When you need to restore data, the full backup +all incremental backups are required = more difficult restore scenario Bottom Line Frequency depends on data size & mutability

  22. Backups: bottom line Pick a strategy Be consistent Test your approach!

  23. Data security “Data security is the means of ensuring that research data are kept safe from corruption and that access is suitably controlled.”

  24. Data security • It is important to consider the security of your data to prevent: • Accidental or malicious damage/modification to data • Theft of valuable data • Breach of confidentiality agreements and privacy laws • Premature release of data, which can void intellectual property claims • Release before data have been checked for accuracy and authenticity

  25. Data security • There are different levels of security to consider for your research data: • Access: This refers to the mechanisms for limiting the availability of your data • Systems: This covers protecting your hardware and software systems • Data Integrity: This refers to the mechanisms for ensuring that your data is not manipulated in an unauthorized way

  26. Data security: access • Limit the availability of your data: • ID/Password: Step 1, for everyone really • Role-based access: limited privileges/permissions to data depending on user • Wireless devices: lack anti-virus software and firewalls; vulnerable to theft & theft of device • Use a PIN; limit storage of sensitive data on device

  27. Data security: systems • Protect your hardware & software systems: • Anti-virus software: required of all OSU computers • OS & media software: keep them up to date • Firewalls: block unwanted network traffic from reaching your computer or server(e.g. typical home router) • Intrusion detection software: detects & alerts, does not prevent • Physical access: locked office; password on wake; cable lock for laptops;

  28. Data security: data integrity • Protect the integrity of your data @ file-level: • Encryption: the process of converting data into an unreadable code. You must have access to a password or a secret encryption key to be able to read an encrypted file. Check with OSU Data Security team for advice (no “one size fits all” solution). • Electronic signatures: meant to ensure the authenticity of the signer and by extension, the document; now carry legal significance • Watermarking: embeds a digital marker for authorship verification and can alert someone of alterations made to data files; most often w/images & media

  29. ? ? ? ?

  30. Exercise Complete the ‘Data Storage, Backup & Security Checklist’

More Related