70 likes | 178 Vues
Data goodness. Mostly in black and white By Dom. You must love your data!. Lost data : Current imaging data in BRIC cost ~£5.1M, just for scanning costs! (2011) no research no publications no jobs no PhDs! Sad Dom Look after your data! It looks after you Happy Dom .
E N D
Data goodness Mostly in black and white By Dom
You must love your data! • Lost data : • Current imaging data in BRIC cost ~£5.1M, just for scanning costs! (2011) • no research • no publications • no jobs • no PhDs! • Sad Dom • Look after your data! • It looks after you • Happy Dom
Data Storage • Home directories: • ISIS home, U Home • Not for large amounts of imaging data • Projects directory • ISIS, V: Big stuff goes here • If you require large amounts of space • E.g. > 50 GB • LET ME KNOW IN ADVANCE!
Server goodness • Why is the server a good place to store data? • Mirror and parity - some errors - data can be easily recovered • BACKUPS: • Tape backups, daily - 1 month retention • if you have funding, processed data can be mirrored off site • raw data is always mirrored offsite (ECDF) by default • Desktop PC's • not reliable - no mirroring, no parity - some errors - data is lost (Often all of it) • Network backups often fail • Machines turned off, Network busy • moving to a new system when I get time!
Data love • Curation: Do this as you work! • Plan your data use • Use meaningful folder names • Make 'README.txt' files with dates, names of students/employees involved, references to software, scripts and versions, purpose of experiment/processing. • Be tidy with your data - tidy up occasionally • Friday afternoon - quick tidy up • Big tidy up at end of experiment/ project/ phase/ year • BE CAREFUL, don’t rush • Data, spreadsheets, databases • Anonymisation • *** Repatriation keys***
Code and Scripts • Coding: • Testing • Make sure that the software you are using does exactly what you think it does! • Check every step for every image! • Do not use hard coded paths • Use versioning software (ECDF)