Working with large scientific data at the Australian Synchrotron and beyond.
The Australian Synchrotron generates terabytes of data daily from a range of scientific instruments. In the past, this crucial data often ended up on unlabelled DVDs in professors' offices and hard drives just waiting to die. Sharing data usually required the help of Australia Post. How is it that cutting edge scientific facilities often have prehistoric data management practices?
To bring scientific data into the modern world, scientists and software engineers in Ashley Buckle's protein crystallography lab at Monash University teamed up to create a web application for storing, managing, annotating, and sharing large amounts of scientific data. MyTardis is a Python/Django application geared towards receiving automated 'feeds' of data direct from scientific instruments, allowing users to work with their data and share it 'in the cloud'. Data is then cite-able and has been - in journals such as Science and Nature.
Since its creation and deployment at the Australiam Synchrotron, MyTardis has been used to manage data such as that from microscopy / microanalysis, particle physics, next-gen sequencing and medical imaging, in addition to all instruments at the Australian Synchrotron and the neutron facility ANSTO. Its codebase (github.com/mytardis/mytardis/) has been contributed to by over 20 software developers from a large range of institutions and projects in publicly funded research. This presentation will present case studies in which MyTardis lends a helping hand, in addition to reflections on the experience of managing an open source codebase with many contributors from many origins, with many aims.
Steve Androulakis is a software engineer at the Monash e-Research Centre of Monash University. He and A/Prof Ashley Buckle invented MyTardis in 2007 and has seen its success grow into many scientific disciplines and uses.