Successfully undertaking a PhD in the 21st century requires researchers, from all disciplines, to have basic training in software development and the management and stewardship of their data resources. However, for many scientists, the prospect of developing the skills needed to engage increasingly diverse and complex e-infrastructure is daunting.
For this reason, NLeSC, and its partners SURFsara, DTL and VU have set up a course that will provide PhD students in the life sciences, who are developing software or working with large data resources, the experience and knowledge to empower their use of data and development of research software. It will equip participants with essential skills required to optimise their research now and in the coming years. Students completing this course will have the basic skills and knowledge required to manage large datasets, utilise databases, produce metadata and ensure the long-term stewardship of their data. The course will provide students with their first introductions to programming and will demonstrate that anyone can code.
Participants will also understand the basic requirements of sustainable software development, including the use of coding environments and the need for version control and testing. Key aspects of software and data carpentry and data stewardship will be taught.
Finally, the course will provide an introduction to e-infrastructure and show how this can be accessed and used to improve the speed and quality of research.
For Life sciences the following essentials will be brought forward in the course:
- Dealing with Data: Data Provenance & PID, FAIR Data Stewardship, From Excel to databases, SQL & Python practicals
- Computation & Automation: Introductions to command line, programming (Python), Github, version control, unit testing, Debugging, Documentation.
- e-Infrastructure: Introduction and examples moving from small local to big national compute and data storage services.