Software and Data Carpentry

Software Carpentry (SWC) and Data Carpentry (DC) are international organisations that teach researchers computing skills. DTL/ELIXIR-NL, the Netherlands eScience Center, and SURFsara are actively aligning the Dutch SWC/DC initiatives.

 

The Software Carpentry Foundation and its sister organisation Data Carpentry are world-renowned for their efforts. Software Carpentry focuses on helping researchers develop foundational computational skills; Data Carpentry focuses on teaching researchers to work effectively with their data through the research data lifecycle. They have built up a curriculum, to which people worldwide contribute and help improve. This curriculum is openly accessible on Github. Last but not least, they are building capacity worldwide, i.e., they train people to become certified instructors that can teach SWC/DC workshops. You can read more background about SWC/DC in the sections at the bottom of this page.

Source: http://carpentries.github.io/instructor-training/20-carpentries/

Certified trainers
Several DTL partners are very active in SWC/DC and are home to certified trainers (e.g., NLeSCSurfsaraErasmus MC, and UMC Utrecht). NLeSC, the DTL Learning programme, and ELIXIR-NL Training have joined forces to align the Dutch SWC/DC initiatives and to make sure they are aligned with the European SWC/DC activities. The sustainability of SWC/DC training efforts depends on the availability of active local instructors. At the moment, there are nine certified instructors in the Netherlands. In the recent instructor training event in Utrecht, 17 new instructors have been trained and are now undergoing the certification process. At the beginning of the year 2018, up to 33 more instructors will be taught.

Examples of activities
The list below contains some examples of past DTL/ELIXIR-NL activities related to SWC/DC. Please consult the section at the bottom of this page for a more complete overview.

  • ELIXIR-NL was involved in the Pilot action, which aimed to train researchers in IT skills and introduce SWC/DC workshops across the ELIXIR nodes. Read more.
  • In 2017, DTL/ELIXIR-NL, the Netherlands eScience Center, University Medical Center Utrecht, and SURFsara organised one of the first Data Carpentry Genomics workshops in Europe. Read more.
  • In 2016, the first edition of the SURFacademy workshop ‘Essential skills in data-intensive research’ took place. It was collaborative effort of the Netherlands eScience Center, SURFsara, DTL, VU University Amsterdam, and the Software and Data Carpentry foundation. Read more.
  • In 2015, UMC Utrecht (Philip Lijnzaad) hosted two SWC/DC events (a hackathon and a workshop) in the context of an ELIXIR Pilot. You can find a report about these events here.

The content of this page is in part based on Software Carpentry, Data Carpentry and Training pages.

Life scientists increasingly need computational skills to efficiently work with their large complex data sets. There is a need for specialised training that includes topics like shell scripting, basic programming skills in R or Python, data handling, reproducible computational research, and open science. Unfortunately, formal curricula usually lack such training. To fill this gap, the Software Carpentry (SWC) organisation was founded in 1998 and Data Carpentry (DC) was founded in 2013. 

SWC and DC are communities of volunteer researchers, educators, and others, who develop lessons and teach two-day workshops on basic computing and data skills for researchers. Software Carpentry focuses on helping researchers develop foundational computational skills; Data Carpentry focuses on teaching researchers to work effectively with their data through the research data lifecycle. The main goal of these organisations is not to teach specific skills per se (although those are covered) but rather, to convey best practices that will enable researchers to be more productive and do better research.

SWC and DC both focus on computational skills and run two-day workshops taught by volunteer instructors, striving to fill gaps in current training for researchers.

Data Carpentry workshops

  • focus on best practices concerning data,
  • are aimed at pure novices (i.e., not people who want to learn about coding, but rather those who have a lot of data and don’t know what to do with it,
  • are domain-specific, and
  • present a full curriculum centered around a single data set.

Software Carpentry workshops are

  • intended for people who need to program more effectively to solve their computational challenges,
  • not domain-specific, and
  • modular—each Software Carpentry lesson is standalone.

Further reading

  1. Software Carpentry: Lessons Learned
  2. Data Carpentry: Workshops to Increase Data Literacy for Researchers.
  3. M. Kuzak et al. Towards sustainable training in essential computing research skills for Dutch life scientists
  4. Pawlik et al.
  5. C. van Gelder Shaping SWC/DC based training in ELIXIR: European and node perspective
  6. https://software-carpentry.org
  7. http://www.datacarpentry.org
  8. http://carpentries.github.io/instructor-training/

In parallel to SWC and DC, new community-driven projects are emerging. One that has recently grown very fast is Library Carpentry (LC). The project has been started by James Baker, while he was working at the British Library. LC grew thanks to the great community building effort by Belinda Weaver. The focus of LC is to build training materials and teach research librarians and humanities scholars basic, but necessary digital skills. The majority of new materials for LC carpentry has been developed during two Mozilla Science Lab Global Sprints, one in 2016 and one in 2017. The Netherlands joined both sprints. The sprint in 2016 was hosted at the Netherlands eScience Center and one in 2017 at Koninklijke Bibliotheek, with participants from VU University Amsterdam and Utrecht University Libraries, Rijksmuseum archive, Koninklijke Bibliotheek digital archive, and the Netherlands eScience Center.

There is an increasingly high demand for SWC/DC workshops in the Netherlands and in other European countries. Unfortunately, most instructors are located in North America and the United Kingdom, and the number of instructors available in the Netherlands and neighbouring countries is low. All SWC/DC instructors are volunteers and workshop organisers need to cover instructors’ travel costs, which may consume a substantial part of the workshop budget. The key to making SWC/DC workshop organisation and execution sustainable is establishing a local community of active instructors.

DTL aims to establish a comprehensive, (inter)nationally acclaimed and sustainable training and education course portfolio for life science research and  technology, open to academic and industrial researchers. This is why DTL has a leading role in coordinating and supporting SWC/DC community initiatives in the Netherlands.

DTL and ELIXIR-NL involvement in SWC is a part of the larger European effort lead by ELIXIR. ELIXIR is also working together with SWC and DC foundations on a collaboration agreement, which will allow for organising multiple workshops and instructor training events in ELIXIR nodes.

The initiative started in 2015 with the ELIXIR Software and Data Carpentry (SWC/DC) pilot project that has been described in details in F1000 publication (Pawlik et al. 2017). Shortly after that, the Netherlands eScience Center became an SWC/DC affiliate organisation, and since then, the first SWC/DC instructors have been certified. In January 2016, the first workshop Essential Skills in Data-Intensive Research took place in Utrecht.

During the year 2016, the initiative has been actively delivering trainings, building the instructor community and strengthening the collaboration with the SWC/DC foundations.

Since the beginning of 2016, four SWC/DC workshops (January 2016, April 2016, October 2016 and March 2017), four tutorials during SURF Research Boot Camp events, and one during the National PhD day have been taught. Those events not only helped to train numerous young researchers but also helped to increase awareness of the SWC/DC training activities in the Netherlands.

In March 2017, the first Genomics-focused DC workshop was organised (blog post). This was the result of collaboration between Utrecht Medical Center and Netherlands eScience Center. The second Genomics workshop in collaboration between DTL, UMC Utrecht, RIVM, and the University of Amsterdam is planned for December 2017.

 

The Netherlands eScience Center continues to organise workshops based on the SWC/DC model under the eScience Academy Essential Skills umbrella.

Celia van Gelder

Mateusz Kuzak

 

Philip Lijnzaad

Anita Schürch

Jan Bot

Arthur Newton

Christine Steiger

Carlos Martinez Ortiz

Sven Warris

Steven Claeyssens

Martijn Kleppe

Lotte Wilms

Ben Companjen

Michiel Cock

Ruud Steltenpool