![[ Sponsorenlogos: Förderverein CampusSource e.V., Helmholtz Open Science Office und de-RSE e.V. ]](/publikationen/csa2022/bilder/sponsoren.png)
campusSOURCE
vergibt im Jahr 2022 in Kooperation mit
dem
Helmholtz Open Science Office
und de-RSE e.V.
den
campusSOURCE Award 2022
Sonderpreis:
The CodeClub at the Max Planck Institue of Psychiatry
Jonas Hagenberg
jonas_hagenberg@psych.mpg.de
Linda Dieckmann
linda_dieckmann@psych.mpg.de
Dept. Translational Research in Psychiatry
MPI of Psychiatry
Kraepelinstr. 2-10
80804 Munich
Germany
Introduction
At the Max Planck Institute of Psychiatry (MPIP) people with different programming skills, from wet-lab scientists to bioinformaticians, work together. Also, quite a lot of PhD students who have not received a formal education in computer science perform primarily computational work. Hence, we revived and reorganized the CodeClub at the MPIP with monthly meetings. Here, we hold presentations about programming topics, discuss problems encountered in the last month and find partners for code review.
Goals
Our main goal is to increase the quality of research software and scripts developed and used for our research. By increasing code quality, we aim to make our research more reproducible and reduce the number of errors.
Especially in sight of the last two years, where most people have been working from home, we want to foster collaboration between computational researchers at our institute and connect people across different groups and backgrounds.
To have an impact beyond the MPIP, we encourage Open Science practices so that our software and scripts can be used by a wider scientific community.
Lastly, we hope that by having a central organization for programming related topics, the importance of research software engineering is better represented and recognized within our institute.
Organizational details
The CodeClub is organized by us two PhD students. Our project has the support of our department head. Therefore, we were able to fit the CodeClub into an existing schedule of rotating meetings and could replace another meeting. Also, we are supported by a staff scientist so that a transition is provided when we leave the institute.
Our approach
We hold regular meetings with scheduled short presentations about programming topics either specific to one programming language or language agnostic such as dependency management or how to obtain a DOI to publish your code. We use a peer-to-peer approach where each researcher presents a topic they are experienced in. Doing so, we encourage exchange and knowledge sharing between researchers. Often, some people know certain libraries or approaches for a problem encountered by another researcher who has not been aware of that solution.
One important industry standard to increase code quality is code review. As this has not been widely adapted at the MPIP, we want to provide a low-threshold entry. For this, we prepared checklists for both code reviewers as well as code authors on what to look for when performing a code review and an example GitHub repository that shows the review functionality. Additionally, we created a central document to find code reviewer and to document past code reviews. In this way, the researchers who participate in code review are honored.
Project status
The codeclub was restarted in October 2021. Since then, we have held monthly meetings. The approach that the members of the codeclub propose topics, where they are knowledgeable in, works well and the program for the next months is fixed. Also, researchers at the MPIP have already used the code review framework to perform code reviews.
Learnings
- For the presentations, it is important that also small tutorials are included. Thereby, the presented concepts are easier learned and the engagement is higher. This is especially important for how to do code reviews with GitHub. Not everyone at the MPIP has extensive experience with working with GitHub. Therefore, a hands-on tutorial brings everyone on the same page and increases the confidence for working with code review tools.
- When preparing the exercises, keep them short. They usually need more time to complete than anticipated.
- The monthly schedule works well as it is integrated into the schedule of rotating meetings in our department and does not burden the researchers too much with another meeting.
- Opening the CodeClub to researchers from the whole institute allows to get to know people from other groups and leads to more interactions even when mostly working from home.
- Researchers can identify common problems and can figure them out together.
- Even with the tools and support provided, it is still difficult to get researchers to perform code review.
- Code Review in research is quite different from code review in industry (more scripts rather than classical software development in git branches) and has its own challenges.
Outlook
We prepare a central sharing point where the tutorial presentation is saved together with the materials of other presentations. In this way we will have a central repository where (new) lab members can look up the collected knowledge.
For the future, we also plan to invite external speakers to get introductions to topics that are not represented yet at our institute.
Guidelines for code review
For reviewers
Generals
- be nice!
- communicate which ideas you feel strong about and those you don’t
- ask open ended questions and ask for clarification
- offer and explain alternatives and workarounds
- stay empathetic and positive
- accept that many programming decisions are opinions
- ask questions, don’t make demands
- talk in person if there are many things to clarify
- don’t give strong, opinionated statements
- don’t criticize the author but the code
Areas to check
Code style / understandability
- Does the code follow the style guide?
- Is the overall readability ok?
- If multiple scripts are given, is their naming meaningful and indicative of their order?
- Are unnecessary duplications of code snippets avoided?
- Is the script organized, so it is easy to follow what the code does?
- Is not all code in one big method?
- Do the comments also explain why you do something, not only what?
- Is it clear when and by whom the code was written?
Functionality
- Does the code do what the documentation says it should?
- Are there any major / obvious bugs?
Maintainability
- Does the code require the least amount of effort to support it in the future?
- Are up-to-date and reliable libraries and frameworks used?
Reproducibility
- Are the results easy to reproduce (e.g. by using a reproducible environment)?
- Are seeds used when necessary?
Data protection (especially important for the phase before publishing)
- Do the scripts not contain any data that shouldn’t be published? (e.g. as outputs in markdown; comments or even files stored in the directory)
For code authors
Prepare your code for the review
- check with the reviewer how they want the code
- should you just send the code files per mail?
- best case: create a repository in GitHub, invite the reviewer as collaborator and assign them as a reviewer to a pull request
- provide enough context to understand your code/the changes. You can either link to an issue you solve with your pull request or provide additional materials such as a paper draft
- prepare the code (e.g. try to follow the style guides)
- don’t submit too many lines of code for review; if it’s a larger script you may mark the parts for the first code review