campusSOURCE Award 2022
SURESOFT: Towards Sustainable Research Software
Christopher Blech, Nils Dreyer, Björn Friebel, Christoph R. Jacob, Mostafa Shamil Jassim, Leander Jehl, Rüdiger Kapitza, Manfred Krafczyk, Thomas Kürner, Sabine C. Langer, Jan Linxweiler, Mohammad Mahhouk, Sven Marcus, Ines Messadi, Sören Peters, Jan-Marc Pilawa, Harikrishnan K. Sreekumar, Robert Strötgen, Katrin Stump, Arne Vogel, Mario Wolter
Technische Universität Braunschweig, Germany
Corresponding author: Jan Linxweiler, firstname.lastname@example.org
Researchers develop software to process or generate data to test their scientific hypotheses. Especially researchers from STEM disciplines (Science, Technology, Engineering, and Mathematics) validate their models through self-developed software prototypes. However, they do not necessarily have a programming or software engineering background. Moreover, scientific software development happens in a time-constrained environment focusing on the fast publication of results. In addition, as scientists usually have limited contracts, the software is developed with short-term goals in mind. Therefore, the resulting software is mostly rapidly developed and contains code artifacts that are hard to maintain, extend and even reproduce due to missing documentation or dependencies. These factors, among others, hinder researchers from progressing and further developing their software. Thus, research software sustainability is vital to the research process to ensure that software can be evolved and reused by the next generation of scientists. To foster sustainability, scientists can highly benefit from education in principles and practices of software engineering, which can further be supported by established methods, tools, and technologies. This paper introduces SURESOFT, a twofold approach to address challenges in the development of research software regarding sustainable science that combines tools and infrastructure with education in the form of workshops and training. Furthermore, we report our experience of applying the SURESOFT approach to five software projects from different fields and discuss challenges, such as common bad practices and applicability in diverse scenarios.
The increasing number of publications that base their findings on scientific software indicates its continuously growing importance. However, scientific software is usually developed by the researchers themselves who seldom have the necessary education to ensure a high software quality , . At the same time, reproducibility is generally not a focus of the software development process. Part of the reason for these problems is the publish or perish mentality in research in correspondence with the still prevalent lack of reputation for software development . In the past, these factors have led to the production and publication of incorrect or irreproducible scientific results , . This prevailing situation has taken on an extent so that some scientists even speak of a reproducibility or credibility crisis , .
According to a study by Collberg et al. , the problems concerning reproducibility primarily manifest themselves in lacking documentation, unavailable environment, and missing packages, often ending in non-compilable systems. This implies that it is not enough to share the codebase to ensure reproducibility; it is also important to enable other researchers to run research software with minimal efforts. It also appeared that researchers are often reluctant to publish the code of their tools or disregard publishing the raw data , . Indeed, authors often see reproducibility as extra effort without benefit for their submission because publishing reproducible work takes time and must be thought of at the earliest stages. In contrast, in order to enable researchers to reuse it and reproduce results, the research software should comply with the FAIR principles (Findable, Accessible, Interoperable and Reusable) , , . Hence, to drive the scientific discovery process forward, scientists not only need to be able to reproduce prior results but also need to be able to build upon them to answer new scientific questions. In other words, the reproducibility of scientific results in itself is not sufficient. Instead, the results as well as the process to produce them must be accessible and adaptable. For scientific software, this means that it must not only be available and executable, but other scientists must also be able to fix bugs, add features, and port existing implementations to new environments later on. The reproducibility of results as well as the ability of software to endure and evolve over time can be summarized as software sustainability , .
The question of how to design sustainable software systems is one of the grand challenges in the field of software engineering . Many decades of research and experience have made it clear that there is neither a magical tool nor any easy path to achieve it , . However, there is an agreement on certain fundamentals of software engineering such as cohesion and coupling, modularisation, abstraction, information hiding and separation of concerns as well as striving for simplicity , ], , , . Various guidelines were proposed to achieve sustainability in general and reproducibility in particular, but their implementation remains difficult . This work focuses on supporting the sustainability of research software by providing practical methods and tools that help sustain essential and large software from different fields of study. We address the key concerns researchers face when looking to extend existing research software. This includes the effort required to build the software, the difficulty of reproducing the results, and the long-term maintenance of the software. In section III, this paper describes SURESOFT, a conceptual approach to develop sustainable software that allows widespread applications. To demonstrate the SURESOFT approach, we apply it to five research projects at Technische Universität Braunschweig (TU Braunschweig) from different scientific fields and discuss the common and distinct hurdles between them: PyADF a software from the branch of theoretical chemistry that is characterized by large amounts of numerical data and complex computational tasks, THEMIS a fault-tolerant distributed framework that is often vulnerable to bugs and hard to maintain, elPaSo a vibroacoustic simulation tool which utilizes popular numerical methods to provide acoustic and structural analyses of various complex material and element types, SiMoNe a system-level simulator for simulating and modeling realistic mobile networks, and lastly VirtualFluids a computational fluid dynamics research software that provides fast and reliable numerical solutions for various kinds of flow problems.
The rest of the paper is structured as follows: in section II, we identify a set of challenges that hinder the sustainability of research software. Subsequently, in section III, we present the core principles and technical approaches & methods of SURESOFT, followed by an enormously important teaching component as well as the illustration of the SURESOFT workflow. Thereafter, in section IV, we give a general overview regarding the application of the SURESOFT approach on each one of the aforementioned research projects. Afterwards, we present related work that tackled some of these challenges and highlight their differences compared to SURESOFT. Finally, in section VI, we conclude by outlining our strategy to support the movement towards sustainable software in the scientific community.
As previously discussed, there is no general agreement on the definition of software sustainability , . Still we have identified reproducibility of results and the capability of software to endure and evolve over time to be the most important aspects from our perspective. However, the implementation of these criteria presents certain conceptual, technical as well as organizational challenges. Some of these challenges especially in the context of reproducibility have been addressed e.g. by Boettiger . However, our discussion covers a wider context.
A. Availability & Accessibility
Publications in the field of scientific computing usually present models, methods and result data. However, the software itself is usually not published alongside the paper as it is considered nothing more than a highly advanced calculator and does not add to the scientist’s reputation . This poses an issue in terms of reproducibility. Without access to the source code of the software, it is impossible for other scientists to trace back the calculation of results and ensure that the implementation of the underlying models is correct. Furthermore, without access to the binaries and a compatible computing environment, results cannot be reproduced. To ensure long-term availability of the software and its source code, it needs to be archived on an appropriate platform. Another question raised in the context of availability and accessibility is the one of the corresponding license. Frequently, software will be published using a free or open-source software license, but also a proprietary software license might be an option to make your software and its source code available.
Providing a thorough documentation is essential to enable others to reuse and extend the research software as well as reproduce results. However, the documentation process requires a commitment from early development stages and also consumes a larger amount of time than most researchers can afford. Since scientific software undergoes heavy modifications as part of the discovery and development process, writing the documentation is often pushed to the very end of the project at which point there is no sufficient time to do it thoroughly. Thus, the documentation is either neglected or in an incomplete state. Moreover, most of the time scientific software is not published and therefore writing a documentation to enable other scientists to understand, reuse and extend the software is often not even considered. As a consequence, even the installation or build processes become hard or even impossible to reproduce .
C. Software Quality & Design
The majority of research software is developed by scientists themselves rather than by professional software developers , . The main reason for this is that in order to develop the software, in-depth domain knowledge is required. Hence, being domain experts themselves, scientists usually strive to get the necessary software development knowledge through self-studies or receive it from colleagues , . Unfortunately, self-education only happens to a limited extent that is sufficient to achieve their primary short-term goal of getting a scientific reputation. In other words, the software itself is of a limited value and only serves as a tool to gain new insights as soon as possible . Therefore, scientists often follow a quick and dirty software development approach as opposed to focusing on high quality, long-term sustainable software . Accordingly, they have little motivation to learn the corresponding skill set. Nevertheless, code that is easy to read, understand, change and reuse needs to be structured appropriately . Due to the lack of software engineering knowledge among scientists as well as the time pressure, scientific codes often suffer from bad quality and design. The result being software that is tightly coupled, unstructured, hard to understand and not well tested.
D. Collaboration & Versioning
Research software often starts out as a project of a single scientist but eventually ends up being used and maybe even developed further by their peers as well. These collaborators can either be located within the same institution or from the wider research community. As software development progresses, it becomes harder to track the changes in code and documentation over time. This is especially true if multiple developers collaborate while working on their own code copies. Moreover, the integration of these versions becomes a challenge in different aspects. On the one hand, they could contain conflicting changes. On the other hand, there is no guarantee that the software still works as intended after integrating. Also, untraceable changes complicate the reproduction of scientific results, as it is not clear how to match a version of the software with the corresponding result data.
E. Dependency Hell & Software Evolution
Many software systems use existing solutions in the form of third-party libraries. These libraries reduce the workload necessary to solve a problem. However, they also represent dependencies that must be available in order to run the software. For users, this can become a problem with respect to reproducibility if dependencies are either not documented, no longer available, or not compatible with certain platforms. Another aspect of this issue arises when the dependencies evolve. Further development of libraries either to implement new features or to fix bugs from previous versions may eventually result in different behavior, making it incompatible with older versions. As a result, data from older versions cannot be reproduced with the new version. In addition, evolving dependencies may even prevent developers from compiling and running the software, making it unusable.
III. SURESOFT APPROACH
A. Core Principles
As outlined in section I, our definition of software sustainability is based on reproducibility of results and the capability of software to endure and evolve over time. In SURESOFT, we follow a twofold approach (Figure 1) to achieve sustainability, based on a combination of established tools and infrastructures that we provide for members of the TU Braunschweig on the one hand and supporting education and training on the other hand. More specifically, with our toolchain and infrastructure we aim to achieve reproducibility of results by providing ready-to-use runtime environments that include all necessary instructions, data and dependencies in order to run the software. Both the software and runtime environments will be made available through a dedicated publication and long-term archiving service that complies with the FAIR principles. During the publication and archiving process, an automated test pipeline will ensure that the software is executable within the runtime environment and that it produces the correct results. To allow for higher software quality, easier software evolution and reuse in the future, extensive education regarding software engineering principles and best practices is part of the SURESOFT approach. Talks and workshop topics cover essential concepts like modularization, abstraction and information hiding as well as design principles and patterns. Furthermore, a large focus is placed on testing practices ensuring that applications behave according to their specifications. These practices are supported by employing version control systems in combination with continuous integration subsection III-B to entirely automate the testing process and provide quick feedback to developers on every code integration.
B. Technical Approach & Methods
The software engineering community has developed multiple tools to support the software development process that have been established for quite a while. In recent years, some have also been adopted by the scientific software development world , . Especially tools supporting automation are becoming more common since they can reduce human errors and accelerate the development process. In the following paragraphs, we introduce technologies and methods that we found helpful to support the sustainable development of research software. The tools and techniques described below are well established in the software industry. However, while individual techniques, like the use of version control have become popular among scientists, widespread adoption is still not common . In many cases, scientists are not aware of modern technologies or are reluctant to use them because they often come with a steep learning curve.
- Figure 1: SURESOFT Approach for Sustainable Software
Addressed challenges: Availability & Accessibility, Documentation, Dependency Hell & Software Evolution
Container technologies provide operating system-level virtualization. In contrast to virtual machines, they are bound to the host operating system’s kernel. While this approach creates a dependency to the operating system, it has less overhead and therefore offers faster startup times and better performance compared to virtual machines. In the context of SURESOFT, we employ the specific container technologies Docker and Singularity , . We use Docker especially in combination with continuous integration while Singularity is used in high performance computing (HPC) environments e.g. HPC clusters due to its easier integration with the Message Passing Interface (MPI) and general purpose graphics processing units (GPGPUs). The two container technologies play an important role in the context of reproducibility. By introducing an image format that allows the definition of environment templates that contain all the required components to run a software, they provide a solution to the challenge of missing dependencies. Since these images can be exported to simple file archives, they can also be uploaded to appropriate archiving platforms, allowing other researchers to easily find and access the software. Moreover, the environment definitions for images are plain text files that can therefore also serve as a very basic form of documentation in regard to installation and execution of the software .
Addressed challenges: Availability & Accessibility, Documentation, Collaboration & Versioning
Version Control Systems allow developers to track and manage changes made to their source code files and commit them to a source code repository. With each commit the current state of the software is recorded and provided with a unique identifier, documenting the changes made to the source code over time and allowing developers to roll back to previous versions if something turns out to be wrong. In addition, collaborative development is supported by using a centralized source code repository as a means for synchronization. Popular choices for hosting repositories are platforms like GitLab or GitHub that can help make the software easily available and accessible for contributors and users. At TU Braunschweig, the IT center hosts our own GitLab instance.
Addressed challenges: Documentation, Software Quality & Design, Collaboration & Versioning
Continuous integration (CI) is a development practice that was introduced as part of Kent Beck’s Extreme Programming . As the name suggests, it aims at integrating newly developed code often and in short intervals into the main production code. With every integration, the entire system is tested by an extensive suite of automated tests that provide rapid feedback to the developers if an integration has compromised the application’s functionality. The short integration cycle plays an important role in this context. Keeping the cycle short ensures that the amount of changes committed to the main code line is low. Therefore, if the test suite fails, locating the defect is easier. Of course, for this procedure to be really effective the presence of the aforementioned extensive and reliable test suite is an essential requirement. In addition, the test suite also serves as low level documentation. Moreover, a prerequisite for easy testability is good software quality. Therefore, continuous integration can indirectly foster good software design. Although not part of the initial definition of continuous integration, its usage nowadays implies the employment of some kind of continuous integration service that automates the build process and execution of the test suite on every integration.
Addressed challenges:Availability & Accessibility, Dependency Hell & Software Evolution
The term continuous analysis was coined by Beaulieu-Jones and Greene . It combines containerization with the continuous integration approach with the goal of improving the reproducibility of research. In order to run the test suite of an application the continuous integration pipeline must of course provide a runtime environment that is able to execute it. Continuous analysis suggests providing this environment in form of a Docker container. When the continuous integration pipeline for a computational analysis completes successfully, the environment is extended by the compiled application and published as a new Docker image. Since this image contains all the required dependencies and configuration alongside the application, it serves as an executable package that can be distributed to other researchers who can use it to reproduce the computational analysis run in the test pipeline.
Addressed challenges: Availability & Accessibility, Dependency Hell & Software Evolution
Archiving corresponds to the aspect of collecting and indexing digital copies of published papers or datasets in an accessible and usable format into public repositories. This ensures long-term availability and makes them citable as well as findable via meaningful metadata including a unique identifier (DOI), following the FAIR principles , , . Furthermore, to enable researchers to reproduce the computational results, it is also essential to archive the software assets including the input data and the research software itself. Ideally, archiving the entire runtime environment, for example in form of the aforementioned containers, as a ready-to-use appliance will ease reproducibility and prevent any struggle with running the software.
As discussed in section II-C scientific software developers usually have little to no dedicated software engineering education. Consequently, research software often suffers from low quality. However, especially the central aspects of software evolution and reuse strongly correlate with the software design. Researchers lean more and more towards modern tools and methods (e.g., Git, contiuous integration, object-oriented programming, etc.) to support their software development process. But, without deeper knowledge of the underlying concepts, simply the use of these tools and methods cannot guarantee the sustainability of the developed software system. Hence, within SURESOFT we have a strong focus on education, not only on the usage of tools but also on the underlying concepts of software design. To fill the gaps in the software engineering knowledge of researchers, we are regularly organizing workshops to provide educational content about the following topics:
- Software Design Principles & Patterns
- Testing &Test Driven Development
- Version Control & Continuous Integration
- Research data management & Long-term archiving
- Software licensing
The SURESOFT workflow is based on utilizing well-established practices and tools from the software engineering community that are provided ready to use. In particular, we strive for automation wherever possible in order to speed up the development process and avoid human errors. In Figure 2, a high-level view of the workflow is given, which depicts how a set of interlinked stages aids the development of sustainable research software. The first stage depicts an automated development and deployment process, where the continuous integration platform is loosely coupled to a version control system and a bug reporting tool. To start with, academic software developers are responsible for frequently integrating their code <1>. Next, the continuous integration platform triggers an automated build process and a suite of tests is executed to verify the new changes did not break the software <2>. A second suite of predefined macro tests on the cluster is executed to evaluate the settings and how the software operates under specific conditions <3>. Detected issues are reported back to the developer <4>. Once the tests are validated and the software works properly, the platform creates an image including the software and its dependencies, which is ready to use. Additionally, a developer edition with integrated tool support to extend the software is generated <5>. The latter lowers the entry barrier for external contributors and external evolution of the software. From the researcher perspective, this concept enables the (academic) developers to provide ready to use and install software <6>. In case bugs are identified they can be reported via a dedicated service <7>. From now on researches (internal or external) can locally execute the software or deploy it on a suitable system, i.e., the cluster that is also used for testing the software. To support research software sustainability, SURESOFT will provide researchers with a long-term preservation service to archive finished studies and the software associated with it. This is achieved by initiating the second stage of the approach. This stage enables researchers to archive their experiments and software artifacts to an archival platform <8>. A research data and software repository collects these artifacts once they are validated. Initially and further on experiments of an archived software and its experiments are periodically re-executed to ensure reproducibility is still feasible <9>. This platform relaunches the tests and reports results to involved parties.
IV. SURESOFT APPLICATIONS
To illustrate how SURESOFT contributes to software sustainability, we present five scientific scenarios ranging from the system research community to disciplines that often rely on extensive computations (e.g., theoretical chemistry). In the following, we provide an overview of our experiences applying the SURESOFT approach to the mentioned projects with a focus on their demand for long-term sustainability.
A. SURESOFT over PyADF
PyADF  is a Python-based scripting environment for designing and executing workflows in theoretical and computational chemistry. It is interfaced to several (commercial and open-source) external quantum-chemical program packages and allows researchers to build and execute complex workflows. PyADF was initially published in 2011 and has been constantly developed and extended by Jacob et al., Institute of Physical and Theoretical Chemistry (PCI), TU Braunschweig. Before PyADF joined the SURESOFT project, only few of the challenges mentioned in section II were solved by the developers. E.g., while having an extensive number of tests available, these were carried out manually on some arbitrary versions of the code. However, the largest challenges of PyADF lie in the Availability & Accessibility. First of all, PyADF is able to orchestrate calculations of many different quantum-chemical program packages. Therefore, its deployment relies heavily on the environment and the availability of these external codes. Second, the inputs and outputs of these quantum chemical calculations are quite diverse. Here, a strategy for the storage and archival in suitable unified data formats will be needed. Within SURESOFT, we first tackled the challenge of Collaboration & Versioning. The extensive manual test suite of PyADF was converted to a continuous integration approach with automatic testing on an HPC setup using several computing nodes in parallel. In this setup, it is now possible for all (even external) developers to run the complete test suite on-demand or automatically within merge requests in a short time frame. With this setup, it became possible to convert PyADF to Python 3 in a smooth transition without loosing functionality. Currently, strategies for improving the Availability & Accessibility are developed. Here, PyADF will use a modularized approach of interfacing with different quantum-chemical software packages using container technologies. As the availability of these external program packages will depend on the user’s environment as well as licensing terms, we aim to develop an interface, which allows PyADF to work with external code running outside or even within different containers. The last step within SURESOFT will be to develop a strategy for the archival of quantum-chemical research data. This requires not only the input and output data of PyADF itself but also of the individual tasks executed as parts of a workflow. For tasks performed with different external program packages, output data needs to be converted to common data formats and annotated with suitable metadata in order to allow for the reuse of this data. As PyADF provides a common interface to different external program packages, it is ideally suited to integrate the conversion, annotation, and archival of quantum-chemical research data. In summary, the SURESOFT approach will make PyADF easier to install, use, and develop, making PyADF itself more sustainable. Additionally, seamlessly integrating data formats and interfaces to archival solutions will make it easier to handle and publish research data generated with PyADF.
B. SURESOFT over THEMIS
THEMIS is a Rust-based Byzantine fault-tolerant (BFT) framework ], developed at the Institute of Operating Systems and Computer Networks (IBR), TU Braunschweig. In essence, Byzantine fault tolerance (BFT) systems are designed to safeguard against arbitrary failures in a distributed setting, e.g., a node sends corrupted responses , . BFT implementations are typically complex and include thousands of lines of code. Despite the algorithms being formally verified , the implementation of these systems can be vulnerable to faults or attacks due to programming errors or memory corruption, especially prevalent when using unsafe languages. A Defective implementation can invalidate the protocol faults resilience. Therefore, tackling the challenge of Software Quality & Design is essential. For this, testing is an important pillar in the development process of THEMIS. Since Rust is a safe language that already gives memory and concurrency guarantees, more energy can be invested into testing the framework’s logic. We include a wide range of unit and integration tests that improves the development of THEMIS and allow us to gain more confidence in the correctness of the implemented prototype. We use unit tests to test the logic within the code and integration tests to validate the protocol steps and subprotocol (e.g., view-change) that is independent and intended for any traditional BFT protocol. Additionally, we use Miri  to detect undefined behavior through static analysis of the Rust code. Further, the THEMIS project involves multiple authors with specific competencies that aim to extend or implement optimizing features which may result in a Collaboration & Versioning challenge. For this and the challenge of Software Quality & Design, we rely on a self-hosted GitLab instance with a CI pipeline to run distributed tests on-demand and with a large number of changes on the main branch. This allows us to accept changes from a diverse set of authors with different levels of experience, as errors can be easily identified and corrected. Meanwhile, the performance of the system is measured and recorded. This allows performance regressions to be identified and traced back to specific commits. Further, collaboration is supported by a bug reporting tool. With this tool, contributors and external users can easily create reproducible bug reports as it automatically creates a report including inputs, logs, and information about the underlying infrastructure (e.g. Rust or the operating system). We emphasize that building and running a BFT system is often a time-consuming and error-prone process, especially when the system relies on different third-party libraries and various hardware requirements , leading to a Dependency Hell & Software Evolution issue. In this context, we apply each step of the SURESOFT approach to THEMIS to enable automated deployment, testing, and evaluation. Following the procedure of SURESOFT, the process of building and executing THEMIS is automated using ansible  scripts. The user has also the option to pull ready-to-use images from the GitLab image registry during development. Additionally, upon publishing and open sourcing of the software, a tested Docker image will be submitted to1 an archiving service to ensure the long-term usability of the software. The Documentation challenge is naturally addressed by code comments and supporting documentation like READMEs and an active GitLab wiki. Here, the SURESOFT principles offer an additional avenue for THEMIS. Through the ansible and Docker build scripts, "living" documentation is created. They outline exactly how the project can be built and executed. As they are constantly performed in continuous integration, the developers are immediately notified once the system and the documentation diverge. Therefore, the documentation provided by them is always up-to-date. Thus, the concepts of SURESOFT can validate THEMIS implementation and improve the project’s quality and lifespan. More importantly, SURESOFT is suitable for providing sustainability and automated deployment and testing of distributed systems such as BFT.
C. SURESOFT over elPaSo
The research code "Elementary Parallel Solver (elPaSo)" is the in-house vibroacoustic simulation tool constantly developed over 25 years at TU Braunschweig, presently extended and maintained by the Institute for Acoustics (InA), TU Braunschweig . The capabilities of the tool include acoustic and structural analysis using the popular numerical methods of FEM, BEM and SBFEM supporting various complex material and element types. The code written in C++ also facilitates efficient deployment in HPC clusters to support parallel computing of large-scale high-fidelity models , , such as aircraft simulations . Various other components of the software include model order reduction, uncertainty quantification and topology optimization. Prior to the adoption of SURESOFT approaches, the software development environment for elPaSo served only fundamental services, which includes the usage of Git enabling Collaboration & Versioning. A sophisticated procedure to compile elPaSo, including installation and linking of external dependencies, made it harder for the users and developers demanding higher software skills. Maintainability of the software was a resulting consequence for elPaSo causing a slow development process. In addition, the lack of software testing resulted in past research contributions to become incompatible with the newer implementations. As a result, the sustainability of the code implementations became a recurring challenge. Hence, the main goals for elPaSo with SURESOFT is to improve the availability, maintainability and reliability of the software, thereby addressing the mentioned challenges of Availability & Accessibility and Dependency Hell & Software Evolution. In the frame of SURESOFT, the legacy code of elPaSo is adapted to ensure the guiding principles of "ready to use" and "ready to develop". This has already shown remarkable improvements for researchers developing elPaSo for various research projects and for students using the code for their advanced academic courses and projects. Currently, elPaSo addresses continuous analysis combining a continuous integration pipeline and containerization using Docker and Singularity wrapping the various code dependencies. While Docker is applied as standard approach, Singularity is preferred for performance testing on HPC clusters. The software environment complexity for elPaSo includes the numerous third-party libraries linked to elPaSo, for instance MPI (Open MPI/Intel MPI), PETSc, SLEPc, ARPACK, Intel MKL and HDF5 libraries. The resulting large memory size of dependencies, hence the container images, is handled efficiently by the CONAN C++ package manager , which packs only the necessary components required by elPaSo. Software-testing concepts in the development workflow incorporating unit-, integration- and performance testing are also adapted within the code. A set of vibroacoustic benchmarks is identified to enable automated testing of code features on every commit, finally generating technical reports specific to the benchmarks with detailed error diagnosis and issue reporting if any. An issue reporting feature includes automated issue creation using python-gitlab . As a result, the responsible developer is notified in the issue board about the domain-specific error caused by his/her recent commit. In summary, the software emerges to ensure sustainability by complying with the SURESOFT defined guidelines and delivers higher standards of software development environment for researchers. Identified challenges or milestones include increasing test coverage of the legacy code, enhancing the testing of different code modules in the context of parallel computing and archiving/publishing the software’s stable releases along with necessary documentation and test artifacts.
D. SURESOFT over SiMoNe
The Simulator for Mobile Networks (SiMoNe) framework developed and maintained at the Institute for Communications Technology (IfN), TU Braunschweig is a tool to realistically model mobile networks . It supports various mobile radio technologies as the Global System for Mobile1 Communications (GSM) or Long Term Evolution (LTE) and offers several models to emulate the movement of pedestrians, vehicles, or people inside buses or trains. SiMoNe supports realistically large radio networks in urban environments covering thousands of base stations and subscribers. This allows to evaluate the performance gain of new scheduling algorithms or networks setups on the network load and the user experience. SiMoNe stands through its linkage to a Structured Query Language (SQL) database that holds terrain and building data as well as ray optical channel predictions, computed in advance for complexity reasons. The framework is developed in C# that is based on the .NET framework, which raises the requirement of a Microsoft Windows platform as an operating system. In the last ten years, the SiMoNe software framework was maintained and extended by a varying number of permanently employed researchers as well as students, mostly coming from the field of electrical engineering. On average, about ten people are working on different software parts. Due to its large number of maintainers, it happens again and again that the current build of SiMoNe fails because of dependencies not pushed to the main repository or because of overlooked coding errors. As part of the SURESOFT approach, the source code was shifted to GitLab and Continuous Integration (CI) principles were integrated in the SiMoNe development process ensuring Collaboration & Versioning.This also guarantees a working state of the software at all times which addresses the Availability & Accessibility issues. Thereby, most of the master-, bachelor and Ph.D.-theses make use of the simulation results of SiMoNe. Therefore, it is of great interest to ensure the reproducibility of experiments due to the introduced archiving approaches for many years. This requires strategies to store the executables and experiments as well as to download and provide data of the SQL database for the offline usage as required by the archived experiments. By applying SURESOFT, software development of the SiMoNe framework shifted from an ego-perspective to a team-oriented process, supported by CI principles that guarantee a stable software version. The Documentation and archiving concept solves the handling of self-managed software developed by doctoral students and left at the institute when finishing their Ph.D. This not only allows to repeat experiments years later but transfers important knowledge for future generations.
E. SURESOFT over VirtualFluids
VirtualFluids is a Computational Fluid Dynamics research code developed at the Institute for Computational Modeling in Civil Engineering (IRMB) at TU Braunschweig. Based on the Lattice Boltzmann Method (LBM), VirtualFluids aims to provide fast and reliable numerical solutions for various kinds of flow problems ranging from turbulent flow to multi-field problems such as Fluid-Structure-interaction . Because flow simulations quickly become very expensive in terms of computing time, VirtualFluids supports both a GPU parallelization based on CUDA and a CPU parallelization based on MPI and OpenMP. An open source version of VirtualFluids is published under GNU GPLv3 . VirtualFluids has been developed for over 20 year as part of several research projects and is now at a critical stage where changes are very difficult to make and debugging times in particular continue to increase. Although a lot of effort has been put into Software Quality & Design in the past, VirtualFluids lacks an appropriate test base. In order to ensure functionality and quality standards, testing of the code is required after each change. For testing, it will be also necessary to run the code on distributed machines. Another challenge in the context of HPC clusters is the application of containers when it comes to direct usage of hardware resources such as GPUs and fast network interfaces. A special challenge for VirtualFluids will be to ensure parallel efficiency at the same time. As a consequence, in the context of SURESOFT, it is crucial for VirtualFluids to set up a continuous integration workflow targeting the challenge Collaboration & Versioning as well as Availability & Accessibility. First, this means to enable all developers to work together on a single code base while having continuous code merges and code builds. Second, to increase the code quality, while decreasing debugging times, a test suite needs to be set up, which verifies the algorithmic correctness as well as physical correctness. Therefore, we extended the SURESOFT approach with our own tool HPC Rocket, which supports deployment over SSH and at the same time the execution within a scheduling system (e.g., hpc-rocket , Slurm ). The common approach based on Docker is therefore extended with Singularity, which allows for native support of e.g. MPI and CUDA, as similarly used by elPaSo. Summarizing, VirtualFluids will benefit a lot from the SURESOFT approach in the long term. The continuous integration pipeline enables scientist, as well as students, to further collaborate together on VirtualFluids. The development of a test suite has just started and will further improve the overall quality, while reducing the debugging time in the future. Furthermore, the containerized approach already brings significant advantages, as the barrier of using and developing VirtualFluids has decreased a lot.
V. RELATED WORK
In the following, we highlight prominent works that address the challenges identified in scope of this paper:
a) On Applying Continuous Integration to Research Projects: So far, little research investigated the benefits of introducing continuous integration practices into the research software development process. An example is the "continuous analysis" process , which explored continuous integration for reproducing computational analysis. It combines containerized approaches with continuous integration, to re-run the computational analysis. The solution is specifically designed for scientific workflows. The Conquaire project  also relies on continuous integration to monitor the quality of research data, e.g. according to the open format standards. It essentially tackles the reproducibility issues of research results. At its core, the Conquaire system is connected to a repository for tracking data and code versions. However, both works lack support for long-term maintenance, preservation of the code base and do not target an integrated approach like in SURESOFT.
b) Archival of digital assets: There are various initiatives towards Software archiving. A recent initiative is the Software Heritage project , which collects the most influential code base from various scientific fields in order to create a long-lasting, accessible reference. Rapp et al. , presented a Software Archiving of Research Artefacts (SARA), a service for archiving research data and scientific tools. At its core, SARA relies on a GitLab instance (for read only access to source code) and a customized DSpace repository (for metadata), ensuring citability through a DOI assignment to each repository. It features the archiving of the development history and data exploration. Nevertheless, the continuous integration aspect and the ease of deployment for end-users was neglected so far. Currently, there is no project that provides an integrated approach and explores the expected synergy effects as it is the goal of SURESOFT. This was the aim of the CiTAR (Citing & Archiving Research) project . It focused on the long-term preservation of the research environment (e.g., configuration).
c) Reproducible Research: Reproducibility in computational science is a long-term issue. A broad range of research attempted to analyze and promote reproducible research papers. In 1995, WaveLab was the first effort towards reproducibility . It proposed a toolbox of Matlab functions to implement wavelet analysis algorithms and enable reproducibility of results in computational science papers. Underlying elements such as datasets, simulations, and tables were implemented using the Matlab language and its computing environment, which then was shared as a package. The focus of previous works in the topic of reproducible research was to endeavoring reproducing of influential papers. Barham et al. experiment was to reproduce results of the Xen project , which required porting device drivers, and adding some testing scripts . In the field of distributed systems, Howard et al. focus was reproducing the Raft protocol paper results . But, for both works, reproducibility was only possible with the support from the original authors. And more recently with the advances of new technologies, Satyanarayanan et al.  described Olive, a system that uses a virtual machine to encapsulate legacy software with its software dependencies to reproduce the environment and enable its longevity. In the discipline of computational science, Gruning et al.  approach appears to be similar to SURESOFT in identifying virtualization as a crucial and convenient mechanism for reproducibility.
d) Education via workshops and trainings: Although researchers indicate an increasing need and interest , only few universities offer courses specifically targeted towards scientists to increase their knowledge on how to develop software. However, a flagship project of this kind is the Software Carpentry courses initiated by Greg Wilson , . Other examples are community projects offering workshop-based training programs promoting the FAIR principles (e.g. CodeRefinery ) or using HPC clusters in science .
VI. BROADER IMPACT OF SURESOFT AND LONG-TERM PERSPECTIVE
The SURESOFT project has been running since 2020. Its mission is to enable researchers to produce sustainable software. The project addresses the highlighted challenges in section II, based on our twofold approach (Figure 1) consisting of technologies and methods in combination with supportive education to achieve the goals we have identified to be essential for software sustainability:
- Reproducibility of results by ensuring the long-term availability of the software and its environment
- Reuse of the software by ensuring its capability to endure and evolve over time
Within the SURESOFT project, the TU Braunschweig provides educational resources in form of training and workshops as well as services and infrastructure to support a workflow using the methods and technologies we have described in this paper. Moreover, there is a continuous dialogue with research software developers to identify their needs and further improve the SURESOFT approach. The long-term goal of SURESOFT is to generate awareness for the importance of software sustainability aspects among scientists as well as creating an environment that is required to achieve these goals. In such an environment the value of software and its greater impact on science is recognized by the entire scientific community and thereby enabling researchers to earn scientific reputation for their software development work. So far the scientist’s intellectual capital is solely based on domain knowledge, whereas software development expertise is just seen as a trivial technical skill. On the opposite, in an environment that recognizes the value of software, the development of scientific software will also contribute to the scientist’s career advancements and therefore raise the motivation to acquire the expertise necessary to produce sustainable software. Accordingly, an appropriate amount of time has to be reserved for education and proper implementation of software. As time is a very limited resource in scientific work, the education starts as early as possible in the scientific career. Starting from a dedicated group of researchers within the SURESOFT project at TU Braunschweig, we strive to expand our activities to the whole university and eventually beyond. To initiate the organizational change we have started by establishing a policy at TU Braunschweig to encourage the development of sustainable research software and its recognition. Furthermore, we provide workshops and trainings as well as the described infrastructure on an organizational level. Additionally, we will publish and share our educational resources as well as the experience gained, while applying the SURESOFT approach, with a wider audience. In doing so, we hope to contribute to the continuously growing movement within the scientific community towards sustainable research software beyond our university.
The SURESOFT project is funded by the German Research Foundation (DFG) as part of the "e-Research Technologies" funding programme under grants: EG 404/1-1, JA 2329/7-1, KA 3171/12-1, KU 2333/17-1, LA 1403/12-1, LI 2970/1-1 and STU 530/6-1.
 L. Joppa, G. Mcinerny, R. Harper, L. Salido, K. Takeda, K. O’Hara, D. Gavaghan, and S. Emmott, "Troubling trends in scientific software use," Science (New York, N.Y.), vol. 340, pp. 814–815, 05 2013.
 Z. Merali, "Computational science: Error, why scientific programming does not compute," Nature, vol. 467, no. 7317, pp. 775–777, Oct. 2010.
 M. De Rond and A. N. Miller, "Publish or perish: bane or boon of academic life?" Journal of management inquiry, vol. 14, no. 4, pp. 321–329, 2005.
 G. Miller, "A scientist’s nightmare: Software problem leads to five retractions," Science (New York, N.Y.), vol. 314, pp. 1856–7, 01 2007.
 A. Morin, J. Urban, P. D. Adams, I. Foster, A. Sali, D. Baker, and P. Sliz, "Shining light into black boxes," Science, vol. 336, no. 6078, pp. 159–160, 2012.
 M. Baker, "1,500 scientists lift the lid on reproducibility," Nature, vol. 533, no. 7604, pp. 452—454, May 2016.
 C. Collberg, T. Proebsting, G. Moraila, A. Shankaran, Z. Shi, and A. M. Warren, "Measuring reproducibility in computer systems research," Technical report, University of Arizona, Tech. Rep., 2014.
 N. Barnes, "Publish your computer code: it is good enough," Nature, vol. 467, no. 7317, pp. 753–753, 2010.
 T. Miyakawa, "No raw data, no science: another possible source of the reproducibility crisis," Molecular Brain, vol. 13, p. 24, 02 2020.
 M. D. Wilkinson, M. Dumontier, I. J. Aalbersberg, G. Appleton, M. Axton, A. Baak, N. Blomberg, J.-W. Boiten, L. B. da Silva Santos, P. E. Bourne et al., "The fair guiding principles for scientific data management and stewardship," Scientific data, vol. 3, no. 1, pp. 1–9, 2016.
 N. Manola, P. Mutschke, G. Scherp, K. Tochtermann, P. Wittenburg, K. Gregory, W. Hasselbring, K. den Heijer, P. Manghi, and D. V. Uytvanck, "Implementing FAIR Data Infrastructures (Dagstuhl Perspectives Workshop 18472)," Dagstuhl Manifestos, vol. 8, no. 1, pp. 1–34, 2020. [Online]. Available: https://drops.dagstuhl.de/opus/volltexte/2020/13237
 W. Hasselbring, L. Carr, S. Hettrick, H. Packer, and T. Tiropanis, "From fair research data toward fair and open research software," it - Information Technology, vol. 62, no. 1, pp. 39–47, 2020. [Online]. Available: https://doi.org/10.1515/itit-2019-0040
 J. C. Carver, I. A. Cosden, C. Hill, S. Gesing, and D. S. Katz, "Sustaining research software via research software engineers and professional associations," in 2021 IEEE/ACM International Workshop on Body of Knowledge for Software Sustainability (BoKSS), 2021, pp. 23–24.
 B. Penzenstadler, "Towards a definition of sustainability in and for software engineering," in Proceedings of the 28th Annual ACM Symposium on Applied Computing, ser. SAC ’13. New York, NY, USA: Association for Computing Machinery, 2013, pp. 1183–1185.
 C. Venters, R. Capilla, S. Betz, B. Penzenstadler, T. Crick, S. Crouch, E. Nakagawa, C. Becker, and C. Carrillo Sa´nchez, "Software sustainability: Research and practice from a software architecture viewpoint," Journal of Systems and Software, 2017.
 D. Parnas, "Structured programming: A minor part of software engineering," Inf. Process. Lett., vol. 88, pp. 53–58, 10 2003.
 F. P. Brooks, Jr., "No silver bullet: Essence and accidents of software engineering," Computer, vol. 20, no. 4, pp. 10–19, 1987.
 G. Booch, "The history of software engineering," IEEE Software, vol. 35, no. 5, pp. 108–114, 2018.
 N. Wirth, "A brief history of software engineering," IEEE Annals of the History of Computing, vol. 30, no. 03, pp. 32–39, jul 2008.
 D. L. Parnas, "On the criteria to be used in decomposing systems into modules," Communications of the ACM, vol. 15, no. 12, pp. 1053–1058, December 1972.
 B. Liskov and S. Zilles, "Programming with abstract data types," SIGPLAN Not., vol. 9, no. 4, pp. 50–59, Mar. 1974.
 E. Yourdon and L. L. Constantine, Structured Design: Fundamentals of a Discipline of Computer Program and Systems Design, ser. Yourdon Press computing series. Upper Saddle River, NJ, USA: Prentice-Hall, Inc, 1979.
 D. Saucez and L. Iannone, "Thoughts and recommendations from the acm sigcomm 2017 reproducibility workshop," ACM SIGCOMM Computer Communication Review, vol. 48, no. 1, pp. 70–74, 2018.
 C. Boettiger, "An introduction to docker for reproducible research," ACM SIGOPS Operating Systems Review, vol. 49, no. 1, pp. 71–79, 2015.
 R. Sanders and D. Kelly, "Dealing with risk in scientific software development," IEEE Software, vol. 25, no. 4, pp. 21–28, 2008.
 J. E. Hannay, C. MacLeod, J. Singer, H. P. Langtangen, D. Pfahl, and G. Wilson, "How do scientists develop and use scientific software?" in 2009 ICSE Workshop on Software Engineering for Computational Science and Engineering. Ieee, 2009, pp. 1–8.
 J. Carver, D. Heaton, L. Hochstein, and R. Bartlett, "Self-perceptions about software engineering: A survey of scientists and engineers," Computing in Science Engineering, vol. 15, no. 1, pp. 7–11, 2013.
 V. R. Basili, J. C. Carver, D. Cruzes, L. M. Hochstein, J. K. Hollingsworth, F. Shull, and M. V. Zelkowitz, "Understanding the high-performance-computing community: A software engineer’s perspective," IEEE Software, vol. 25, no. 4, pp. 29–36, 2008.
 S. Faulk, E. Loh, M. L. V. D. Vanter, S. Squires, and L. G. Votta, "Scientific computing’s productivity gridlock: How software engineering can help," Computing in Science Engineering, vol. 11, no. 6, pp. 30–39, Nov 2009.
 G. Balaban, I. Grytten, K. D. Rand, L. Scheffer, and G. K. Sandve, "Ten simple rules for quick and dirty scientific programming," PLOS Computational Biology, vol. 17, no. 3, pp. 1–15, 03 2021. [Online]. Available: https://doi.org/10.1371/journal.pcbi.1008549
 B. Moseley and P. Marks, "Out of the tar pit," Software Practice Advancement (SPA), vol. 2006, 2006.
 B. K. Beaulieu-Jones and C. S. Greene, "Reproducibility of computational workflows is automated using continuous analysis," Nature biotechnology, vol. 35, no. 4, pp. 342–346, 2017.
 O. Mesnard and L. A. Barba, "Reproducible workflow on a public cloud for computational fluid dynamics," Computing in Science & Engineering, vol. 22, no. 1, pp. 102–116, 2019.
 D. F. Kelly, "A software chasm: Software engineering and scientific computing," IEEE Software, vol. 24, no. 6, pp. 120–119, Nov 2007.
 G. M. Kurtzer, V. Sochat, and M. W. Bauer, "Singularity: Scientific containers for mobility of compute," PLOS ONE, vol. 12, no. 5, pp. 1–20, 05 2017. [Online]. Available: https://doi.org/10.1371/journal.pone.0177459
 K. Beck, Extreme Programming Explained: Embrace Change. Addison-Wesley Professional, October 1999.
 C. R. Jacob, S. M. Beyhan, R. E. Bulo, A. S. P. Gomes, A. W. Götz, K. Kiewisch, J. Sikkema, and L. Visscher, "Pyadf—a scripting framework for multiscale quantum chemistry," Journal of computational chemistry, vol. 32, no. 10, pp. 2328–2338, 2011.
 S. Rüsch, K. Bleeke, and R. Kapitza, "Themis: An efficient and memory-safe bft framework in rust: Research statement," in Proceedings of the 3rd Workshop on Scalable and Resilient Infrastructures for Distributed Ledgers, 2019, pp. 9–10.
 L. Lamport, R. Shostak, and M. Pease, "The byzantine generals problem," in Concurrency: the Works of Leslie Lamport, 2019, pp. 203–226.
 M. Castro, B. Liskov et al., "A correctness proof for a practical byzantine-ult-tolerant replication algorithm," Technical Memo MIT/LCS/TM-590, MIT Laboratory for Computer Science, Tech. Rep., 1999.
 C. D. Scott Olson, "Miri: An interpreter for rust’s mid-level intermediate representation," 2016.
 J. Behl, T. Distler, and R. Kapitza, "Hybrids on steroids: Sgx-based high performance bft," in Proceedings of the Twelfth European Conference on Computer Systems, 2017, pp. 222–237.
 "Research code elpaso - elementary parallel solver." [Online]. Available: https://www.tu-braunschweig.de/en/ina/institute/ina-tech/research-code-elpaso
 M. Schauer, J. E. Roman, E. S. Quintana-Ortí, and S. Langer, "Parallel computation of 3-d soil-structure interaction in time domain with a coupled fem/sbfem approach," Journal of Scientific Computing, vol. 52, pp. 446–467, 2012. [Online]. Available: https://doi.org/10.1007/s10915-011-9551-x
 H. K. Sreekumar, C. Blech, and S. C. Langer, "Large-scale vibroacoustic simulations using parallel direct solvers for high-performance clusters," in Proceedings of the Annual Conference on Acoustics. DAGA, 2021, pp. 864–867.
48] C. Blech, C. K. Appel, R. Ewert, J. W. Delfs, and S. C. Langer, "Numerical prediction of passenger cabin noise due to jet noise by an ultra–high–bypass ratio engine," Journal of Sound and Vibration, vol. 464, p. 114960, 2020. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0022460X1930522X
 D. M. Rose, J. Baumgarten, S. Hahn, and T. Kurner, "Simone - simulator for mobile networks: System-level simulations in the context of realistic scenarios," in 2015 IEEE 81st Vehicular Technology Conference (VTC Spring), 2015, pp. 1–7.
 S. Geller, C. Janßen, and M. Krafczyk, A Lattice Boltzmann Approach for Distributed Three-dimensional Fluid-Structure Interaction, 01 2013, pp. 199–216.
 D. Adekanye, M. Geier, and M. Schönherr, "Parallel lattice boltzmann method models of dilute gravity currents: Virtual fluids docker image," Dec 2021, this software belongs to the paper publication: Adekanye et al. (2021): Parallel Lattice Boltzmann Method Models of Dilute Gravity Currents. Computers and Fluids (in Review). [Online]. Available: https://publikationsserver.tu-braunschweig.de/receive/dbbs_mods_00069921
 V. Ayer, C. Pietsch, J. Vompras, J. Schirrwagen, C. Wiljes, N. Jahn, and P. Cimiano, "Conquaire: Towards an architecture supporting continuous quality control to ensure reproducibility of research," D-Lib Magazine, vol. 23, no. 1/2, 2017.
 R. Di Cosmo and S. Zacchiroli, "Software heritage: Why and how to preserve software source code," 2017.
 F. Rapp, S. Kombrink, V. Kushnarenko, M. Fratz, and D. Scharon, "Sara-dienst: Software langfristig verfügbar machen," o-bib: Das offene Bibliotheksjournal, vol. 5, no. 2, pp. 92–105, 2018.
 F. Bartusch, J. Kruger, K. Rechert, O. Zharkov, and K. Udod, "Citar-citing & archiving research," 2019.
 J. B. Buckheit and D. L. Donoho, "Wavelab and reproducible research," in Wavelets and statistics. Springer, 1995, pp. 55–81.
 P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R. Neugebauer, I. Pratt, and A. Warfield, "Xen and the art of virtualization," ACM SIGOPS operating systems review, vol. 37, no. 5, pp. 164–177, 2003.
 B. Clark, T. Deshane, E. M. Dow, S. Evanchik, M. Finlayson, J. Herne, and J. N. Matthews, "Xen and the art of repeated research." in USENIX Annual Technical Conference, FREENIX Track, 2004, pp. 135–144.
 H. Howard, M. Schwarzkopf, A. Madhavapeddy, and J. Crowcroft, "Raft refloated: Do we have consensus?" ACM SIGOPS Operating Systems Review, vol. 49, no. 1, pp. 12–21, 2015.
 M. Satyanarayanan, G. S. Clair, B. Gilbert, J. Harkes, D. Ryan, E. Linke, and K. Webster, "Olive: Sustaining executable content over decades," 2014.
 B. Grüning, J. Chilton, J. Köster, R. Dale, N. Soranzo, M. van den Beek, J. Goecks, R. Backofen, A. Nekrutenko, and J. Taylor, "Practical computational reproducibility in the life sciences," Cell systems, vol. 6, no. 6, pp. 631–635, 2018.
 G. Wilson, "Software carpentry: Getting scientists to write better code by making them more productive," Computing in Science Engineering, vol. 8, no. 6, pp. 66–69, Nov 2006.
 P. Messina, "Gaining the broad expertise needed for high-end computational science and engineering research," Computing in Science Engineering, vol. 17, no. 2, pp. 89–90, 2015.