Research news

Why and how open source accelerates scientific progress

Photo by ThisisEngineering

Publish Date: 07.03.2022

Category: Our contribution to sustainable development goals

Sustainable development goals: 4 Quality education, 10 Reduced inequalities, 16 Peace, justice and strong institutions, 17 Partnerships for the goals (Indicators)

“It’s not even about open source; it’s all about the open-source community” is the thesis discussed by Prof Janko Slavič, PhD, from the University of Ljubljana’s Faculty of Mechanical Engineering. This thesis is supported with the article co-written with 110 authors and published in the renowned scientific journal Nature Methods.  The article was selected as one of the most outstanding research achievements in the field of natural science and technology, with the designation “Excellent in Science 2021”. The selection is made every year by the Slovenian Research Agency.

In recent years, scientific community is heavily discussing open access publishing, i.e. making publications of scientific research available to everyone. Scientific research is often based on numerical methods or programming procedures developed in various types of software. If such a program code is publicly disclosed/open, it enables greater reproducibility and can considerably reduce wasting human resources spent on re-developing a similar code.

According to Matthew Rocklin (open source researcher, Dask project manager, founder of Coiled), there are seven stages of open software. Stage 1 is publicly visible source code. Stage 2 is achieved once the code is licensed for reuse and can be used for free. In this context, free should be understood as free as a bird, not free beer. Stage 3 is achieved when we start accepting contributions/supplementations from others. Stage 4 involves open development, meaning that communication is open to the public. Stage 5 is open decision-making. The penultimate stage, stage 6, involves multi-institution engagement, when several institutions are involved in the development, thus distributing development, communication and decision-making. And so we arrive to stage 7, which is when the original code author can retire and know that the software will live on forever.

Prof Slavič collaborated on the article within the SciPy 1.0 Contributors group. ScyPy is a software package which started to be developed in 2001 by a group of researchers with no formal education in programming. In these efforts, they were guided by their aim to establish an inclusive, open community with long-term goals. This group defined the structure of the program package, which has remained largely intact to this day. The open-source community in SciPy was soon noticed by other open-source communities (e.g. the community that was developing Python), major companies (like Google) and the wider scientific community. And so it was that as early as 2007, the IEEE journal Computing in Science and Engineering issued a special edition focusing solely on the Python programming language in science. Other publications soon followed, which were already focused on specific scientific fields, such as the 2015 special edition of the journal Frontiers in Neuroinformatics, “Python in Neuroscience”.

Odprta koda-hekerkaPhoto by Christina@woicentechchat.com

In 2014, when Prof Slavič first became involved, the community of the SciPy program package comprised fewer than 100 researchers. Today, the community is 1,100+ researchers strong. The SciPy package is used by more than 400 thousand repositories on the GitHub portal and records approx. 1 million downloads per day from the pypi.org portal.

Prof Slavič stresses that open source enables greater reproducibility of scientific research, faster absorption and further scientific development. With regard to the latter, open source is merely the first step: it is only once the subsequent steps through the seven stages of open code are made that actual success is possible. Prof Slavič views the publication in the prestigious scientific journal Nature Methods as a recognition of the entire open-source community of scientists. In the period from its publication in March 2020 to 6 December 2021, the article was cited over 4,000 times in the Scopus database, which, globally speaking, places it among the most frequently cited articles of the last year.

 

back to list