Data sharing

Wisdom
Some thoughts on data sharing in science
Author

Wolfgang Huber

Published

2024-01-22

There was discussion on the burden of data sharing over in the bad place. Here some takeaways from me:

Data sharing1 is hard work. The analogy “data is the new oil” is a cliché, but it does work here. You don’t just dig a hole in the desert and shovel the result directly into your car’s fuel tank—there is a trillion dollar industry and infrastructure that makes such things work2.

Data submission to repositories should be easy for the data producing researchers, who are already busy enough being creative, productive, innovative etc. Science funders and science performing institutions need to become more serious about supporting this task through core services, automation, submitter-friendly public data repositories. This needs many more research data curators and software engineers, both at research performing and at repository providing institutions. And sustainable career options for such individuals.

Having a data management plan that includes the final sharing of the data should be built into any research project as early as possible. This may sound idealistic. But we have become quite used to expect that any research project ends with some sort of publication, including an introduction, a complete methods part, a logical chain of results, a conclusion, a discussion, and complete references. Getting these together for paper submission involves considerable planning and work that is best started as early as possible. So it could be analogously for the data management — which is actually far more predictable.

Of course you (as a researcher) can always cut corners. There seems to be a trade-off3: more papers but each with lower quality and on average less impact, versus fewer papers but with higher quality.

Footnotes

  1. More precisely: making data FAIR↩︎

  2. Luckily the world is now moving from fossil to renewable energy — but the idea remains the same.↩︎

  3. I’m not judging the two sides; e.g., there can be legitimate reasons to write many small papers, such as the need for trainees to each have “their” paper.↩︎