Preparing for the costs of data management and sharing plans

Data management and sharing plans are becoming a more common requirement from funders. The NSF requires a data management plan (DMP) for grant submissions, which includes a section describing how data will be shared. The NIH has released a draft of a data management and sharing policy that, when implemented, will require a description of how data will be maintained throughout the project and shared at the project’s conclusion. While accelerating discovery and making scientific research more transparent are some of the goals associated with FAIR (Findable, Accessible, Interoperable, Reusable) data sharing efforts, the associated costs of managing and preserving datasets that are increasingly large and complex is of concern.

Cost drivers of data management and sharing plans include increased personnel and infrastructure. Personnel costs include effort required to make datasets meet the FAIR principles (i.e. creating data dictionaries and read me files) and preparation of datasets for submission to public repositories. Infrastructure costs include data storage, archiving, preservation locations and fees associated with submitting datasets to public repositories. The good news for researchers is that both the NSF and NIH allow for costs related to these tasks to be included in their grant budgets. It can, however, be difficult to predict the magnitude of these costs.

To address this issue, the National Academies of Science, Engineering, and Medicine (NASEM) created a task force to develop a framework for forecasting costs of preserving, archiving and accessing data1. The task force published a report and recently presented a webinar detailing approaches for researchers to forecast the costs associated with managing and sharing their own data. The webinar and report cover issues such as identifying characteristics of data, identifying personnel and infrastructure necessary to maintain data, and identifying the potential value of the data.

At Washington University, researchers have numerousf options for data storage, archiving and preservation, including the RIS Research Data Storage Platform, REDCap, and many other cloud and sever based data storage platforms. Many of these options have allocations of free storage with additional storage available using tiered pricing models. For data sharing, the NIH supports many institute- and discipline-specific repositories that are often free for researchers with data that fit within the repository scope. If your data do not fit into any of these repositories, Washington University researchers can submit their data to Digital Research Materials Repository (DRMR) supported by Data Services at University Libraries. There are also many generalist repositories. Many of these repositories are free or low cost. A comparison of many generalist repositories including pricing can be found here.

If you need assistance developing a data management and sharing plan, choosing a data repository, or preparing a dataset for submission to a repository, please complete our data sharing help request form or email Chris Sorensen at sorensenc@wustl.edu. You also can find materials from previous workshops covering these topics on our Data Management and Sharing webpage.