If you have any budgetary power at your university, you need to contact whoever is in charge of overseeing the compliance of federally-funded research. If you are that person in the office of sponsored programs, you need to contact your libraries to identify who is involved in research data management (RDM). And finally, if you are a head librarian who supervises anyone involved in RDM, ask them to write up a full memo detailing the staffing and support necessary to run a full shop and not be shy about it. University leadership, offices of sponsored programs, and libraries need to hire research data management librarians and specialists, and soon. Move a mountain and make it happen.
By order of the recent White House Office of Science and Technology Policy (OSTP) memo, by the end of 2025, faculty conducting federally-funded research will need to deposit their underlying research data, immediately and without embargo. Even the data that doesn’t lead to publication. While data deposit was part of the previous OSTP guidance from 2013, it only applied to federal agencies with more than $100M in R&D expenditures. The more recent 2022 Nelson Memo extends to all agencies, agencies who are now working with “OSTP to update their public access and data sharing plans by mid-2023.”
It’s unclear how intensely federal agencies have scrutinized their researcher’s compliance with regard to data deposit since 2013, but given that the 2022 Memo mentions the word data 50 times compared to the 31 mentions in 2013, there seems to be a renewed emphasis on its importance.
DATA OR IT DIDN’T HAPPEN
As late as 2020, journal editors have observed that authors are not able to produce raw data when requested. A 2022 study investigating the follow through on Data Availability Statements found that only 6.8% of authors actually provided data upon request. It seems as though the days of publishing the paper and kicking the data deposit down the road are gone. If your data availability statement is upon reasonable request, any editor worth their salt will see your funder info and ask you to revise that to include the link to the deposited data.
In a recent column, Dr. Nelson and two other members of the WHOSTP continued to emphasize the importance of data availability. They also spoke to the safety concerns that data sharing poses, noting that “[s]upporting public access also means working to prevent the misuse of research and data by actors that seek to do harm … [p]rivacy and security must be protected, even as federally funded research becomes more open.” If potentially a majority of researchers aren’t familiar with the steps to deposit their research data, what chance is there that a majority of them understand the FAIR Guiding Principles for scientific data management and stewardship or the CARE Principles for Indigenous Data Governance? The point, here, is that: without appropriate metadata, data-sharing mandates are pointless.
Even if federally-funded researchers are aware of which repositories have met the “Desirable Characteristics of Data Repositories for Federally Funded Data,” will they have the ability or time to figure out how to deposit the data themselves? Maybe that sounds a bit silly, but the NIH’s public access policy has developed an entire “Method B” procedure that includes a list of journals and publishers that will arrange to deposit an author’s accepted manuscript into PubMed (the NIH’s own repository) for a fee.
IF YOU BUILD IT…
This where your soon-to-be beefed up teams of research data management librarians and specialists come in. Depositing a paper in a repository is not even that hard. Preparing research data for deposit? That takes a lot of labor and expertise. The labor-intensive aspect of bringing data up to FAIR standards is “something many researchers described as a serious barrier to participating in formal data sharing.” As for expertise, research data librarians and other information specialists (of which I am neither) already have accrued and organized around in one form or another.
The federal government is ready to help chip in for whatever seems the best option they have before them. They cannot choose your university’s research data management team unless it exists. Once it does, research grants will support the necessary personnel and infrastructure that you have begun to put in place. (Be bold, by the way. Scared money doesn’t entice federal funding.) You can rely on this income stream because it is federal research funding, and if that were to dry up, your larger existential threats will diminish concerns about this outing to nil.
Expect to have some up front costs, but remember: that’s what making an investment in your own institution feels like. If it’s not your institution who hosts this service, it will be a third party vendor who will have a fiduciary responsibility to their investors, not to your budget. Besides, you are going to charge a fee for these services. As a starting point, for every data set arising from federal funding that your institution processes, charge a fee of $2,000. Maybe more for larger datasets. This $2,000 amount is slightly less than 0.4% of $504,805, which is what the average NIH research award between 2011 and 2021 has been.
What’s my stake in this? I have no expertise in data nor aspirations toward gaining it. My normal area of concern is equitable and open access publishing and the infrastructure. Much of the ground has been ceded to the big commercial publishers whose business models are founded on making the reading and authoring of scholarly texts an exclusive activity.
But research data and its management is a new front for this fight, given the refreshed emphasis both from the grassroots library and open science communities as well as national funder mandates, including the most recent memo from the U.S. As I’ve written before, “over-reliant outsourcing to the big commercial vendors is always a mistake.”
University leadership does what it can to retain public funding. Those of us not in that position should advocate to ensure that the funding we do receive stays within the institution where it can continue to support the broader public mission.
As a roadmap, start with a few fee-based research projects. This will give your institution’s RDM team time to work out its workflow. It may become clear that further support is necessary; do it. Have your sponsored programs coordinator be very clear in their conversations with federal agencies that this is in the works, to expect this as an option.
By mid-2025, you will have a year or two of experience under your belt, assuming you begin this process next year in 2023. The federal agencies your institution most often works with will also have a year or two of experience working with your team. Hopefully the experience will have been good on both sides, and those federal agencies can recommend your service, and services similar, and can include your established services into their future compliance plans.
After 2025, become more ambitious in your aims. Apply for a five-year grant for the team itself. Offer services outside of your institution for a fee. Get a feel for the market. Build it up as large as you can sustain. These will be coming from institutions likely smaller than yours because otherwise they would have built teams of their own. After your five-year grant runs out, apply for another.
You will know this experiment has succeeded when commenters (like me) begin to note that the practice of big institutions leeching federal funds from smaller institutions is extractive. It will be extractive. But the extraction could have been much worse, if you had left it to the billion-dollar commercial publishing industry. Defend yourself by pointing to the social infrastructure you helped grow for data specialists at smaller institutions to ramp up similar services more efficiently. Established teams (like yours) will be there to provide training. This will happen because it will have been librarians you hired to populate the core services. And service to the profession is simply what we librarians tend to do. And do well.
The author would like to acknowledge and thank Teresa Schultz, Scholarly Communications & Social Sciences Librarian at the University of Nevada, Reno, for feedback and input. All errors and omissions should be attributed to the author solely.