Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Promoting FAIR Data Through Community-driven Agile Design: the Open Data Commons for Spinal Cord Injury (odc-sci.org)

Promoting FAIR Data Through Community-driven Agile Design: the Open Data Commons for Spinal Cord... The past decade has seen accelerating movement from data protectionism in publishing toward open data sharing to improve reproducibility and translation of biomedical research. Developing data sharing infrastructures to meet these new demands remains a challenge. One model for data sharing involves simply attaching data, irrespective of its type, to publisher websites or general use repositories. However, some argue this creates a ‘data dump’ that does not promote the goals of making data Findable, Accessible, Interoperable and Reusable (FAIR). Specialized data sharing communities offer an alternative model where data are curated by domain experts to make it both open and FAIR. We report on our experiences developing one such data-sharing ecosystem focusing on ‘long-tail’ preclinical data, the Open Data Commons for Spinal Cord Injury (odc-sci.org). ODC-SCI was developed with community-based agile design requirements directly pulled from a series of workshops with multiple stakeholders (researchers, consumers, non-profit funders, governmental agencies, journals, and industry members). ODC-SCI focuses on heterogeneous tabular data collected by preclinical researchers including bio-behaviour, histopathology findings and molecular endpoints. This has led to an example of a specialized neurocommons that is well-embraced by the community it aims to serve. In the present paper, we provide a review of the community-based design template and describe the adoption by the community including a high-level review of current data assets, publicly released datasets, and web analytics. Although odc-sci.org is in its late beta stage of development, it represents a successful example of a specialized data commons that may serve as a model for other fields. . . . . . Keywords Data sharing FAIR spinal cord injury neurotrauma data reuse community repository Introduction * Karim Fouad In the last decade, the need for more openness and transpar- kfouad@ualberta.ca ency in scientific research has become apparent, with calls from funders, journals, lawmakers, researchers, and the gen- * Adam R. Ferguson adam.ferguson@ucsf.edu eral public to improve the self-correcting nature of the scien- tific literature. The fact that many experiments are not pub- Weill Institute for Neurosciences, Brain and Spinal Injury Center, lished, the lack of transparency of published studies, and the Department of Neurological Surgery, University of California San difficulty of accessing the source data underlying published Francisco, San Francisco, CA, USA results have been recognized as significant barriers to repro- Department of Neuroscience, University of California, San Diego, ducibility (Chan et al., 2014). Data openness and sharing are San Diego, CA, USA potential solutions to deal with some of these barriers and Faculty of Rehabilitation Medicine and the Neuroscience and Mental increase the value of research by providing direct access to Health Institute, University of Alberta, Edmonton, AB, Canada data for conducting replication and large-scale pooled studies Spinal Cord and Brain Injury Research Center, Department of at the subject level (Ferguson et al., 2014). Although the ben- Physiology, University of Kentucky College of Medicine, efits of sharing research data have long been recognized with- Lexington, KY, USA in computational fields such as imaging neuroinformatics, San Francisco Veterans Affairs Health Care System, San ‘omics and clinical informatics (Kennedy, 2012; Piwowar Francisco, CA, USA 204 Neuroinform (2022) 20:203–219 et al., 2007; Pronk et al., 2015; Roundtable on Environmental Community-driven and problem-specific infrastructures can Health Sciences et al., 2016), building the necessary social overcome both the sociocultural and the technical challenges culture and infrastructure for data-sharing in other fields with to achieve FAIR share data. However, community acceptance non-standardized heterogeneous preclinical data of diverse and financial support are essential. types (‘long-tail’ data) remains a challenge (Borgman, 2012; Here, we report on the adoption of FAIR data principles by Callahan et al., 2017; Ferguson et al., 2014; Roche et al., the field of spinal cord injury (SCI) research, offering an exam- 2014; Tenopir et al., 2011). Relatively recent movements call- ple of sociocultural and technical embracement of data sharing ing for accountability and transparency, such as open science, and FAIR data principles by a specific research community. are challenging the traditional scientific dissemination estab- Preclinical SCI research produces diverse neuromotor recovery lishment based on scientific narratives in papers. These move- behavioral measures in rats, mice, nonhuman primates, and ments have generated new forms of academic merit-based pooled de-identified human data. These neuro-behavioral data data sharing that contest the culture of data protectionism. At are often combined with histopathological ratings of postmor- the same time, the lack of dedicated digital infrastructures has tem tissue, and variety of molecular endpoints with data often created a ‘semi-adoption’ of data sharing. For example, there collected in an ad hoc fashion in the same individuals over time has been an increase in the number of digital platforms for (e.g., Ferguson et al., 2013;Kyritsisetal., 2021). Both clinical massive digital sharing, and some journals have started and preclinical research have worked to promote the cultural hosting data files. There is no doubt that these all-purpose adoption of data sharing and standardization in the SCI research solutions are better than non-sharing, but increasing evidence community after many years of collective action. For example, suggests that “data deposition” without curation and proper the creation of clinical SCI data repositories such as the National documentation might not be sufficient for achieving the goals Spinal Cord Injury Model System Database (www.nscisc.uab. of data sharing for reproducibility. In order to accommodate edu) in 1973 (DeVivo et al., 2002), the European Multicenter every possible need, all-purpose data repositories impose little Study about Spinal Cord Injury (EMSCI - emsci.org) in 2004 or no requirements on file format, data structure, and docu- (Curt et al., 2004) have been instrumental for the community to mentation, making it difficult to reuse and integrate these data understand the value of data gathering and integration. More with other data resources. Additionally, such data repositories recently, the International Spinal Cord Society (ISCoS), the can create a false sense of accomplishment where researchers American Spinal Injury Association (ASIA), and the National believe they have contributed in data sharing. However, if the Institutes of Neurological Disorders and Stroke (NINDS) joined data is not actually interpretable by humans, digitally interop- efforts to develop standards such as common data elements erable by software systems, and reusable, the act is ultimately (CDEs) for the collection and reporting of clinical research insufficient despite superficially appearing in-line with the data (Biering-Sørensen et al., 2015; Charlifue et al., 2016). cultural movement towards open data and open science. The pre-clinical SCI research community similarly gained Studies based on game theory suggest that data sharing might valuable collective experience leading to the current stage of be beneficial if a collaborative approach is taken and data data sharing. The NINDS funded projects Multicenter Animal sharing is embraced as a community rather than by individuals Spinal Cord Injury Study (MASCIS) in the 90’s (Basso et al., (Pronk et al., 2015). Thus, there is a need for solutions that 1995, 1996; Young, 2002) and Facilities of Research elevate the quality and value of shared data for reusability Excellence in Spinal Cord Injury (FORE-SCI) in the 2000’s which can be achieved through dedicated data services and (Aguilar & Steward, 2010; Anderson et al., 2009;Steward collective efforts for specific research communities. et al., 2012) led to the development of standards and An important step forward is the cultural adoption of the procedures for SCI research in current use across the globe. ‘FAIR data principles’ (Wilkinson et al., 2016), a set of rec- The events preceding FAIR sharing in pre-clinical SCI ommendations establishing a framework for data sharing stat- research have accelerated in the last decade, resulting in the ing that data should be Findable, Accessible, Interoperable development of minimal reporting expectations for preclinical and Reusable (FAIR). The first two are relatively easy to im- SCI research (MIASCI)(Lemmon et al., 2014), a knowledge plement technically although they do require a cultural shift base and ontology for integration of SCI research data for the research community to embrace data sharing. The all- compatible with terminology standards (RegenBase) (Callahan purpose data sharing platforms do a great job of making et al., 2016), and the curation of the Visualized Syndromic shared data findable and accessible, offering solutions for Information and Outcomes for Neurotrauma (VISION-SCI) most researchers and lowering the bar for cultural adoption multicenter, multi-species SCI dataset (Nielson et al., 2014). It of data sharing. However, increasing the utility of shared data is noteworthy that these efforts tackle diverse data types beyond and FAIRness requires the hosted data, and data-related re- those covered under standardized imaging modalities supported sources, to be interoperable and reusable. Achieving these by the Brain Imaging Data Format (BIDS) (Gorgolewski et al., latter two principles requires overcoming additional engineer- 2016), ‘omics data standards (Chervitz et al., 2011), clinical ing and data management challenges atop cultural adoption. physiological data standards such as Neurodata Without Neuroinform (2022) 20:203–219 205 Borders (NWB)(Rübel et al., 2019; Teeters et al., 2015), and set of events in three stages of community involvement using health informatics such as Observational Medical Outcomes agile design principles (Fig. 1): (1) bringing FAIR to the SCI Partnership (OMOP) standards. community; (2) adapting FAIR to the specific challenges of In parallel to these efforts, some important events have SCI research; and (3) responding to community feedback. generated momentum for a cultural shift in biomedical re- Moreover, a fourth stage of establishment, consolidation, and search in general: the acknowledgment of a reproducibility maturity has recently been ensuredbynew funding crisis (Begley & Ioannidis, 2015; Ioannidis, 2005;Macleod (“Facilitating SCI research, translation and transparency: et al., 2014; Schulz et al., 2016; Steward et al., 2012), a lack of Going Public with the Open Data Commons”) through a translation of preclinical research into clinical care multi-agency funding mechanism (KF, contact PI) and the con- (Lammertse, 2013; Seyhan, 2019), and the growth of the open tinuous support of multiple stakeholders (SCI foundations, SCI access and open science movements (Laakso et al., 2011). In community organizers and advocates, publishers, governmental the SCI community in particular, important events include: (1) agencies, industry representatives, among others) (Fouad et al., the success of VISION-SCI in recovering and repurposing 2020). Below we detail the stages and how they came about. data for new discoveries (Nielson et al., 2015); (2) the devel- opment of FAIR data principles (Wilkinson et al., 2016); (3) Stage One: Bringing FAIR to the Community the endorsement of FAIR by NIH and other funding agencies; (4) the Craig H. Neilsen Foundation awarding the project that In 2016, the NINDS in collaboration with the ODC-SCI con- seeded the ODC-SCI (“Open Data Commons for Spinal Cord sortium hosted the “Developing a FAIR Share Community” Injury Research” in 2016 to ARF); (5) the generalized support of funders to the ODC-SCI effort (Wings for Life Foundation, International Spinal Research Trust, Rick Hansen Institute, the US Veterans Affairs, the Department of Defense Congressionally Directed Medical Research Program) (Fouad et al., 2020); and (6) the SCI 2020 meeting hosted by NINDS. These events have been key elements in bringing the SCI research community together, providing the cultural environment that has ultimately allowed for the development of FAIR data sharing in SCI research. Based on this prior work, the community has directly em- braced FAIR data sharing by developing and launching the Open Data Commons for SCI (ODC-SCI, odc-sci.org), a plat- form to share tabular data of research in the field of spinal cord injury. This included the development of a leadership plan with term limits, orderly leadership succession, and proactive change management (Callahan et al., 2017; Fouad et al., 2020). The ODC-SCI is a community-based data sharing in- frastructure with the goal of democratizing SCI research data by allowing users to access existing data, contribute new data, and utilize and create user-friendly tools for analytics and SCI knowledge-discovery all within FAIR guidelines. The goal of the present paper is to provide historical context and illustrate how members of research communities can work together toward the development of dedicated data sharing initiatives under the umbrella of FAIR. Our major conclusion is that Fig. 1 Staged development. We have divided the process by which the development and adoption of FAIR principles by a research ODC-SCI and the SCI data-sharing community has come together in 4 stages (A). The three first stages seeded the foundations for ODC-SCI and community may require several years of collective effort by stage 4, that has recently started, will bring ODC-SCI to maturity. During multiple stakeholders. these stages the engagement with the SCI data-sharing community and the development of tools has occurred in parallel, in both cases using agile design principles (B). These consist on performing a requirement analysis (e.g., ask the community what data needs to be shared), followed by a Methods period of design and development of tools and policies, and a period of feedback (testing) by the users and the community. When the implemen- The process of bringing the SCI community together around tation satisfies the requirements, the new functionalities can be incorpo- rated to the ODC-SCI FAIR is an ongoing continuum, but we have conceptualized the 206 Neuroinform (2022) 20:203–219 workshop with different representatives of the SCI communi- sharing in the community. These questions were derived from ty to discuss data sharing in SCI (Callahan et al., 2017). The the challenges the community expressed during the first meet- workshop was co-sponsored by the NINDS and the ing, with the intention to create a concrete list of actions to- University of Alberta, with contributions from the ward FAIR data sharing. The detailed results of this second International Spinal Research Trust, the Rick Hansen meeting are documented in Fouad et al. 2020. Briefly, the Institute, Wings for Life, and the Craig H. Neilsen community-driven recommendations can be summarized as: Foundation. The goal was to have an open conversation with (1) the data to be shared should be individual-subject data in different SCI stakeholders about the readiness of the field for tabular form that underlie analysis rather than the raw data the challenge of data sharing and to develop a path toward (e.g., images in histological analysis), in order to balance the adopting FAIR in the SCI community. The development and technical and sociocultural challenges. Such data could also outcomes of that meeting are thoroughly discussed elsewhere include data from ‘failed’ experiments that are difficult to (Callahan et al., 2017), though a summary of the conclusions publish because of publication bias (Macleod et al., 2009; are of interest here. The SCI community was receptive to data Watzlawick et al., 2014, 2019). (2) The user permissions sharing at the time the workshop took place. This was dem- and rights that describe who can view and use the data needs onstrated by polling responses that suggested the willingness a flexible policy, allowing for different members of the com- of the participants to share data to some degree and by the munity to adapt the system to their specific needs. (3) Data collective efforts preceding the meeting as described above. curation and quality control processes should be put in place However, there was disagreement on how open or restrictive with enough flexibility to accommodate different goals that sharing should be (i.e., available to the public or under access members of the community might have when sharing/ control). Major challenges and needs towards adopting FAIR accessing data. The creation of a ‘curation board’ formed by were also identified. Community members voiced concerns members of the community with expertise in SCI research was about: (1) the added time required for sharing data, (2) the suggested. (4) The community felt that there is a need for lack of specific funding mechanisms for researchers to prac- ‘minimum information’ metadata allowing users to under- tice data sharing, (3) the absence of dedicated infrastructures stand the shared data at a high level. Similarly, the adoption for sharing, discovery and reuse of SCI data, (4) the need for and augmentation of existing standards like MIASCI better standards allowing for augmented interoperability (Lemmon et al., 2014) and the RegenBase ontology across the community, (5) the implementation of mechanisms (Callahan et al., 2016) would be useful. (5) Users should gain to protect the intellectual property of the data owner, (6) the credit for data sharing efforts while ensuring that this does not proper attribution allowing for data citation, and (7) the need hamper the utility of the data. A license that legally binds the to ensure the stability of the system. data (re)-user to give appropriate credit to the data creator (e.g., creative commons CC-BY) was recommended by the The result of the 2016 meeting established a roadmap for FAIR adoption by the community. The development of the community. (6) The use of digital object identifiers (DOIs) odc-sci.org platform moved forward to respond to the needs were approved as a viable mechanism to generate citable units expressed by the community and establish the initial require- that would credit researchers for sharing their data. ments to operationalize FAIR. At the same time, a steering committee was formed with individuals representing different Stage Three: Community feedback and Testing of sectors of the community to oversee the development of odc-sci.org ODC-SCI. Moreover, broader community thoughts and ideas continued to be collected through outreach activities and pre- With the general agreements regarding issues such as models sentations in scientific meetings about ODC-SCI and FAIR of data access, quality control, and licenses in place, the ODC- data sharing in SCI after the 2016 meeting. SCI team implemented features into the platform to accord- ingly realize the vision that the SCI community understood as Stage Two: Adapting FAIR to the Community FAIR and found acceptable for data sharing. After several months of internal testing, a beta release (a version to be tested One year after the first meeting, the “FAIR SCI Ahead: the by users outside of the developing team) was made available Evolution of the Open Data Commons for Spinal Cord Injury during 2018. During the first period of odc-sci.org testing with research” workshop took place as a satellite event during the a small group of external users, it rapidly became apparent that 2017 SFN (Society for Neuroscience) meeting in Washington guiding users through the structure, functionalities, and DC. This second meeting was co-sponsored by Wings for workflow of the odc-sci.org was not an easy task. There was Life, International Spinal Research Trust, Craig H. Neilsen a notable learning curve associated with understanding the Foundation, and NINDS. The goal was to discuss fundamen- process and navigating through the site (e.g., from registering tal questions with the SCI community about how to adopt an account to uploading data and applying for a DOI). FAIR data sharing and develop specific policies that govern Members of the development team had to dedicate time to Neuroinform (2022) 20:203–219 207 explain the ODC-SCI portal to users individually, which the increasing usability and robustness of the system, as well quickly created a bottleneck for utilization. In order to reach as in community adoption. a broader audience and encourage more members of the SCI During the first quarter of 2020, a second spate of updates community to join the FAIR share movement, a third work- for the ODC-SCI platform took place to implement the basic shop entitled ‘SCI Team Research Enabling Expansion and functionalities in response to the needs of the community. Translation of FAIR’ (STREET-FAIR) was held as a satellite Moving forward, we implemented an user-centered design event at the 2018 SFN Neuroscience meeting. The STREET- approach to improve the user experience and usability of the FAIR was supported by the International Neuroinformatics workflows and the site. The ODC-SCI team engaged in focus Coordinating Facility (INCF) with the goal of promoting meetings with fast iterations between workflow FAIR data sharing principles to the SCI community by (1) implementations and user feedback that generated a new de- providing an update on the ODC-SCI portal and its support sign guided by the users. An updated ODC-SCI site was made for FAIR data sharing and (2) encouraging participation in the available in April 2020 based on these sessions with future odc-sci.org while offering one-on-one guidance for the partic- updates planned as we are moving to the fourth stage of bring- ipants to explore the portal and progress on their way to shar- ing together the SCI community around FAIR sharing. ing data. To make the session practical and interactive, partic- ipants were challenged to use their own data as a working Stage Four: Refining SCI FAIR Share and the Maturity example in a hackathon-style format. of ODC-SCI Upon initiation of the workshop, we realized that the ODC- SCI system was not prepared to handle the volume of network The culmination of these three stages reflects the completion traffic of all the participants at once. The problem was rapidly in 2019 of the Craig H. Neilsen Foundation funding that seed- detected and corrected on-site but clearly highlighted the val- ed the development of the ODC-SCI. However, much work ue of the massive demonstration/work for beta testing the site remains to be done. The current version of the odc-sci.org has to reveal unforeseen problems. Other challenges that were implemented the basic functionalities that translate most of the found during the course included technical bugs. For example, needs and policies decided by the community. Nonetheless, one participant observed that the ‘0’s in a dataset were trans- advanced functionalities such as incorporating tools for in- formed to blank cells in the uploaded data as a result of the creasing interoperability and reusability and more challenging ODC’s upload process misinterpreting values of “0” as miss- policies and procedures such as curation or establishing a sus- ing data. Others mentioned the difficulty when using specific tainability model are still works in progress. A new multi- web browsers, pointing towards software compatibility issues. agency award supported by Wings for Life and Craig H. These and other technical issues raised during the workshop Neilsen Foundation entitled “Facilitating SCI research, trans- were rapidly corrected in updates to the platform following the lation and transparency: Going Public with the Open Data conclusion of the meeting. Moreover, participants pointed out Commons” ensured funding for the next 5 years. Moreover, the need for improving self-explanatory tutorials and help ma- we have in-kind support from International Spinal Research terials that would facilitate the learning experience for those Trust and other funders supporting data sharing moving for- who could not attend the workshop. Notably, beyond identi- ward. The main goals for this new phase are to advance the fying points for improvement, the workshop provided oppor- development of the odc-sci.org, to continue outreach, and to tunities to stress test more uncommon features. For instance, consolidate the FAIR community effort that will help release participates who did not bring their laptops instead accessed the full potential of data sharing in SCI research. Specifically, the ODC-SCI through smartphones and tablets and helped the project plan foresees the implementation of quality control verify that the site was functional on mobile devices and and curation processes, the development of tools for better browsers. data searching and discovery to improve data findability and The organizers and participants of the meeting concluded reusability, and the mechanisms for continuing community that having educational hands-on workshops is an instrumen- outreach and education. tal tool for bringing awareness of the FAIR data sharing ef- During the initial phase of this new grant, we formalized forts to the broader community. Equally important, getting the governing structure of the ODC-SCI and divided the or- direct and indirect feedback (i.e., gathering opinions or ganizational structure into different teams or boards: a watching users interact with the system using their own data) Leadership team to coordinate the development and operation from community members representing different users with of ODC-SCI, an Executive board to offer oversight and be different goals is essential for the success of the collective involved in executive decisions, a Community board to en- effort. It is important to stress that working on their own data gage with the community and to receive community feedback rather than test, users are likely to be more engaged and may through workshops, and a Data Science team to be responsible notice errors in the systems more readily. The impact of for data curation, quality control, and revision. The constitu- conducting the STREET-FAIR meeting is evident in both ents of the Executive and Community board and some of the 208 Neuroinform (2022) 20:203–219 Data Science team are members representing the broad and Community members are defined. The most permissive ac- heterogeneous SCI stakeholder community with the commit- count type is becoming an ODC-SCI Community member ment to serve a 3-year term. Setting term limits gives the associated to an ODC-SCI lab, known as a Lab member. opportunity to rotate between constituents, allowing for new Community members can request to be part of an ODC-SCI ideas and visions from a rapidly changing community. lab and/or create their own Lab. The user obtains a Lab mem- In addition to maturing the ODC-SCI community portal in ber account type upon approval of this request by the respec- this stage, we are formalizing the implementation of the odc- tive ODC-SCI lab PI or Leadership team. Lab members can sci.org using user-centered design practices. This has created a perform all actions in ODC-SCI. Users approved for a first major update of the odc-sci.org website with a stream of Community member account but not associated with a specif- updates that will be released as new processes and features are ic ODC-SCI Lab can explore and access public datasets and incorporated. The following section offers an overview of the share data peer-to-peer (feature in development). For Lab current implementation of odc-sci.org. members, three possible permission levels are defined for each ODC-SCI Lab to which the user has access: regular lab mem- ber, lab manager, and principal investigator (PI). Regular lab Results members can only act on their own datasets or share their dataset to the private laboratory space. Lab managers have The odc-sci.org system has been designed and operationalized the same privileges as regular lab members but can also man- as a framework that aligns the necessities of the SCI data- age the laboratory space (i.e., accept new members, approve sharing community to FAIR share principles through inter- data to be shared to the laboratory and community spaces, or connected functionalities. The process by which the commu- request a DOI for publication). The highest level of authority nity was engaged is described in the "Methods"section. is the PI; PI’s have full control of their laboratory space in- cluding managing lab users and sharing datasets beyond the ODC-SCI Data Spaces and Account Types laboratory with the entire ODC-SCI Community or publishing their data to the Public space. PI’s also have the additional The SCI data-sharing community has designed an incre- privilege of being able to assign the PI status to others or mental process for releasing and publishing data where change the permissions of any members of their lab. In any data is first shared on a limited basis and then made given virtual laboratory, several members, managers, and PIs progressively more open before final release (i.e., publi- might coexist, even if they are from different research groups. cation). The platform is accordingly built with hierarchi- This approach allows for multi-lab or multi-center data shar- cal spaces for datasets. Each space determines who can ing in a private setting by creating a virtual lab on the ODC- SCI to manage the collaboration. access the data (Fig. 2). When a dataset is initially uploaded into the personal space, it is only accessible to the uploader and the PI of the uploader’slab (user ODC-SCI Data Format and Upload roles are explained in the next section). The successive levels of sharing will finally reach a public space; where The ODC-SCI incorporates various features contributing to at the discretion of the PI, users can publish their dataset the interoperability and reusability of data. First, the ODC- with acreativecommons (CC) licenseand adigital ob- SCI standardizes the use of the comma separated value ject identifier (DOI, Fig. 2). Published data is then ac- (*.csv) format for the data file upload process. We chose cessible to any registered user of the ODC-SCI and does .csv because it is a widely used and open format that can be not require the audience to be part of the ODC-SCI opened and edited in almost any data and text editor including community. spreadsheet/analytics software. This offers flexibility in terms The platform is designed to reflect community concerns of data organization and compatibility. These features allow and agreements reached in the above-described workshops for a balance between human and machine readability of the about when data would be shared and with whom. Four dif- data, making it accessible for the research community while ferent account types define the actions a user can perform (see maintaining a level of machine interoperability and reusability Fig. 3) built upon the Neuroscience Information Framework (Fig. 4). Requirements for data formatting using the .csv files (NIF)/SciCrunch technology stack developed by the FAIR are easy to comply with: rows are observations, columns are Data Informatics Laboratory at University of California, San variables, and the first row contains the headers or names of Diego. Becoming a registered user requires signing in by pro- the columns. The ODC-SCI database is structured at the sub- viding a valid email (preferably an institutional one) and ject-data level, and therefore a unique identifier for each sub- agreeing to the terms of use of the site. Registered users can ject represented in the .csv file is required. Beyond that, cur- request to become ODC-SCI Community members with fur- rent versions of the site do not impose further constraints on ther approval by the Leadership team. Two types of how to organize the data, giving flexibility for adapting to the Neuroinform (2022) 20:203–219 209 Fig. 2 ODC-SCI data spaces and movement of data. Data on the ODC- Community space or request DOI for publication. Datasets that are re- SCI can live in different data spaces on increasing order of privacy. The leased to the ODC-SCI Community space can be explored and accessed Personal space is the most private space where users (Registered users by any registered user who has a Community member account (eighter who are part of a Lab) can upload data, share their uploaded data with general members or members of a laboratory). From the Community their Lab (after PI/Lab manager approval) and explore and access data space, datasets can also be published by requesting DOI. This tiered that is present in the user Personal space. Datasets at the Lab space can be system is hierarchical, since a dataset that for instance is released to the explored and accessed by all users who are members of the same Lab. In Community space, is still present in and belongs to the original Lab space addition, PIs and Lab managers can release the data to the ODC-SCI and uploaders - Duplicate-row (Checked for publication): Rows can not be duplicated. user’s needs. During the process of uploading the .csv file to Schema errors: These errors reflect conflicts between the data dictionary the site, a few automatic checks take place to ensure the min- and the dataset. imal format requirements that would allow for integration of - Extra-header (Checked for publication): The dataset contains at least the data on the ODC-SCI database (Box 1). If a dataset does one variable name not defined in the data dictionary. not pass this check, a notification is displayed pointing to the - Missing-header (Checked for publication): The dataset is missing at source of the issue. Once the user corrects the problem(s), the least one variable name defined in the data dictionary. data can be uploaded to the site. - Missing-definition (Checked for publication): The definition of a variable in the data dictionary is missing. Box 1: ODC-S 534 CI data formatting quality checks. - Required-constraint (Checked at upload): A required field for the dataset contains no values or is not assigned on the dataset. Currently Source errors (Checked at upload): ODC-SCI can read-in the data file. the only required value in the datasets is the subject identifier. As Structure errors: These errors are caused by formatting issues on the ODC-SCI develops additional data standards, it is possible that more dataset variables will be required on all datasets. - Blank-header (Checked at upload): There is a blank variable name. All - Value-constraint (under development): The values of a variable should cells in the header row (first row) must have a value. be equal to one of the permitted values enumerated in the data - Duplicate-header (Checked at upload): There are multiple columns dictionary, or within the limits of the permitted values. with the same name. All column names must be unique. - Blank-row (Checked at upload): Rows must have at least one When a dataset is first uploaded, the site assigns a persistent non-blank cell. identifier to the dataset, and the user is required to provide a 210 Neuroinform (2022) 20:203–219 Fig. 3 ODC-SCI account types and functions. Access to different functions on the site are determined by the account types. Visitors to the platform with no account can only explore the metadata for published datasets but can not see nor download the data. Registered users who are not part of the ODC-SCI Community can explore and download pub- lished datasets. Registered users who become part of the ODC-SCI Community will be able to ex- plore and download published datasets, as well as get private peer sharing (feature still under development). To gain access to all the full suite of functions users will have to be part of a Lab in the ODC-SCI title and a narrative description of the content. This informa- Publishing Datasets at ODC-SCI tion is displayed in the lab space together with the name of the user who uploaded the data which facilitates other Lab Publishing datasets through ODC-SCI means making data Members in finding the data. If the dataset is released to the available to the general public (with a regular registered user ODC-SCI Community space, these metadata elements are account) under an open source license (CC-BY 4.0) and with displayed in their respective landing pages with the addi- an associated DOI which generates a citable unit. One of the tion of the number of observations and records contained goals of ODC-SCI is to promote FAIR data principles and in the dataset and the name of the lab where the data was reproducibility of the ODC-SCI data which requires a minimal uploaded. This allows members of the ODC-SCI standard for the datasets before it can be made public. To Community to find and identify datasets of interest. achieve such a standard, we have created a quality check Members of the SCI data-sharing community have and review process that is conducted by members of the expressed the need for these minimal metadata information ODC-SCI Data Science team. The refinement of this process about a dataset to be present in the ODC-SCI Community is ongoing, although the basic workflow and tools are in place. space, even when data remains private in the Lab space. As a requirement for publication, a data dictionary (i.e., ‘code- The ODC-SCI design team is contemplating this option for book’) must be provided. Specifically, the data dictionary future implementation which would help inform users on gives definitions, units, permitted values, and other valuable what other datasets might become available or allow users information necessary for understanding the collected data. to ask for private sharing while the data is limited to inter- This type of documentation is essential for the reusability of nal usage. Once a dataset is made public (see section be- the data but is often overlooked in general purpose reposito- low), a citation page is generated and a DOI issued. ries. Another important piece of documentation is the Neuroinform (2022) 20:203–219 211 Fig. 4 Machine vs. human readable tabular formats. How data is unique record, meaning that there are not two identical rows on the formatted into spreadsheets can affect the readability of it. As humans, dataset, and completely empty rows and columns are not allowed. The we benefit from visual clues such as blank spaces or colors and from ODC-SCI database is organized around the subject identification number complex data organizations that divide data into chunks (e.g., groups of and thus it must always be present in the dataset. This formatting can have subjects) (A). Although this formatting of the data can be self-explanatory different variations depending on the hierarchical relationships between for humans, the complexity and lack of a consistent structure across variables (such as in the case of repeated measures like time). For exam- researches make it challenging to generate standards that can be used ple, the same variables are collected at different timepoints, a time column by machines to process and understand data. The readability of a spread- can be specified, and subjects can be repeated in rows with records for sheet by a machine can be dramatically improved with simple rules each time point in different rows, known as semi-long format (C). (Broman & Woo, 2018) to organize the data in a structured manner (B Contrary, a new column can be created for every variable and every time to D). In ODC-SCI, data can be uploaded using spreadsheet type file (as point known as wide format (D), in which case each subject is only .csv file) where columns are variables (also known as fields), the first row present in one row. When possible, ODC-SCI recommends using semi- contain the variable names or headers and each consequent row is a long formats metadata information associated with the dataset. This is con- reviewed for potential inconsistencies in formatting and struc- stituted by an appropriate title, a structured abstract with a ture (Box 1) such as whether a variable is present in the dataset description of the study purpose, an overview of the data col- but is not defined in the data dictionary. Some of these checks lected and major conclusions of the study, a list of authors and are performed automatically during the dataset upload, and contributors, a list of identifiers and links to other resources others are done by members of the Data Curation team while such as an associated paper, and funding information. An reviewing the dataset for publication. As the ODC-SCI pro- editable webform for each uploaded dataset can be used to gresses, we will develop automated tools for conducting all provide this information directly on the odc-sci.org site. dataset and data dictionary formatting quality checks. The The dataset, data dictionary, and metadata undergo struc- second part of the review process is an editorial revision of tured quality checks to ensure minimal ODC-SCI standards metadata information to ensure that it contains enough infor- before publication. First, datasets and data dictionaries are mation to adhere to FAIR standards. Once the dataset, data 212 Neuroinform (2022) 20:203–219 dictionary, and metadata are approved by the Data team for the private lab space (Fig. 5D). A total of 6 datasets have been release, a DOI and citation page will be generated and made released to the community, and 4 DOIs have been generated public. It is important to keep in mind that based on commu- with the datasets available for the public (Ferguson et al., nity feedback this process has been put in place to ensure 2018; Liu et al., 2019;Puko&McTigue, 2020;Schmidt minimal quality of shared data for interoperability and reus- et al., 2019). At the time of publishing, 11 DOIs have been ability, and the review process increases the time to generate a requested and are being processed, reflecting the commitment DOI and make data public compared to general purpose re- of the community to sharing data to the public. Importantly, positories that may not provide curation services. the sequences of stages (private, to the lab space, to Once published, ODC-SCI adds searchable tags to a community, to public) is not necessary, and we have found dataset by marking up the pages with structured vocabulary that some authors choose to go directly from the laboratory such as the one provided by schema.org. This permits the user space into DOI release upon completion of data curation and to search for the dataset DOI, for the citation information, or quality checks. To date, this type of lab-to-public sharing has for a related article in a search engine resulting in web links to largely been in the context of authors being asked by journals the publicly shared dataset. Moreover, the ODC-SCI is part of to provide a dataset DOI to coincide with the release of peer- the SciCrunch ecosystem (Whetzel et al. 2015) and is indexed reviewed papers, suggesting that publishers are starting to as a resource (RRID: SCR_016673) that can be found by the enforce data sharing policies and users are seeing ODC-SCI. Neuroscience Information Framework (NIF, neuinfo.org). org as a viable option to meet the requirements. Community Adoption and Web Analytics on the odc- Global Visits to odc-sci.org sci.org The odc-sci.org has been registered with Google Analytics In order to evaluate the community adoption of the odc- since 2018, allowing us to measure usage and activity of the sci.org, we derived aggregated metrics from the registered website. These tools do not allow for individual-visitor iden- users (Fig. 5). From this information, we compiled descriptive tification but rather aggregate metrics of usage of the ODC- statistics of new registrations, data uploads, and the spaces/ SCI portal that can be used as indicators of community en- stages where datasets are in the data release cycle (e.g., private gagement to the site. One measure of global activity on the space, lab space, DOI requested) to summarize the current ODC-SCI is the number of returning and new visitors data landscape of the ODC-SCI (Table 1). These metrics serve (Fig. 6A-B ,Table 1). Fluctuations on the traffic of both as a surrogate to study the adoption of the odc-sci.org and as new and recurrent visits can be appreciated where peaks of an indicator of interest in data sharing by the community. visitors can be associated to periods during or right after out- As of July 2020, 248 users are registered on the site and 57 reach activities. After most peaks of activity, traffic seems to different laboratories across Canada, Europe and USA have return to an average baseline of around baseline level of been created. Some peaks of activity coincide with outreach around 2–3new and 2–3 recurrent users a day on average. and community workshops (Fig. 5B), where the maximum The average of new and returning visitors per day rap- number of new user registrations in a week happened during idly increased in the first and second quarter of 2020 the SFN Neuroscience meeting in 2018 where the STREET- with a particularly high traffic period during the I- FAIR workshop took place. Other activities such as the OSCIRS seminar. It is too early to see if the latest jump streaming at the International Online Spinal Cord Injury in traffic will return to a similar baseline activity or if it Research Seminars (I-OSCIRS; https://www.youtube.com/ will produce a new sustained base traffic with higher watch?v=LZ9DhxUUkeE&t=14 s) were also accompanied visitors on average per day. by an increase in the number of registered users. The Two potential proxies for the level of engagement of number of uploaded datasets also peaked during outreach visitors with the site is the time spent (Fig. 6C-D)and and workshop activities, although the increase in users is not the number of pages viewed (Fig. 6E-F) per session. As always accompanied with an increase in uploads. In terms of a general trend, returning visitors spend more time and the number of uploaded datasets per laboratory (excluding a visit more pages than new visitors, which is to be ex- test lab by the developing team), we observed a general trend pected since there is limited content that new visitors can of an increasing number of datasets from the same laboratory access until they register as users. Thus, these two mea- from 2018 to 2019–2020 (Fig. 5C). In addition, more sures likely convey different things about whether the laboratories were created in 2019–2020 with a visitor is new or recurrent. New visitors who become corresponding increase in the number of labs that are registered users may come back after closing their ses- uploading data to the ODC-SCI. Of the current active sion, and then be counted as returning visitor with more datasets that are not in the test laboratory, most are either in options on the platform. There are some peaks in the the private user space or internally approved to be shared in mean session time and number of pages viewed by Neuroinform (2022) 20:203–219 213 Fig. 5 ODC-SCI activity. We tracked the activity on when users during development, test sets during outreach activities, and datasets by registered to the site (A), when datasets got uploaded (B), the number users who include “test” or “practice” on the description. Most of those of uploaded datasets per Lab (C), and the status or data space where datasets have been subsequently deleted and only active datasets are datasets are set (D). A total of 234 datasets have been uploaded. An shown in D. Notice that although we have 11 requests for DOI at the estimated 38 % of the uploaded datasets are placeholder datasets created time of writing, there are 2 datasets in preparation for being uploaded, and to explore the functionality of the portal, including datasets uploaded therefore not reflected in D visitors associated with hands-on outreach activities such visitors indicate international traffic to the site (Fig. 7; as the STREET-FAIR workshop in 2018, but other ac- Table 1). tivities did not register the same pattern as the I-OSCIRS seminar. This is similar to the fluctuations in new regis- tered users and uploaded datasets, which suggest that Discussion and Conclusions different outreach activities produce different behaviors in visitors and users. The geographical locations of The present paper highlights the journey to-date that the SCI research community has undertaken to adopt the FAIR Table 1 ODC-SCI community Registered users 248 activity as of July 2020 Active datasets 103 with accumulated 1,379,988 rows and 18,359 Estimated individual subjects N > 10,000 Total visits 4799 New visits 2649from46countries Recurrent visits 2150 Returning visitors 502* from 20 countries *Note that the activity of the members of the development team are included in the returning visitors’ data, although the ODC-SCI team constitutes very few of those 502 visitors 214 Neuroinform (2022) 20:203–219 Fig. 6 Traffic of visitors to the odc-sci.org. Using Google analytics traffic period of time. Some of the important outreach activities are annotated on monitoring data we identified new and returning visitors over time (A-B), the graphs: SFN 2018 STREET-FAIR workshop, the SCI2020 meeting as well as the time spend per session in minutes (C-D) and the number of hosted by NINDS, the press release of the new multi-agency grant launch, pages viewed per visitor/session (E-F). A, C and F show the raw daily the SFN 2019 ODC-SCI stand as part of NIF and the IOSCIRS online metrics while B, D and F show 3 weeks moving average over the same workshop principles to promote data sharing and research transparency community and may be repurposed in other research commu- in the context of heterogeneous preclinical research data types nities. We would summarize the steps taken as: (1) developing (‘long-tail’ data)(Ferguson et al., 2014). The data covered by a history of collective and cooperative efforts around data the ODC-SCI enables FAIR sharing of data that falls outside collection and standards; (2) early assessment of the readiness of that covered by established standards such as clinical neu- of the community, the challenges, and the specific community roimaging (e.g., BIDS), physiology (e.g., NWB), health infor- needs while involving different parties to provide different matics (e.g., OMOP). The experience of direct community perspectives on the data life cycle; (3) adapting FAIR engagement and application of agile design principles pro- principles to the specific needs of the community; (4) vides a template for achieving FAIR data sharing in a research seeking community involvement and feedback, and Neuroinform (2022) 20:203–219 215 Fig. 7 Geographical origin of internet traffic to the odc-sci.org. New and Returning visitors have viewed odc-sci.org since we started monitoring traffic combining it with agile design principles for constant it- eration. Our experience to-date has led us to a set of To-date, this community-based, agile design framework fundamental principles for developing community-based has helped odc-sci.org meet the increasing requirements of FAIR technology (Box 2). journals and funders that published paper are accompanied by open and FAIR data that promote transparency and repro- Box 2: Principles behind community-based FAIR data sharing ducibility of research. By embracing community engagement, technology. our goal throughout has been to empower the community to - Emphasize community acceptance ahead of engineering perfection. meet such demands. The design philosophy has been to both - Recognize cultural change is slow and needs constant effort in parallel adapt the design to user demands as well as guide them to- to offering technological solutions through a portal. wards implementing FAIR principles by keeping a user-cen- - Multiple levels of research community engagement with multiple tric design that can accommodate different user types includ- stakeholders (researchers, consumers, funders, government, ing primary data creators, reviewers and funders, and data publishers) are essential for evolving a data publication culture and the data platform. consumers and meta-analysts. Overall, there has been an in- - Collaboration with funding agencies early on is essential and crease in the community usage of the odc-sci.org with impor- potentially the key for adoption of FAIR and open data sharing tant peaks of activity during major events, especially portals. STREET-FAIR and I-OSCIRS. ODC-SCI has seen not only a constant increase of community members but also of 216 Neuroinform (2022) 20:203–219 registered laboratories and movement of datasets from the commons through odc-sci.org and broader neurocommons private space to the community and publicly accessible space. efforts will promote ever increasing knowledge through This suggests that the community of SCI researchers are be- FAIR data sharing across individual researchers, laboratories, ginning to use ODC-SCI and adopt FAIR data sharing princi- species, and perhaps even disease domains. ples. It is noteworthy that there this steady increase in traffic based on grass-roots interest without a large scale, coordinated marketing effort. However, a campaign to increase user en- Information Sharing Statement gagement is planned as the portal moves into full production. In that regard, the metrics and numbers observed during the The ODC-SCI is freely accessible under registration and ap- reported period will serve as a baseline to benchmark future proval as described in this manuscript. development. The next steps will involve further user-driven develop- ment of data dictionaries and standards that improve interop- Acknowledgements In addition to the authors, the STREET-FAIR Wor k shop Part ici p ants i n cl udes: W a rren A lila in, erability. Currently, data dictionaries are optional for datasets warren.alilain@uky.edu; Mark Bacon, mark3bacon@gmail.com; in private lab spaces but are required files (in .csv format) for Nicholas Batty, nbatty@ualberta.ca; Michael Beattie, all datasets released to the public. As the ODC-SCI is further michael .beattie@uc sf.edu; J acquel ine Bresnahan, populated and data dictionaries are uploaded together with jacqueline.bresnahan@ uc sf .edu; Emily Burnside, emily.burnside@kcl.ac.uk; Sarah Busch, sbusch@athersys.com; datasets, we will be able to generate a list of variables or data Randall Carpenter, carpenter.794@osu.edu; Isaac Francos Quijorna, elements that are commonly collected by the community. The isaacfq87trabajo@gmail.com; Xiaohui Guo, x1guo@ucsd.edu; Agnes dictionaries will thus ultimately inform the generation of the Haggerty, agnes.haggerty@gmail.com; Sarah Haroon, ODC-SCI common data elements (CDEs), a set of standards haroon.sarah@yahoo.com; Jack Harris, jckhrrs1@gmail.com; Lyn Jakeman, lyn.jakeman@nih.gov; Linda Jones, latj@comcast.net; Naomi for variables that will help augment interoperability between Kleitman, naomi@chnfoundation.org; Timothy Kopper, data sources with the potential to include ‘translational inter- Timothy.Kopper@uky.edu; Michael Lane, mlane.neuro@gmail.com; operability’ of dataset across species. The effort would mirror Francisco Magana, javifran737@gmail.com; David Magnuson, the establishment of clinical CDEs for SCI by the NINDS dsmagn01@louisville.edu; Ines Maldonado, imaldonado@sralab.org; Verena May, verena.may@wingsforlife.com; Katelyn McFarlane, (Biering-Sørensen et al., 2015). Notably, while there are pre- kate lynmcf ar lane@uky .edu; Kazuhito Morioka, clinical standards being developed by NINDS workgroups for kazuhito.morioka@ucsf.edu; Martin Oudega, moudega@sralab.org; several disorders, there have yet to be directed efforts for pre- Philip Leo Pascual, plpascual18@gmail.com; Jean-Baptiste Poline, clinical SCI. jean-baptiste.poline@mcgill.ca; Ephron Rosenzweig, ephronr@gmail.com; Emma Schmidt, ekschmid@ualberta.ca; Wolfram The ODC-SCI developing team is planning to incorporate Tetz laff, tetzlaf f @icor d .or g ; Lana Zholude v a, functions to map the variables (i.e., data elements) present in lana.zholudeva@gladstone.ucsf.edu. the ODC-SCI data to existing data elements available through InterLex/NeuroLex (Larson & Martone, 2013), a dynamic Funding Supported by: National Institutes of Health (NS088475, lexicon of biomedical terms maintained by NIF. With the NS106899 to A.R.F.), the US Department of Veterans Affairs (I01RX002245, I01RX002787 to A.R.F.), Wings for Life (to A.R.F.) future creation of ODC-SCI CDEs and the integration of those and the Craig H. Neilsen Foundation (to A.R.F.). The STREET-FAIR through InterLex, the ODC-SCI will establish the tools for workshop was supported by a workshop grant from the International developing community standards and increasing interopera- Neuroinformatics Coordinating Facility (INCF)(to A.R.F.). ODC- bility in the SCI research field. We expect that these mapping SCI.org development is supported by a multi-funder grant (Wings for Life, Craig H. Neilsen Foundation, in-kind support from International functionalities, together with sufficient metadata and docu- Spinal Research Trust)(to K.F.) mentation, will provide a common data model with enough information for reusing the ODC-SCI datasets both by Data Availability No applicable. humans and machines. In time these may mature to the point that they can integrate with other, more mature clinical stan- Code Availability No applicable. dards such as BIDS, NWB, and OMOP, among others. As the SCI community has demonstrated, even in the ab- Declarations sence of tightly defined knowledge engineering, it may be possible to extract new knowledge from semi-structured data Conflicts of Interest/Competing interests J.-B.P. was partially funded if modern machine learning analytics are leveraged. Indeed, by National Institutes of Health (NIH) NIH-NIBIB P41 EB019936 (ReproNim) NIH-NIMH R01 MH083320 and NIH RF1 MH120021 Nielson et al. 2015 and Almeida et al. (2021; in the present (NIDM), NIMH Award Number R01MH096906 (Neurosynth), as well issue) demonstrates the utility of analyzing FAIR data even as the Canada First Research Excellence Fund, awarded to McGill from archival laboratory data (25 years ago) to develop and University for the Healthy Brains for Healthy Lives initiative. externally validate new predictors of long term neuromotor recovery. The continuing development of the SCI data Ethics Approval No applicable. Neuroinform (2022) 20:203–219 217 Consent to Participate No applicable. Broman,K.W., &Woo,K.H.(2018).Data Organizationin Spreadsheets. The American Statistician, 72(1), 2–10. https://doi. org/10.1080/00031305.2017.1375989. Consent for Publication No applicable. Callahan, A., Abeyruwan, S. W., Al-Ali, H., Sakurai, K., Ferguson, A. R., Popovich, P. G., Shah, N. H., Visser, U., Bixby, J. L., & Lemmon, Open Access This article is licensed under a Creative Commons V. P. (2016). RegenBase: A knowledge base of spinal cord injury Attribution 4.0 International License, which permits use, sharing, adap- biology for translational research. Database: The Journal of tation, distribution and reproduction in any medium or format, as long as Biological Databases and Curation, 2016. https://doi.org/10.1093/ you give appropriate credit to the original author(s) and the source, pro- database/baw040. vide a link to the Creative Commons licence, and indicate if changes were Callahan, A., Anderson, K. D., Beattie, M. S., Bixby, J. L., Ferguson, A. made. The images or other third party material in this article are included R., Fouad, K., Jakeman, L. B., Nielson, J. L., Popovich, P. G., in the article's Creative Commons licence, unless indicated otherwise in a Schwab, J. M., Lemmon, V. P. & FAIR Share Workshop credit line to the material. If material is not included in the article's Participants. (2017). Developing a data sharing community for spi- Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain nal cord injury research. Experimental Neurology, 295, 135–143. permission directly from the copyright holder. To view a copy of this https://doi.org/10.1016/j.expneurol.2017.05.012. licence, visit http://creativecommons.org/licenses/by/4.0/. Chan, A.-W., Song, F., Vickers, A., Jefferson, T., Dickersin, K., Gøtzsche, P. C., Krumholz, H. M., Ghersi, D., & van der Worp, H. B. (2014). Increasing value and reducing waste: Addressing in- accessible research. The Lancet, 383(9913), 257–266. https://doi. org/10.1016/S0140-6736(13)62296-5. Charlifue, S., Tate, D., Biering-Sorensen, F., Burns, S., Chen, Y., Chun, References S., Jakeman, L. B., Kowalski, R. G., Noonan, V. K., & Ullrich, P. (2016). Harmonization of Databases: A Step for Advancing the Aguilar, R. M., & Steward, O. (2010). A bilateral cervical contusion Knowledge About Spinal Cord Injury. Archives of Physical injury model in mice: Assessment of gripping strength as a measure Medicine and Rehabilitation, 97(10), 1805–1818. https://doi.org/ of forelimb motor function. Experimental Neurology, 221(1), 38– 10.1016/j.apmr.2016.03.030. 53. https://doi.org/10.1016/j.expneurol.2009.09.028. Chervitz, S. A., Deutsch, E. W., Field, D., Parkinson, H., Quackenbush, Almeida, C. A., Torres-Espin, A., Huie, J. R., Sun, D., Noble-Haeusslein, J., Rocca-Serra, P., Sansone, S.-A., Stoeckert, C. J., Taylor, C. F., L. J., Young, W., Beattie, M. S., Bresnahan, J. C., Nielson, J. L., Taylor, R., & Ball, C. A. (2011). Data standards for Omics data: The Ferguson, A. R. (2021). Excavating FAIR Data: the Case of the basis of data sharing and reuse. Methods in Molecular Biology Multicenter Animal Spinal Cord Injury Study (MASCIS), Blood (Clifton, N.J.), 719,31–69. https://doi.org/10.1007/978-1-61779- Pressure, and Neuro-Recovery. Neuroinformatics. https://doi.org/ 027-0_2. 10.1007/s12021-021-09512-z. Curt, A., Schwab, M. E., & Dietz, V. (2004). Providing the clinical basis Anderson, K. D., Sharp, K. G., Hofstadter, M., Irvine, K.-A., Murray, M., for new interventional therapies: Refined diagnosis and assessment & Steward, O. (2009). Forelimb locomotor assessment scale of recovery after spinal cord injury. Spinal Cord, 42(1), 1–6. https:// (FLAS): Novel assessment of forelimb dysfunction after cervical doi.org/10.1038/sj.sc.3101558. spinal cord injury. Experimental Neurology, 220(1), 23–33. https:// DeVivo, M. J., Go, B. K., & Jackson, A. B. (2002). Overview of the doi.org/10.1016/j.expneurol.2009.08.020. national spinal cord injury statistical center database. The Journal Basso, D. M., Beattie, M. S., & Bresnahan, J. C. (1995). A sensitive and of Spinal Cord Medicine, 25(4), 335–338. https://doi.org/10.1080/ reliable locomotor rating scale for open field testing in rats. Journal 10790268.2002.11753637. of Neurotrauma, 12(1), 1–21. https://doi.org/10.1089/neu.1995.12. Ferguson, A. R., Irvine, K.-A., Gensel, J. C., Nielson, J. L., Lin, A., Ly, J., et al. (2013). Derivation of multivariate syndromic outcome metrics Basso, D. M., Beattie, M. S., Bresnahan, J. C., Anderson, D. K., Faden, for consistent testing across multiple models of cervical spinal cord A. I., Gruner, J. A., Holford, T. R., Hsu, C. Y., Noble, L. J., Nockels, injury in rats. PLoS One, 8(3), e59712. https://doi.org/10.1371/ R., Perot, P. L., Salzman, S. K., & Young, W. (1996). MASCIS journal.pone.0059712. evaluation of open field locomotor scores: Effects of experience Ferguson, A. R., Irvine, K.-A., Gensel, J. C., Nielson, J. L., Lin, A., Ly, J., and teamwork on reliability. Multicenter Animal Spinal Cord Segal, M. R., Ratan, R. R., Bresnahan, J. C., & Beattie, M. S. (2018). Injury Study. Journal of Neurotrauma, 13(7), 343–359. https://doi. Cervical (C5), unilateral spinal cord injury with diverse injury mo- org/10.1089/neu.1996.13.343. dalities, multiple behavioral outcomes, and histopathology. Open Begley, C. G., & Ioannidis, J. P. A. (2015). Reproducibility in science: Data Commons for Spinal Cord Injury [Text/csv,application/zip,ap- Improving the standard for basic and preclinical research. plication/x-zip-compressed]. In Open Data Common for Spinal Circulation Research, 116(1), 116–126. https://doi.org/10.1161/ Cord Injury (1.0, p. ODC-SCI:26). https://doi.org/10.7295/ CIRCRESAHA.114.303819. W9T72FMZ. Biering-Sørensen, F., Alai, S., Anderson, K., Charlifue, S., Chen, Y., Ferguson, A. R., Nielson, J. L., Cragin, M. H., Bandrowski, A. E., & DeVivo, M., Flanders, A. E., Jones, L., Kleitman, N., Lans, A., Martone, M. E. (2014). Big data from small data: Data-sharing in the Noonan, V. K., Odenkirchen, J., Steeves, J., Tansey, K., “long tail” of neuroscience. Nature Neuroscience, 17,1442–1447. Widerström-Noga, E., & Jakeman, L. B. (2015). Common data el- https://doi.org/10.1038/nn.3838. ements for spinal cord injury clinical research: A National Institute Fouad, K., Bixby, J. L., Callahan, A., Grethe, J. S., Jakeman, L. B., for Neurological Disorders and Stroke project. Spinal Cord, 53(4), Lemmon, V. P., Magnuson, D. S. K., Martone, M. E., Nielson, J. 265–277. https://doi.org/10.1038/sc.2014.246. L., Schwab, J. M., Taylor-Burds, C., Tetzlaff, W., Torres-Espin, A., Borgman, C. L. (2012). The conundrum of sharing research data. Journal Ferguson, A. R., the FAIR-SCI Ahead Workshop Participants, Alam, S., Bacon, M., Bambrick, L., Basso, M., … Rabchevsky, S. of the American Society for Information Science and Technology, 63(6), 1059–1078. https://doi.org/10.1002/asi.22634. (2020). FAIR SCI Ahead: The Evolution of the Open Data 218 Neuroinform (2022) 20:203–219 Commons for Pre-Clinical Spinal Cord Injury Research. Journal of Lum, P. Y., Carlsson, G. E., Manley, G. T., Young, W., Beattie, M. Neurotrauma, 37(6), 831–838. https://doi.org/10.1089/neu.2019. S., Bresnahan, J. C., & Ferguson, A. R. (2015). Topological data 6674. analysis for discovery in preclinical spinal cord injury and traumatic brain injury. Nature Communications, 6,8581. https://doi.org/10. Gorgolewski, K. J., Auer, T., Calhoun, V. D., Craddock, R. C., Das, S., 1038/ncomms9581. Duff, E. P., Flandin, G., Ghosh, S. S., Glatard, T., Halchenko, Y. O., Handwerker, D. A., Hanke, M., Keator, D., Li, X., Michael, Z., Piwowar, H. A., Day, R. S., & Fridsma, D. B. (2007). Sharing Detailed Maumet,C.,Nichols,B.N.,Nichols,T.E.,Pellman,J., … Research Data Is Associated with Increased Citation Rate. PLOS Poldrack, R. A. (2016). The brain imaging data structure, a format ONE, 2(3), e308. https://doi.org/10.1371/journal.pone.0000308. for organizing and describing outputs of neuroimaging experiments. Pronk, T. E., Wiersma, P. H., van Weerden, A., & Schieving, F. (2015). A Scientific Data, 3, 160044. https://doi.org/10.1038/sdata.2016.44. game theoretic analysis of research data sharing. PeerJ, 3,e1242. Ioannidis, J. P. A. (2005). Why most published research findings are https://doi.org/10.7717/peerj.1242. false. PLoS Medicine, 2(8), 6. Puko, N., & McTigue, D. M. (2020). Data for manuscript: Delayed short- Kennedy, D. N. (2012). The benefits of preparing data for sharing even term tamoxifen treatment does not promote remyelination or neuron when you don’t. Neuroinformatics, 10(3), 223–224. https://doi.org/ sparing after spinal cord injury [Text/csv, application-zip,applica- 10.1007/s12021-012-9154-1. tion/x-zip-compressed]. Open Data Commons for Spinal Cord Kyritsis, N., Torres-Espín, A., Schupp, P. G., Huie, J. R., Chou, A., Injury (1.0, p. ODC-SCI:419). https://doi.org/10.34945/F5QP4H. Duong-Fernandez, X., Thomas, L. H., Tsolinas, R. E., Hemmerle, Roche, D. G., Lanfear, R., Binning, S. A., Haff, T. M., Schwanz, L. E., D. D., Pascual, L. U., Singh, V., Pan, J. Z., Talbott, J. F., Whetstone, Cain, K. E., et al. (2014). Troubleshooting public data archiving: W. D., Burke, J. F., DiGiorgio, A. M., Weinstein, P. R., Manley, G. suggestions to increase participation. PLOS Biology, 12(1), T.,Dhall,S. S., … Beattie, M. S. (2021). Diagnostic blood RNA e1001779. https://doi.org/10.1371/journal.pbio.1001779. profiles for human acute spinal cord injury. The Journal of Roundtable on Environmental Health Sciences, Practice, R., B. on P. H. Experimental Medicine, 218(3). https://doi.org/10.1084/jem. and Division, P. H. H. and M., & National Academies of Sciences, 20201795. E. (2016). The Benefits of Data Sharing. In Principles and Laakso, M., Welling, P., Bukvova, H., Nyman, L., Björk, B.-C., & Obstacles for Sharing Data from Environmental Health Research: Hedlund, T. (2011). The development of open access journal pub- Workshop Summary. National Academies Press, lishing from 1993 to 2009. PLoS One, 6(6), e20961. https://doi.org/  Washington, DC. https://www.ncbi.nlm.nih.gov/books/ 10.1371/journal.pone.0020961. NBK362433/. Lammertse, D. P. (2013). Clinical trials in spinal cord injury: Lessons Rübel, O., Tritt, A., Dichter, B., Braun, T., Cain, N., Clack, N., Davidson, learned on the path to translation. The 2011 International Spinal T. J., Dougherty, M., Fillion-Robin, J.-C., Graddis, N., Grauer, M., Cord Society Sir Ludwig Guttmann Lecture. Spinal Cord, 51(1), Kiggins, J. T., Niu, L., Ozturk, D., Schroeder, W., Soltesz, I., 2–9. https://doi.org/10.1038/sc.2012.137. Sommer, F. T., Svoboda, K., Lydia, N., … Bouchard, K. (2019). Larson, S. D., & Martone, M. E. (2013). NeuroLex.org: An online frame- NWB:N 2.0: An Accessible Data Standard for Neurophysiology. work for neuroscience knowledge. Frontiers in Neuroinformatics, BioRxiv, 523035. https://doi.org/10.1101/523035. 7. https://doi.org/10.3389/fninf.2013.00018. Schmidt, E., Torres-Espin, A., Raposo, P., Madsen, K., Kigerl, K., Popovich, P., Fenrich, K. F., & Fouad, K. (2019). Data for the Lemmon, V. P., Ferguson, A. R., Popovich, P. G., Xu, X.-M., Snow, D. manuscript: Fecal transplant prevents gut dysbiosis and anxiety-like M., Igarashi, M., Beattie, C. E., Bixby, J. L., & MIASCI Consortium behaviour after spinal cord injury in rats [Text/csv,application/zip, (2014). Minimum information about a spinal cord injury experi- application/x-zip-compressed]. In Open Data Commons for Spinal ment: A proposed reporting standard for spinal cord injury experi- Cord Injury (1.0, p. ODC-SCI:262). https://doi.org/10.7295/ ments. Journal of Neurotrauma, 31(15), 1354–1361. https://doi.org/ W97942VQ. 10.1089/neu.2014.3400. Liu, Y., Wang, X., Li, W., Zhang, Q., Li, Y., Zhang, Z., Zhu, J., Chen, B., Schulz, J. B., Cookson, M. R., & Hausmann, L. (2016). The impact of Williams, P. R., Zhang, Y., Yu, B., Gu, X., & He, Z. (2019). T10 fraudulent and irreproducible data to the translational research cri- lateral hemisection spinal cord injury with multiple histological and sis—Solutions and implementation. Journal of Neurochemistry, behavioral outcomes [Text/csv,application/zip,application/x-zip- 139(Suppl 2), 253–270. https://doi.org/10.1111/jnc.13844. compressed]. Open Data Commons for Spinal Cord Injury (1.0, p. Seyhan, A. A. (2019). Lost in translation: The valley of death across ODC-SCI:212). https://doi.org/10.7295/W9HQ3X20. preclinical and clinical divide – identification of problems and over- Macleod, M. R., Fisher, M., O’Collins, V., Sena, E. S., Dirnagl, U., Bath, coming obstacles. Translational Medicine Communications, 4(1), P. M. W., Buchan, A., van der Worp, H. B., Traystman, R., 1–19. https://doi.org/10.1186/s41231-019-0050-7. Minematsu, K., Donnan, G. A., & Howells, D. W. (2009). Good Steward, O., Popovich, P. G., Dietrich, W. D., & Kleitman, N. (2012). laboratory practice: Preventing introduction of bias at the bench. Replication and reproducibility in spinal cord injury research. Stroke, 40(3), e50–e52. https://doi.org/10.1161/STROKEAHA. Experimental Neurology, 233(2), 597–605. https://doi.org/10. 108.525386. 1016/j.expneurol.2011.06.017. Macleod, M. R., Michie, S., Roberts, I., Dirnagl, U., Chalmers, I., Teeters, J. L., Godfrey, K., Young, R., Dang, C., Friedsam, C., Wark, B., Ioannidis, J. P. A., Salman, R. A.-S., Chan, A.-W., & Glasziou, P. et al. (2015). Neurodata without borders: creating a common data (2014). Biomedical research: Increasing value, reducing waste. The format for neurophysiology. Neuron, 88(4), 629–634. https://doi. Lancet, 383(9912), 101–104. https://doi.org/10.1016/S0140- org/10.1016/j.neuron.2015.10.025. 6736(13)62329-6. Tenopir, C., Allard, S., Douglass, K., Aydinoglu, A. U., Wu, L., Read, E., Nielson, J. L., Guandique, C. F., Liu, A. W., Burke, D. A., Lash, A. T., et al. (2011). Data sharing by scientists: practices and perceptions. Moseanko, R., et al. (2014). Development of a database for transla- PLoS One, 6(6), e21101. https://doi.org/10.1371/journal.pone. tional spinal cord injury research. Journal of Neurotrauma, 31(21), 0021101. 1789–1799. https://doi.org/10.1089/neu.2014.3399. Watzlawick, R., Antonic, A., Sena, E. S., Kopp, M. A., Rind, J., Dirnagl, Nielson, J. L., Paquette, J., Liu, A. W., Guandique, C. F., Tovar, C. A., U., Macleod, M., Howells, D. W., & Schwab, J. M. (2019). Inoue, T., Irvine, K.-A., Gensel, J. C., Kloke, J., Petrossian, T. C., Outcome heterogeneity and bias in acute experimental spinal cord Neuroinform (2022) 20:203–219 219 injury: A meta-analysis. Neurology, 93(1), e40–e51. https://doi.org/ Wilkinson, M. D., Dumontier, M., Aalbersberg, Ij. J., Appleton, G., Axton, M., Baak, A., Blomberg, N., Boiten, J.-W., da Silva 10.1212/WNL.0000000000007718. Santos, L. B., Bourne, P. E., Bouwman, J., Brookes, A. J., Clark, Watzlawick, R., Sena, E. S., Dirnagl, U., Brommer, B., Kopp, M. A., T., Crosas, M., Dillo, I., Dumon, O., Edmunds, S., Evelo, C. T., Macleod, M. R., Howells, D. W., & Schwab, J. M. (2014). Effect Finkers, R., … Mons, B. (2016). The FAIR Guiding Principles for and reporting bias of RhoA/ROCK-blockade intervention on loco- scientific data management and stewardship. Scientific Data, 3, motor recovery after spinal cord injury: A systematic review and 160018. https://doi.org/10.1038/sdata.2016.18. meta-analysis. JAMA Neurology, 71(1), 91–99. https://doi.org/10. Young, W. (2002). Spinal cord contusion models. Progress in Brain 1001/jamaneurol.2013.4684. Research, 137, 231–255. https://doi.org/10.1016/s0079-6123(02) Whetzel, P. L., Grethe, J. S., Banks, D. E., & Martone, M. E. (2015). The 37019-5. NIDDK Information Network: A Community Portal for Finding Data, Materials, and Tools for Researchers Studying Diabetes, Digestive, and Kidney Diseases. PLoS One., 10(9), e0136206. Publisher’sNote Springer Nature remains neutral with regard to jurisdic- https://doi.org/10.1371/journal.pone.0136206. tional claims in published maps and institutional affiliations. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Neuroinformatics Springer Journals

Promoting FAIR Data Through Community-driven Agile Design: the Open Data Commons for Spinal Cord Injury (odc-sci.org)

Loading next page...
 
/lp/springer-journals/promoting-fair-data-through-community-driven-agile-design-the-open-y4a9OVQabJ
Publisher
Springer Journals
Copyright
Copyright © The Author(s) 2021
ISSN
1539-2791
eISSN
1559-0089
DOI
10.1007/s12021-021-09533-8
Publisher site
See Article on Publisher Site

Abstract

The past decade has seen accelerating movement from data protectionism in publishing toward open data sharing to improve reproducibility and translation of biomedical research. Developing data sharing infrastructures to meet these new demands remains a challenge. One model for data sharing involves simply attaching data, irrespective of its type, to publisher websites or general use repositories. However, some argue this creates a ‘data dump’ that does not promote the goals of making data Findable, Accessible, Interoperable and Reusable (FAIR). Specialized data sharing communities offer an alternative model where data are curated by domain experts to make it both open and FAIR. We report on our experiences developing one such data-sharing ecosystem focusing on ‘long-tail’ preclinical data, the Open Data Commons for Spinal Cord Injury (odc-sci.org). ODC-SCI was developed with community-based agile design requirements directly pulled from a series of workshops with multiple stakeholders (researchers, consumers, non-profit funders, governmental agencies, journals, and industry members). ODC-SCI focuses on heterogeneous tabular data collected by preclinical researchers including bio-behaviour, histopathology findings and molecular endpoints. This has led to an example of a specialized neurocommons that is well-embraced by the community it aims to serve. In the present paper, we provide a review of the community-based design template and describe the adoption by the community including a high-level review of current data assets, publicly released datasets, and web analytics. Although odc-sci.org is in its late beta stage of development, it represents a successful example of a specialized data commons that may serve as a model for other fields. . . . . . Keywords Data sharing FAIR spinal cord injury neurotrauma data reuse community repository Introduction * Karim Fouad In the last decade, the need for more openness and transpar- kfouad@ualberta.ca ency in scientific research has become apparent, with calls from funders, journals, lawmakers, researchers, and the gen- * Adam R. Ferguson adam.ferguson@ucsf.edu eral public to improve the self-correcting nature of the scien- tific literature. The fact that many experiments are not pub- Weill Institute for Neurosciences, Brain and Spinal Injury Center, lished, the lack of transparency of published studies, and the Department of Neurological Surgery, University of California San difficulty of accessing the source data underlying published Francisco, San Francisco, CA, USA results have been recognized as significant barriers to repro- Department of Neuroscience, University of California, San Diego, ducibility (Chan et al., 2014). Data openness and sharing are San Diego, CA, USA potential solutions to deal with some of these barriers and Faculty of Rehabilitation Medicine and the Neuroscience and Mental increase the value of research by providing direct access to Health Institute, University of Alberta, Edmonton, AB, Canada data for conducting replication and large-scale pooled studies Spinal Cord and Brain Injury Research Center, Department of at the subject level (Ferguson et al., 2014). Although the ben- Physiology, University of Kentucky College of Medicine, efits of sharing research data have long been recognized with- Lexington, KY, USA in computational fields such as imaging neuroinformatics, San Francisco Veterans Affairs Health Care System, San ‘omics and clinical informatics (Kennedy, 2012; Piwowar Francisco, CA, USA 204 Neuroinform (2022) 20:203–219 et al., 2007; Pronk et al., 2015; Roundtable on Environmental Community-driven and problem-specific infrastructures can Health Sciences et al., 2016), building the necessary social overcome both the sociocultural and the technical challenges culture and infrastructure for data-sharing in other fields with to achieve FAIR share data. However, community acceptance non-standardized heterogeneous preclinical data of diverse and financial support are essential. types (‘long-tail’ data) remains a challenge (Borgman, 2012; Here, we report on the adoption of FAIR data principles by Callahan et al., 2017; Ferguson et al., 2014; Roche et al., the field of spinal cord injury (SCI) research, offering an exam- 2014; Tenopir et al., 2011). Relatively recent movements call- ple of sociocultural and technical embracement of data sharing ing for accountability and transparency, such as open science, and FAIR data principles by a specific research community. are challenging the traditional scientific dissemination estab- Preclinical SCI research produces diverse neuromotor recovery lishment based on scientific narratives in papers. These move- behavioral measures in rats, mice, nonhuman primates, and ments have generated new forms of academic merit-based pooled de-identified human data. These neuro-behavioral data data sharing that contest the culture of data protectionism. At are often combined with histopathological ratings of postmor- the same time, the lack of dedicated digital infrastructures has tem tissue, and variety of molecular endpoints with data often created a ‘semi-adoption’ of data sharing. For example, there collected in an ad hoc fashion in the same individuals over time has been an increase in the number of digital platforms for (e.g., Ferguson et al., 2013;Kyritsisetal., 2021). Both clinical massive digital sharing, and some journals have started and preclinical research have worked to promote the cultural hosting data files. There is no doubt that these all-purpose adoption of data sharing and standardization in the SCI research solutions are better than non-sharing, but increasing evidence community after many years of collective action. For example, suggests that “data deposition” without curation and proper the creation of clinical SCI data repositories such as the National documentation might not be sufficient for achieving the goals Spinal Cord Injury Model System Database (www.nscisc.uab. of data sharing for reproducibility. In order to accommodate edu) in 1973 (DeVivo et al., 2002), the European Multicenter every possible need, all-purpose data repositories impose little Study about Spinal Cord Injury (EMSCI - emsci.org) in 2004 or no requirements on file format, data structure, and docu- (Curt et al., 2004) have been instrumental for the community to mentation, making it difficult to reuse and integrate these data understand the value of data gathering and integration. More with other data resources. Additionally, such data repositories recently, the International Spinal Cord Society (ISCoS), the can create a false sense of accomplishment where researchers American Spinal Injury Association (ASIA), and the National believe they have contributed in data sharing. However, if the Institutes of Neurological Disorders and Stroke (NINDS) joined data is not actually interpretable by humans, digitally interop- efforts to develop standards such as common data elements erable by software systems, and reusable, the act is ultimately (CDEs) for the collection and reporting of clinical research insufficient despite superficially appearing in-line with the data (Biering-Sørensen et al., 2015; Charlifue et al., 2016). cultural movement towards open data and open science. The pre-clinical SCI research community similarly gained Studies based on game theory suggest that data sharing might valuable collective experience leading to the current stage of be beneficial if a collaborative approach is taken and data data sharing. The NINDS funded projects Multicenter Animal sharing is embraced as a community rather than by individuals Spinal Cord Injury Study (MASCIS) in the 90’s (Basso et al., (Pronk et al., 2015). Thus, there is a need for solutions that 1995, 1996; Young, 2002) and Facilities of Research elevate the quality and value of shared data for reusability Excellence in Spinal Cord Injury (FORE-SCI) in the 2000’s which can be achieved through dedicated data services and (Aguilar & Steward, 2010; Anderson et al., 2009;Steward collective efforts for specific research communities. et al., 2012) led to the development of standards and An important step forward is the cultural adoption of the procedures for SCI research in current use across the globe. ‘FAIR data principles’ (Wilkinson et al., 2016), a set of rec- The events preceding FAIR sharing in pre-clinical SCI ommendations establishing a framework for data sharing stat- research have accelerated in the last decade, resulting in the ing that data should be Findable, Accessible, Interoperable development of minimal reporting expectations for preclinical and Reusable (FAIR). The first two are relatively easy to im- SCI research (MIASCI)(Lemmon et al., 2014), a knowledge plement technically although they do require a cultural shift base and ontology for integration of SCI research data for the research community to embrace data sharing. The all- compatible with terminology standards (RegenBase) (Callahan purpose data sharing platforms do a great job of making et al., 2016), and the curation of the Visualized Syndromic shared data findable and accessible, offering solutions for Information and Outcomes for Neurotrauma (VISION-SCI) most researchers and lowering the bar for cultural adoption multicenter, multi-species SCI dataset (Nielson et al., 2014). It of data sharing. However, increasing the utility of shared data is noteworthy that these efforts tackle diverse data types beyond and FAIRness requires the hosted data, and data-related re- those covered under standardized imaging modalities supported sources, to be interoperable and reusable. Achieving these by the Brain Imaging Data Format (BIDS) (Gorgolewski et al., latter two principles requires overcoming additional engineer- 2016), ‘omics data standards (Chervitz et al., 2011), clinical ing and data management challenges atop cultural adoption. physiological data standards such as Neurodata Without Neuroinform (2022) 20:203–219 205 Borders (NWB)(Rübel et al., 2019; Teeters et al., 2015), and set of events in three stages of community involvement using health informatics such as Observational Medical Outcomes agile design principles (Fig. 1): (1) bringing FAIR to the SCI Partnership (OMOP) standards. community; (2) adapting FAIR to the specific challenges of In parallel to these efforts, some important events have SCI research; and (3) responding to community feedback. generated momentum for a cultural shift in biomedical re- Moreover, a fourth stage of establishment, consolidation, and search in general: the acknowledgment of a reproducibility maturity has recently been ensuredbynew funding crisis (Begley & Ioannidis, 2015; Ioannidis, 2005;Macleod (“Facilitating SCI research, translation and transparency: et al., 2014; Schulz et al., 2016; Steward et al., 2012), a lack of Going Public with the Open Data Commons”) through a translation of preclinical research into clinical care multi-agency funding mechanism (KF, contact PI) and the con- (Lammertse, 2013; Seyhan, 2019), and the growth of the open tinuous support of multiple stakeholders (SCI foundations, SCI access and open science movements (Laakso et al., 2011). In community organizers and advocates, publishers, governmental the SCI community in particular, important events include: (1) agencies, industry representatives, among others) (Fouad et al., the success of VISION-SCI in recovering and repurposing 2020). Below we detail the stages and how they came about. data for new discoveries (Nielson et al., 2015); (2) the devel- opment of FAIR data principles (Wilkinson et al., 2016); (3) Stage One: Bringing FAIR to the Community the endorsement of FAIR by NIH and other funding agencies; (4) the Craig H. Neilsen Foundation awarding the project that In 2016, the NINDS in collaboration with the ODC-SCI con- seeded the ODC-SCI (“Open Data Commons for Spinal Cord sortium hosted the “Developing a FAIR Share Community” Injury Research” in 2016 to ARF); (5) the generalized support of funders to the ODC-SCI effort (Wings for Life Foundation, International Spinal Research Trust, Rick Hansen Institute, the US Veterans Affairs, the Department of Defense Congressionally Directed Medical Research Program) (Fouad et al., 2020); and (6) the SCI 2020 meeting hosted by NINDS. These events have been key elements in bringing the SCI research community together, providing the cultural environment that has ultimately allowed for the development of FAIR data sharing in SCI research. Based on this prior work, the community has directly em- braced FAIR data sharing by developing and launching the Open Data Commons for SCI (ODC-SCI, odc-sci.org), a plat- form to share tabular data of research in the field of spinal cord injury. This included the development of a leadership plan with term limits, orderly leadership succession, and proactive change management (Callahan et al., 2017; Fouad et al., 2020). The ODC-SCI is a community-based data sharing in- frastructure with the goal of democratizing SCI research data by allowing users to access existing data, contribute new data, and utilize and create user-friendly tools for analytics and SCI knowledge-discovery all within FAIR guidelines. The goal of the present paper is to provide historical context and illustrate how members of research communities can work together toward the development of dedicated data sharing initiatives under the umbrella of FAIR. Our major conclusion is that Fig. 1 Staged development. We have divided the process by which the development and adoption of FAIR principles by a research ODC-SCI and the SCI data-sharing community has come together in 4 stages (A). The three first stages seeded the foundations for ODC-SCI and community may require several years of collective effort by stage 4, that has recently started, will bring ODC-SCI to maturity. During multiple stakeholders. these stages the engagement with the SCI data-sharing community and the development of tools has occurred in parallel, in both cases using agile design principles (B). These consist on performing a requirement analysis (e.g., ask the community what data needs to be shared), followed by a Methods period of design and development of tools and policies, and a period of feedback (testing) by the users and the community. When the implemen- The process of bringing the SCI community together around tation satisfies the requirements, the new functionalities can be incorpo- rated to the ODC-SCI FAIR is an ongoing continuum, but we have conceptualized the 206 Neuroinform (2022) 20:203–219 workshop with different representatives of the SCI communi- sharing in the community. These questions were derived from ty to discuss data sharing in SCI (Callahan et al., 2017). The the challenges the community expressed during the first meet- workshop was co-sponsored by the NINDS and the ing, with the intention to create a concrete list of actions to- University of Alberta, with contributions from the ward FAIR data sharing. The detailed results of this second International Spinal Research Trust, the Rick Hansen meeting are documented in Fouad et al. 2020. Briefly, the Institute, Wings for Life, and the Craig H. Neilsen community-driven recommendations can be summarized as: Foundation. The goal was to have an open conversation with (1) the data to be shared should be individual-subject data in different SCI stakeholders about the readiness of the field for tabular form that underlie analysis rather than the raw data the challenge of data sharing and to develop a path toward (e.g., images in histological analysis), in order to balance the adopting FAIR in the SCI community. The development and technical and sociocultural challenges. Such data could also outcomes of that meeting are thoroughly discussed elsewhere include data from ‘failed’ experiments that are difficult to (Callahan et al., 2017), though a summary of the conclusions publish because of publication bias (Macleod et al., 2009; are of interest here. The SCI community was receptive to data Watzlawick et al., 2014, 2019). (2) The user permissions sharing at the time the workshop took place. This was dem- and rights that describe who can view and use the data needs onstrated by polling responses that suggested the willingness a flexible policy, allowing for different members of the com- of the participants to share data to some degree and by the munity to adapt the system to their specific needs. (3) Data collective efforts preceding the meeting as described above. curation and quality control processes should be put in place However, there was disagreement on how open or restrictive with enough flexibility to accommodate different goals that sharing should be (i.e., available to the public or under access members of the community might have when sharing/ control). Major challenges and needs towards adopting FAIR accessing data. The creation of a ‘curation board’ formed by were also identified. Community members voiced concerns members of the community with expertise in SCI research was about: (1) the added time required for sharing data, (2) the suggested. (4) The community felt that there is a need for lack of specific funding mechanisms for researchers to prac- ‘minimum information’ metadata allowing users to under- tice data sharing, (3) the absence of dedicated infrastructures stand the shared data at a high level. Similarly, the adoption for sharing, discovery and reuse of SCI data, (4) the need for and augmentation of existing standards like MIASCI better standards allowing for augmented interoperability (Lemmon et al., 2014) and the RegenBase ontology across the community, (5) the implementation of mechanisms (Callahan et al., 2016) would be useful. (5) Users should gain to protect the intellectual property of the data owner, (6) the credit for data sharing efforts while ensuring that this does not proper attribution allowing for data citation, and (7) the need hamper the utility of the data. A license that legally binds the to ensure the stability of the system. data (re)-user to give appropriate credit to the data creator (e.g., creative commons CC-BY) was recommended by the The result of the 2016 meeting established a roadmap for FAIR adoption by the community. The development of the community. (6) The use of digital object identifiers (DOIs) odc-sci.org platform moved forward to respond to the needs were approved as a viable mechanism to generate citable units expressed by the community and establish the initial require- that would credit researchers for sharing their data. ments to operationalize FAIR. At the same time, a steering committee was formed with individuals representing different Stage Three: Community feedback and Testing of sectors of the community to oversee the development of odc-sci.org ODC-SCI. Moreover, broader community thoughts and ideas continued to be collected through outreach activities and pre- With the general agreements regarding issues such as models sentations in scientific meetings about ODC-SCI and FAIR of data access, quality control, and licenses in place, the ODC- data sharing in SCI after the 2016 meeting. SCI team implemented features into the platform to accord- ingly realize the vision that the SCI community understood as Stage Two: Adapting FAIR to the Community FAIR and found acceptable for data sharing. After several months of internal testing, a beta release (a version to be tested One year after the first meeting, the “FAIR SCI Ahead: the by users outside of the developing team) was made available Evolution of the Open Data Commons for Spinal Cord Injury during 2018. During the first period of odc-sci.org testing with research” workshop took place as a satellite event during the a small group of external users, it rapidly became apparent that 2017 SFN (Society for Neuroscience) meeting in Washington guiding users through the structure, functionalities, and DC. This second meeting was co-sponsored by Wings for workflow of the odc-sci.org was not an easy task. There was Life, International Spinal Research Trust, Craig H. Neilsen a notable learning curve associated with understanding the Foundation, and NINDS. The goal was to discuss fundamen- process and navigating through the site (e.g., from registering tal questions with the SCI community about how to adopt an account to uploading data and applying for a DOI). FAIR data sharing and develop specific policies that govern Members of the development team had to dedicate time to Neuroinform (2022) 20:203–219 207 explain the ODC-SCI portal to users individually, which the increasing usability and robustness of the system, as well quickly created a bottleneck for utilization. In order to reach as in community adoption. a broader audience and encourage more members of the SCI During the first quarter of 2020, a second spate of updates community to join the FAIR share movement, a third work- for the ODC-SCI platform took place to implement the basic shop entitled ‘SCI Team Research Enabling Expansion and functionalities in response to the needs of the community. Translation of FAIR’ (STREET-FAIR) was held as a satellite Moving forward, we implemented an user-centered design event at the 2018 SFN Neuroscience meeting. The STREET- approach to improve the user experience and usability of the FAIR was supported by the International Neuroinformatics workflows and the site. The ODC-SCI team engaged in focus Coordinating Facility (INCF) with the goal of promoting meetings with fast iterations between workflow FAIR data sharing principles to the SCI community by (1) implementations and user feedback that generated a new de- providing an update on the ODC-SCI portal and its support sign guided by the users. An updated ODC-SCI site was made for FAIR data sharing and (2) encouraging participation in the available in April 2020 based on these sessions with future odc-sci.org while offering one-on-one guidance for the partic- updates planned as we are moving to the fourth stage of bring- ipants to explore the portal and progress on their way to shar- ing together the SCI community around FAIR sharing. ing data. To make the session practical and interactive, partic- ipants were challenged to use their own data as a working Stage Four: Refining SCI FAIR Share and the Maturity example in a hackathon-style format. of ODC-SCI Upon initiation of the workshop, we realized that the ODC- SCI system was not prepared to handle the volume of network The culmination of these three stages reflects the completion traffic of all the participants at once. The problem was rapidly in 2019 of the Craig H. Neilsen Foundation funding that seed- detected and corrected on-site but clearly highlighted the val- ed the development of the ODC-SCI. However, much work ue of the massive demonstration/work for beta testing the site remains to be done. The current version of the odc-sci.org has to reveal unforeseen problems. Other challenges that were implemented the basic functionalities that translate most of the found during the course included technical bugs. For example, needs and policies decided by the community. Nonetheless, one participant observed that the ‘0’s in a dataset were trans- advanced functionalities such as incorporating tools for in- formed to blank cells in the uploaded data as a result of the creasing interoperability and reusability and more challenging ODC’s upload process misinterpreting values of “0” as miss- policies and procedures such as curation or establishing a sus- ing data. Others mentioned the difficulty when using specific tainability model are still works in progress. A new multi- web browsers, pointing towards software compatibility issues. agency award supported by Wings for Life and Craig H. These and other technical issues raised during the workshop Neilsen Foundation entitled “Facilitating SCI research, trans- were rapidly corrected in updates to the platform following the lation and transparency: Going Public with the Open Data conclusion of the meeting. Moreover, participants pointed out Commons” ensured funding for the next 5 years. Moreover, the need for improving self-explanatory tutorials and help ma- we have in-kind support from International Spinal Research terials that would facilitate the learning experience for those Trust and other funders supporting data sharing moving for- who could not attend the workshop. Notably, beyond identi- ward. The main goals for this new phase are to advance the fying points for improvement, the workshop provided oppor- development of the odc-sci.org, to continue outreach, and to tunities to stress test more uncommon features. For instance, consolidate the FAIR community effort that will help release participates who did not bring their laptops instead accessed the full potential of data sharing in SCI research. Specifically, the ODC-SCI through smartphones and tablets and helped the project plan foresees the implementation of quality control verify that the site was functional on mobile devices and and curation processes, the development of tools for better browsers. data searching and discovery to improve data findability and The organizers and participants of the meeting concluded reusability, and the mechanisms for continuing community that having educational hands-on workshops is an instrumen- outreach and education. tal tool for bringing awareness of the FAIR data sharing ef- During the initial phase of this new grant, we formalized forts to the broader community. Equally important, getting the governing structure of the ODC-SCI and divided the or- direct and indirect feedback (i.e., gathering opinions or ganizational structure into different teams or boards: a watching users interact with the system using their own data) Leadership team to coordinate the development and operation from community members representing different users with of ODC-SCI, an Executive board to offer oversight and be different goals is essential for the success of the collective involved in executive decisions, a Community board to en- effort. It is important to stress that working on their own data gage with the community and to receive community feedback rather than test, users are likely to be more engaged and may through workshops, and a Data Science team to be responsible notice errors in the systems more readily. The impact of for data curation, quality control, and revision. The constitu- conducting the STREET-FAIR meeting is evident in both ents of the Executive and Community board and some of the 208 Neuroinform (2022) 20:203–219 Data Science team are members representing the broad and Community members are defined. The most permissive ac- heterogeneous SCI stakeholder community with the commit- count type is becoming an ODC-SCI Community member ment to serve a 3-year term. Setting term limits gives the associated to an ODC-SCI lab, known as a Lab member. opportunity to rotate between constituents, allowing for new Community members can request to be part of an ODC-SCI ideas and visions from a rapidly changing community. lab and/or create their own Lab. The user obtains a Lab mem- In addition to maturing the ODC-SCI community portal in ber account type upon approval of this request by the respec- this stage, we are formalizing the implementation of the odc- tive ODC-SCI lab PI or Leadership team. Lab members can sci.org using user-centered design practices. This has created a perform all actions in ODC-SCI. Users approved for a first major update of the odc-sci.org website with a stream of Community member account but not associated with a specif- updates that will be released as new processes and features are ic ODC-SCI Lab can explore and access public datasets and incorporated. The following section offers an overview of the share data peer-to-peer (feature in development). For Lab current implementation of odc-sci.org. members, three possible permission levels are defined for each ODC-SCI Lab to which the user has access: regular lab mem- ber, lab manager, and principal investigator (PI). Regular lab Results members can only act on their own datasets or share their dataset to the private laboratory space. Lab managers have The odc-sci.org system has been designed and operationalized the same privileges as regular lab members but can also man- as a framework that aligns the necessities of the SCI data- age the laboratory space (i.e., accept new members, approve sharing community to FAIR share principles through inter- data to be shared to the laboratory and community spaces, or connected functionalities. The process by which the commu- request a DOI for publication). The highest level of authority nity was engaged is described in the "Methods"section. is the PI; PI’s have full control of their laboratory space in- cluding managing lab users and sharing datasets beyond the ODC-SCI Data Spaces and Account Types laboratory with the entire ODC-SCI Community or publishing their data to the Public space. PI’s also have the additional The SCI data-sharing community has designed an incre- privilege of being able to assign the PI status to others or mental process for releasing and publishing data where change the permissions of any members of their lab. In any data is first shared on a limited basis and then made given virtual laboratory, several members, managers, and PIs progressively more open before final release (i.e., publi- might coexist, even if they are from different research groups. cation). The platform is accordingly built with hierarchi- This approach allows for multi-lab or multi-center data shar- cal spaces for datasets. Each space determines who can ing in a private setting by creating a virtual lab on the ODC- SCI to manage the collaboration. access the data (Fig. 2). When a dataset is initially uploaded into the personal space, it is only accessible to the uploader and the PI of the uploader’slab (user ODC-SCI Data Format and Upload roles are explained in the next section). The successive levels of sharing will finally reach a public space; where The ODC-SCI incorporates various features contributing to at the discretion of the PI, users can publish their dataset the interoperability and reusability of data. First, the ODC- with acreativecommons (CC) licenseand adigital ob- SCI standardizes the use of the comma separated value ject identifier (DOI, Fig. 2). Published data is then ac- (*.csv) format for the data file upload process. We chose cessible to any registered user of the ODC-SCI and does .csv because it is a widely used and open format that can be not require the audience to be part of the ODC-SCI opened and edited in almost any data and text editor including community. spreadsheet/analytics software. This offers flexibility in terms The platform is designed to reflect community concerns of data organization and compatibility. These features allow and agreements reached in the above-described workshops for a balance between human and machine readability of the about when data would be shared and with whom. Four dif- data, making it accessible for the research community while ferent account types define the actions a user can perform (see maintaining a level of machine interoperability and reusability Fig. 3) built upon the Neuroscience Information Framework (Fig. 4). Requirements for data formatting using the .csv files (NIF)/SciCrunch technology stack developed by the FAIR are easy to comply with: rows are observations, columns are Data Informatics Laboratory at University of California, San variables, and the first row contains the headers or names of Diego. Becoming a registered user requires signing in by pro- the columns. The ODC-SCI database is structured at the sub- viding a valid email (preferably an institutional one) and ject-data level, and therefore a unique identifier for each sub- agreeing to the terms of use of the site. Registered users can ject represented in the .csv file is required. Beyond that, cur- request to become ODC-SCI Community members with fur- rent versions of the site do not impose further constraints on ther approval by the Leadership team. Two types of how to organize the data, giving flexibility for adapting to the Neuroinform (2022) 20:203–219 209 Fig. 2 ODC-SCI data spaces and movement of data. Data on the ODC- Community space or request DOI for publication. Datasets that are re- SCI can live in different data spaces on increasing order of privacy. The leased to the ODC-SCI Community space can be explored and accessed Personal space is the most private space where users (Registered users by any registered user who has a Community member account (eighter who are part of a Lab) can upload data, share their uploaded data with general members or members of a laboratory). From the Community their Lab (after PI/Lab manager approval) and explore and access data space, datasets can also be published by requesting DOI. This tiered that is present in the user Personal space. Datasets at the Lab space can be system is hierarchical, since a dataset that for instance is released to the explored and accessed by all users who are members of the same Lab. In Community space, is still present in and belongs to the original Lab space addition, PIs and Lab managers can release the data to the ODC-SCI and uploaders - Duplicate-row (Checked for publication): Rows can not be duplicated. user’s needs. During the process of uploading the .csv file to Schema errors: These errors reflect conflicts between the data dictionary the site, a few automatic checks take place to ensure the min- and the dataset. imal format requirements that would allow for integration of - Extra-header (Checked for publication): The dataset contains at least the data on the ODC-SCI database (Box 1). If a dataset does one variable name not defined in the data dictionary. not pass this check, a notification is displayed pointing to the - Missing-header (Checked for publication): The dataset is missing at source of the issue. Once the user corrects the problem(s), the least one variable name defined in the data dictionary. data can be uploaded to the site. - Missing-definition (Checked for publication): The definition of a variable in the data dictionary is missing. Box 1: ODC-S 534 CI data formatting quality checks. - Required-constraint (Checked at upload): A required field for the dataset contains no values or is not assigned on the dataset. Currently Source errors (Checked at upload): ODC-SCI can read-in the data file. the only required value in the datasets is the subject identifier. As Structure errors: These errors are caused by formatting issues on the ODC-SCI develops additional data standards, it is possible that more dataset variables will be required on all datasets. - Blank-header (Checked at upload): There is a blank variable name. All - Value-constraint (under development): The values of a variable should cells in the header row (first row) must have a value. be equal to one of the permitted values enumerated in the data - Duplicate-header (Checked at upload): There are multiple columns dictionary, or within the limits of the permitted values. with the same name. All column names must be unique. - Blank-row (Checked at upload): Rows must have at least one When a dataset is first uploaded, the site assigns a persistent non-blank cell. identifier to the dataset, and the user is required to provide a 210 Neuroinform (2022) 20:203–219 Fig. 3 ODC-SCI account types and functions. Access to different functions on the site are determined by the account types. Visitors to the platform with no account can only explore the metadata for published datasets but can not see nor download the data. Registered users who are not part of the ODC-SCI Community can explore and download pub- lished datasets. Registered users who become part of the ODC-SCI Community will be able to ex- plore and download published datasets, as well as get private peer sharing (feature still under development). To gain access to all the full suite of functions users will have to be part of a Lab in the ODC-SCI title and a narrative description of the content. This informa- Publishing Datasets at ODC-SCI tion is displayed in the lab space together with the name of the user who uploaded the data which facilitates other Lab Publishing datasets through ODC-SCI means making data Members in finding the data. If the dataset is released to the available to the general public (with a regular registered user ODC-SCI Community space, these metadata elements are account) under an open source license (CC-BY 4.0) and with displayed in their respective landing pages with the addi- an associated DOI which generates a citable unit. One of the tion of the number of observations and records contained goals of ODC-SCI is to promote FAIR data principles and in the dataset and the name of the lab where the data was reproducibility of the ODC-SCI data which requires a minimal uploaded. This allows members of the ODC-SCI standard for the datasets before it can be made public. To Community to find and identify datasets of interest. achieve such a standard, we have created a quality check Members of the SCI data-sharing community have and review process that is conducted by members of the expressed the need for these minimal metadata information ODC-SCI Data Science team. The refinement of this process about a dataset to be present in the ODC-SCI Community is ongoing, although the basic workflow and tools are in place. space, even when data remains private in the Lab space. As a requirement for publication, a data dictionary (i.e., ‘code- The ODC-SCI design team is contemplating this option for book’) must be provided. Specifically, the data dictionary future implementation which would help inform users on gives definitions, units, permitted values, and other valuable what other datasets might become available or allow users information necessary for understanding the collected data. to ask for private sharing while the data is limited to inter- This type of documentation is essential for the reusability of nal usage. Once a dataset is made public (see section be- the data but is often overlooked in general purpose reposito- low), a citation page is generated and a DOI issued. ries. Another important piece of documentation is the Neuroinform (2022) 20:203–219 211 Fig. 4 Machine vs. human readable tabular formats. How data is unique record, meaning that there are not two identical rows on the formatted into spreadsheets can affect the readability of it. As humans, dataset, and completely empty rows and columns are not allowed. The we benefit from visual clues such as blank spaces or colors and from ODC-SCI database is organized around the subject identification number complex data organizations that divide data into chunks (e.g., groups of and thus it must always be present in the dataset. This formatting can have subjects) (A). Although this formatting of the data can be self-explanatory different variations depending on the hierarchical relationships between for humans, the complexity and lack of a consistent structure across variables (such as in the case of repeated measures like time). For exam- researches make it challenging to generate standards that can be used ple, the same variables are collected at different timepoints, a time column by machines to process and understand data. The readability of a spread- can be specified, and subjects can be repeated in rows with records for sheet by a machine can be dramatically improved with simple rules each time point in different rows, known as semi-long format (C). (Broman & Woo, 2018) to organize the data in a structured manner (B Contrary, a new column can be created for every variable and every time to D). In ODC-SCI, data can be uploaded using spreadsheet type file (as point known as wide format (D), in which case each subject is only .csv file) where columns are variables (also known as fields), the first row present in one row. When possible, ODC-SCI recommends using semi- contain the variable names or headers and each consequent row is a long formats metadata information associated with the dataset. This is con- reviewed for potential inconsistencies in formatting and struc- stituted by an appropriate title, a structured abstract with a ture (Box 1) such as whether a variable is present in the dataset description of the study purpose, an overview of the data col- but is not defined in the data dictionary. Some of these checks lected and major conclusions of the study, a list of authors and are performed automatically during the dataset upload, and contributors, a list of identifiers and links to other resources others are done by members of the Data Curation team while such as an associated paper, and funding information. An reviewing the dataset for publication. As the ODC-SCI pro- editable webform for each uploaded dataset can be used to gresses, we will develop automated tools for conducting all provide this information directly on the odc-sci.org site. dataset and data dictionary formatting quality checks. The The dataset, data dictionary, and metadata undergo struc- second part of the review process is an editorial revision of tured quality checks to ensure minimal ODC-SCI standards metadata information to ensure that it contains enough infor- before publication. First, datasets and data dictionaries are mation to adhere to FAIR standards. Once the dataset, data 212 Neuroinform (2022) 20:203–219 dictionary, and metadata are approved by the Data team for the private lab space (Fig. 5D). A total of 6 datasets have been release, a DOI and citation page will be generated and made released to the community, and 4 DOIs have been generated public. It is important to keep in mind that based on commu- with the datasets available for the public (Ferguson et al., nity feedback this process has been put in place to ensure 2018; Liu et al., 2019;Puko&McTigue, 2020;Schmidt minimal quality of shared data for interoperability and reus- et al., 2019). At the time of publishing, 11 DOIs have been ability, and the review process increases the time to generate a requested and are being processed, reflecting the commitment DOI and make data public compared to general purpose re- of the community to sharing data to the public. Importantly, positories that may not provide curation services. the sequences of stages (private, to the lab space, to Once published, ODC-SCI adds searchable tags to a community, to public) is not necessary, and we have found dataset by marking up the pages with structured vocabulary that some authors choose to go directly from the laboratory such as the one provided by schema.org. This permits the user space into DOI release upon completion of data curation and to search for the dataset DOI, for the citation information, or quality checks. To date, this type of lab-to-public sharing has for a related article in a search engine resulting in web links to largely been in the context of authors being asked by journals the publicly shared dataset. Moreover, the ODC-SCI is part of to provide a dataset DOI to coincide with the release of peer- the SciCrunch ecosystem (Whetzel et al. 2015) and is indexed reviewed papers, suggesting that publishers are starting to as a resource (RRID: SCR_016673) that can be found by the enforce data sharing policies and users are seeing ODC-SCI. Neuroscience Information Framework (NIF, neuinfo.org). org as a viable option to meet the requirements. Community Adoption and Web Analytics on the odc- Global Visits to odc-sci.org sci.org The odc-sci.org has been registered with Google Analytics In order to evaluate the community adoption of the odc- since 2018, allowing us to measure usage and activity of the sci.org, we derived aggregated metrics from the registered website. These tools do not allow for individual-visitor iden- users (Fig. 5). From this information, we compiled descriptive tification but rather aggregate metrics of usage of the ODC- statistics of new registrations, data uploads, and the spaces/ SCI portal that can be used as indicators of community en- stages where datasets are in the data release cycle (e.g., private gagement to the site. One measure of global activity on the space, lab space, DOI requested) to summarize the current ODC-SCI is the number of returning and new visitors data landscape of the ODC-SCI (Table 1). These metrics serve (Fig. 6A-B ,Table 1). Fluctuations on the traffic of both as a surrogate to study the adoption of the odc-sci.org and as new and recurrent visits can be appreciated where peaks of an indicator of interest in data sharing by the community. visitors can be associated to periods during or right after out- As of July 2020, 248 users are registered on the site and 57 reach activities. After most peaks of activity, traffic seems to different laboratories across Canada, Europe and USA have return to an average baseline of around baseline level of been created. Some peaks of activity coincide with outreach around 2–3new and 2–3 recurrent users a day on average. and community workshops (Fig. 5B), where the maximum The average of new and returning visitors per day rap- number of new user registrations in a week happened during idly increased in the first and second quarter of 2020 the SFN Neuroscience meeting in 2018 where the STREET- with a particularly high traffic period during the I- FAIR workshop took place. Other activities such as the OSCIRS seminar. It is too early to see if the latest jump streaming at the International Online Spinal Cord Injury in traffic will return to a similar baseline activity or if it Research Seminars (I-OSCIRS; https://www.youtube.com/ will produce a new sustained base traffic with higher watch?v=LZ9DhxUUkeE&t=14 s) were also accompanied visitors on average per day. by an increase in the number of registered users. The Two potential proxies for the level of engagement of number of uploaded datasets also peaked during outreach visitors with the site is the time spent (Fig. 6C-D)and and workshop activities, although the increase in users is not the number of pages viewed (Fig. 6E-F) per session. As always accompanied with an increase in uploads. In terms of a general trend, returning visitors spend more time and the number of uploaded datasets per laboratory (excluding a visit more pages than new visitors, which is to be ex- test lab by the developing team), we observed a general trend pected since there is limited content that new visitors can of an increasing number of datasets from the same laboratory access until they register as users. Thus, these two mea- from 2018 to 2019–2020 (Fig. 5C). In addition, more sures likely convey different things about whether the laboratories were created in 2019–2020 with a visitor is new or recurrent. New visitors who become corresponding increase in the number of labs that are registered users may come back after closing their ses- uploading data to the ODC-SCI. Of the current active sion, and then be counted as returning visitor with more datasets that are not in the test laboratory, most are either in options on the platform. There are some peaks in the the private user space or internally approved to be shared in mean session time and number of pages viewed by Neuroinform (2022) 20:203–219 213 Fig. 5 ODC-SCI activity. We tracked the activity on when users during development, test sets during outreach activities, and datasets by registered to the site (A), when datasets got uploaded (B), the number users who include “test” or “practice” on the description. Most of those of uploaded datasets per Lab (C), and the status or data space where datasets have been subsequently deleted and only active datasets are datasets are set (D). A total of 234 datasets have been uploaded. An shown in D. Notice that although we have 11 requests for DOI at the estimated 38 % of the uploaded datasets are placeholder datasets created time of writing, there are 2 datasets in preparation for being uploaded, and to explore the functionality of the portal, including datasets uploaded therefore not reflected in D visitors associated with hands-on outreach activities such visitors indicate international traffic to the site (Fig. 7; as the STREET-FAIR workshop in 2018, but other ac- Table 1). tivities did not register the same pattern as the I-OSCIRS seminar. This is similar to the fluctuations in new regis- tered users and uploaded datasets, which suggest that Discussion and Conclusions different outreach activities produce different behaviors in visitors and users. The geographical locations of The present paper highlights the journey to-date that the SCI research community has undertaken to adopt the FAIR Table 1 ODC-SCI community Registered users 248 activity as of July 2020 Active datasets 103 with accumulated 1,379,988 rows and 18,359 Estimated individual subjects N > 10,000 Total visits 4799 New visits 2649from46countries Recurrent visits 2150 Returning visitors 502* from 20 countries *Note that the activity of the members of the development team are included in the returning visitors’ data, although the ODC-SCI team constitutes very few of those 502 visitors 214 Neuroinform (2022) 20:203–219 Fig. 6 Traffic of visitors to the odc-sci.org. Using Google analytics traffic period of time. Some of the important outreach activities are annotated on monitoring data we identified new and returning visitors over time (A-B), the graphs: SFN 2018 STREET-FAIR workshop, the SCI2020 meeting as well as the time spend per session in minutes (C-D) and the number of hosted by NINDS, the press release of the new multi-agency grant launch, pages viewed per visitor/session (E-F). A, C and F show the raw daily the SFN 2019 ODC-SCI stand as part of NIF and the IOSCIRS online metrics while B, D and F show 3 weeks moving average over the same workshop principles to promote data sharing and research transparency community and may be repurposed in other research commu- in the context of heterogeneous preclinical research data types nities. We would summarize the steps taken as: (1) developing (‘long-tail’ data)(Ferguson et al., 2014). The data covered by a history of collective and cooperative efforts around data the ODC-SCI enables FAIR sharing of data that falls outside collection and standards; (2) early assessment of the readiness of that covered by established standards such as clinical neu- of the community, the challenges, and the specific community roimaging (e.g., BIDS), physiology (e.g., NWB), health infor- needs while involving different parties to provide different matics (e.g., OMOP). The experience of direct community perspectives on the data life cycle; (3) adapting FAIR engagement and application of agile design principles pro- principles to the specific needs of the community; (4) vides a template for achieving FAIR data sharing in a research seeking community involvement and feedback, and Neuroinform (2022) 20:203–219 215 Fig. 7 Geographical origin of internet traffic to the odc-sci.org. New and Returning visitors have viewed odc-sci.org since we started monitoring traffic combining it with agile design principles for constant it- eration. Our experience to-date has led us to a set of To-date, this community-based, agile design framework fundamental principles for developing community-based has helped odc-sci.org meet the increasing requirements of FAIR technology (Box 2). journals and funders that published paper are accompanied by open and FAIR data that promote transparency and repro- Box 2: Principles behind community-based FAIR data sharing ducibility of research. By embracing community engagement, technology. our goal throughout has been to empower the community to - Emphasize community acceptance ahead of engineering perfection. meet such demands. The design philosophy has been to both - Recognize cultural change is slow and needs constant effort in parallel adapt the design to user demands as well as guide them to- to offering technological solutions through a portal. wards implementing FAIR principles by keeping a user-cen- - Multiple levels of research community engagement with multiple tric design that can accommodate different user types includ- stakeholders (researchers, consumers, funders, government, ing primary data creators, reviewers and funders, and data publishers) are essential for evolving a data publication culture and the data platform. consumers and meta-analysts. Overall, there has been an in- - Collaboration with funding agencies early on is essential and crease in the community usage of the odc-sci.org with impor- potentially the key for adoption of FAIR and open data sharing tant peaks of activity during major events, especially portals. STREET-FAIR and I-OSCIRS. ODC-SCI has seen not only a constant increase of community members but also of 216 Neuroinform (2022) 20:203–219 registered laboratories and movement of datasets from the commons through odc-sci.org and broader neurocommons private space to the community and publicly accessible space. efforts will promote ever increasing knowledge through This suggests that the community of SCI researchers are be- FAIR data sharing across individual researchers, laboratories, ginning to use ODC-SCI and adopt FAIR data sharing princi- species, and perhaps even disease domains. ples. It is noteworthy that there this steady increase in traffic based on grass-roots interest without a large scale, coordinated marketing effort. However, a campaign to increase user en- Information Sharing Statement gagement is planned as the portal moves into full production. In that regard, the metrics and numbers observed during the The ODC-SCI is freely accessible under registration and ap- reported period will serve as a baseline to benchmark future proval as described in this manuscript. development. The next steps will involve further user-driven develop- ment of data dictionaries and standards that improve interop- Acknowledgements In addition to the authors, the STREET-FAIR Wor k shop Part ici p ants i n cl udes: W a rren A lila in, erability. Currently, data dictionaries are optional for datasets warren.alilain@uky.edu; Mark Bacon, mark3bacon@gmail.com; in private lab spaces but are required files (in .csv format) for Nicholas Batty, nbatty@ualberta.ca; Michael Beattie, all datasets released to the public. As the ODC-SCI is further michael .beattie@uc sf.edu; J acquel ine Bresnahan, populated and data dictionaries are uploaded together with jacqueline.bresnahan@ uc sf .edu; Emily Burnside, emily.burnside@kcl.ac.uk; Sarah Busch, sbusch@athersys.com; datasets, we will be able to generate a list of variables or data Randall Carpenter, carpenter.794@osu.edu; Isaac Francos Quijorna, elements that are commonly collected by the community. The isaacfq87trabajo@gmail.com; Xiaohui Guo, x1guo@ucsd.edu; Agnes dictionaries will thus ultimately inform the generation of the Haggerty, agnes.haggerty@gmail.com; Sarah Haroon, ODC-SCI common data elements (CDEs), a set of standards haroon.sarah@yahoo.com; Jack Harris, jckhrrs1@gmail.com; Lyn Jakeman, lyn.jakeman@nih.gov; Linda Jones, latj@comcast.net; Naomi for variables that will help augment interoperability between Kleitman, naomi@chnfoundation.org; Timothy Kopper, data sources with the potential to include ‘translational inter- Timothy.Kopper@uky.edu; Michael Lane, mlane.neuro@gmail.com; operability’ of dataset across species. The effort would mirror Francisco Magana, javifran737@gmail.com; David Magnuson, the establishment of clinical CDEs for SCI by the NINDS dsmagn01@louisville.edu; Ines Maldonado, imaldonado@sralab.org; Verena May, verena.may@wingsforlife.com; Katelyn McFarlane, (Biering-Sørensen et al., 2015). Notably, while there are pre- kate lynmcf ar lane@uky .edu; Kazuhito Morioka, clinical standards being developed by NINDS workgroups for kazuhito.morioka@ucsf.edu; Martin Oudega, moudega@sralab.org; several disorders, there have yet to be directed efforts for pre- Philip Leo Pascual, plpascual18@gmail.com; Jean-Baptiste Poline, clinical SCI. jean-baptiste.poline@mcgill.ca; Ephron Rosenzweig, ephronr@gmail.com; Emma Schmidt, ekschmid@ualberta.ca; Wolfram The ODC-SCI developing team is planning to incorporate Tetz laff, tetzlaf f @icor d .or g ; Lana Zholude v a, functions to map the variables (i.e., data elements) present in lana.zholudeva@gladstone.ucsf.edu. the ODC-SCI data to existing data elements available through InterLex/NeuroLex (Larson & Martone, 2013), a dynamic Funding Supported by: National Institutes of Health (NS088475, lexicon of biomedical terms maintained by NIF. With the NS106899 to A.R.F.), the US Department of Veterans Affairs (I01RX002245, I01RX002787 to A.R.F.), Wings for Life (to A.R.F.) future creation of ODC-SCI CDEs and the integration of those and the Craig H. Neilsen Foundation (to A.R.F.). The STREET-FAIR through InterLex, the ODC-SCI will establish the tools for workshop was supported by a workshop grant from the International developing community standards and increasing interopera- Neuroinformatics Coordinating Facility (INCF)(to A.R.F.). ODC- bility in the SCI research field. We expect that these mapping SCI.org development is supported by a multi-funder grant (Wings for Life, Craig H. Neilsen Foundation, in-kind support from International functionalities, together with sufficient metadata and docu- Spinal Research Trust)(to K.F.) mentation, will provide a common data model with enough information for reusing the ODC-SCI datasets both by Data Availability No applicable. humans and machines. In time these may mature to the point that they can integrate with other, more mature clinical stan- Code Availability No applicable. dards such as BIDS, NWB, and OMOP, among others. As the SCI community has demonstrated, even in the ab- Declarations sence of tightly defined knowledge engineering, it may be possible to extract new knowledge from semi-structured data Conflicts of Interest/Competing interests J.-B.P. was partially funded if modern machine learning analytics are leveraged. Indeed, by National Institutes of Health (NIH) NIH-NIBIB P41 EB019936 (ReproNim) NIH-NIMH R01 MH083320 and NIH RF1 MH120021 Nielson et al. 2015 and Almeida et al. (2021; in the present (NIDM), NIMH Award Number R01MH096906 (Neurosynth), as well issue) demonstrates the utility of analyzing FAIR data even as the Canada First Research Excellence Fund, awarded to McGill from archival laboratory data (25 years ago) to develop and University for the Healthy Brains for Healthy Lives initiative. externally validate new predictors of long term neuromotor recovery. The continuing development of the SCI data Ethics Approval No applicable. Neuroinform (2022) 20:203–219 217 Consent to Participate No applicable. Broman,K.W., &Woo,K.H.(2018).Data Organizationin Spreadsheets. The American Statistician, 72(1), 2–10. https://doi. org/10.1080/00031305.2017.1375989. Consent for Publication No applicable. Callahan, A., Abeyruwan, S. W., Al-Ali, H., Sakurai, K., Ferguson, A. R., Popovich, P. G., Shah, N. H., Visser, U., Bixby, J. L., & Lemmon, Open Access This article is licensed under a Creative Commons V. P. (2016). RegenBase: A knowledge base of spinal cord injury Attribution 4.0 International License, which permits use, sharing, adap- biology for translational research. Database: The Journal of tation, distribution and reproduction in any medium or format, as long as Biological Databases and Curation, 2016. https://doi.org/10.1093/ you give appropriate credit to the original author(s) and the source, pro- database/baw040. vide a link to the Creative Commons licence, and indicate if changes were Callahan, A., Anderson, K. D., Beattie, M. S., Bixby, J. L., Ferguson, A. made. The images or other third party material in this article are included R., Fouad, K., Jakeman, L. B., Nielson, J. L., Popovich, P. G., in the article's Creative Commons licence, unless indicated otherwise in a Schwab, J. M., Lemmon, V. P. & FAIR Share Workshop credit line to the material. If material is not included in the article's Participants. (2017). Developing a data sharing community for spi- Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain nal cord injury research. Experimental Neurology, 295, 135–143. permission directly from the copyright holder. To view a copy of this https://doi.org/10.1016/j.expneurol.2017.05.012. licence, visit http://creativecommons.org/licenses/by/4.0/. Chan, A.-W., Song, F., Vickers, A., Jefferson, T., Dickersin, K., Gøtzsche, P. C., Krumholz, H. M., Ghersi, D., & van der Worp, H. B. (2014). Increasing value and reducing waste: Addressing in- accessible research. The Lancet, 383(9913), 257–266. https://doi. org/10.1016/S0140-6736(13)62296-5. Charlifue, S., Tate, D., Biering-Sorensen, F., Burns, S., Chen, Y., Chun, References S., Jakeman, L. B., Kowalski, R. G., Noonan, V. K., & Ullrich, P. (2016). Harmonization of Databases: A Step for Advancing the Aguilar, R. M., & Steward, O. (2010). A bilateral cervical contusion Knowledge About Spinal Cord Injury. Archives of Physical injury model in mice: Assessment of gripping strength as a measure Medicine and Rehabilitation, 97(10), 1805–1818. https://doi.org/ of forelimb motor function. Experimental Neurology, 221(1), 38– 10.1016/j.apmr.2016.03.030. 53. https://doi.org/10.1016/j.expneurol.2009.09.028. Chervitz, S. A., Deutsch, E. W., Field, D., Parkinson, H., Quackenbush, Almeida, C. A., Torres-Espin, A., Huie, J. R., Sun, D., Noble-Haeusslein, J., Rocca-Serra, P., Sansone, S.-A., Stoeckert, C. J., Taylor, C. F., L. J., Young, W., Beattie, M. S., Bresnahan, J. C., Nielson, J. L., Taylor, R., & Ball, C. A. (2011). Data standards for Omics data: The Ferguson, A. R. (2021). Excavating FAIR Data: the Case of the basis of data sharing and reuse. Methods in Molecular Biology Multicenter Animal Spinal Cord Injury Study (MASCIS), Blood (Clifton, N.J.), 719,31–69. https://doi.org/10.1007/978-1-61779- Pressure, and Neuro-Recovery. Neuroinformatics. https://doi.org/ 027-0_2. 10.1007/s12021-021-09512-z. Curt, A., Schwab, M. E., & Dietz, V. (2004). Providing the clinical basis Anderson, K. D., Sharp, K. G., Hofstadter, M., Irvine, K.-A., Murray, M., for new interventional therapies: Refined diagnosis and assessment & Steward, O. (2009). Forelimb locomotor assessment scale of recovery after spinal cord injury. Spinal Cord, 42(1), 1–6. https:// (FLAS): Novel assessment of forelimb dysfunction after cervical doi.org/10.1038/sj.sc.3101558. spinal cord injury. Experimental Neurology, 220(1), 23–33. https:// DeVivo, M. J., Go, B. K., & Jackson, A. B. (2002). Overview of the doi.org/10.1016/j.expneurol.2009.08.020. national spinal cord injury statistical center database. The Journal Basso, D. M., Beattie, M. S., & Bresnahan, J. C. (1995). A sensitive and of Spinal Cord Medicine, 25(4), 335–338. https://doi.org/10.1080/ reliable locomotor rating scale for open field testing in rats. Journal 10790268.2002.11753637. of Neurotrauma, 12(1), 1–21. https://doi.org/10.1089/neu.1995.12. Ferguson, A. R., Irvine, K.-A., Gensel, J. C., Nielson, J. L., Lin, A., Ly, J., et al. (2013). Derivation of multivariate syndromic outcome metrics Basso, D. M., Beattie, M. S., Bresnahan, J. C., Anderson, D. K., Faden, for consistent testing across multiple models of cervical spinal cord A. I., Gruner, J. A., Holford, T. R., Hsu, C. Y., Noble, L. J., Nockels, injury in rats. PLoS One, 8(3), e59712. https://doi.org/10.1371/ R., Perot, P. L., Salzman, S. K., & Young, W. (1996). MASCIS journal.pone.0059712. evaluation of open field locomotor scores: Effects of experience Ferguson, A. R., Irvine, K.-A., Gensel, J. C., Nielson, J. L., Lin, A., Ly, J., and teamwork on reliability. Multicenter Animal Spinal Cord Segal, M. R., Ratan, R. R., Bresnahan, J. C., & Beattie, M. S. (2018). Injury Study. Journal of Neurotrauma, 13(7), 343–359. https://doi. Cervical (C5), unilateral spinal cord injury with diverse injury mo- org/10.1089/neu.1996.13.343. dalities, multiple behavioral outcomes, and histopathology. Open Begley, C. G., & Ioannidis, J. P. A. (2015). Reproducibility in science: Data Commons for Spinal Cord Injury [Text/csv,application/zip,ap- Improving the standard for basic and preclinical research. plication/x-zip-compressed]. In Open Data Common for Spinal Circulation Research, 116(1), 116–126. https://doi.org/10.1161/ Cord Injury (1.0, p. ODC-SCI:26). https://doi.org/10.7295/ CIRCRESAHA.114.303819. W9T72FMZ. Biering-Sørensen, F., Alai, S., Anderson, K., Charlifue, S., Chen, Y., Ferguson, A. R., Nielson, J. L., Cragin, M. H., Bandrowski, A. E., & DeVivo, M., Flanders, A. E., Jones, L., Kleitman, N., Lans, A., Martone, M. E. (2014). Big data from small data: Data-sharing in the Noonan, V. K., Odenkirchen, J., Steeves, J., Tansey, K., “long tail” of neuroscience. Nature Neuroscience, 17,1442–1447. Widerström-Noga, E., & Jakeman, L. B. (2015). Common data el- https://doi.org/10.1038/nn.3838. ements for spinal cord injury clinical research: A National Institute Fouad, K., Bixby, J. L., Callahan, A., Grethe, J. S., Jakeman, L. B., for Neurological Disorders and Stroke project. Spinal Cord, 53(4), Lemmon, V. P., Magnuson, D. S. K., Martone, M. E., Nielson, J. 265–277. https://doi.org/10.1038/sc.2014.246. L., Schwab, J. M., Taylor-Burds, C., Tetzlaff, W., Torres-Espin, A., Borgman, C. L. (2012). The conundrum of sharing research data. Journal Ferguson, A. R., the FAIR-SCI Ahead Workshop Participants, Alam, S., Bacon, M., Bambrick, L., Basso, M., … Rabchevsky, S. of the American Society for Information Science and Technology, 63(6), 1059–1078. https://doi.org/10.1002/asi.22634. (2020). FAIR SCI Ahead: The Evolution of the Open Data 218 Neuroinform (2022) 20:203–219 Commons for Pre-Clinical Spinal Cord Injury Research. Journal of Lum, P. Y., Carlsson, G. E., Manley, G. T., Young, W., Beattie, M. Neurotrauma, 37(6), 831–838. https://doi.org/10.1089/neu.2019. S., Bresnahan, J. C., & Ferguson, A. R. (2015). Topological data 6674. analysis for discovery in preclinical spinal cord injury and traumatic brain injury. Nature Communications, 6,8581. https://doi.org/10. Gorgolewski, K. J., Auer, T., Calhoun, V. D., Craddock, R. C., Das, S., 1038/ncomms9581. Duff, E. P., Flandin, G., Ghosh, S. S., Glatard, T., Halchenko, Y. O., Handwerker, D. A., Hanke, M., Keator, D., Li, X., Michael, Z., Piwowar, H. A., Day, R. S., & Fridsma, D. B. (2007). Sharing Detailed Maumet,C.,Nichols,B.N.,Nichols,T.E.,Pellman,J., … Research Data Is Associated with Increased Citation Rate. PLOS Poldrack, R. A. (2016). The brain imaging data structure, a format ONE, 2(3), e308. https://doi.org/10.1371/journal.pone.0000308. for organizing and describing outputs of neuroimaging experiments. Pronk, T. E., Wiersma, P. H., van Weerden, A., & Schieving, F. (2015). A Scientific Data, 3, 160044. https://doi.org/10.1038/sdata.2016.44. game theoretic analysis of research data sharing. PeerJ, 3,e1242. Ioannidis, J. P. A. (2005). Why most published research findings are https://doi.org/10.7717/peerj.1242. false. PLoS Medicine, 2(8), 6. Puko, N., & McTigue, D. M. (2020). Data for manuscript: Delayed short- Kennedy, D. N. (2012). The benefits of preparing data for sharing even term tamoxifen treatment does not promote remyelination or neuron when you don’t. Neuroinformatics, 10(3), 223–224. https://doi.org/ sparing after spinal cord injury [Text/csv, application-zip,applica- 10.1007/s12021-012-9154-1. tion/x-zip-compressed]. Open Data Commons for Spinal Cord Kyritsis, N., Torres-Espín, A., Schupp, P. G., Huie, J. R., Chou, A., Injury (1.0, p. ODC-SCI:419). https://doi.org/10.34945/F5QP4H. Duong-Fernandez, X., Thomas, L. H., Tsolinas, R. E., Hemmerle, Roche, D. G., Lanfear, R., Binning, S. A., Haff, T. M., Schwanz, L. E., D. D., Pascual, L. U., Singh, V., Pan, J. Z., Talbott, J. F., Whetstone, Cain, K. E., et al. (2014). Troubleshooting public data archiving: W. D., Burke, J. F., DiGiorgio, A. M., Weinstein, P. R., Manley, G. suggestions to increase participation. PLOS Biology, 12(1), T.,Dhall,S. S., … Beattie, M. S. (2021). Diagnostic blood RNA e1001779. https://doi.org/10.1371/journal.pbio.1001779. profiles for human acute spinal cord injury. The Journal of Roundtable on Environmental Health Sciences, Practice, R., B. on P. H. Experimental Medicine, 218(3). https://doi.org/10.1084/jem. and Division, P. H. H. and M., & National Academies of Sciences, 20201795. E. (2016). The Benefits of Data Sharing. In Principles and Laakso, M., Welling, P., Bukvova, H., Nyman, L., Björk, B.-C., & Obstacles for Sharing Data from Environmental Health Research: Hedlund, T. (2011). The development of open access journal pub- Workshop Summary. National Academies Press, lishing from 1993 to 2009. PLoS One, 6(6), e20961. https://doi.org/  Washington, DC. https://www.ncbi.nlm.nih.gov/books/ 10.1371/journal.pone.0020961. NBK362433/. Lammertse, D. P. (2013). Clinical trials in spinal cord injury: Lessons Rübel, O., Tritt, A., Dichter, B., Braun, T., Cain, N., Clack, N., Davidson, learned on the path to translation. The 2011 International Spinal T. J., Dougherty, M., Fillion-Robin, J.-C., Graddis, N., Grauer, M., Cord Society Sir Ludwig Guttmann Lecture. Spinal Cord, 51(1), Kiggins, J. T., Niu, L., Ozturk, D., Schroeder, W., Soltesz, I., 2–9. https://doi.org/10.1038/sc.2012.137. Sommer, F. T., Svoboda, K., Lydia, N., … Bouchard, K. (2019). Larson, S. D., & Martone, M. E. (2013). NeuroLex.org: An online frame- NWB:N 2.0: An Accessible Data Standard for Neurophysiology. work for neuroscience knowledge. Frontiers in Neuroinformatics, BioRxiv, 523035. https://doi.org/10.1101/523035. 7. https://doi.org/10.3389/fninf.2013.00018. Schmidt, E., Torres-Espin, A., Raposo, P., Madsen, K., Kigerl, K., Popovich, P., Fenrich, K. F., & Fouad, K. (2019). Data for the Lemmon, V. P., Ferguson, A. R., Popovich, P. G., Xu, X.-M., Snow, D. manuscript: Fecal transplant prevents gut dysbiosis and anxiety-like M., Igarashi, M., Beattie, C. E., Bixby, J. L., & MIASCI Consortium behaviour after spinal cord injury in rats [Text/csv,application/zip, (2014). Minimum information about a spinal cord injury experi- application/x-zip-compressed]. In Open Data Commons for Spinal ment: A proposed reporting standard for spinal cord injury experi- Cord Injury (1.0, p. ODC-SCI:262). https://doi.org/10.7295/ ments. Journal of Neurotrauma, 31(15), 1354–1361. https://doi.org/ W97942VQ. 10.1089/neu.2014.3400. Liu, Y., Wang, X., Li, W., Zhang, Q., Li, Y., Zhang, Z., Zhu, J., Chen, B., Schulz, J. B., Cookson, M. R., & Hausmann, L. (2016). The impact of Williams, P. R., Zhang, Y., Yu, B., Gu, X., & He, Z. (2019). T10 fraudulent and irreproducible data to the translational research cri- lateral hemisection spinal cord injury with multiple histological and sis—Solutions and implementation. Journal of Neurochemistry, behavioral outcomes [Text/csv,application/zip,application/x-zip- 139(Suppl 2), 253–270. https://doi.org/10.1111/jnc.13844. compressed]. Open Data Commons for Spinal Cord Injury (1.0, p. Seyhan, A. A. (2019). Lost in translation: The valley of death across ODC-SCI:212). https://doi.org/10.7295/W9HQ3X20. preclinical and clinical divide – identification of problems and over- Macleod, M. R., Fisher, M., O’Collins, V., Sena, E. S., Dirnagl, U., Bath, coming obstacles. Translational Medicine Communications, 4(1), P. M. W., Buchan, A., van der Worp, H. B., Traystman, R., 1–19. https://doi.org/10.1186/s41231-019-0050-7. Minematsu, K., Donnan, G. A., & Howells, D. W. (2009). Good Steward, O., Popovich, P. G., Dietrich, W. D., & Kleitman, N. (2012). laboratory practice: Preventing introduction of bias at the bench. Replication and reproducibility in spinal cord injury research. Stroke, 40(3), e50–e52. https://doi.org/10.1161/STROKEAHA. Experimental Neurology, 233(2), 597–605. https://doi.org/10. 108.525386. 1016/j.expneurol.2011.06.017. Macleod, M. R., Michie, S., Roberts, I., Dirnagl, U., Chalmers, I., Teeters, J. L., Godfrey, K., Young, R., Dang, C., Friedsam, C., Wark, B., Ioannidis, J. P. A., Salman, R. A.-S., Chan, A.-W., & Glasziou, P. et al. (2015). Neurodata without borders: creating a common data (2014). Biomedical research: Increasing value, reducing waste. The format for neurophysiology. Neuron, 88(4), 629–634. https://doi. Lancet, 383(9912), 101–104. https://doi.org/10.1016/S0140- org/10.1016/j.neuron.2015.10.025. 6736(13)62329-6. Tenopir, C., Allard, S., Douglass, K., Aydinoglu, A. U., Wu, L., Read, E., Nielson, J. L., Guandique, C. F., Liu, A. W., Burke, D. A., Lash, A. T., et al. (2011). Data sharing by scientists: practices and perceptions. Moseanko, R., et al. (2014). Development of a database for transla- PLoS One, 6(6), e21101. https://doi.org/10.1371/journal.pone. tional spinal cord injury research. Journal of Neurotrauma, 31(21), 0021101. 1789–1799. https://doi.org/10.1089/neu.2014.3399. Watzlawick, R., Antonic, A., Sena, E. S., Kopp, M. A., Rind, J., Dirnagl, Nielson, J. L., Paquette, J., Liu, A. W., Guandique, C. F., Tovar, C. A., U., Macleod, M., Howells, D. W., & Schwab, J. M. (2019). Inoue, T., Irvine, K.-A., Gensel, J. C., Kloke, J., Petrossian, T. C., Outcome heterogeneity and bias in acute experimental spinal cord Neuroinform (2022) 20:203–219 219 injury: A meta-analysis. Neurology, 93(1), e40–e51. https://doi.org/ Wilkinson, M. D., Dumontier, M., Aalbersberg, Ij. J., Appleton, G., Axton, M., Baak, A., Blomberg, N., Boiten, J.-W., da Silva 10.1212/WNL.0000000000007718. Santos, L. B., Bourne, P. E., Bouwman, J., Brookes, A. J., Clark, Watzlawick, R., Sena, E. S., Dirnagl, U., Brommer, B., Kopp, M. A., T., Crosas, M., Dillo, I., Dumon, O., Edmunds, S., Evelo, C. T., Macleod, M. R., Howells, D. W., & Schwab, J. M. (2014). Effect Finkers, R., … Mons, B. (2016). The FAIR Guiding Principles for and reporting bias of RhoA/ROCK-blockade intervention on loco- scientific data management and stewardship. Scientific Data, 3, motor recovery after spinal cord injury: A systematic review and 160018. https://doi.org/10.1038/sdata.2016.18. meta-analysis. JAMA Neurology, 71(1), 91–99. https://doi.org/10. Young, W. (2002). Spinal cord contusion models. Progress in Brain 1001/jamaneurol.2013.4684. Research, 137, 231–255. https://doi.org/10.1016/s0079-6123(02) Whetzel, P. L., Grethe, J. S., Banks, D. E., & Martone, M. E. (2015). The 37019-5. NIDDK Information Network: A Community Portal for Finding Data, Materials, and Tools for Researchers Studying Diabetes, Digestive, and Kidney Diseases. PLoS One., 10(9), e0136206. Publisher’sNote Springer Nature remains neutral with regard to jurisdic- https://doi.org/10.1371/journal.pone.0136206. tional claims in published maps and institutional affiliations.

Journal

NeuroinformaticsSpringer Journals

Published: Jan 1, 2022

Keywords: Data sharing; FAIR; spinal cord injury; neurotrauma; data reuse; community repository

References