Deep Blue Data is part of a suite of services provided by the University of Michigan (U-M) Library to faculty, students, and others currently affiliated with U-M and their direct collaborators. Deep Blue Data services are offered in support of the Library's Mission Statement: "to support, enhance, and collaborate in the instructional, research, and service activities of the faculty, students, and staff, and contribute to the common good by collecting, organizing, preserving, communicating, and sharing the record of human knowledge."
You are solely responsible for the contents of your deposit and agree that University of Michigan is not responsible for the contents of your deposit.
You represent and warrant that:
- the deposit is your original work, and/or that you have the authority to authorize the uses contained in the license you have selected for your deposit;
- if your deposit contains material that is not your original work, either you have the right to deposit the materials in this context or you have obtained the necessary permission to do so.
- any third-party material is clearly and appropriately identified and acknowledged in the content of the deposit
- your deposit does not infringe upon anyone's rights (e.g., copyright, privacy, defamation, etc.), breach a contract, violate the law, or contain unlawful material.
You are responsible for ensuring that Research data involving the use of human subjects is in accordance with the University of Michigan's Institutional Review Board (IRB).
You understand and agree that Deep Blue services are provided "as is" without warranty of any kind, either express or implied, including without limitation, the implied warranties of merchantability or fitness for a particular purpose.
You understand and agree that if the Library determines that the terms of service have been violated, it may limit or remove access to the deposited material in question, leaving descriptive metadata and a notice to explain the reason for the removal. The descriptive metadata and the notice will be visible to those who have its persistent URL.
If you view or download information from Deep Blue repositories, you agree that Deep Blue services and content therein are provided "as is" without warranty of any kind, either express or implied, including without limitation, the implied warranties of merchantability or fitness for a particular purpose. Use of Deep Blue Data is at your own risk.
You agree that Deep Blue repositories and its administrator, the University of Michigan, shall have no liability for any consequential, indirect, punitive, special or incidental damages, whether foreseeable or unforeseeable (including, but not limited to, claims for defamation, errors, loss of data, or interruption in availability of data), arising out of or relating to your use of Deep Blue repositories or any resource that you access through Deep Blue repositories.
Deep Blue repositories host content from a number of authors. The statements and views of these authors are theirs alone, and do not reflect the stances or policies of the University of Michigan or their sponsors, nor does their posting imply the endorsement of the University of Michigan or their sponsors.
Deep Blue repositories collect usage data. The U-M Library has a revised statement on privacy and confidentiality on how these data are collected and used.
2. Submission & Deposit Policy
Who can deposit data
Deep Blue Data welcomes deposits from current University of Michigan (U-M) faculty and research staff. Submissions from multi-institutional collaborations are acceptable provided that they include at least one participant who is actively employed by the University of Michigan.
Faculty and research staff can deposit work on their own behalf. They can also permit others to act as their proxies to deposit their work, or serve as sponsors of those who can deposit additional work.
Proxies deposit work on behalf of others — i.e., the proxy is not necessarily an author or co-author of the work — and both the faculty/research staff member and the proxy will be depositors of record. Sponsored depositors can deposit their own work; the sponsoring faculty or research staff member does not have to be an author or co-author of work deposited by people they sponsor, but will be noted as a sponsor of that work (not displayed publicly).
The kind of work Deep Blue Data accepts
Deep Blue Data accepts data from all disciplines. Data submitted should be complete and must be ready for distribution and re-use by others. One aspect of re-usability is that data must include appropriate descriptive metadata.
Work Deep Blue Data cannot accept
Deep Blue Data is designed to make data publicly available. However, there are several types of data that cannot be accepted by Deep Blue Data, including:
- Data under the purview of U-M Research Compliance Programs that are subject to export controls, present a conflict of interest for the University or the Researcher(s), or whose distribution would otherwise constitute a violation of research ethics or compliance.
- Data that contain personally identifiable information (PII), such as data subject to HIPPA, FERPA, or other regulations — including IRB requirements — that would prohibit public access to the data.
- Unless they are being used for research purposes, administrative data will not be accepted without the prior
consent and agreement of the U-M Library. This includes data that meet one or more of the following criteria:
- They are relevant to planning, managing, operating, controlling, or auditing administrative functions of an administrative or academic unit of the University;
- They are generally referenced or required for use by more than one organizational unit;
- They are included in an official University administrative report.
- Sensitive data will not be accepted. To determine whether data are considered sensitive, please consult the Sensitive Data Guide or contact the Research Ethics & Compliance group. For help creating a publicly shareable version of your sensitive data, contact Center for Statistical Consultation and Research (CSCAR).
The Library reserves the right to review data and to refuse or remove any data that do not meet the criteria described above.
Access to material in Deep Blue Data
All data deposited in Deep Blue Data are immediately accessible worldwide. We will clearly identify depositors' name(s) as the author(s) or owner(s) of the submission, and will not make any alteration, other than as allowed by this license, to the submission.
Depositors must choose to apply one of several Creative Commons licenses to the data. Depositors authorize the Library to distribute the data under the terms of the license they have selected.
Metadata is the descriptive information provided by the depositor including the abstract as described below. Regardless of which license a depositor chooses for data distribution, metadata will always be distributed with the CC0 mark in order to facilitate the broadest possible use.
Deep Blue Data requires each deposited dataset to be accompanied by descriptive metadata. While some metadata is basic (title, author, etc.) it's important to acknowledge that data require more metadata than traditional scholarly works. Discoverability for publications like journal articles or book chapters is aided by full text search functionality; this is not the case for many other formats. A detailed description of a data's origins, purpose, and use are essential for both discoverability and re-use. The following are minimum requirements for the researcher deposited metadata:
- title: a descriptive title for the data and any associated works you deposit along with it
- author(s): those responsible for creating the data
- creation date: the date the data was created, as determined by its author(s) - mm/dd/yyyy
- method: a description of method for collecting the dataset (this can include equipment used for sensing, survey methodology, etc.)
- description: a description of each dataset (including, for example, codebooks and descriptions of variables along with corresponding units, etc.)
- license: the researcher will choose one of the available Creative Commons licenses
Additionally, the system generates metadata automatically to help us maintain the integrity of your files and to provide a complete record of how the data and its presentation have changed over time. Generated metadata include:
- timestamp: time and date stamping for deposits and changes to deposits, including changes to metadata, to track modifications over time
- depositor: who made the deposit and any subsequent changes, in case we need to consult with them about issues that arise
- checksums: error detection mechanism to help ensure the integrity of the files
Storage and costs
The Library does not foresee placing cumulative storage limits on Deep Blue Data deposits. There are some constraints on the size of deposits based on the following technical limits:
- Self-deposit: The maximum size of a deposit uploaded through the online interface is 5 GB. There is no limit on the number of uploads that can be submitted through self-service.
- Mediated deposit: Data deposits over 5 GB but under 1 TB need to be uploaded to the system through a separate tool. Researchers with individual files or aggregations of files that fall within these parameters should contact Research Data Services to facilitate deposits directly into the repository. There is no limit on the number of deposits that can be made through the mediated service.
- Large deposits: Single files or single aggregations of files exceeding 1 TB may pose challenges based on the underlying Deep Blue Data repository software. In cases where single files or single aggregations of files exceed 1 TB the Library will consult with depositors to assess available archiving and access options. The Library desires to be a partner in the archiving of research data no matter the size. Please contact Research Data Services.
Currently there is no cost to the researcher for using Deep Blue Data within the guidelines and technical limitations stated in this policy. The Deep Blue Data platform is currently focused on the management of small-to-medium sized data, as defined in items (1) and (2) above. As we scale the usage, capabilities, and capacities of the system, we may introduce cost-sharing or other cost models to support new services.
Support for depositors
The Library offers consultation services and best practices documentation to help guide Deep Blue Data users. Contact us for assistance.
3. Collections & Content
Defining Research Data
For the purposes of Deep Blue Data, research data are defined as representations of observations, objects, or other entities used as evidence of phenomena for the purposes of research or scholarship. In practical terms, Deep Blue Data will accept data that were developed or used in the support of research activities of U-M faculty, students and staff.
As the intent of the Deep Blue Data data repository is to make data as openly available as possible for discovery, understanding, and reuse, we strongly encourage the submission of data in formats that are open and nonproprietary.
If data cannot be converted to nonproprietary formats, we then encourage data submission in formats that are widely used.
Deep Blue Data will accept data in proprietary formats provided that these formats are appropriate for the research communities who are likely to have an interest in the data. However, it may not be possible to provide as high a level of preservation service for proprietary formats (see Preservation Policy).
Data submitted to Deep Blue Data will be reviewed after 10 years to determine if a data set should be retained and be subject to further, periodic, reviews thereafter. The goal of these reviews is to identify and possibly remove data that have reached the end of their use and reuse life cycle, or have become inaccessible (e.g. because of format obsolescence). The retention review will be conducted by the Data Curation Librarian, appropriate subject librarian(s), and, whenever possible, the depositor. The retention decision will be driven by a determination of the ongoing value to the research community. Long-term retention will also be determined by file format based preservation levels assigned upon deposit. Any data removed from the repository will be returned to the depositor whenever possible and documented with a tombstone record, which is the remaining metadata from a deleted record kept for the purposes of permanence.
Removing work from Deep Blue Data
Depositors can remove their work from Deep Blue Data with the assistance of and after consultation with staff if there is a mutual determination that the work is not appropriate for the service. Whenever work is removed, a tombstone record will remain.
If the depositor requests that the data be withdrawn from Deep Blue Data, the Library will take the following factors into consideration:
- If the data has been shown to contain inaccurate or faulty information
- If there is evidence of the data being used, cited, or downloaded
The Library also reserves the right to remove any deposit for reasons including:
- It was not appropriate for deposit (e.g. it contains sensitive information, viruses or other malware, or if we receive a verified complaint that it contains materials determined to be an infringement of copyright)
- It is no longer of active interest as described below (see the Retention Review section)
In such cases we will make reasonable attempts to contact the depositor so they can arrange for a new home for the data. A tombstone record will always remain for any deposit that is removed.
Copyright and Take-Down Notification
Please refer to the library and University policy and procedures on copyright and take-down.
4. Preservation Policy
The University of Michigan Library is committed to providing preservation services for Deep Blue Data. The ongoing development of policies, procedures, and technical infrastructure facilitating these services are informed by community standards and best practice, such as the Open Archival Information System (OAIS) Reference Model (2012), the Trusted Digital Repository (TDR) Checklist, and practice at peer institutions. The Library is committed to the use of open content standards and facilitating the preservation of content independent of repository software. This policy is subject to change; the Library will make our best efforts to communicate these changes to depositors.
All content in the Deep Blue Data repository will receive services commonly referred to as "bit-level" preservation. These include quality control measures such as checksum creation and fixity checks, the creation of archival backups, and file format characterization. This level of preservation is intended to ensure data are as they were deposited for as long as that content is stored in the repository (ten years or more depending on the retention review) and protect content from preservation threats such as bit-rot and unintended changes/deletions. All content will be stored on an enterprise-grade infrastructure with appropriate disaster recovery capabilities, security, and media replacement schedules.
To facilitate basic preservation services, deposited content must not be encrypted. Compression (.zip, .gz, tar.gz) may be used for deposits, though it is discouraged. ZIP or tar files are only as good as their contents; best practices discussed above and below still apply.
More robust services are necessary for longer-term preservation, especially in cases where data will be retained for longer than 10 years. These services include file format migration informed by ongoing file format monitoring, and the creation of preservation metadata documenting these transformative actions. Due to the volume and diversity of content, Deep Blue Data will not perform automatic normalizations or other format transformations upon deposit. Therefore, the level of preservation provided to content will be dependent on the file formats as deposited in the repository.
File Formats and Preservation Support
Deep Blue Data will accept any file format. However, as stated above, the application of certain preservation actions are dependent on the formats as deposited into the repository.
The Repository provides three levels of support for various submission file formats:
The Repository will provide its highest level of preservation support, making its best effort to maintain the content, structure and functionality in the future. This service level is currently provided only for formats that are both publicly documented and widely used, giving us a high degree of confidence in our preservation commitment, making it more likely that tools will exist or be developed to undertake preservation actions, and that those actions will result in an understood and controlled transformation or migration. The content may also be migrated (transformed to another stable format) to provide additional assurance that the information content is preserved. Finally, the content will be preserved as originally deposited to ensure the original bitstream is always available. .CSV is an example of a Level 1-supported format, as its specifications are publicly available, it is well-supported and widely deployed.
The Repository will make limited efforts to maintain the usability of the file as well as preserving it as submitted (bit-level preservation). This level of support is generally applied to proprietary formats that are widely used and where there is substantial commercial interest in maintaining access to the format. These factors increase the chances that tools will available to migrate them to successor formats (e.g., Microsoft Excel). However, because of the uncertainty inherent in tool development for proprietary formats, the Repository can only guarantee bit-level services and ongoing monitoring for level 2 formats at this time.
The Repository provides basic preservation of the file (bitstream) and associated metadata as-is with no active effort made to monitor the format and associated risks or to normalize, transform or migrate the file to another format. Files may be openable and/or readable by future applications, but there is no guarantee that the content, structure, or functionality will be preserved. This service level usually applies to files written in highly specialized, proprietary formats, often usable only in a single software environment, formats no longer widely utilized, and/or formats about which little information is publicly available. Shapefile (.shp) is an example of a format that would receive Level 3 support in the Repository. Any format not yet reviewed and evaluated by the Library will also receive Level 3 service on deposit, but a higher level may be assigned after format review takes place.
See the complete list of the Library's Registered Formats and Support Levels.
It is important to note that the above file format considerations apply to the preservation of all material deposited into the repository. This includes uploaded supplemental material such as code books, "read me" documentation, etc.
All content withdrawn from the repository will conform to the policy in the "Removing Work from Deep Blue Data" section. In addition to the actions listed above, Deep Blue Data will retain a "tombstone" containing a sub-set of metadata for all content removed from the repository. This tombstone will include the date and reason for removal, and will be retained indefinitely. The metadata will be accessible to Deep Blue Data staff to facilitate future auditing and accounting requirements. The metadata will also be visible to those who already have its persistent URL, but will no longer be searchable and unavailable for harvesting by services such as Google.
The permanent identifier will remain active so that people who had previously cited the data can confirm its status. Where possible and applicable, that tombstone record will point to and/or resolve to the new location for the data.
In all cases where data are withdrawn from the repository, the Library will make its best effort to return data to the data producer or the designee of the data producer.
Repository Succession Plan
The Library is committed to providing the required financial and technical resources for the long-term curation of content in Deep Blue Data. Funding for these curation activities is part of the base-funded responsibilities of the University of Michigan Library, and ongoing funding for the Library is provided to that end.
Should the funding or organizational imperatives of the University of Michigan Library change, the Library will strive to provide at least one year notice, and devote resources to support the transition to another host institution and/or returning the data to the data producers. The Library commits to the further development of formal succession plans in conformance with Trusted Digital Repository (TDR) requirements. This planning will include documenting funding commitments and efforts to forge formal data transfer agreements.
This policy will be reviewed annually by Research Data Services.