Show simple item record

Comparing Costs for Cloud-based Data Archives

dc.contributor.authorHemphill, Libby
dc.contributor.authorXing, Junjie
dc.contributor.authorFan, Lizhou
dc.date.accessioned2023-05-02T13:53:23Z
dc.date.available2023-05-02T13:53:23Z
dc.date.issued2023
dc.identifier.urihttps://hdl.handle.net/2027.42/176337en
dc.description.abstractResearch data management is an expensive enterprise. Computing infrastructure for storing, retrieving, and preserving data is one area of expenses, and computing infrastructure costs grow as the size and number of datasets and demands for their retrieval grow. This paper compares the costs and performance of two database infrastructures, PostgreSQL and Elasticsearch, for digital data archives. We used benchmarking experiments and data from social media to estimate the costs of loading, indexing, and querying data from these two databases. The results show that traditional relational open-source databases can be effective for large social science data and run on relatively low-cost computing infrastructure, where PostgreSQL queries can be faster and less expensive than Elasticsearch. PostgreSQL required higher up front costs and time, and adding computing resources did not improve Elasticsearch’s query performance. These findings are useful for digital archives evaluating back-end storage systems.en_US
dc.description.sponsorshipMIDAS PODSen_US
dc.language.isoen_USen_US
dc.rightsCC0 1.0 Universal*
dc.rights.urihttp://creativecommons.org/publicdomain/zero/1.0/*
dc.subjectelasticsearchen_US
dc.subjectdata curationen_US
dc.subjectpostgresen_US
dc.titleComparing Costs for Cloud-based Data Archivesen_US
dc.typePreprinten_US
dc.subject.hlbsecondlevelStatistics and Numeric Data
dc.subject.hlbtoplevelSocial Sciences
dc.contributor.affiliationumInter-university Consortium for Political and Social Researchen_US
dc.contributor.affiliationumSchool of Informationen_US
dc.contributor.affiliationumComputer Science and Engineeringen_US
dc.contributor.affiliationumcampusAnn Arboren_US
dc.description.bitstreamurlhttp://deepblue.lib.umich.edu/bitstream/2027.42/176337/1/Hemphill - Comparing Costs.pdf
dc.identifier.doihttps://dx.doi.org/10.7302/7187
dc.description.mapping4ae71d2a-01c0-4084-84c3-c32ce960e81cen_US
dc.description.mapping5836d8a9-776f-4cd5-ba6e-a0cfd10d555den_US
dc.identifier.orcid0000-0002-3793-7281en_US
dc.description.filedescriptionDescription of Hemphill - Comparing Costs.pdf : Main article
dc.description.depositorSELFen_US
dc.identifier.name-orcidHemphill, Libby; 0000-0002-3793-7281en_US
dc.working.doi10.7302/7187en_US
dc.owningcollnameInter-university Consortium for Political and Social Research (ICPSR)


Files in this item

Show simple item record

CC0 1.0 Universal
Except where otherwise noted, this item's license is described as CC0 1.0 Universal

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.