Work Description

Title: High-frequency precipitation variance in high-resolution global coupled ocean-atmosphere models: Precipitation data and analysis code Open Access Deposited
Attribute Value
  • The data in this work are precipitation fields output from the models we analyzed, contained in files representing the whole globe at one time stamp. We extracted this data into time series format (all time stamps for one location stored together) before processing it through the methods described in our paper (primarily involving the creation of power spectra).
  • The precipitation data itself is the output of the models/datasets that we analyze in our paper. Most of it is in .nc or .nc4 format, although we provide code to extract the data into time series .mat files. We used MATLAB to perform our analysis.
Contact information
Funding agency
  • National Science Foundation (NSF)
Resource type
Last modified
  • 10/16/2020
  • 10/16/2020
To Cite this Work:
Light, C., Arbic, B., Martin, P., Brodeau, L., Farrar, J., Griffies, S., Kirtman, B., Laurindo, L., Menemenlis, D., Molod, A., Nelson, A., Nyadjiro, E., O'Rourke, A., Shriver, J., Siqueira, L., Small, R., Strobach, U. (2020). High-frequency precipitation variance in high-resolution global coupled ocean-atmosphere models: Precipitation data and analysis code [Data set]. University of Michigan - Deep Blue.


Files (Count: 36; Size: 471 GB)

Research Overview

We analyzed the precipitation output of several high resolution global coupled ocean-atmosphere
climate models to determine if increasing the resolution of either component causes an increase
in precipitation variance, and compare the results of this analysis to more traditional methods
of studying precipitation in model outputs.
This data repository contains the precipitation output files generated from the original models,
along with the code ran to generate all of the results that we included in the paper.

Models/datasets used (co-authors associated with each in parentheses):
-TRMM 3B42 (from NASA) and CMORPH v1.0 (from NOAA) for observational datasets (mainly satellite-derived).
Two separate datasets are used to account for possible weaknesses in each one.
-Rain gauge datasets (land clusters from NOAA, SPURS ocean sites (Farrar)) for observational data
derived from physical gauges instead of satellite data.
-ERA5 (from ECMWF) and Navy ESPC (Shriver, Nyadjiro) for models that include data assimilation (reanalyses).
-EC-Earth models (Brodeau), with a change in atmospheric resolution over a constant high-resolution
ocean component. Atmospheric core is IFS (which ERA5 is also based off of).
-Community Earth/Climate System (CESM, CCSM) models (Kirtman, Laurindo, Small, Siqueira) contain
changes in both atmospheric and oceanic resolutions to compare with.
CESM runs receive greater focus due to higher sampling frequency.
-GFDL models (Griffies)(CM2-1deg, CM2.5, CM2.6) contain changes in oceanic resolution over a
relatively coarse constant atmosphere component.
-GEOS/ECCO models (Menemenlis, Strobach, Molod) have the highest resolution out of any models we analyzed.


Our main analysis involved computing power spectra from this precipitation output data and comparing
the amount of resulting precipitation variance, with a focus on the effects of model resolution
changes (both oceanic and atmospheric) on this difference, along with comparing the results of model
output to observational datasets. Analysis focused on geographical regions of interest, particularly
areas of warm ocean boundary currents where increased model resolution resolves eddies when simulating
the precipitation process.
Conclusions of the results emphasize resolution changes within each model group, and how the change
in variance shifts towards
Other calculations involved a direct comparison of precipitation output time series among models,
generating and comparing cumulative distribution functions (CDFs) for 24-hour accumulation among models,
and creating maps of precipitation variance differences and CDF cross-sections for a sample of all
points around the globe.

File Inventory

The files in this repository can be divided into two sections, data and code.
Data contains the precipitation output itself (stored as global snapshots in time).
Code contains functions that extract the output into time series format and
calculate precipitation variance, along with the other results derived from those time series
presented in our paper.

Matlab Code
-A template for function calls to generatePowerSpectra, including some
variables used to pre-define region boundaries.
-The main function of our code; used to read in precipitation time series
and output power spectra from it.
coordinateConverter.m, rcc.m
-Convert between matrix cells for each dataset and latitude/longitude values,
in both directions. (helper functions to generatePowerSpectra)
The other functions are dependent on this.
-Generate cumulative distribution functions (CDFs) averaged over a region of interest.
-Calculate cross sections of CDFs for specified thresholds over a sample of all points
on the globe.
-All of these files extract time series from the precipitation output data stored
here, and store those time series as separate Matlab files for every point.

Precipitation output for all models and datasets we analyzed.
The code I used to extract these into time series is included in the repository;
it is programmed to internally deal with the file structure inside these directories.
Subdirectories generally group individual days, months, and years of output together.

-TRMM3B42, years 1998-2014 in folder TRMM3hr.
-one file for each time stamp
-CMORPH v1.0, years 1998-2014 in folder CMORPH.
-one file per full day of output, stored as raw text
-Rain gauge data (in folder rainGauges):
-14 files containing 4 years of hourly precipitation recordings from NOAA's Local
Climatological Data. The seven near the Atlantic Ocean and the seven in Hawaii are
intended to be analyzed as two clusters.
-Hourly and minute precipitation from the SPURS 1 and 2 projects (just over one year
of recordings each).
-ERA5 reanalysis, years 1998-2002 in folder ERA5
-one file per month of output
-Navy ESPC model, Sept. 2012 - Aug. 2013 in folder NAVY_ESPC.
-one file for all output
-EC-Earth higher-resolution model, 1990, in folder EC-Earth_T1279.
-one file for each month
-EC-Earth lower-resolution model, 1990, in folder EC-Earth_T255.
-one file for each month
-GFDL CM2.6, model years 111-130, in folder GFDLCM2.6.
-8 files total; 4 blocks of time * 2 types of precipitation
-GFDL CM2.5, model years 111-130, in folder GFDLCM2.5.
-8 files total; 4 blocks of time * 2 types of precipitation
-GFDL CM2-1deg, model years 111-130, in folder GFDLCM2-1deg.
-8 files total; 4 blocks of time * 2 types of precipitation
-CESM higher-resolution model, model years 46-65, in folder CESM-NCAR_HR
-1 file per year
-CESM mixed-resolution model, first 20 full model years, in folder CESM-NCAR_MR
-stored in 41 files, 180 days each
-CESM low-resolution model, first 20 full model years, in folder CESM-NCAR_LR
-stored in 41 files, 180 days each
-CCSM high-resolution model, 30 model years, in folder RSMAS_HRC10
-stored in 360 files, 1 for each month
-CCSM low-resolution model, model years Dec. 256 to Dec. 264, in folder RSMAS_LRC08
-one file for every day that was sampled (every other day total)
-GEOS/ECCO high-resolution model, Apr.13-Jul.5, 2012, in folder GEOS_HR
-one file for each hour of output
-GEOS/ECCO low-resolution model, Feb.8,2012 - Feb.6,2013, in folder GEOS_LR
-one file for each hour of output

Definition of Terms and Variables

Source names I used in code ("source" arguments in functions) to refer to each model:
(need to be consistent across all 3 functions referencing these)
TRMM 3B42 - "TRMM"
ERA5 - "ERA5"
Navy ESPC - "Navy"
EC-Earth T1279 - "ECEarthHighRes"
EC-Earth T255 - "ECEarthLowRes"
CESM high-res - "NCARHR"
CESM mixed-res - "NCARMR"
CESM low-res - "NCARLR"
GFDL CM2.6 - "GFDLCM2.6"
GFDL CM2.5 - "GFDLCM2.5"
GFDL CM2-1deg - "GFDLCM1deg"
GEOS/ECCO c1440-llc2160 - "GEOSHR"
GEOS/ECCO c720-llc1440 - "GEOSLR"

Use and Access

The data itself consists of 1.2 TB; use Globus to transfer larger parts of it.
GEOS/ECCO models' output takes up more than half of this total.
Code is in Matlab and contains relative references to file storage locations; change any path names
to wherever files are actually stored (relative to Matlab working directory).

Download All Files (To download individual files, select them in the “Files” panel above)

Total work file size of 471 GB is too large to download directly. Consider using Globus (see below).

Best for data sets > 3 GB. Globus is the platform Deep Blue Data uses to make large data sets available.   More about Globus