Work Description

Title: Data and Methods for Magnetometer Station Correlation Calculations Open Access Deposited

h
Attribute Value
Methodology
  • Source data used were all raw magnetic field data for September 2017 and March 2023 from 13 stations from THEMIS, CARISMA, the University of Alaska, Fairbanks, INTERMAGNET, and AUTUMNX (run by the university of Athabasca). Raw magnetic field data was processed by subtracting the median in day long intervals. Correlations were calculated between magnetic field data from individual stations in 7.5 minute intervals over mainly 1 or 2 day long periods. The calculated correlations are all saved in Correlation_Distributions.csv
Description
  • This study analyzes correlations between magnetic field data from closely-spaced pairs of ground magnetometers to observe the spatial scale of ionospheric current signatures. Correlations were mainly calculated in 7.5 minute intervals for periods of multiple days. Distributions were taken from the collection of these 7.5 minute intervals to identify the amount of time where the magnetometers were observing "similar" or "different" ionospheric signatures. The raw magnetometer data was taken from two geomagnetic storms: one taking place on 7-8 September, 2017, and the other taking place on 23-24 March, 2023. These periods were selected due to the presence of both high and low geomagnetic activity. The final distributions calculated from this analysis are available in Correlation_Distributions.csv.
Creator
Creator ORCID
Depositor
  • arigott@umich.edu
Contact information
Discipline
Funding agency
  • National Science Foundation (NSF)
Keyword
Date coverage
  • 2017-09-07 to 2023-03-24
Resource type
Last modified
  • 05/23/2024
Published
  • 05/23/2024
Language
DOI
  • https://doi.org/10.7302/m5vm-jn47
License
To Cite this Work:
Gottesman, A. (2024). Data and Methods for Magnetometer Station Correlation Calculations [Data set], University of Michigan - Deep Blue Data. https://doi.org/10.7302/m5vm-jn47

Relationships

This work is not a member of any user collections.

Files (Count: 4; Size: 27.9 MB)

Date: May 10, 2022

Dataset Title:
Data and Methods for Magnetometer Station Correlation Calculations

Dataset Creators:
Gottesman, Ari

Dataset Contact:
arigott@umich.edu

Funding:
National Science Foundation (NSF)

Abstract:
The occurrence of small-scale and intense ionospheric currents that can contribute to geomagnetically induced currents
have recently been discovered. A difficulty in their characterization is that their signatures are often only observed
at single widely spaced (typically 300 to 500 km) ground geomagnetic stations. These small-scale structures motivate the
examination of the maximum station separation required to fully characterize these small-scale signatures. We analyze
distributions of correlation coefficients between closely spaced mid-latitude and auroral zone ground magnetometer
stations spanning day to month long intervals to assess the separation distance at which geomagnetic signatures appear
in only one station. Distributions were analyzed using periods that included low and high geomagnetic activity. We used
data from pairs of magnetometer stations across North America within 200 km of each other, all of which were separated
primarily latitudinally. Results show that while measurements remain largely similar up to separations of 200 km, large
and frequent differences appear starting at around 130 km separation. Larger differences and lower correlations are
observed during high geomagnetic activity, while low geomagnetic activity leads to frequent high correlation even past
200 km separation. Small but identifiable differences can appear in magnetometer data from stations as close as 35 km
during high geomagnetic activity. Correlations are consistently higher in the north-south component when compared to the
east-west component, giving insight into current signatures driving the differences in observations during storms. With
one-second magnetometer stations now being standard practice, we recommend future magnetometer array deployment in the
auroral and sub-auroral zone to have separations of 100-150 km. This would enable the monitoring of large scale effects
of geomagnetic storms, better temporal and spatial resolution of substorms, and observations of small scale current
signatures.

Summary of data and processing:
Raw data was taken from 13 magnetometer stations. All of this was processed into a singular data structure created to do correlation calculations. Data of magnetometer stations from SuperMAG, THEMIS, CARISMA, the University of Alaska, Fairbanks, INTERMAGNET, and AUTUMNX can all be easily read in using the Station data type in station_oss.py. Folders containing data from multiple magnetometer stations can also be read in using the StationSet data type from station_set_oss.py, which creates individual Station objects for each magnetometer station's data.

After data is put into Station objects, the correlation_histogram() function can be used to calculate and save data in the same format as Correlation_Distributions.csv. The description for running correlation_histogram() is contained within the docstring of the function.

Data for this specific study was taken from two geomagnetic storms, one taking place on 7-8 September, 2017, and one taking place on 23-24 March, 2023.

ALL DATA MUST BE DOWNLOADED SEPARATELY:
THEMIS: http://themis.ssl.berkeley.edu/data/themis/thg/l2/mag/
CARISMA: https://carisma.ca/carisma-data-repository
UAF: https://www.gi.alaska.edu/monitors/magnetometer/archive
INTERMAGNET: https://imag-data.bgs.ac.uk/GIN_V1/GINForms2
AUTUMN: https://autumn.athabascau.ca/TBS_index.php

Methodology and Files Contained:

---------------------------------------------------------------------------------------------------
station_oss.py:
This file contains a data structure used to format magnetic field data from 6 databases used in this project. These
sources are: SuperMAG, THEMIS, UAF, INTERMAGNET, AUTUMN, and CARISMA (these are case-sensitive).
Initializing a Station object with the filename of the data, along with the source entered as specified in the docstring
should handle all the necessary steps to format the data correctly into a Station object. Certain data sources require
additional information: THEMIS requires station name, and both THEMIS and INTERMAGNET require latitude and longitude for
full functionality. Station objects can also be easily concatenated using + or += operators, though this does not
compare times, so ensure that the left side of the operator is the earlier dataset. Station objects also have the
ability to be compared with > and <, which compares the names of the stations based on order in the english alphabet.
str(Station) will return the IAGA code for the magnetometer station.

Once data is stored in Station objects, data from every magnetometer array should bne fully cross-compatible, though
for good results, data should be taken from the same time period. All data that used for the following functions uses
the data stored in station.db_horizontal, station.db_polar, or station.deltab depending on the coordinate system
specified.

The following functions are methods of the Station data structure, meaning they should be called as Station.Function().
Currently available functionality includes:
Finding the distance between two stations, called as Station1.find_distance(Station2)

Finding the difference in magnetic or geodetic latitude or longitude, called as Station2.xxxx_difference(Station2),
where xxxx = mlat, mlon, glat, or glon. Note that not all data files come with both magnetic and geodetic coordinates,
so certain comparisons may not work correctly.

Calculating the correlation between two stations over an interval in Cartesian, Polar or DeltaB Coordinates. It is also
possible to calculate the correlation only for data above a certain value threshold of either the magnetometer data
itself or Sym-H, a product of NASA OMNI that measures geomagnetic activity. This function is called as
Station1.correlation_over_interval(Station2, start_of_interval, end_of_interval). More information about other possible
parameters and customizations can be found in the function's docstring in station_oss.py.

Calculating the time lag between stations that maximizes the correlation. This can be used to find the propagation of an
event between stations. This function works by checking every time lag value from a range of offsets given by input
parameters. This means that using a larger window of time lags can drastically increase the run time required to
calculate the maximum correlation. Negative offsets will offset the first station instead of the second, and the
smallest (or most negative) number should be entered as the offset_start. This function is called as
Station1.max_offset(Station2, start_of_interval, length_of_interval, offset_start=minimum_time_lag_to_be_checked,
offset_end=maximum_time_lag_to_be_checked, offset_interval=interval_between_offsets_checked). Other parameters and
customizations can be found in the docstring of the function.

Plotting the data of two stations over top of each other, along with the calculated correlation between the two stations
over that interval. This function can also take Sym-H data, and plot it along with the magnetometer data. This function
is called as Station1.plot_against_interval(Station2, start_of_interval, end_of_interval, time_lag), where time_lag is
an optional shift of the second station's data set by {time_lag} number of points/seconds. If no time_lag is required,
input 0.

Plotting multiple copies of the above description, and saving them all to the same folder. These copies will all be
within the start and stop times, and will be {shift} seconds apart from each other (if the first plot starts at 0:00 UTC
and the shift is 600 seconds, the second plot will start at 0:10 UTC). This function can also optionally be given the
same time lag information as max_offset() described above, otherwise the default will be 0 (not using any offset).
This function is called as Station1.max_correlation_series(Station2, start_of_total_time_period,
end_of_total_time_period, shift_in_seconds, interval_of_individual_plots_in_seconds). Other parameters and
customizations can be found in the docstring of the function.

Creating a distribution of correlations and plotting a weighted histogram of the created distributions. This works
similarly to the function described directly above (max_correlation_series), but stores the correlation values instead
of plotting them with data. This function is called nearly identically to max_correlation_series, called as
Station1.correlation_histogram(Station2, start_of_total_time_period, end_of_total_time_period, shift_in_seconds,
interval_of_individual_plots_in_seconds). Other parameters and customizations can be found in the docstring of the
function. This function saves 4 files, a png of the histogram, a pickle file of the histogram data, and a csv of each
dimension of the distribution data.

station_set_oss.py:
This file contains a data structure used to hold multiple Station objects and perform certain calculations using data
stored in Station objects. Currently, this only supports data from SuperMAG, THEMIS, and CARISMA. Data can be read in
by entering a filename for SuperMAG, or a foldername of a folder that contains multiple files for THEMIS and CARISMA,
along with the source the data is from: "SuperMAG", "THEMIS", or "CARISMA" (this is case-sensitive).

The main functionality currently available is creating a scatter plot of correlations calculated between all stations
within a StationSet object, and plotting them as a function of the separation in latitude between the stations. This
function behaves similarly to Station.correlation_histogram(), though it averages every correlation value, instead of
keeping them as a distribution. This does mean that error bars could optionally be added to the scatter plot if a large
distribution is used. This function is called as StationSet.lat_scatter(start_of_total_time_period,
end_of_total_time_period, shift_in_seconds, interval_of_individual_periods_in_seconds). Other parameters and
customizations can be found in the docstring of the function.
Scatters based on longitude and Distance are currently deprecated, though they could easily be fixed by using the
lat_scatter code and replacing latitude with the other measures of separation.

StationSet objects can also be used to print the lats or lons of every station in the StationSet by calling
StationSet.print('lat') or StationSet.print('lon').

Correlation_Distributions.csv:
This file contains the distribution data of every histogram used in the associated paper, all of which were taken
from the Station.correlation_histogram() outputs.

Download All Files (To download individual files, select them in the “Files” panel above)

Best for data sets < 3 GB. Downloads all files plus metadata into a zip file.



Best for data sets > 3 GB. Globus is the platform Deep Blue Data uses to make large data sets available.   More about Globus

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.