Data Services Newsletter

Volume 19 : No 3 : Winter 2017

Enhanced data selection: Research Ready Data Sets (RRDS)

Separating the available raw data from that useful for any given study is often a tedious step in a research project, particularly for first-order data quality problems such as broken sensors, incorrect response information, and non-continuous time series. With the ever increasing amounts of data available to researchers, this chore becomes more and more time consuming. To assist users in this pre-processing of data, the IRIS Data Management Center (DMC) has created a system called Research Ready Data Sets (RRDS). The RRDS system allows researchers to apply filters that constrain their data request using criteria related to signal quality, response correctness, and high resolution data availability. In addition to the traditional selection methods of stations at a geographic location for given time spans, RRDS will provide enhanced criteria for data selection based on many of the measurements available in the DMC’s MUSTANG quality control system. This means that data may be selected based on background noise (tolerance relative to high and low noise Earth models), signal-to-noise ratio for earthquake arrivals, signal RMS, instrument response corrected signal correlation with Earth tides, time tear (gaps/overlaps) counts, timing quality (when reported in the raw data by the datalogger) and more.

The new RRDS system will be available as a web service designed to operate as a request filter. A request is submitted containing the standard location and time selections, as well as data quality constraints. The request is then filtered and a report is returned that indicates the data selection (in a format ready to be submitted) or, optionally, a more detailed report including:

  1. the request that would subsequently be submitted to a data access service, e.g. the DMC’s fdsnws-dataselect service,
  2. a record of the quality criteria specified, and
  3. a record of the data rejected based on those criteria, including the relevant values.

This service can be used to either filter a request prior to requesting the actual data or to explore which data match a set of enhanced criteria without downloading the data. We are optimistic this capability will reduce the initial data culling steps most researchers go through. Additionally, use of this service should reduce the amount of data transmitted from the DMC, easing the workload for our finite shared resources.

We plan to release this service before the end of 2017 and to add support for this service in our usual data access tools starting in 2018.

by Chad Trabant and Mick Van Fossen (IRIS Data Management Center)