Data Services Newsletter

Volume 16 : No 2 : Summer 2014

Comprehensive Data Analytics as a Service -- The MUSTANG Project

It began as a recognized need for an enhanced historical perspective on the state of health of seismic recording stations: The notion of having measurements to check data and metadata for signs that a deployed station was in need of attention, or that the data being provided by the DMC was noisy or flawed. Being able to see such issues requires a comprehensive view of recorded data over long periods of time, and it`s the interest of every seismic network and principal investigator to be able to know when the data they are providing to scientists is somehow not of sufficient scientific integrity.

figure1
Fig 1. – MUSTANG PDF service showing a noise aggregate for two and a half years of data for IU.TRQA.00.BHZ

What has been conceived and developed is a project called MUSTANG ( M odular U tility for STA tistical k N owledge G athering system), which takes its inspiration from the oft-used QUACK data quality browser. Its overall goals are much the same as QUACK in terms of providing scientists and network operators a clear view on the state of their data and their instrumentation. Where MUSTANG differs is where it attempts to address some very specific needs:

  • Comprehensive data coverage – can gather measurements from the data archive at any point in time.
  • Current and correct measurements – can reprocess metrics at will, most importantly when a change in data or metadata is detected.
  • Wide variety of metricsMUSTANG now records more than 40 separate measurements and this can be easily extended to accommodate new ones.
  • Lightweight, flexible data access – uses web services protocols to supply metrics data in various formats and accepts a number of filters to find specific measurements.

figure2
Fig 2. – Min/max/mean plots for IU.TAO

MUSTANG is currently in a state of building its metrics catalog for broadband sensors. Already, a first pass of the Global Seismic Network (GSN) stations has been completed (1988 to Present) and we are expanding our reach to the major IRIS programs with OBSIP, Transportable Array (TA), and open PASSCAL experiments. We still have a lot of ground to cover, but are continually improving our ability to process large amounts of data and hope to see a bulk of the data completed by the end of the year.

We are getting the word out at workshops about MUSTANG and have an open beta web page for users to go to to explore MUSTANG further. We would like to hear from you as to how we can improve your experience with this analytical tool.

More information available at the MUSTANG Beta Homepage

There are also a couple of prototype web tools for visualization of MUSTANG metrics, like in the above figures. Please try these out and send us feedback to comments@iris.washington.edu.

1. The MUSTANG Databrowser
2. The LASSO station performance monitoring tool

Credits: The following designers and contributors have made MUSTANG possible:
Instrumental Software Technologies, Inc.
STW Software
Mazama Science
Members of the Quality Assurance Working Group
Dr. Tim Ahern, Mary Templeton, Bruce Weertman, Gillian Sharer, Sarah Ashmore, Rob Newman, and other great staff at IRIS DMC.

by Rob Casey (IRIS Data Management Center)