IRIS Services, Products, Quality Assurance Efforts, and Potential Links to High Performance Computing in the Era of BIG DATA
T. Ahern, M. Bahavar, R.Casey, C. Trabant, A. Clark, A. Hutko, R. Karstens, Y. Suleiman, B. Weertman
This keynote talk will focus on four fundamentally new areas currently being pursued by IRIS Data Services (DS). These include 1) the new but very significant web services, 2) development of higher level products, 3) improved Quality Assurance initiatives, and 4) links to high performance computing.
IRIS DS has traditionally been focussed upon ingesting, curating and distributing data to the research and monitoring communities through a variety of mechanisms. More recently the request services portfolio of IRIS DS has expanded to include a Service Oriented Architecture (SOA) design leveraging web services, that greatly enhances programmatic access to the waveform and metadata holdings managed at the IRIS DMC. This talk will discuss the availability of new web services available from the DMC. The DMC has also expanded the levels of products/information that it now supports and has expanded into a variety of DMC and/or research community developed products that serve as stepping-stones to further research.
As data volumes increase, it is clear that the manner in which the research community will interact with the data in the future must change and automation in the assessment of data quality must improve. A new QA system called MUSTANG, is now in beta-release and is a major development intended to improve the quality of seismic data world-wide. All data either provided by IRIS funded networks as well as networks that contribute data to the IRIS DMC for further distribution to the monitoring and research communities. Coupled with tighter connections to network operators, IRIS is optimistic that the automation of MUSTANG coupled with tighter connections to network operators will meet the objective of improved data quality.
IRIS is in the near final stages of deploying an Auxilliary Data Center near one of the nation’s fastest computers at Lawrence Livermore National Labs. This architecture is an attempt to not only meet the IRIS DMC’s needs for mitigation of any failure at the IRIS DMC in Seattle but also a step towards addressing the “cycles close to data” issue that currently limits seismologists’ ability to perform massive computations over all or significant portions of the DMC data holdings. While this should be considered a first step in this direction it is nevertheless quite significant.
Finally this talk will mention a project to provide Digital Object Identifiers (DOIs) to provide a convenient way to provide attribution to data sets from permanent seismic networks, PI driven temporary seismic networks such as those facilitated by PASSCAL, as well as products managed at the IRIS DMC.