Data Services Newsletter

Volume 3 : No 2 : June 2001

BUD: An On-line Buffer of Real Time Data

Accessing Real Time Data at the IRIS DMC

The IRIS Data Management Center has primarily dealt with delayed data. Our only effort in near-real time data is our participation in the University of Washington SPYDER® system that provides data from significant earthquakes within a few hours of their occurrence. During the past year we have begun developing systems to handle data in real time for distribution to end-users.

Schematic diagram showing the primary data repositories at the IRIS DMC.
Figure 1: Schematic diagram showing the primary data repositories at the IRIS DMC.

The above figure shows the primary data repositories at the DMC. The most familiar is of course the very large archive of continuous waveforms in the mass storage system. We currently have about 20 terabytes (20,000,000,000,000 bytes) of data available in the archive although its total capacity is about 360 terabytes. We also have a collection of active source data in SEG-Y format, older PASSCAL experimental data in miscellaneous formats, and a wide variety of other data in variable formats (For instance the Apollo Lunar Seismic Data Set) in the archive in what are termed Assembled Data Products.

The FARM is a collection of quality controlled event oriented data sets assembled for larger events. Historically the FARM data have been generated for only the GSN data but the new FARM is built for all networks the DMC receives including PASSCAL, FDSN networks and regional networks. The data in the FARM volumes comes from the archive.

SPYDER® data have traditionally come from near real time links to the seismic stations such as telephone lines, autoDRM methods or in increasing numbers Internet connections. The new SPYDER® system is now functioning in beta mode and receives the data directly from the new real time system at the DMC which we call the Buffer of Uniform Data or BUD.

The SPYDER®, FARM, and BUD data all reside on large disk systems at the DMC and thus are appropriate to provide very fast access to data. The system provides near-line robotic access to the entire archive but takes from a few minutes to a few hours to serve waveforms for user requests.

The BUD is the method being developed to offer access to real time data at the IRIS DMC. Presently we receive data into the BUD from one of two systems: the commercial Antelope system developed by Boulder Real Time Technologies, Inc. (BRTT) and the Earthworm Waveserver developed by the USGS in Golden, CO. The Antelope system supports the reception of data from 12 different networks including the IRIS GSN (ASL and IDA), the PASSCAL Broadband Array, GEOFON data from Potsdam, data from Patrick Air Force Base, two central Asian Networks and several regional networks in the United States. The Earthworm technique uses a waveserver client developed for the IRIS DMS by ISTI. Currently we are receiving data from the USNSN, the University of Utah, the University of Washington and the Montana Regional Network. Plans are to add several more regional networks using an Earthworm client in the near future. We are also receiving data from the German GEOFON network using a SEEDlink adapter for Antelope developed by ORFEUS. We have plans to receive data from the GERESS array in Germany as well.

Schematic diagram showing how data flows into the BUD system in near real-time.
Figure 2: Schematic diagram showing how data flows into the BUD system in near real-time.

No matter what the source of the data, all of these data flow into the IRIS DMC’s BUD system in near real time. BUD is quite simply a well-organized file structure with a variety of tools that can manipulate the data within the structure. Our long term goal is to keep approximately 6 months of each network on-line in BUD. These data will then move from the BUD into the primary tape based mass storage system.

An article in the next DMS Electronic Newsletter will provide greater details on the BUD utility tools that assist in viewing data availability, feed latency, data latency, GMT maps, waveform displays and other such features. The focus of the remainder of this article will be how the research community can gain access to the data in the BUD.

Schematic diagram showing the major methods by which data can be accessed out of the BUD system
Figure 3: Schematic diagram showing the major methods by which data can be accessed out of the BUD system.

The figure above shows the major methods by which data can be accessed out of the BUD system. These include:

  • ORB to ORB for data in the Antelope System
  • LISS, the ASL developed real time data delivery system
  • FTP, files containing station-channel-days
  • AutoDRM, the GSE/CTBT data access tool
  • Data Handling Interface, available later this year (see previous newsletter article)

At the present time users can access data from the BUD either by LISS or FTP. In the very near future, autoDRM access will be provided. Several clients are in the process of being developed for the Data Handling Interface and will begin appearing over the next few months.

For more information about the BUD see the on-line help pages or, for more specific access information, contact Sandy Stromme at the IRIS DMC.

by Tim Ahern (IRIS Data Management Center)

05:34:15 v.c03ec7af