Data Services Newsletter

Volume 21 : No 1 : Spring 2019

ROVER - A new tool to robustly access data sets

Introducing ROVER, a new data access tool designed to robustly collect data from an FDSN data center such as the DMC. While ROVER can be used to collect data sets of any size, it is primarily intended to be used by power users that wish to collect large volumes of data.

ROVER is a command line program that downloads selected data and builds a local, indexed repository in miniSEED format. The following features support downloads of very large requests that run for extended time periods:

  • An embedded webserver provides a simple mechanism to see the current status of a download
  • A download can be restarted to continue an interrupted download
  • An email can be sent on completion of a download
  • A local organization of data that is appropriate for an arbitrary volume of data
  • Built-in capability to query the data set

Inherent in ROVER’s design is the ability to compare the local data repository with a center’s data availability and subsequently fetch any data that is available but not yet downloaded. Not only does this allow ROVER to be restarted after an interrupted download, but ROVER can be run again after an initial download to remain synchronized with a data center, add a different selection of data, or even add data from another repository or center.

Organized and indexed data set

As data are downloaded, they are stored in an organized structure on the user’s computer in miniSEED format, ready for use. Furthermore, an index of the data has been created (by mseedindex) in an SQLite database.

Using the index, ROVER has built-in capability to summarize or search for specific data in the local repository. This capability is accessed with the list-summary and list-index commands.

While we do not expect most users to directly interact with the SQLite-based index, it can easily be used by other programs to identify available data, file names, sizes, etc. Direct access also allows users to translate the index to a different indexing scheme, such as CSS-based systems.

Highlight: The indexed repository of data created by ROVER can be used by the DMC’s portable-fdsnws-dataselect to create your very own FDSN-standard data web service. Most tools that access data via the fdsnws-dataselect service, such as FetchData, ObsPy, irisFetch.m, etc. can be directed to access data from an indexed, local repository using portable-fdsnws-dataselect.

All channels of data for a station are stored in a single file per day, according to the following pattern:

data/<NET>/<STA>/<STA>.<NET>.<YEAR>.<DayOfYear>

The SQLite file is stored, by default, at data/timeseries.sqlite.

Other data centers and repositories

ROVER requires an fdsnws-dataselect web service, such as the DMC’s implementation, and a data availability service compatible with the DMC’s irisws-availability service. Any data center with these interfaces may be used with ROVER.

By default, ROVER will access the DMC’s primary repository of miniSEED-based data. In the near future, ROVER will also be able to access data in the DMC’s PH5-based repository, which is primarily active source and mixed-mode data.

Installation and documentation

ROVER is written in Python and can be installed with pip. ROVER requires the mseedindex program that is distributed as C-language source code, therefore a C compiler and make program are needed as well.

Full installation instructions and documentation are available at the ROVER homepage:
https://iris-edu.github.io/rover/

The initial development of ROVER was performed by Instrumental Software Technologies, Inc. (ISTI)

by Chad Trabant , Timothy Ronan and Nick Falco (IRIS DMC)

12:25:57 v.22510d55