Thread: ROVER - A new tool to robustly access data sets

Started: 2019-05-21 16:19:08
Last activity: 2019-05-21 16:19:08
Topics: Web Services
Chad Trabant
2019-05-21 16:19:08

ROVER - A new tool to robustly access data sets
Introducing ROVER, a new data access tool designed to robustly collect data from the IRIS DMC. While ROVER can be used to collect data sets of any size, it is primarily intended to be used by power users that wish to collect large volumes of data.

ROVER is a command line program that downloads selected data and builds a local, indexed repository in miniSEED format. The following features support downloads of very large requests that run for extended time periods:

A download can be restarted to continue an interrupted download
An embedded webserver provides a simple mechanism to see the current status of a download
An email can be sent on completion of a download
A local organization of data that is appropriate for an arbitrary volume of data
Built-in capability to query the data set
Inherent in ROVER’s design is the ability to compare the local data repository with a center’s data availability and subsequently fetch any data that is available but not yet downloaded. Not only does this allow ROVER to be restarted after an interrupted download, but ROVER can be run again after an initial download to remain synchronized with a data center, add a different selection of data, or even add data from another repository or center.

Organized and indexed data set

As data are downloaded, they are stored in an organized structure on the user’s computer in miniSEED format, ready for use. Furthermore, an index of the data has been created (by mseedindex in an SQLite database.

Using the index, ROVER has built-in capability to summarize or search for specific data in the local repository. This capability is accessed with the list-summary and list-index commands.

While we do not expect most users to directly interact with the SQLite-based index, it can easily be used by other programs to identify available data, file names, sizes, etc. Direct access also allows users to translate the index to a different indexing scheme, such as CSS-based systems.

Highlight: The indexed repository of data created by ROVER can be used by the DMC’s portable-fdsnws-dataselect to create your very own FDSN-standard data web service. Most tools that access data via the fdsnws-dataselect service, such as FetchData, ObsPy, irisFetch.m, etc. can be directed to access data from an indexed, local repository using portable-fdsnws-dataselect.

All channels of data for a station are stored in a single file per day, according to the following pattern:

The SQLite file is stored, by default, at data/timeseries.sqlite.

Other data centers and repositories

ROVER requires an fdsnws-dataselect web service, such as the DMC’s implementation, and a data availability service compatible with the DMC’s irisws-availability service Any data center with these interfaces may be used with ROVER. In the future, ROVER will be updated to support the FDSN's standardized availability service.

By default, ROVER will access the DMC’s primary repository of miniSEED-based data. In the near future, ROVER will also be able to access data in the DMC’s PH5-based repository, which is primarily active source and mixed-mode data.

Installation and documentation

ROVER is written in Python and can be installed with pip. ROVER requires the mseedindex program that is distributed as C-language source code, therefore a C compiler and make program are needed as well.

Installation instructions and documentation are available at the ROVER homepage:
The initial development of ROVER was performed by Instrumental Software Technologies, Inc. (ISTI), particular acknowledgment to Andrew Cooke.
15:50:17 v.01697673