Data Services Newsletter

Volume 3 : No 1 : March 2001

New Sync tools

Two new Sync file tools have been developed for internal use at the DMC which may be of use to DCCs.

getpercent – this Perl script reports percentage of data missing from a Sync file on a channel by channel basis. The time range used to determine the percentage is the min and max time for each channel. Output is not pretty, but thorough. A report of min and max time, total seconds in time range, seconds of gap, and percentage missing for each channel is produced. A tolerance may be specified in seconds such that gaps less than or equal to the tolerance will not be counted.


cat syncfile | [tolerance]

Example output

IU|XMAS|01|BHE 1999,120,01:45:58 to 1999,365,23:59:39 (21248021 secs): gaps = 2605704 secs, 12.26% missing
IU|XMAS|01|BHN 1999,120,01:45:58 to 1999,365,23:59:39 (21248021 secs): gaps = 3469610 secs, 16.33% missing
IU|XMAS|01|BHZ 1999,120,01:45:58 to 1999,365,23:59:39 (21248021 secs): gaps = 3469679 secs, 16.33% missing

goatreport – this Perl script reports the gap, overlaps, and continuous time segments (specified by arguments -g, -o, and -c) in a sync file. Time is printed in month day by default, Julian day if the argument -j is used. This script can be used for automated analysis of sync files. It has a silent mode, generating no output (-s), where exit status indicates whether any of the specified gaps, overlaps, and/or continuous time segments were found. Silent mode may be useful when trying to trap infrequent problems, where generation of a full report is unneeded.


cat syncfile | [tolerance] [-j] [-g] [-o] [-c] [-s]

Example Unix shell script

This example use (a Unix script) traps overlaps greater than 5 seconds and moves the sync file to a special area for later analysis:

foreach syncfile (`ls *.syncfile`)
    cat $syncfile | 5 -o -s
    if ($status != 0) then
       mv $syncfile bad_syncfile_dir
        mv $syncfile good_syncfile_dir

squash – this Perl script joins timeseries found in a Sync file to a specified tolerance. The tolerance specified is in seconds. Squash can be used to analyze data holdings to various time granularities. For example, if someone wants to view gaps in data which are greater than 1 day in length, the Sync file can be run through squash with a tolerance of 86400 seconds. Squash can also be used to join timeseries which are continuous, but where a time break was introduced due to the particulars of a Sync file writer. For example, if a Sync file writer always breaks timeseries on a day boundary, at time 23:59:59 and picks up the next day at time 00:00:00, this Sync file can be run through squash with a tolerance of 1 second to create a more compact and quickly analyzed Sync file.


cat syncfile | [tolerance]


These scripts may be found in Edits may be required to point to the local Perl installation.

Contact Sandy Stromme with any questions or to be notified of updates to these scripts.

by Sandy Stromme (IRIS Data Management Center)

21:25:27 v.ad6b513c