Overview
This is a study of timeseries gaps in TA network data archived at the DMC from January 2005 through May 2008. For the initial investigation only gaps greater than one second in duration on TA.BHZ channels are considered. Following the figures is a short discussion of gaps equal to or less than one second.
Several metrics are used for this study : number of gaps per calendar day, number of stations with gaps per calendar day, number of gaps per stationday, and mean time between gaps. Note that the sum of the number of gaps per day is not equivalent to the total number of gaps. Gaps spanning a day boundary are counted twice in the stationday and calendar day metrics (for example, if a station has data from day 100 00:0015:00 UTC and day 101 02:0023:00 UTC, then there are two stationdays each with one gap.) Also, if a station has no data for a given day it is not counted as a stationday or as a gap for that day.
Figure 1 illustrates gaps measured by calendar day. The top graph compares the number of stations with data and the number of stations with gaps from year 2005 through 2008 day 140. The number of stations with data increased steadily as the TA network was installed and leveled out in the third quarter of 2007. The number of stations with gaps is highly variable, but days with near networkwide gaps stand out. These days were most common in 2007. The second graph shows the number of gaps per day. Most days have a small number of gaps. The largest number (10156) occurred on year 2007 day 265 when the ANF TA systems were undergoing emergency patching. The third graph illustrates an average of the number of gaps per station for each day. The highest numbers were in 2005.
Figures 25 are histograms of gap count per stationday. The bins are groups with 0 gaps, 19 gaps, 1099 gaps, and 100+ gaps per stationday, and their numbers are plotted on a logarithmic scale. The total number of stationdays varies from 11248 in 2005 to 111900 in 2007. The maximum number of gaps in a stationday is 3645 (2007 day 141; X17A was having communications problems). Throughout the lifetime of the TA array the number of stationdays with zero gaps is more than ten times greater than the number of stationdays with gaps.
The final metric is mean time between gaps. While this does give a measure of length of data segments, we found it to be heavily skewed by a small number of days that have a large number of gaps. For 2008, the mean time between gaps is 3.4 days; for 2007 it is 0.8 days; for 2006 it is 13 days; for 2005 it is 0.3 days.
Figures from the initial gap study
Figure 1: Number of stations and gaps per calendar day from 2005 through 2008 day 140.
Figure 2: Gaps per stationday for 2008.
Figure 3: Gaps per stationday for 2007.
Figure 4: Gaps per stationday for 2006.
Figure 5: Gaps per stationday for 2005.
Follow up results including gaps less than 1 second
The decision to limit the initial study to gaps of greater than one second was made for convenience to limit computational time. To address the question of whether gaps of 1 second or less would significantly alter the results, we reran the numbers for 2007 and 2008. For 2008 (days 001140) there are 241 gaps of 1 second duration and 3 gaps < 1 second; for 2007 there are 10199 gaps of 1 second and 5 gaps < 1 second. As shown in the following table, the addition of these gaps creates only minor differences in the gap count per stationday analysis. Regarding the 2008 data, the increased number of stationdays with zero gaps is the result of gapfill data that we received after the initial analysis was completed.
2008 gaps (days 001140)
 # stationdays 
# gaps/day  New results  Previous results 
0  47,519  46,974 
19  2,973  2,828 
1099  35  57 
100999  20  37 
1000+  0  0 
2007 gaps
 # stationdays 
# gaps/day  New results  Previous results 
0  100,965  101,783 
19  9,642  8,833 
1099  1,175  1,170 
100999  86  86 
1000+  28  28 
In conclusion, the large majority of stationdays are gap free and most calendar days have relatively few gaps. A small number of days have large gap counts related to communication failures at individual stations or the result of communications issues beween the ANF and the DMC.
