Hi.
I'm retrieving continuous data using bulkdataselect, one day at a time. A
typical request line might look like "TA X16A -- BHZ 2007-03-30T00:00:00
2007-03-30T23:59:59.999"
Using this method, I occasionally miss one or two samples at the day
boundaries. I'm under the impression that the DMC internals make it more
efficient to request on day boundaries. How do you recommend I do this to
keep the data continuous and not miss samples at the day boundaries?
Thanks!
-- John
I'm retrieving continuous data using bulkdataselect, one day at a time. A
typical request line might look like "TA X16A -- BHZ 2007-03-30T00:00:00
2007-03-30T23:59:59.999"
Using this method, I occasionally miss one or two samples at the day
boundaries. I'm under the impression that the DMC internals make it more
efficient to request on day boundaries. How do you recommend I do this to
keep the data continuous and not miss samples at the day boundaries?
Thanks!
-- John
-
One suggestion (which unfortunately changes the IRIS web service
time specification) is for time intervals to be half-open invervals,
represented in math notation as
[time1, time2)
This means the time interval where time t >= time1 and t < time2.
I believe that all IRIS services currently defined a closed interval
[time1, time2] which means the time interval where
time t >= time1 and <= time2.
Closed intervals make it very hard to request a series of
requests whose results can be concatenated to generate a
contiguous timeseries with no overlap. For day requests,
2 request for:
2007-03-03T00:00:00.0000 to 2007-03-04T00:00:00.0000
2007-03-04T00:00:00.0000 to 2007-03-05T00:00:00.0000
will contains 2 copies of a sample whose timestamp is 2007-03-04T00:00:00.0000
However, if the requests are open intervals, you will never miss a sample or
get a duplicate sample at a request boundary.
If you are missing 1 sample at a day boundary, it could be that you are missing a
sample timestamped between 59.999 and 00.000 seconds. If you are missing
more than one sample, there is either a timetear (or gap) in the timeseries,
or there is a problem with the IRIS web service.
- Doug N
On 1/25/12 10:24 PM, John D. West wrote:
Hi.
--
I'm retrieving continuous data using bulkdataselect, one day at a time. A typical request line might look like "TA X16A -- BHZ 2007-03-30T00:00:00 2007-03-30T23:59:59.999"
Using this method, I occasionally miss one or two samples at the day boundaries. I'm under the impression that the DMC internals make it more efficient to request on day boundaries. How do you recommend I do this to keep the data continuous and not miss samples at the day boundaries?
Thanks!
-- John
_______________________________________________
webservices mailing list
webservices<at>iris.washington.edu
http://www.iris.washington.edu/mailman/listinfo/webservices
Doug Neuhauser University of California, Berkeley
doug<at>seismo.berkeley.edu Berkeley Seismological Laboratory
Office: 510-642-0931 215 McCone Hall # 4760
Fax: 510-643-5811 Berkeley, CA 94720-4760
Remote: 530-752-5615 (Wed,Fri)
-
Thanks, Doug.
That's exactly correct, in one case I'm missing a sample which should have
been at 23:59.999998. I was trying to avoid crossing day boundaries because
I understood that was more intensive processing on the DMC end. My process
for stitching together traces can handle overlap, so the brute force method
would be to just request to 00:00:01 the next day. I'd like to know if
there is a more efficient way.
-- John
On Thu, Jan 26, 2012 at 5:19 PM, Doug Neuhauser <doug<at>seismo.berkeley.edu>wrote:
One suggestion (which unfortunately changes the IRIS web service
time specification) is for time intervals to be half-open invervals,
represented in math notation as
[time1, time2)
This means the time interval where time t >= time1 and t < time2.
I believe that all IRIS services currently defined a closed interval
[time1, time2] which means the time interval where
time t >= time1 and <= time2.
Closed intervals make it very hard to request a series of
requests whose results can be concatenated to generate a
contiguous timeseries with no overlap. For day requests,
2 request for:
2007-03-03T00:00:00.0000 to 2007-03-04T00:00:00.0000
2007-03-04T00:00:00.0000 to 2007-03-05T00:00:00.0000
will contains 2 copies of a sample whose timestamp is
2007-03-04T00:00:00.0000
However, if the requests are open intervals, you will never miss a sample
or
get a duplicate sample at a request boundary.
If you are missing 1 sample at a day boundary, it could be that you are
missing a
sample timestamped between 59.999 and 00.000 seconds. If you are missing
more than one sample, there is either a timetear (or gap) in the
timeseries,
or there is a problem with the IRIS web service.
- Doug N
On 1/25/12 10:24 PM, John D. West wrote:
Hi.
--
I'm retrieving continuous data using bulkdataselect, one day at a time. A
typical request line might look like "TA X16A -- BHZ 2007-03-30T00:00:00
2007-03-30T23:59:59.999"
Using this method, I occasionally miss one or two samples at the day
boundaries. I'm under the impression that the DMC internals make it more
efficient to request on day boundaries. How do you recommend I do this to
keep the data continuous and not miss samples at the day boundaries?
Thanks!
-- John
______________________________**_________________
webservices mailing list
webservices<at>iris.washington.**edu <webservices<at>iris.washington.edu>
http://www.iris.washington.**edu/mailman/listinfo/**webserviceshttp://www.iris.washington.edu/mailman/listinfo/webservices
Doug Neuhauser University of California, Berkeley
doug<at>seismo.berkeley.edu Berkeley Seismological Laboratory
Office: 510-642-0931 215 McCone Hall # 4760
Fax: 510-643-5811 Berkeley, CA 94720-4760
Remote: 530-752-5615 (Wed,Fri)
-
Hi
Not sure if this is related, but back in November I identified a bug
where a request that asked for data till 12.999 seconds would only get
data up to 12.000, so there would be almost a second of data missing
within the request window.
Chad said that this would be addressed in the next release of the web
services, but I am not sure if that has happened or not. Chad, can you
let us know the status of this bug fix?
One thing I have done in the past to get continuous data is to make
the request for the next window begin at end time of the data returned
by the previous request. So if you ask for a day and get data ending
at 23:59:51.234 then the request for the next day would start at
23:59:51.235. Obviously requires some bookkeeping, but might be more
efficient than asking for a really small time window before moving to
the next day and also makes it pretty likely that you will not miss
data.
Philip
On Thu, Jan 26, 2012 at 5:01 AM, John D. West <john.d.west<at>asu.edu> wrote:
Thanks, Doug.
That's exactly correct, in one case I'm missing a sample which should have
been at 23:59.999998. I was trying to avoid crossing day boundaries because
I understood that was more intensive processing on the DMC end. My process
for stitching together traces can handle overlap, so the brute force method
would be to just request to 00:00:01 the next day. I'd like to know if there
is a more efficient way.
-- John
On Thu, Jan 26, 2012 at 5:19 PM, Doug Neuhauser <doug<at>seismo.berkeley.edu>
wrote:
One suggestion (which unfortunately changes the IRIS web service
_______________________________________________
time specification) is for time intervals to be half-open invervals,
represented in math notation as
[time1, time2)
This means the time interval where time t >= time1 and t < time2.
I believe that all IRIS services currently defined a closed interval
[time1, time2] which means the time interval where
time t >= time1 and <= time2.
Closed intervals make it very hard to request a series of
requests whose results can be concatenated to generate a
contiguous timeseries with no overlap. For day requests,
2 request for:
2007-03-03T00:00:00.0000 to 2007-03-04T00:00:00.0000
2007-03-04T00:00:00.0000 to 2007-03-05T00:00:00.0000
will contains 2 copies of a sample whose timestamp is
2007-03-04T00:00:00.0000
However, if the requests are open intervals, you will never miss a sample
or
get a duplicate sample at a request boundary.
If you are missing 1 sample at a day boundary, it could be that you are
missing a
sample timestamped between 59.999 and 00.000 seconds. If you are missing
more than one sample, there is either a timetear (or gap) in the
timeseries,
or there is a problem with the IRIS web service.
- Doug N
On 1/25/12 10:24 PM, John D. West wrote:
Hi.
--
I'm retrieving continuous data using bulkdataselect, one day at a time. A
typical request line might look like "TA X16A -- BHZ 2007-03-30T00:00:00
2007-03-30T23:59:59.999"
Using this method, I occasionally miss one or two samples at the day
boundaries. I'm under the impression that the DMC internals make it more
efficient to request on day boundaries. How do you recommend I do this to
keep the data continuous and not miss samples at the day boundaries?
Thanks!
-- John
_______________________________________________
webservices mailing list
webservices<at>iris.washington.edu
http://www.iris.washington.edu/mailman/listinfo/webservices
Doug Neuhauser University of California, Berkeley
doug<at>seismo.berkeley.edu Berkeley Seismological Laboratory
Office: 510-642-0931 215 McCone Hall # 4760
Fax: 510-643-5811 Berkeley, CA 94720-4760
Remote: 530-752-5615 (Wed,Fri)
webservices mailing list
webservices<at>iris.washington.edu
http://www.iris.washington.edu/mailman/listinfo/webservices
-
Hi John,
The times are inclusive, as you've figured out, such that that any sample occurring on the time specified will be included in output. We do not anticipate changing this logic.
Your request is exactly the right way to select a whole day, the problem is that the service currently only supports millisecond resolution. We will be updating the service to support microsecond resolution to match the resolution supported by the underlying miniSEED format. After which you should be able to request, for example:
start: 2007-03-30T00:00:00.000000
end: 2007-03-30T23:59:59.999999
and always get the entire day. Thanks for bringing this to light. It'll might be a week or two before we roll out an update.
Request time ranges using the day boundaries would work, but this is not preferable because you must deal with any overlap and it does add a bit more load for the DMC request mechanism.
Chad
On Jan 26, 2012, at 2:01 AM, John D. West wrote:
Thanks, Doug.
That's exactly correct, in one case I'm missing a sample which should have been at 23:59.999998. I was trying to avoid crossing day boundaries because I understood that was more intensive processing on the DMC end. My process for stitching together traces can handle overlap, so the brute force method would be to just request to 00:00:01 the next day. I'd like to know if there is a more efficient way.
-- John
On Thu, Jan 26, 2012 at 5:19 PM, Doug Neuhauser <doug<at>seismo.berkeley.edu> wrote:
One suggestion (which unfortunately changes the IRIS web service
time specification) is for time intervals to be half-open invervals,
represented in math notation as
[time1, time2)
This means the time interval where time t >= time1 and t < time2.
I believe that all IRIS services currently defined a closed interval
[time1, time2] which means the time interval where
time t >= time1 and <= time2.
Closed intervals make it very hard to request a series of
requests whose results can be concatenated to generate a
contiguous timeseries with no overlap. For day requests,
2 request for:
2007-03-03T00:00:00.0000 to 2007-03-04T00:00:00.0000
2007-03-04T00:00:00.0000 to 2007-03-05T00:00:00.0000
will contains 2 copies of a sample whose timestamp is 2007-03-04T00:00:00.0000
However, if the requests are open intervals, you will never miss a sample or
get a duplicate sample at a request boundary.
If you are missing 1 sample at a day boundary, it could be that you are missing a
sample timestamped between 59.999 and 00.000 seconds. If you are missing
more than one sample, there is either a timetear (or gap) in the timeseries,
or there is a problem with the IRIS web service.
- Doug N
On 1/25/12 10:24 PM, John D. West wrote:
Hi.
I'm retrieving continuous data using bulkdataselect, one day at a time. A typical request line might look like "TA X16A -- BHZ 2007-03-30T00:00:00 2007-03-30T23:59:59.999"
Using this method, I occasionally miss one or two samples at the day boundaries. I'm under the impression that the DMC internals make it more efficient to request on day boundaries. How do you recommend I do this to keep the data continuous and not miss samples at the day boundaries?
Thanks!
-- John
_______________________________________________
webservices mailing list
webservices<at>iris.washington.edu
http://www.iris.washington.edu/mailman/listinfo/webservices
--
Doug Neuhauser University of California, Berkeley
doug<at>seismo.berkeley.edu Berkeley Seismological Laboratory
Office: 510-642-0931 215 McCone Hall # 4760
Fax: 510-643-5811 Berkeley, CA 94720-4760
Remote: 530-752-5615 (Wed,Fri)
_______________________________________________
webservices mailing list
webservices<at>iris.washington.edu
http://www.iris.washington.edu/mailman/listinfo/webservices
-
-