Thread: time order in event service

Started: 2013-03-14 20:44:47
Last activity: 2013-03-14 22:12:17
Topics: Web Services
Philip Crotwell
2013-03-14 20:44:47
Hi

So, the event web server returns events in "descending" time order. That
makes sense for a person looking at a listing. But a system like SOD that
does processing on events, it is much more natural to do the processing in
natural time order, ie ascending. Obviously I can flip the order on the
client side, so maybe a minor quibble, but it is additional work and it
makes it harder to take advantage of the limit/offset functionality to step
through a large number of events. Just my $0.02.

On that point, if I am running a series of queries to the event server
using limit and offset and it just happens to bridge the insertion of new
data into your event database, will I have missing or duplicate events? In
other words, if I care very much about a consistent batch of events, should
I stay away from limit and offset and manually batch the queries by time
for example?

thanks
Philip

  • Chad Trabant
    2013-03-14 18:54:52

    Hi Philip,

    So, the event web server returns events in "descending" time order. That makes sense for a person looking at a listing. But a system like SOD that does processing on events, it is much more natural to do the processing in natural time order, ie ascending. Obviously I can flip the order on the client side, so maybe a minor quibble, but it is additional work and it makes it harder to take advantage of the limit/offset functionality to step through a large number of events. Just my $0.02.

    Our soon-to-be released FDSN event service (see a pattern here?), will support ordering the results by time and magnitude both descending and ascending.

    On that point, if I am running a series of queries to the event server using limit and offset and it just happens to bridge the insertion of new data into your event database, will I have missing or duplicate events?

    Yes, either is possible. Since these requests are stateless there is no way for our service to know what any given client got in some earlier, count-limited request; translated this means there is no way to provide bite-size chunks (count-limited) of a larger data selection from an always changing repository of event information and guarantee that the results are the same as if they were requested en masse. But most of the time it works because the amount of time needed to get count-limited pieces is small compared to the time between updates of the event information.

    In other words, if I care very much about a consistent batch of events, should I stay away from limit and offset and manually batch the queries by time for example?

    Yes, this will eliminate the duplication and the missing events from our current data set possibilities.

    As you probably know you can keep any such data set in sync with future updates to the DMC's data set by using the 'updatedafter' parameter to search only for events that have changed.

    Chad



    • Philip Crotwell
      2013-03-14 22:12:17
      Cool, thanks. Yes, pattern is observed! :)

      My strategy will likely be to "batch" queries using limit and then use the
      time of the last returned event as the start time of the next query. Even
      that might stumble, but seems less likely than a raw count from beginning.
      I will also probably do this stepping only on the first pass through the
      time window of interest, and then once I am in "waiting for new events"
      mode, will make use of the updatedafter.

      BTW, updatedafter is a really, really nice feature! Thanks for making that
      available.

      thanks,
      Philip


      On Thu, Mar 14, 2013 at 2:54 PM, Chad Trabant <chad<at>iris.washington.edu>wrote:


      Hi Philip,

      So, the event web server returns events in "descending" time order. That
      makes sense for a person looking at a listing. But a system like SOD that
      does processing on events, it is much more natural to do the processing in
      natural time order, ie ascending. Obviously I can flip the order on the
      client side, so maybe a minor quibble, but it is additional work and it
      makes it harder to take advantage of the limit/offset functionality to step
      through a large number of events. Just my $0.02.

      Our soon-to-be released FDSN event service (see a pattern here?), will
      support ordering the results by time and magnitude both descending and
      ascending.

      On that point, if I am running a series of queries to the event server
      using limit and offset and it just happens to bridge the insertion of new
      data into your event database, will I have missing or duplicate events?

      Yes, either is possible. Since these requests are stateless there is no
      way for our service to know what any given client got in some earlier,
      count-limited request; translated this means there is no way to provide
      bite-size chunks (count-limited) of a larger data selection from an always
      changing repository of event information and guarantee that the results are
      the same as if they were requested en masse. But most of the time it works
      because the amount of time needed to get count-limited pieces is small
      compared to the time between updates of the event information.

      In other words, if I care very much about a consistent batch of events,
      should I stay away from limit and offset and manually batch the queries by
      time for example?

      Yes, this will eliminate the duplication and the missing events from our
      current data set possibilities.

      As you probably know you can keep any such data set in sync with future
      updates to the DMC's data set by using the 'updatedafter' parameter to
      search only for events that have changed.

      Chad


      _______________________________________________
      webservices mailing list
      webservices<at>iris.washington.edu
      http://www.iris.washington.edu/mailman/listinfo/webservices


01:14:34 v.22510d55