Thread: caching in dataselect and tracedsp

Started: 2012-09-25 18:17:55
Last activity: 2012-09-25 19:05:55
Topics: Web Services
John West
2012-09-25 18:17:55
Hello.

I am working on code that retrieves corrected traces from the DMC using the
dataselect and tracedsp web services. I notice that a second retrieval of
the same data is much, much faster, presumably because the data is being
cached by the DMC web services. I am retrieving broadband 40Hz data in
60-hour increments.

Do these web services cache more data than I ask for? Can I speed up my
requests by either sequentially asking for data for the same channel (i.e.,
ask for 60 hours of BHZ for a station, then ask for the next 60 hours,
etc.) or by asking for multiple traces for the same time period (i.e., 60
hours of BHZ, then 60 hours of BHE & BHN for the same station & time span)?

Thanks!

-- John

  • Bruce Weertman
    2012-09-25 18:41:08
    John:

    Good to hear from you.

    A few points about caching:

    * Yes ws-dataselect and ws-tracedsp do cache the data.

    * The cache is very specific to the request that you make.

    If you asked for some time period from some channel and then asked for exactly the same
    time period from the same channel, you would hit the cache and it should return much faster.
    If the second request's time range was just a fraction of a second different than there first's, you would not
    as there would then be two objects in the cache. This is because the 'token' to the cached objects
    are generated from hashes of request parameters. Changing a request in just the slightest
    way will generate a completely different hash.

    * ws-bulkdataselect does not cache data.

    * The underlying NFS filesystem which holds the archive and everything else we do here and the DMC
    does some caching of it's own. Going to datasets that are close to each other can result in sped up
    requests as a result. Performance may vary depending on many different factors including
    system load and how close to each other subsequent queries are.


    Hope that helps.

    Cheers,
    -Bruce



    On Sep 25, 2012, at 11:17 AM, John D. West wrote:

    Hello.

    I am working on code that retrieves corrected traces from the DMC using the dataselect and tracedsp web services. I notice that a second retrieval of the same data is much, much faster, presumably because the data is being cached by the DMC web services. I am retrieving broadband 40Hz data in 60-hour increments.

    Do these web services cache more data than I ask for? Can I speed up my requests by either sequentially asking for data for the same channel (i.e., ask for 60 hours of BHZ for a station, then ask for the next 60 hours, etc.) or by asking for multiple traces for the same time period (i.e., 60 hours of BHZ, then 60 hours of BHE & BHN for the same station & time span)?

    Thanks!

    -- John
    _______________________________________________
    webservices mailing list
    webservices<at>iris.washington.edu
    http://www.iris.washington.edu/mailman/listinfo/webservices



    • John West
      2012-09-25 19:05:55
      That's very helpful and just what I needed to know. Thanks, Bruce!

      -- John


      On Tue, Sep 25, 2012 at 11:41 AM, Bruce Weertman
      <bruce<at>iris.washington.edu>wrote:

      John:

      Good to hear from you.

      A few points about caching:

      * Yes ws-dataselect and ws-tracedsp do cache the data.

      * The cache is very specific to the request that you make.

      If you asked for some time period from some channel and then asked for
      exactly the same
      time period from the same channel, you would hit the cache and it should
      return much faster.
      If the second request's time range was just a fraction of a second
      different than there first's, you would not
      as there would then be two objects in the cache. This is because the
      'token' to the cached objects
      are generated from hashes of request parameters. Changing a request in
      just the slightest
      way will generate a completely different hash.

      * ws-bulkdataselect does not cache data.

      * The underlying NFS filesystem which holds the archive and everything
      else we do here and the DMC
      does some caching of it's own. Going to datasets that are close to each
      other can result in sped up
      requests as a result. Performance may vary depending on many different
      factors including
      system load and how close to each other subsequent queries are.


      Hope that helps.

      Cheers,
      -Bruce



      On Sep 25, 2012, at 11:17 AM, John D. West wrote:

      Hello.

      I am working on code that retrieves corrected traces from the DMC using
      the dataselect and tracedsp web services. I notice that a second retrieval
      of the same data is much, much faster, presumably because the data is being
      cached by the DMC web services. I am retrieving broadband 40Hz data in
      60-hour increments.

      Do these web services cache more data than I ask for? Can I speed up my
      requests by either sequentially asking for data for the same channel (i.e.,
      ask for 60 hours of BHZ for a station, then ask for the next 60 hours,
      etc.) or by asking for multiple traces for the same time period (i.e., 60
      hours of BHZ, then 60 hours of BHE & BHN for the same station & time span)?

      Thanks!

      -- John
      _______________________________________________
      webservices mailing list
      webservices<at>iris.washington.edu
      http://www.iris.washington.edu/mailman/listinfo/webservices



11:48:06 v.01697673