Thread: 204/404 vs empty xml doc

Started: 2013-04-03 00:11:13
Last activity: 2013-04-03 14:43:06
Topics: Web Services
Philip Crotwell
2013-04-03 00:11:13
Hi

Following up on my inability to catch a 204, can you explain the
rationalization for using a 204 instead of returning an empty, but
structurally correct, quakeml document for a query that doesn't match
anything. For example is I ask for a time window and magnitude range that
doesn't match any earthquake, send back this:

<q:quakeml xmlns:q="http://quakeml.org/xmlns/quakeml/1.2" xmlns="
http://quakeml.org/xmlns/bed/1.2">
<eventParameters publicID="smi:service.iris.edu/fdsnws/event/1/query">
</eventParameters>
</q:quakeml>

To be honest, I would rather have an empty XML document returned in the
case where the query is well formed and valid, but just so happens that
nothing in the database matched my query. HTTP error codes make it sound
like there was an error, but that is not really the case here. There wasn't
any data and so an empty xml document is a fine thing to return.

I had a quick read of the http spec, and it doesn't really sound to me like
a 204 is actually meant to mean, "sorry, no data", but rather is to be used
in cases where there is some communications efficiencies to be had by
avoiding a "entity body" and a update of the document view.

I guess I just don't see any advantage of 204 being the default for "no
data" for these xml web services, especially when it will almost certainly
cause confusion given the way browsers handle it, ie leaving the old page
content on the display.

$0.02
Philip

  • Chad Trabant
    2013-04-02 22:16:47

    Hi Philip,

    The rationalization goes like this: the FDSN services are all consistent, a properly formatted request that does not match any data will return a 204. It is not possible to return an empty container for all data types, for example there is no empty miniSEED structure. The DMC's services also return simple text responses for which there is no obvious "empty" container. As for an empty response with nothing in the body (and a status code of 200), it is virtually identical to a 204, in fact they are logically the same for our purposes.

    Clients should be checking the HTTP response codes for every request. Detecting a 204 allows the client to short-circuit the processing of a response and avoid sending it to the parser. In my opinion it is bad form to blindly send the response of a web service to a parser (XML or otherwise) without checking the HTTP response, otherwise you risk sending garbage (error messages, etc.) to the parser, and then there are security concerns. Feeding an effectively empty document to a parser so it can tell you that it is empty is unnecessary and can be avoided by checking for 204. There are some subtleties to this issue such as non-obvious behavior for web browsers, more on that below.

    HTTP status codes are for status in general and not just errors, the whole 2xx class of codes are distinctly not errors. As for our other services that return 404, yes they look like errors and yes they can be confused for many other types of errors (bad URL, down for maintenance, etc.), which is why we are moving away from 404 as a default.

    There is some disagreement on when exactly to use 204s. The efficiencies are not what we were really targeting, they are minimal for what we are doing. The point is that sending an empty container is 1) not possible in all cases and 2) pointless when there is a code (that you should be checking) that is logically the same.

    Regarding browsers, I assume you mean manually-driven web browsers like Firefox, Crome, etc. The vast, vast majority of data requests to our services are not web browsers. But there is a confusing interaction when you are using a browser with our services, in particular we see this with our "URL Builders". So our services support a nodata parameter that can be set to 204 or 404. By adding "nodata=404" to your request our FDSN services will return a 404 instead of a 204 in order to clearly show errors in a web browser.

    Chad

    On Apr 2, 2013, at 2:11 PM, Philip Crotwell <crotwell<at>seis.sc.edu> wrote:


    Hi

    Following up on my inability to catch a 204, can you explain the rationalization for using a 204 instead of returning an empty, but structurally correct, quakeml document for a query that doesn't match anything. For example is I ask for a time window and magnitude range that doesn't match any earthquake, send back this:

    <q:quakeml xmlns:q="http://quakeml.org/xmlns/quakeml/1.2" xmlns="http://quakeml.org/xmlns/bed/1.2">
    <eventParameters publicID="smi:service.iris.edu/fdsnws/event/1/query">
    </eventParameters>
    </q:quakeml>

    To be honest, I would rather have an empty XML document returned in the case where the query is well formed and valid, but just so happens that nothing in the database matched my query. HTTP error codes make it sound like there was an error, but that is not really the case here. There wasn't any data and so an empty xml document is a fine thing to return.

    I had a quick read of the http spec, and it doesn't really sound to me like a 204 is actually meant to mean, "sorry, no data", but rather is to be used in cases where there is some communications efficiencies to be had by avoiding a "entity body" and a update of the document view.

    I guess I just don't see any advantage of 204 being the default for "no data" for these xml web services, especially when it will almost certainly cause confusion given the way browsers handle it, ie leaving the old page content on the display.

    $0.02
    Philip



    _______________________________________________
    webservices mailing list
    webservices<at>iris.washington.edu
    http://www.iris.washington.edu/mailman/listinfo/webservices


    • Philip Crotwell
      2013-04-03 06:32:14
      OK, I'll buy the error code vs empty xml. Thanks for the explanation. I had
      not thought of the empty miniseed or text file case.

      I will obviously check http result codes in my code, it is more the case of
      "hum, why didn't that work, I'll paste the url into my browser and..."
      where it gets weird. This case, although "no content" seems reasonable,
      ends up being unsettling to the user acting via a browser since the page
      doesn't change content (as mandated by the spec).

      Looking into things, I wonder if the 205 might be better than the 204. Both
      seem to mean "no content" but the 205 says "reset the document view". The
      distinction doesn't matter much in the case of client codes, but would be
      much more informative in the browser case. I understand that is a small
      minority of requests, but is one the is used. I have not tested, so I am
      not sure if 205 would be any better.

      Bottom line I will probably just package nodata=404 on all my requests, if
      only to avoid confusing myself repeatedly.

      thanks,
      Philip





      On Tue, Apr 2, 2013 at 6:16 PM, Chad Trabant <chad<at>iris.washington.edu>wrote:


      Hi Philip,

      The rationalization goes like this: the FDSN services are all consistent,
      a properly formatted request that does not match any data will return a
      204. It is not possible to return an empty container for all data types,
      for example there is no empty miniSEED structure. The DMC's services also
      return simple text responses for which there is no obvious "empty"
      container. As for an empty response with nothing in the body (and a status
      code of 200), it is virtually identical to a 204, in fact they are
      logically the same for our purposes.

      Clients should be checking the HTTP response codes for every request.
      Detecting a 204 allows the client to short-circuit the processing of a
      response and avoid sending it to the parser. In my opinion it is bad form
      to blindly send the response of a web service to a parser (XML or
      otherwise) without checking the HTTP response, otherwise you risk sending
      garbage (error messages, etc.) to the parser, and then there are security
      concerns. Feeding an effectively empty document to a parser so it can tell
      you that it is empty is unnecessary and can be avoided by checking for 204.
      There are some subtleties to this issue such as non-obvious behavior for
      web browsers, more on that below.

      HTTP status codes are for status in general and not just errors, the whole
      2xx class of codes are distinctly not errors. As for our other services
      that return 404, yes they look like errors and yes they can be confused for
      many other types of errors (bad URL, down for maintenance, etc.), which is
      why we are moving away from 404 as a default.

      There is some disagreement on when exactly to use 204s. The efficiencies
      are not what we were really targeting, they are minimal for what we are
      doing. The point is that sending an empty container is 1) not possible in
      all cases and 2) pointless when there is a code (that you should be
      checking) that is logically the same.

      Regarding browsers, I assume you mean manually-driven web browsers like
      Firefox, Crome, etc. The vast, vast majority of data requests to our
      services are not web browsers. But there is a confusing interaction when
      you are using a browser with our services, in particular we see this with
      our "URL Builders". So our services support a *nodata* parameter that
      can be set to *204* or *404*. By adding "nodata=404" to your request our
      FDSN services will return a 404 instead of a 204 in order to clearly show
      errors in a web browser.

      Chad

      On Apr 2, 2013, at 2:11 PM, Philip Crotwell <crotwell<at>seis.sc.edu> wrote:


      Hi

      Following up on my inability to catch a 204, can you explain the
      rationalization for using a 204 instead of returning an empty, but
      structurally correct, quakeml document for a query that doesn't match
      anything. For example is I ask for a time window and magnitude range that
      doesn't match any earthquake, send back this:

      <q:quakeml xmlns:q="http://quakeml.org/xmlns/quakeml/1.2" xmlns="
      http://quakeml.org/xmlns/bed/1.2">
      <eventParameters publicID="smi:service.iris.edu/fdsnws/event/1/query">
      </eventParameters>
      </q:quakeml>

      To be honest, I would rather have an empty XML document returned in the
      case where the query is well formed and valid, but just so happens that
      nothing in the database matched my query. HTTP error codes make it sound
      like there was an error, but that is not really the case here. There wasn't
      any data and so an empty xml document is a fine thing to return.

      I had a quick read of the http spec, and it doesn't really sound to me
      like a 204 is actually meant to mean, "sorry, no data", but rather is to be
      used in cases where there is some communications efficiencies to be had by
      avoiding a "entity body" and a update of the document view.

      I guess I just don't see any advantage of 204 being the default for "no
      data" for these xml web services, especially when it will almost certainly
      cause confusion given the way browsers handle it, ie leaving the old page
      content on the display.

      $0.02
      Philip



      _______________________________________________
      webservices mailing list
      webservices<at>iris.washington.edu
      http://www.iris.washington.edu/mailman/listinfo/webservices



      _______________________________________________
      webservices mailing list
      webservices<at>iris.washington.edu
      http://www.iris.washington.edu/mailman/listinfo/webservices



      • Chad Trabant
        2013-04-03 06:44:49

        Hi Philip,

        There are downsides to using a 404 to indicate "successful request, but no data", namely that a 404 is used for many legitimate error conditions. For example, if you mistype anything in the URL you'll get a 404, if we goof something on our end such as a network path configuration error you'll get a 404. A 404 can be generated by load balancers and other things that proxy connections, particularly during error conditions. In the end the client cannot tell the difference between an actual error and a "successful request, but no data".

        The only real issue with a 204 is the usage with a browser and not resetting the view. A 205 has no mention of 'no content' that I read, doesn't seem like a good match for "successful request, but no data". This is an academic discussion at this point though, the FDSN specification is cooked at this point. It can change in a future version of the spec of course, but until it changes we will be sticking to a default of 204.

        Chad

        On Apr 2, 2013, at 8:32 PM, Philip Crotwell <crotwell<at>seis.sc.edu> wrote:

        OK, I'll buy the error code vs empty xml. Thanks for the explanation. I had not thought of the empty miniseed or text file case.

        I will obviously check http result codes in my code, it is more the case of "hum, why didn't that work, I'll paste the url into my browser and..." where it gets weird. This case, although "no content" seems reasonable, ends up being unsettling to the user acting via a browser since the page doesn't change content (as mandated by the spec).

        Looking into things, I wonder if the 205 might be better than the 204. Both seem to mean "no content" but the 205 says "reset the document view". The distinction doesn't matter much in the case of client codes, but would be much more informative in the browser case. I understand that is a small minority of requests, but is one the is used. I have not tested, so I am not sure if 205 would be any better.

        Bottom line I will probably just package nodata=404 on all my requests, if only to avoid confusing myself repeatedly.

        thanks,
        Philip


        On Tue, Apr 2, 2013 at 6:16 PM, Chad Trabant <chad<at>iris.washington.edu> wrote:

        Hi Philip,

        The rationalization goes like this: the FDSN services are all consistent, a properly formatted request that does not match any data will return a 204. It is not possible to return an empty container for all data types, for example there is no empty miniSEED structure. The DMC's services also return simple text responses for which there is no obvious "empty" container. As for an empty response with nothing in the body (and a status code of 200), it is virtually identical to a 204, in fact they are logically the same for our purposes.

        Clients should be checking the HTTP response codes for every request. Detecting a 204 allows the client to short-circuit the processing of a response and avoid sending it to the parser. In my opinion it is bad form to blindly send the response of a web service to a parser (XML or otherwise) without checking the HTTP response, otherwise you risk sending garbage (error messages, etc.) to the parser, and then there are security concerns. Feeding an effectively empty document to a parser so it can tell you that it is empty is unnecessary and can be avoided by checking for 204. There are some subtleties to this issue such as non-obvious behavior for web browsers, more on that below.

        HTTP status codes are for status in general and not just errors, the whole 2xx class of codes are distinctly not errors. As for our other services that return 404, yes they look like errors and yes they can be confused for many other types of errors (bad URL, down for maintenance, etc.), which is why we are moving away from 404 as a default.

        There is some disagreement on when exactly to use 204s. The efficiencies are not what we were really targeting, they are minimal for what we are doing. The point is that sending an empty container is 1) not possible in all cases and 2) pointless when there is a code (that you should be checking) that is logically the same.

        Regarding browsers, I assume you mean manually-driven web browsers like Firefox, Crome, etc. The vast, vast majority of data requests to our services are not web browsers. But there is a confusing interaction when you are using a browser with our services, in particular we see this with our "URL Builders". So our services support a nodata parameter that can be set to 204 or 404. By adding "nodata=404" to your request our FDSN services will return a 404 instead of a 204 in order to clearly show errors in a web browser.

        Chad

        On Apr 2, 2013, at 2:11 PM, Philip Crotwell <crotwell<at>seis.sc.edu> wrote:


        Hi

        Following up on my inability to catch a 204, can you explain the rationalization for using a 204 instead of returning an empty, but structurally correct, quakeml document for a query that doesn't match anything. For example is I ask for a time window and magnitude range that doesn't match any earthquake, send back this:

        <q:quakeml xmlns:q="http://quakeml.org/xmlns/quakeml/1.2" xmlns="http://quakeml.org/xmlns/bed/1.2">
        <eventParameters publicID="smi:service.iris.edu/fdsnws/event/1/query">
        </eventParameters>
        </q:quakeml>

        To be honest, I would rather have an empty XML document returned in the case where the query is well formed and valid, but just so happens that nothing in the database matched my query. HTTP error codes make it sound like there was an error, but that is not really the case here. There wasn't any data and so an empty xml document is a fine thing to return.

        I had a quick read of the http spec, and it doesn't really sound to me like a 204 is actually meant to mean, "sorry, no data", but rather is to be used in cases where there is some communications efficiencies to be had by avoiding a "entity body" and a update of the document view.

        I guess I just don't see any advantage of 204 being the default for "no data" for these xml web services, especially when it will almost certainly cause confusion given the way browsers handle it, ie leaving the old page content on the display.

        $0.02
        Philip



        _______________________________________________
        webservices mailing list
        webservices<at>iris.washington.edu
        http://www.iris.washington.edu/mailman/listinfo/webservices


        _______________________________________________
        webservices mailing list
        webservices<at>iris.washington.edu
        http://www.iris.washington.edu/mailman/listinfo/webservices


        _______________________________________________
        webservices mailing list
        webservices<at>iris.washington.edu
        http://www.iris.washington.edu/mailman/listinfo/webservices


        • Philip Crotwell
          2013-04-03 14:43:06
          Understand about overuse of 404. My prediction though, is that some users
          (not to mention names, but me) will hit a troublesome URL, try pasting it
          into a browser, have the 204 non-effect, and complain to you. You will
          patently explain the 204 logic and all will be happy. Then 6 months later,
          said user (not to mention names, but me) will hit another troublesome URL
          and having forgotten the whole 204 conversation, will paste it into the
          browser, have the 204 non-effect and come back complaining to you. Rinse,
          lather repeat. :)

          But, given the fdsn spec is basted, broiled and deep fat fried, if you are
          ok with hopelessly confused users whinning on the email list, I will do my
          best to remember the 204.

          Philip


          On Wed, Apr 3, 2013 at 2:44 AM, Chad Trabant <chad<at>iris.washington.edu>wrote:


          Hi Philip,

          There are downsides to using a 404 to indicate "successful request, but no
          data", namely that a 404 is used for many legitimate error conditions. For
          example, if you mistype anything in the URL you'll get a 404, if we goof
          something on our end such as a network path configuration error you'll get
          a 404. A 404 can be generated by load balancers and other things that
          proxy connections, particularly during error conditions. In the end the
          client cannot tell the difference between an actual error and a "successful
          request, but no data".

          The only real issue with a 204 is the usage with a browser and not
          resetting the view. A 205 has no mention of 'no content' that I read,
          doesn't seem like a good match for "successful request, but no data". This
          is an academic discussion at this point though, the FDSN specification is
          cooked at this point. It can change in a future version of the spec of
          course, but until it changes we will be sticking to a default of 204.

          Chad

          On Apr 2, 2013, at 8:32 PM, Philip Crotwell <crotwell<at>seis.sc.edu> wrote:

          OK, I'll buy the error code vs empty xml. Thanks for the explanation. I
          had not thought of the empty miniseed or text file case.

          I will obviously check http result codes in my code, it is more the case
          of "hum, why didn't that work, I'll paste the url into my browser and..."
          where it gets weird. This case, although "no content" seems reasonable,
          ends up being unsettling to the user acting via a browser since the page
          doesn't change content (as mandated by the spec).

          Looking into things, I wonder if the 205 might be better than the 204.
          Both seem to mean "no content" but the 205 says "reset the document view".
          The distinction doesn't matter much in the case of client codes, but would
          be much more informative in the browser case. I understand that is a small
          minority of requests, but is one the is used. I have not tested, so I am
          not sure if 205 would be any better.

          Bottom line I will probably just package nodata=404 on all my requests, if
          only to avoid confusing myself repeatedly.

          thanks,
          Philip


          On Tue, Apr 2, 2013 at 6:16 PM, Chad Trabant <chad<at>iris.washington.edu>wrote:


          Hi Philip,

          The rationalization goes like this: the FDSN services are all consistent,
          a properly formatted request that does not match any data will return a
          204. It is not possible to return an empty container for all data types,
          for example there is no empty miniSEED structure. The DMC's services also
          return simple text responses for which there is no obvious "empty"
          container. As for an empty response with nothing in the body (and a status
          code of 200), it is virtually identical to a 204, in fact they are
          logically the same for our purposes.

          Clients should be checking the HTTP response codes for every request.
          Detecting a 204 allows the client to short-circuit the processing of a
          response and avoid sending it to the parser. In my opinion it is bad form
          to blindly send the response of a web service to a parser (XML or
          otherwise) without checking the HTTP response, otherwise you risk sending
          garbage (error messages, etc.) to the parser, and then there are security
          concerns. Feeding an effectively empty document to a parser so it can tell
          you that it is empty is unnecessary and can be avoided by checking for 204.
          There are some subtleties to this issue such as non-obvious behavior for
          web browsers, more on that below.

          HTTP status codes are for status in general and not just errors, the
          whole 2xx class of codes are distinctly not errors. As for our other
          services that return 404, yes they look like errors and yes they can be
          confused for many other types of errors (bad URL, down for maintenance,
          etc.), which is why we are moving away from 404 as a default.

          There is some disagreement on when exactly to use 204s. The efficiencies
          are not what we were really targeting, they are minimal for what we are
          doing. The point is that sending an empty container is 1) not possible in
          all cases and 2) pointless when there is a code (that you should be
          checking) that is logically the same.

          Regarding browsers, I assume you mean manually-driven web browsers like
          Firefox, Crome, etc. The vast, vast majority of data requests to our
          services are not web browsers. But there is a confusing interaction when
          you are using a browser with our services, in particular we see this with
          our "URL Builders". So our services support a *nodata* parameter that
          can be set to *204* or *404*. By adding "nodata=404" to your request
          our FDSN services will return a 404 instead of a 204 in order to clearly
          show errors in a web browser.

          Chad

          On Apr 2, 2013, at 2:11 PM, Philip Crotwell <crotwell<at>seis.sc.edu> wrote:


          Hi

          Following up on my inability to catch a 204, can you explain the
          rationalization for using a 204 instead of returning an empty, but
          structurally correct, quakeml document for a query that doesn't match
          anything. For example is I ask for a time window and magnitude range that
          doesn't match any earthquake, send back this:

          <q:quakeml xmlns:q="http://quakeml.org/xmlns/quakeml/1.2" xmlns="
          http://quakeml.org/xmlns/bed/1.2">
          <eventParameters publicID="smi:service.iris.edu/fdsnws/event/1/query">
          </eventParameters>
          </q:quakeml>

          To be honest, I would rather have an empty XML document returned in the
          case where the query is well formed and valid, but just so happens that
          nothing in the database matched my query. HTTP error codes make it sound
          like there was an error, but that is not really the case here. There wasn't
          any data and so an empty xml document is a fine thing to return.

          I had a quick read of the http spec, and it doesn't really sound to me
          like a 204 is actually meant to mean, "sorry, no data", but rather is to be
          used in cases where there is some communications efficiencies to be had by
          avoiding a "entity body" and a update of the document view.

          I guess I just don't see any advantage of 204 being the default for "no
          data" for these xml web services, especially when it will almost certainly
          cause confusion given the way browsers handle it, ie leaving the old page
          content on the display.

          $0.02
          Philip



          _______________________________________________
          webservices mailing list
          webservices<at>iris.washington.edu
          http://www.iris.washington.edu/mailman/listinfo/webservices



          _______________________________________________
          webservices mailing list
          webservices<at>iris.washington.edu
          http://www.iris.washington.edu/mailman/listinfo/webservices


          _______________________________________________
          webservices mailing list
          webservices<at>iris.washington.edu
          http://www.iris.washington.edu/mailman/listinfo/webservices



          _______________________________________________
          webservices mailing list
          webservices<at>iris.washington.edu
          http://www.iris.washington.edu/mailman/listinfo/webservices



16:24:57 v.b3198453