[webservices] some comments on new staitonXML
Philip Crotwell
crotwell at seis.sc.edu
Thu Oct 27 09:20:15 PDT 2011
HI
Time has gotten away from me, but I will try and read the new schema
more carefully soon and send in any more comments I have. But do have
three so far.
First is, please consider some sort of versioning in the schema. I
believe I posted a comment some time back about this, but for an xml
schema to be useful, you need to be able to pair an instance XML
document with the schema that it uses. My understanding is that this
is usually accomplished by adding a version of some kind to the
namespace url. Currently the namespace for stationxml is:
http://www.data.scec.org/xml/station/
and as far as I can tell has always been that in spite of several
revisions. This means that if I have two stationXML instances, one
from yesterday and one from the day after you release this new schema,
there will be no way for an application to decide which schema to use
to validate the document. You might want to change the namespace to be
something like:
http://www.data.scec.org/xml/station/2.0
or
http://www.data.scec.org/xml/station/2011
and then store the appropriate schema file within that directory on
the web server for easy access, ie at
http://www.data.scec.org/xml/station/2011/station.xsd
Along with that, old versions of the schema should be kept online so
that old xml instances can still be validated against their version of
the schema.
There are probably other ways of versioning, so perhaps look around,
but please, please use some type of version. There needs to be
something in both the stationxml schema and in a stationxml instance
document that give the appropriate versions of the schema to use for a
client to validate and parse against.
Second, just my opinion and I have not looked at what you have done
with using <any> elements in the schema, but I would be very careful
about this. While the notion of an "any" is powerful and seems to
allow great flexibility, it has a real downside because the contents
in the "any" is no longer stationxml and hence harder to parse and
extract without additional information. It appears you are using the
any in a very limited manner, so you have probably already considered
this, but just wanted to sound a warning.
My third comment is probably not likely to happen, but I will put it
out there anyway, consider using relaxng instead of xschema. Relaxng
is so much nicer to read and has features that xschema lacks that are
really useful. For example, relaxng has the notion "interleave"
elements, so you can specify that the contents of an Network element,
for example, has to contain a <startDate>, an <endDate> and a
<description>, but that the order does not matter. This is more
natural way to think of the concept of a "network" as opposed to a
xschema <sequence> where order is required for validity. Of course
order does matter some times, but for the bulk of data-centric xml,
the order requirement is simply irrelevant, and yet the schema
requires it to be matched for validation. This has come up recently
because the IRIS ws/station web service generates xml with
<SelectedNumberStations> coming after the <Station> elements. I can
deal with the out of order, but it causes validation errors that
preclude me validating the output of the station web service to check
for other more serious validation issues. As I said, I realize
switching from xschema to relaxng would be a big change, but thought I
might as well toss the idea into the ring. I would be willing to help
with this should you choose to make the translation.
More info on relaxNG here:
relaxng.org
and
http://books.xmlschemata.org/relaxng/
thanks,
Philip
On Fri, Oct 21, 2011 at 11:55 AM, Ellen Yu <eyu at gps.caltech.edu> wrote:
> Phillip,
>
> We are hoping that we will only need to make minor revisions. We definitely
> would like any feedback as we realize there may be unforeseen issues that
> only will be shook out as people start to use the format.
>
> We are going to release a new version that has <any> elements to allow
> people to add additional information not included in StationXML. You can
> find it at
> http://www.data.scec.org/xml/station/20111019/station.xsd
>
>
> Regards,
>
> Ellen
>
More information about the webservices
mailing list