[webservices] ws_station network identifier

John D. West john.d.west at asu.edu
Wed Jun 15 10:31:26 PDT 2011


Hi, Philip.

Thanks for all of the info. I'm working on a set of rules on handling such
updates and would like your thoughts on them when I'm done. It seems clear
that there will always be exceptions, so I think EMERALD should include a
way to automatically disseminate corrections when needed.

Incidentally, I'm a big believer in numeric surrogate primary keys on
database tables and use them throughout EMERALD.

Thanks!

     -- John


On Tue, Jun 14, 2011 at 6:08 PM, Philip Crotwell <crotwell at seis.sc.edu>wrote:

> Hi John
>
> I have had more than a few headaches along the lines of what you are
> describing. There is good news and bad news from my experiences. The
> good news is that mostly you can use the network code alone for
> permanent networks and network code and begin year for temporary
> networks, ie BK and XA2007 are mostly unique and fixed. The bad news
> is that even this is only "mostly" a unique identifier. In general I
> think the permanent network codes are single and unique and temporary
> network codes are issued for a given begin year while they may be
> extended, ie end date change, it would be really weird for the begin
> date to change.
>
> You should NOT use the begin date as part of the key for permanent
> networks as those have changed over the years. A some point in the
> past the begin time for permanent networks was dynamically determined
> from the earliest data at the DMC, not sure if that is still the case.
> So some networks were in the database with some data and then later
> they sent in additional "old" data, causing the begin times to move
> backwards. For example BK used to start in the 80s I think, but now
> starts in the 30s?
>
> More bad news is that the AF network (I think I am remembering
> correctly), a single permanent network, at some point split into two
> networks due to issues related to some data being restricted and some
> not. So my software started having real problems because it was coded
> to assume that the 2 char network code was unique for permanent
> networks and suddenly there were 2 distinct networks (at least at the
> software level) with the same code. I think there is work at the DMC
> to redo the notion of restricted data so that this bifurcation of that
> network will no longer be an issue in the future, but just pointing it
> out as an example of how limited the options are for creating a unique
> ID based on anything data "in" a network. Basically all fields are
> subject to change, meaning nothing can be assured to be a unique id.
> Big :(
>
> I think this is the argument given way back when people were creating
> database normalization theories and arguing for meaningless integer
> database ids, because any ID based on real world data is subject to
> change and so can not be counted on for a good id.
>
> One more peice of bad news, the same problems that exist in the
> network level also exist at the station and channel level, except that
> they are even more likely to change.
>
> I should also say that this is not a fault of the DMC, they don't
> control when or how networks make changes to their metadata. But it is
> a problem none the less as we simply do not have a globally unique,
> non-changing identifier for any of our metadata. You do the best you
> can and try to put code in to catch when things change. I have had
> very limited success and grumble with regularity about how hard it is
> to keep a metadata database in sync with the upstream one. It is just
> a really really hard problem with no good solutions as far as I can
> see. If you come up with a good answer please, please let me know.
>
> Good luck...
> Philip
>
> On Tue, Jun 14, 2011 at 8:45 PM, Chad Trabant <chad at iris.washington.edu>
> wrote:
> >
> > Got it. The network start/end dates don't change often but on occasion
> they
> > do.  I think the most common case is when a temporary network code is
> > extended to match an extended experiment time window.  The only other
> useful
> > identifier that I can think of is the network description contained in
> the
> > <Description> tags, although that is subject to change as well but also
> > doesn't change often.  Perhaps by checking the description you can figure
> > out when it's the same network versus something new more often than not.
> > Chad
> > On Jun 14, 2011, at 5:04 PM, John D. West wrote:
> >
> > That was what I assumed from the output of the web service. The question
> is:
> > can a start date or end date EVER change? If an incorrect date is entered
> > and then later corrected, I end up with overlapping networks because
> network
> > code + start date + end date combine to form the unique identifier.
> >      -- John
> >
> >
> > On Tue, Jun 14, 2011 at 4:58 PM, Chad Trabant <chad at iris.washington.edu>
> > wrote:
> >>
> >> Hello.
> >>
> >> In general, networks, like stations and channels, have the notion of a
> >> start time and an end time.  For permanent networks there are normally
> not
> >> breaks in the continuity.  For temporary networks there are often blocks
> of
> >> years allocated for specific experiments, for example XY 2005-2006, XY
> >> 2007-2009 and XY 2010-2010.  We would not consider those temporary
> networks
> >> to be modifications of an existing network, but instead to be logically
> >> different networks.  Essentially the network code combined with the
> start
> >> and end time uniquely identifies a "network", when the dates change and
> the
> >> network code is recycled it should be considered a new network.  Not
> sure I
> >> understood your question, did that help at all?
> >>
> >> Chad
> >>
> >> On Jun 14, 2011, at 2:00 PM, John D. West wrote:
> >>
> >> > Hello.
> >> >
> >> > I'm using the station webservice in EMERALD to maintain a local cache
> of
> >> > network, station, and component metadata. In the Network level, reuse
> of
> >> > network codes makes it difficult to differentiate between new and
> modified
> >> > networks, e.g., if a network EndDate changes, my system registers it
> as a
> >> > new usage of the network code instead of modification of an existing
> >> > network.
> >> >
> >> > Is there some unique identifier for each network which can be included
> >> > in the web service?
> >> >
> >> > Thanks!
> >> >
> >> >      -- John
> >> > _______________________________________________
> >> > webservices mailing list
> >> > webservices at iris.washington.edu
> >> > http://www.iris.washington.edu/mailman/listinfo/webservices
> >>
> >
> >
> >
> > _______________________________________________
> > webservices mailing list
> > webservices at iris.washington.edu
> > http://www.iris.washington.edu/mailman/listinfo/webservices
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.iris.washington.edu/pipermail/webservices/attachments/20110615/d3fa8f30/attachment-0001.htm>


More information about the webservices mailing list