Thread: byte order and swapping

Started: 2005-11-20 00:20:48
Last activity: 2005-11-21 18:51:17
Topics: SAC Developers
Brian Savage
2005-11-20 00:20:48

I have done a bit of thinking about the byte-order situation and problem
with using write header. I can confirm that using read header and write
header on a sac file in the opposite byte order does not work and
basically destroys the sac file. My thought for fixing this bug is as
follows:

On read:
Add a variable which tracks if the file needed to be swapped on read or
read header.

On write header:
if file not swapped on read:
write header
if file was swapped on read:
swap header and write

On write:
if file not swapped on read:
write header and data
if file was swapped on read:
swap header and data then write

The above tries to keep code in the same byte order it was read in as.

Any thoughts ?

This of course could be extended to include a preferred byte order per
user (LittleEndian/BigEndian or Linux/Sun/Mac/Automatic). However, this
would most likely require that doing a read header/write header sequence
on a swapped byte order file would mean that the data would need to be
read in, swapped, and written out.

Cheers,
Brian

  • George Helffrich
    2005-11-20 23:16:11
    The byte-swapping/header rewriting issue highlights a problem that SAC
    still has, to wit:
    there is no specification as to what the byte order in SAC files is.

    By accident, there is a reliable way to tell the endianness of a SAC
    data file due to the known format of the data file header. However,
    there is no analogous way to tell what the endianness of a SAC SGF file
    is reliably. So the present situation provides a workable, yet
    complicated solution for SAC data, but none for SGF.

    If we decide, as a community, this endianness issue, then the issues
    with byte-swapped headers vs data disappears. The only reason it
    exists is because we never have answered the endianness question.

    So why not specify it?

    Based on the historical fact that SAC was developed on Ridge and Sun
    machines -- both big-endian -- there is an argument for files being
    big-endian. However, both Macs and PCs are converging towards
    little-endian order, so a more practical choice might be little.

    Does anybody object to or disagree with the need to specify a byte
    order?


    George Helffrich


    • Arthur Snoke
      2005-11-21 18:01:41
      George,

      Just picking up on a detail in your note:

      By accident, there is a reliable way to tell the endianness of a SAC
      data file due to the known format of the data file header. However,
      there is no analogous way to tell what the endianness of a SAC SGF file
      is reliably. So the present situation provides a workable, yet
      complicated solution for SAC data, but none for SGF.

      As you probably recall, I have worked quite a bit with SGF format, having
      decided 10+ years ago to use that for my graphics format for just about
      everything. I don't agree with your statement that the byte order cannot
      be discerned easily for SGF files. The first "word" in an SGF file is a
      4-byte integer which is (using od -t -d2 on the Sun) 00000 00005 for big
      endian and 01280 00000 on PC/Linux. Am I missing something?

      I have planned to write a SGF translator, but it is always next on my to-do
      list. Part of the problem is that it is a pain because everything is I*2
      except the first word in each command which is I*4. I know how to do it,
      but it is a bit tedious. I have written a set of subroutines which can
      write .sgf files without using any SAC library routines for both endians
      and my sgf2ps works on both platforms (and is a part of the SAC
      distribution). From the notes on your variant on SAC, you have many
      additions which I would like to see worked into the formal SAC
      distribution, and would be happy to work with you on that.

      This is a sideline on your main point: how to deal with endians in sac
      data files. You say that it is trivial to tell the endian of a sac data
      file. But part of the problem which Chuck brought up at our meeting at
      the IRIS workshop is that one could have a header with one endian but the
      data in the other by using READHDR/WRITEHDR on the platform with the
      opposite endian from which the file was created. Again, am I missing
      something?

      Arthur

    • Brian Savage
      2005-11-21 18:51:17
      There appears to be two camps concerning this. Either way, we need to
      make a decision about byte order and create the read and write routines
      to handle them.

      At present we still have a problem with the readhdr/writehdr routines
      and this needs to be fixed.


      My own personal opinion is this. When sac was developed it used the
      native format for Sun (Big endian) and this was fine. Livermore had not
      decided if big endian was the standard format and thus moving to the
      linux (little endian format) allowed for sac files to be in another byte
      order. Allowing for two different formats is the easiest and least
      instrusive thing to do. Not to be rude, but I think Livermore has
      inadvertently made our bed for us.

      Swapping massive amounts of data or rewriting software to conform to a
      standard introduced part way into the game is something none of us would
      like to do. Do not force others to do it.

      Brian


      George Helffrich wrote:
      The byte-swapping/header rewriting issue highlights a problem that SAC
      still has, to wit:
      there is no specification as to what the byte order in SAC files is.

      By accident, there is a reliable way to tell the endianness of a SAC
      data file due to the known format of the data file header. However,
      there is no analogous way to tell what the endianness of a SAC SGF file
      is reliably. So the present situation provides a workable, yet
      complicated solution for SAC data, but none for SGF.

      If we decide, as a community, this endianness issue, then the issues
      with byte-swapped headers vs data disappears. The only reason it exists
      is because we never have answered the endianness question.

      So why not specify it?

      Based on the historical fact that SAC was developed on Ridge and Sun
      machines -- both big-endian -- there is an argument for files being
      big-endian. However, both Macs and PCs are converging towards
      little-endian order, so a more practical choice might be little.

      Does anybody object to or disagree with the need to specify a byte order?


      George Helffrich

      _______________________________________________
      sac-dev mailing list
      sac-dev<at>iris.washington.edu
      http://www.iris.washington.edu/mailman/listinfo/sac-dev


07:30:43 v.3514fbed