Hi,
I found a problem with sacswap. Although nowadays with sac's ability
to read both big- and little-endian sac files, sacswap must be less
often used, some people I know are still using it since some of the
legacy software such as pssac [3] can only deal with big-endian sac
files.
PROBLEM:
I used sacswap on my linux system to convert a random sac file vel.sac
(little-endian) to vel.sac.swap (big-endian), and then from
vel.sac.swap (big-endian) to vel.sac.swap.swap (little-endian). By
doing a binary file comparison, it turns out that files vel.sac and
vel.sac.swap.swap are not the same. I read files vel.sac and
vel.sac.swap into matlab and found that 8 or so data points from them
were not the same, i.e. the differences were first introduced in the
first sacswap conversion. I've confirmed that all the file headers
from the two files are the same, and NPTS of vel.sac is more than
10,000, so the error rate is pretty small.
The sac file I used is here, in case you want to try it:
http://maxwell.phys.uconn.edu/~icrazy/sac/vel.sac
To narrow the problem down, I extracted one of the numbers that was
changed after doing sacswap conversions and made a small program to
repeat the problem. The program is here:
http://maxwell.phys.uconn.edu/~icrazy/sac/endian1.c
I found that the culprit is this function float_swap() below, which
according to [1] is not a good way to implement the byte swapping
feature.
float float_swap(char cbuf[]) {
union {
char cval[4];
float fval;
} f_union;
f_union.cval[3] = cbuf[0];
f_union.cval[2] = cbuf[1];
f_union.cval[1] = cbuf[2];
f_union.cval[0] = cbuf[3];
return(f_union.fval);
}
This program endian1.c takes four bytes (0xff, 0xae, 0x47, and 0xc5)
as input and then swap the bytes (1st with 4th, 2nd with 3rd). The
correct output should be:
ff ae 47 c5 --> c5 47 ae ff
However, in my system (Pentium 4 CPU + GCC 4.2.3), using different
optimization parameters with gcc, I got different results:
$ gcc endian1.c -o endian1 && ./endian1
ff ae 47 c5 --> c5 47 ee ff
(wrong, "ae" was mysteriously changed to "ee")
$ gcc -O2 endian1.c -o endian1 && ./endian1
ff ae 47 c5 --> c5 47 ee ff
(wrong, "ae" was mysteriously changed to "ee")
$ gcc -O3 endian1.c -o endian1 && ./endian1
ff ae 47 c5 --> c5 47 ae ff
(correct)
I also found that this behavior is system dependent. I think it might
only be dependent on the gcc version, but I'm listing the CPU as well.
Systems which only give correct byte swapping results when using gcc -O3:
- Intel(R) Pentium(R) 4 CPU 2.40GHz
gcc (GCC) 4.2.3 (Ubuntu 4.2.3-2ubuntu7)
- AMD Athlon(tm) MP 2800+ (2133MHz)
gcc (GCC) 4.1.1 20070105 (Red Hat 4.1.1-52)
Systems which give correct byte swapping results not matter what:
- Intel(R) Xeon(R) CPU E5320 @ 1.86GHz
gcc (GCC) 4.1.2 20070925 (Red Hat 4.1.2-33)
- Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz
gcc (GCC) 4.1.2 20070925 (Red Hat 4.1.2-33)
FIX:
Ref [2] (page 9) gives some example byte swapping macros, which in my
opinion are equivalent to the byteswap() function in sac's source code
src/ucf/byteswap.c . I took this function and adapted my program
endian1.c to endian_byteswap.c below:
http://maxwell.phys.uconn.edu/~icrazy/sac/endian_byteswap.c
and everything works!
I also did a complete update of sacswap:
http://maxwell.phys.uconn.edu/~icrazy/sac/sacswap.c
using byteswap(), while trying to keep as much original code intact as possible.
P.S. I don't know about the attachment policy of this mailing list. If
needed (maybe for archiving purposes), I can attach all the program
files mentioned here in a reply post.
[1] http://www.power.org/devcon/07/Session_Downloads/PADC07_McQuaid_PA_Wrobel_Heinz_20070910.pdf
[2] Endianness White Paper
http://www.intel.com/design/intarch/papers/endian.pdf
[3] pssac
http://www.eas.slu.edu/People/LZhu/downloads/pssac.tar
Best regards,
--
Kuang He
Department of Physics
University of Connecticut
Storrs, CT 06269-3046
Tel: +1.860.486.4919
Web: http://www.phys.uconn.edu/~he/
I found a problem with sacswap. Although nowadays with sac's ability
to read both big- and little-endian sac files, sacswap must be less
often used, some people I know are still using it since some of the
legacy software such as pssac [3] can only deal with big-endian sac
files.
PROBLEM:
I used sacswap on my linux system to convert a random sac file vel.sac
(little-endian) to vel.sac.swap (big-endian), and then from
vel.sac.swap (big-endian) to vel.sac.swap.swap (little-endian). By
doing a binary file comparison, it turns out that files vel.sac and
vel.sac.swap.swap are not the same. I read files vel.sac and
vel.sac.swap into matlab and found that 8 or so data points from them
were not the same, i.e. the differences were first introduced in the
first sacswap conversion. I've confirmed that all the file headers
from the two files are the same, and NPTS of vel.sac is more than
10,000, so the error rate is pretty small.
The sac file I used is here, in case you want to try it:
http://maxwell.phys.uconn.edu/~icrazy/sac/vel.sac
To narrow the problem down, I extracted one of the numbers that was
changed after doing sacswap conversions and made a small program to
repeat the problem. The program is here:
http://maxwell.phys.uconn.edu/~icrazy/sac/endian1.c
I found that the culprit is this function float_swap() below, which
according to [1] is not a good way to implement the byte swapping
feature.
float float_swap(char cbuf[]) {
union {
char cval[4];
float fval;
} f_union;
f_union.cval[3] = cbuf[0];
f_union.cval[2] = cbuf[1];
f_union.cval[1] = cbuf[2];
f_union.cval[0] = cbuf[3];
return(f_union.fval);
}
This program endian1.c takes four bytes (0xff, 0xae, 0x47, and 0xc5)
as input and then swap the bytes (1st with 4th, 2nd with 3rd). The
correct output should be:
ff ae 47 c5 --> c5 47 ae ff
However, in my system (Pentium 4 CPU + GCC 4.2.3), using different
optimization parameters with gcc, I got different results:
$ gcc endian1.c -o endian1 && ./endian1
ff ae 47 c5 --> c5 47 ee ff
(wrong, "ae" was mysteriously changed to "ee")
$ gcc -O2 endian1.c -o endian1 && ./endian1
ff ae 47 c5 --> c5 47 ee ff
(wrong, "ae" was mysteriously changed to "ee")
$ gcc -O3 endian1.c -o endian1 && ./endian1
ff ae 47 c5 --> c5 47 ae ff
(correct)
I also found that this behavior is system dependent. I think it might
only be dependent on the gcc version, but I'm listing the CPU as well.
Systems which only give correct byte swapping results when using gcc -O3:
- Intel(R) Pentium(R) 4 CPU 2.40GHz
gcc (GCC) 4.2.3 (Ubuntu 4.2.3-2ubuntu7)
- AMD Athlon(tm) MP 2800+ (2133MHz)
gcc (GCC) 4.1.1 20070105 (Red Hat 4.1.1-52)
Systems which give correct byte swapping results not matter what:
- Intel(R) Xeon(R) CPU E5320 @ 1.86GHz
gcc (GCC) 4.1.2 20070925 (Red Hat 4.1.2-33)
- Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz
gcc (GCC) 4.1.2 20070925 (Red Hat 4.1.2-33)
FIX:
Ref [2] (page 9) gives some example byte swapping macros, which in my
opinion are equivalent to the byteswap() function in sac's source code
src/ucf/byteswap.c . I took this function and adapted my program
endian1.c to endian_byteswap.c below:
http://maxwell.phys.uconn.edu/~icrazy/sac/endian_byteswap.c
and everything works!
I also did a complete update of sacswap:
http://maxwell.phys.uconn.edu/~icrazy/sac/sacswap.c
using byteswap(), while trying to keep as much original code intact as possible.
P.S. I don't know about the attachment policy of this mailing list. If
needed (maybe for archiving purposes), I can attach all the program
files mentioned here in a reply post.
[1] http://www.power.org/devcon/07/Session_Downloads/PADC07_McQuaid_PA_Wrobel_Heinz_20070910.pdf
[2] Endianness White Paper
http://www.intel.com/design/intarch/papers/endian.pdf
[3] pssac
http://www.eas.slu.edu/People/LZhu/downloads/pssac.tar
Best regards,
--
Kuang He
Department of Physics
University of Connecticut
Storrs, CT 06269-3046
Tel: +1.860.486.4919
Web: http://www.phys.uconn.edu/~he/
-
Dear Huang -
Good work tracking this down. What it shows is that you are running
into compiler problems! Even though your reference [1] says that using
unions to swap is bad, there is actually nothing wrong with it. It
might be inefficient on hardware, and, though [1] does not say why,
there are two possible problems that could arise: 1) loss of
efficiency in swapping due to sign extension; 2) alignment problems in
the union. I suspect that the problem you are having is due to botched
optimization and/or sign extension problems.
Can you make the following test? Change your routine like so, and run
your tests:
float float_swap(unsigned char cbuf[]) {
union {
unsigned char cval[4];
float fval;
} f_union;
f_union.cval[3] = cbuf[0];
f_union.cval[2] = cbuf[1];
f_union.cval[1] = cbuf[2];
f_union.cval[0] = cbuf[3];
return(f_union.fval);
}
That rules out sign extension as a contributor to the problems.
On 31 Aug 2008, at 02:33, Kuang He wrote:
Hi,
George Helffrich
I found a problem with sacswap. Although nowadays with sac's ability
to read both big- and little-endian sac files, sacswap must be less
often used, some people I know are still using it since some of the
legacy software such as pssac [3] can only deal with big-endian sac
files.
PROBLEM:
I used sacswap on my linux system to convert a random sac file vel.sac
(little-endian) to vel.sac.swap (big-endian), and then from
vel.sac.swap (big-endian) to vel.sac.swap.swap (little-endian). By
doing a binary file comparison, it turns out that files vel.sac and
vel.sac.swap.swap are not the same. I read files vel.sac and
vel.sac.swap into matlab and found that 8 or so data points from them
were not the same, i.e. the differences were first introduced in the
first sacswap conversion. I've confirmed that all the file headers
from the two files are the same, and NPTS of vel.sac is more than
10,000, so the error rate is pretty small.
The sac file I used is here, in case you want to try it:
http://maxwell.phys.uconn.edu/~icrazy/sac/vel.sac
To narrow the problem down, I extracted one of the numbers that was
changed after doing sacswap conversions and made a small program to
repeat the problem. The program is here:
http://maxwell.phys.uconn.edu/~icrazy/sac/endian1.c
I found that the culprit is this function float_swap() below, which
according to [1] is not a good way to implement the byte swapping
feature.
float float_swap(char cbuf[]) {
union {
char cval[4];
float fval;
} f_union;
f_union.cval[3] = cbuf[0];
f_union.cval[2] = cbuf[1];
f_union.cval[1] = cbuf[2];
f_union.cval[0] = cbuf[3];
return(f_union.fval);
}
This program endian1.c takes four bytes (0xff, 0xae, 0x47, and 0xc5)
as input and then swap the bytes (1st with 4th, 2nd with 3rd). The
correct output should be:
ff ae 47 c5 --> c5 47 ae ff
However, in my system (Pentium 4 CPU + GCC 4.2.3), using different
optimization parameters with gcc, I got different results:
$ gcc endian1.c -o endian1 && ./endian1
ff ae 47 c5 --> c5 47 ee ff
(wrong, "ae" was mysteriously changed to "ee")
$ gcc -O2 endian1.c -o endian1 && ./endian1
ff ae 47 c5 --> c5 47 ee ff
(wrong, "ae" was mysteriously changed to "ee")
$ gcc -O3 endian1.c -o endian1 && ./endian1
ff ae 47 c5 --> c5 47 ae ff
(correct)
I also found that this behavior is system dependent. I think it might
only be dependent on the gcc version, but I'm listing the CPU as well.
Systems which only give correct byte swapping results when using gcc
-O3:
- Intel(R) Pentium(R) 4 CPU 2.40GHz
gcc (GCC) 4.2.3 (Ubuntu 4.2.3-2ubuntu7)
- AMD Athlon(tm) MP 2800+ (2133MHz)
gcc (GCC) 4.1.1 20070105 (Red Hat 4.1.1-52)
Systems which give correct byte swapping results not matter what:
- Intel(R) Xeon(R) CPU E5320 @ 1.86GHz
gcc (GCC) 4.1.2 20070925 (Red Hat 4.1.2-33)
- Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz
gcc (GCC) 4.1.2 20070925 (Red Hat 4.1.2-33)
FIX:
Ref [2] (page 9) gives some example byte swapping macros, which in my
opinion are equivalent to the byteswap() function in sac's source code
src/ucf/byteswap.c . I took this function and adapted my program
endian1.c to endian_byteswap.c below:
http://maxwell.phys.uconn.edu/~icrazy/sac/endian_byteswap.c
and everything works!
I also did a complete update of sacswap:
http://maxwell.phys.uconn.edu/~icrazy/sac/sacswap.c
using byteswap(), while trying to keep as much original code intact as
possible.
P.S. I don't know about the attachment policy of this mailing list. If
needed (maybe for archiving purposes), I can attach all the program
files mentioned here in a reply post.
[1]
http://www.power.org/devcon/07/Session_Downloads/
PADC07_McQuaid_PA_Wrobel_Heinz_20070910.pdf
[2] Endianness White Paper
http://www.intel.com/design/intarch/papers/endian.pdf
[3] pssac
http://www.eas.slu.edu/People/LZhu/downloads/pssac.tar
Best regards,
--
Kuang He
Department of Physics
University of Connecticut
Storrs, CT 06269-3046
Tel: +1.860.486.4919
Web: http://www.phys.uconn.edu/~he/
_______________________________________________
sac-dev mailing list
sac-dev<at>iris.washington.edu
http://www.iris.washington.edu/mailman/listinfo/sac-dev
george<at>geology.bristol.ac.uk
-
On Sun, Aug 31, 2008 at 8:06 AM, George Helffrich
<george<at>geology.bristol.ac.uk> wrote:
Dear Huang -
Dear George,
Good work tracking this down. What it shows is that you are running
into compiler problems! Even though your reference [1] says that using
unions to swap is bad, there is actually nothing wrong with it. It might be
I also suspected that it could be a compiler problem. However, I
tested endian1.c on another Pentium 4 machine with Visual C 6.0, and
it always gave wrong answer. I know that maybe both gcc and VC6 have
problems with union implementations, but chances are small.
inefficient on hardware, and, though [1] does not say why, there are two
So I made the test, and nothing changed.
possible problems that could arise: 1) loss of efficiency in swapping due
to sign extension; 2) alignment problems in the union. I suspect that the
problem you are having is due to botched optimization and/or sign extension
problems.
Can you make the following test? Change your routine like so, and
run your tests:
float float_swap(unsigned char cbuf[]) {
union {
unsigned char cval[4];
float fval;
} f_union;
f_union.cval[3] = cbuf[0];
f_union.cval[2] = cbuf[1];
f_union.cval[1] = cbuf[2];
f_union.cval[0] = cbuf[3];
return(f_union.fval);
}
That rules out sign extension as a contributor to the problems.
$ diff -u endian1.c endian1-unsigned.c
--- endian1.c 2008-08-30 19:10:29.000000000 -0400
+++ endian1-unsigned.c 2008-08-31 10:00:45.000000000 -0400
@@ -1,9 +1,9 @@
#include <stdio.h>
#include <stdlib.h>
-float float_swap(char cbuf[]) {
+float float_swap(unsigned char cbuf[]) {
union {
- char cval[4];
+ unsigned char cval[4];
float fval;
} f_union;
@@ -17,7 +17,7 @@
int main()
{
float f;
- char cbuf[4];
+ unsigned char cbuf[4];
unsigned char *p;
cbuf[0] = 0xff;
$ gcc -Wall ./endian1-unsigned.c -o endian1-unsigned && ./endian1-unsigned
ff ae 47 c5 --> c5 47 ee ff
(wrong result)
$ gcc -O3 -Wall ./endian1-unsigned.c -o endian1-unsigned && ./endian1-unsigned
ff ae 47 c5 --> c5 47 ae ff
(correct result)
I also did a sizeof() of the above union, and it is 4 bytes on my system.
Best regards,
--
Kuang He
Department of Physics
University of Connecticut
Storrs, CT 06269-3046
Tel: +1.860.486.4919
Web: http://www.phys.uconn.edu/~he/
-
Dear Huang -
Thanks for making the check. Your test program is quite simple and
straightforward.
I was able to reproduce the problem on an Intel Mac mini (gcc 4.0.1).
It turns out that the code to swap is fine. If you change the floats
to ints, then the byte swapping works OK.
Here's what the problem is, however: your number is an IEEE NaN!
When it is returned by the function and stored, then the bit pattern
changes. The reason the pattern changes is that your number is a
signaling NaN. A processor is allowed to clear the flag when a
floating point number is stored. That's the reason that the bit
changes. Depending on whether a floating point store or a fixed point
store or move is used, the result can differ depending on processor
and/or code optimization level.
So there's no problem with the byte swapping code.
On 31 Aug 2008, at 15:45, Kuang He wrote:
On Sun, Aug 31, 2008 at 8:06 AM, George Helffrich
George Helffrich
<george<at>geology.bristol.ac.uk> wrote:
Dear Huang -
Dear George,
Good work tracking this down. What it shows is that you are
running
into compiler problems! Even though your reference [1] says that
using
unions to swap is bad, there is actually nothing wrong with it. It
might be
I also suspected that it could be a compiler problem. However, I
tested endian1.c on another Pentium 4 machine with Visual C 6.0, and
it always gave wrong answer. I know that maybe both gcc and VC6 have
problems with union implementations, but chances are small.
inefficient on hardware, and, though [1] does not say why, there are
So I made the test, and nothing changed.
two
possible problems that could arise: 1) loss of efficiency in
swapping due
to sign extension; 2) alignment problems in the union. I suspect
that the
problem you are having is due to botched optimization and/or sign
extension
problems.
Can you make the following test? Change your routine like so,
and
run your tests:
float float_swap(unsigned char cbuf[]) {
union {
unsigned char cval[4];
float fval;
} f_union;
f_union.cval[3] = cbuf[0];
f_union.cval[2] = cbuf[1];
f_union.cval[1] = cbuf[2];
f_union.cval[0] = cbuf[3];
return(f_union.fval);
}
That rules out sign extension as a contributor to the problems.
$ diff -u endian1.c endian1-unsigned.c
--- endian1.c 2008-08-30 19:10:29.000000000 -0400
+++ endian1-unsigned.c 2008-08-31 10:00:45.000000000 -0400
@@ -1,9 +1,9 @@
#include <stdio.h>
#include <stdlib.h>
-float float_swap(char cbuf[]) {
+float float_swap(unsigned char cbuf[]) {
union {
- char cval[4];
+ unsigned char cval[4];
float fval;
} f_union;
@@ -17,7 +17,7 @@
int main()
{
float f;
- char cbuf[4];
+ unsigned char cbuf[4];
unsigned char *p;
cbuf[0] = 0xff;
$ gcc -Wall ./endian1-unsigned.c -o endian1-unsigned &&
./endian1-unsigned
ff ae 47 c5 --> c5 47 ee ff
(wrong result)
$ gcc -O3 -Wall ./endian1-unsigned.c -o endian1-unsigned &&
./endian1-unsigned
ff ae 47 c5 --> c5 47 ae ff
(correct result)
I also did a sizeof() of the above union, and it is 4 bytes on my
system.
Best regards,
--
Kuang He
Department of Physics
University of Connecticut
Storrs, CT 06269-3046
Tel: +1.860.486.4919
Web: http://www.phys.uconn.edu/~he/
_______________________________________________
sac-dev mailing list
sac-dev<at>iris.washington.edu
http://www.iris.washington.edu/mailman/listinfo/sac-dev
george<at>geology.bristol.ac.uk
-
Dear George,
Thanks for the enlightenment, I did not think of NaN before and was
not aware of fact that the number NaN is allowed to be changed by the
processor.
Just to clarify, the my input number specified in endian1.c (listed
below) under little-endian systems is not NaN, and it actually
represents the number 0xC547AEFF. After byte swapping, this number
will become 0xFFAE7A54, and appear to be signalling NaN to a little
endian system.
cbuf[0] = 0xff;
cbuf[1] = 0xae;
cbuf[2] = 0x47;
cbuf[3] = 0xc5;
Still, I think the original sacswap program can lead to a very
undesirable effect, that is, after using it, the values from sac files
may change. A simple way to demonstrate it is again to use the sac
file I've provided:
http://maxwell.phys.uconn.edu/~icrazy/sac/vel.sac
First, we byte swap it:
$ sacswap vel.sac
sacswap: Writing vel.sac.swap with npts = 14054 native => non-native
Then we convert the sac files to ascii files in order to make a text comparison:
$ cat sac2ascii
CONVERT FROM SAC vel.sac TO ALPHA vel.sac.txt
CONVERT FROM SAC vel.sac.swap TO ALPHA vel.sac.swap.txt
quit
$ sac sac2ascii
If we have used the original version of sacswap in the above byte
swapping step, files vel.sac.txt and vel.sac.swap.txt would differ on
my system (and on some other systems as well, but not all of them).
$ diff -u vel.sac.txt vel.sac.swap.txt
--- vel.sac.txt 2008-08-31 13:06:19.000000000 -0400
+++ vel.sac.swap.txt 2008-08-31 13:06:19.000000000 -0400
@@ -9,7 +9,7 @@
-12345 -12345 -12345 -12345 -12345
-12345 -12345 -12345 -12345 -12345
2848.977 304.7868 107.2769 25.62435 -12345
- -12345 -711.2866 0 0 -12345
+ -12345 -711.2802 0 0 -12345
-12345 -12345 -12345 -12345 -12345
-12345 -12345 -12345 -12345 -12345
2008 133 6 30 54
@@ -1767,7 +1767,7 @@
-442.2065 -627.9093 -794.7451 -949.0807 -1086.678
-1217.899 -1345.818 -1483.826 -1629.344 -1787.957
-1955.329 -2136.53 -2321.403 -2509.167 -2689.645
- -2865.094 -3031.295 -3194.937 -3348.609 -3491.529
+ -2865.094 -3031.295 -3198.937 -3348.609 -3491.529
-3615.532 -3724.585 -3811.694 -3886.186 -3950.19
-4018.139 -4088.735 -4161.889 -4223.314 -4275.219
-4315.033 -4355.229 -4397.055 -4444.013 -4481.068
@@ -1781,7 +1781,7 @@
-2900.382 -2733.463 -2557.34 -2380.768 -2199.411
-2018.65 -1835.886 -1663.406 -1504.461 -1375.213
-1279.549 -1231.151 -1227.254 -1269.1 -1342.068
- -1444.266 -1564.187 -1702.966 -1845.73 -1985.935
+ -1444.266 -1566.187 -1702.966 -1845.73 -1985.935
-2108.854 -2215.427 -2302.428 -2382.116 -2451.169
-2512.34 -2555.602 -2583.942 -2590.573 -2583.463
-2562.272 -2535.475 -2494.733 -2445.612 -2387.887
@@ -1982,7 +1982,7 @@
-80449.42 -80341.84 -80221.92 -80085.12 -79943.02
-79792.08 -79642 -79486.1 -79331.84 -79174.38
-79021.8 -78862.32 -78700.26 -78527.55 -78352.4
- -78166.28 -77976.15 -77775.91 -77575.99 -77370.91
+ -78166.28 -77976.15 -77775.91 -77703.99 -77370.91
-77165.37 -76944.98 -76712.09 -76458.49 -76193.1
-75908.02 -75610.84 -75292.95 -74960.52 -74601.54
-74220.37 -73809.9 -73374.59 -72901.31 -72395.22
@@ -2074,8 +2074,8 @@
49460.56 49606.18 49739.93 49870.56 49992.12
50113.37 50231.81 50355.56 50468.37 50569.93
50647.43 50709.93 50754 50788.06 50804
- 50807.43 50786.81 50750.87 50695.25 50621.5
- 50518.37 50394.31 50243.06 50076.5 49892.75
+ 50807.43 50786.81 50750.87 50695.25 50685.5
+ 50518.37 50394.31 50243.06 50140.5 49892.75
49698.37 49488.37 49277.12 49060.56 48854.31
48656.5 48478.37 48318.68 48187.75 48076.5
47991.19 47920.56 47864 47810.25 47769.94
@@ -2084,12 +2084,12 @@
47841.5 47901.19 47964.31 48041.5 48120.25
48206.5 48290.25 48380.87 48463.53 48546.96
48624.78 48707.28 48787.9 48879.15 48974
- 49080.4 49190.4 49311.5 49432.28 49558.37
+ 49080.4 49190.4 49375.5 49432.28 49558.37
49682.12 49812.43 49940.4 50073.37 50203.21
50335.71 50457.28 50570.56 50662.43 50738.21
50786.5 50814.46 50813.06 50793.68 50747.2
50680.64 50579.23 50446.34 50269.93 50061.26
- 49816.5 49547.98 49246.26 48914.78 48539.07
+ 49880.5 49547.98 49246.26 48914.78 48539.07
48130.72 47686.66 47220.05 46724.55 46213.8
45684.16 45148.57 44598.93 44042.48 43467.42
42878.2 42263.68 41629.63 40966.04 40282.76
@@ -2655,7 +2655,7 @@
-14297.84 -14457.91 -14600.05 -14759.44 -14899.54
-15057.7 -15197.83 -15354.5 -15491.26 -15648.15
-15789.54 -15950.98 -16095.28 -16259.3 -16402.67
- -16560.03 -16695.9 -16844.77 -16969.5 -17109.72
+ -16560.03 -16695.9 -16844.77 -17001.5 -17109.72
-17225.52 -17350.89 -17453.12 -17570.5 -17664.18
-17769.41 -17850.37 -17944.39 -18012.98 -18092.75
-18143.65 -18201.23 -18230.68 -18269.39 -18278.61
One can of course argue that the difference in values of the changed
data are not great, but it is certainly not desirable for users to
have their data changed without telling them. However, if we have used
the updated version of sacswap I've provided before in the above byte
swapping step, files vel.sac.txt and vel.sac.swap.txt will always be
the same (at least it is like this according to my own tests).
Therefore, to eliminate the possible unwitting data changing effects,
I think we should update the original sacswap program to one that uses
byte_swap(). One can still argue that it might be easier to just add a
"-O3" parameter when compiling the original sacswap program, which
seems to eliminate the problem withing touching the source code, but
this trick can not be guaranteed to work on all the platforms.
Best regards,
--
Kuang He
Department of Physics
University of Connecticut
Storrs, CT 06269-3046
Tel: +1.860.486.4919
Web: http://www.phys.uconn.edu/~he/
On Sun, Aug 31, 2008 at 11:56 AM, George Helffrich
<george<at>gly.bris.ac.uk> wrote:
Dear Huang -
Thanks for making the check. Your test program is quite simple and
straightforward.
I was able to reproduce the problem on an Intel Mac mini (gcc 4.0.1).
It turns out that the code to swap is fine. If you change the floats to
ints, then the byte swapping works OK.
Here's what the problem is, however: your number is an IEEE NaN!
When it is returned by the function and stored, then the bit pattern
changes. The reason the pattern changes is that your number is a signaling
NaN. A processor is allowed to clear the flag when a floating point number
is stored. That's the reason that the bit changes. Depending on whether a
floating point store or a fixed point store or move is used, the result can
differ depending on processor and/or code optimization level.
So there's no problem with the byte swapping code.
-
Dear Kuang -
If you demand that type of compatibility, then the straightforward way
to change sacswap would be to change the type of float_swap (and the
type in the union) to int, and store the swapped value as an int (with
a suitable cast). That would ensure against any bit changes due to NaN
behavior on intermediate machine architectures. Can you provide a diff
to sacswap.c do that?
On 31 Aug 2008, at 18:41, Kuang He wrote:
Dear George,
George Helffrich
Thanks for the enlightenment, I did not think of NaN before and was
not aware of fact that the number NaN is allowed to be changed by the
processor.
Just to clarify, the my input number specified in endian1.c (listed
below) under little-endian systems is not NaN, and it actually
represents the number 0xC547AEFF. After byte swapping, this number
will become 0xFFAE7A54, and appear to be signalling NaN to a little
endian system.
cbuf[0] = 0xff;
cbuf[1] = 0xae;
cbuf[2] = 0x47;
cbuf[3] = 0xc5;
Still, I think the original sacswap program can lead to a very
undesirable effect, that is, after using it, the values from sac files
may change. A simple way to demonstrate it is again to use the sac
file I've provided:
http://maxwell.phys.uconn.edu/~icrazy/sac/vel.sac
First, we byte swap it:
$ sacswap vel.sac
sacswap: Writing vel.sac.swap with npts = 14054 native => non-native
Then we convert the sac files to ascii files in order to make a text
comparison:
$ cat sac2ascii
CONVERT FROM SAC vel.sac TO ALPHA vel.sac.txt
CONVERT FROM SAC vel.sac.swap TO ALPHA vel.sac.swap.txt
quit
$ sac sac2ascii
If we have used the original version of sacswap in the above byte
swapping step, files vel.sac.txt and vel.sac.swap.txt would differ on
my system (and on some other systems as well, but not all of them).
$ diff -u vel.sac.txt vel.sac.swap.txt
--- vel.sac.txt 2008-08-31 13:06:19.000000000 -0400
+++ vel.sac.swap.txt 2008-08-31 13:06:19.000000000 -0400
@@ -9,7 +9,7 @@
-12345 -12345 -12345 -12345
-12345
-12345 -12345 -12345 -12345
-12345
2848.977 304.7868 107.2769 25.62435
-12345
- -12345 -711.2866 0 0
-12345
+ -12345 -711.2802 0 0
-12345
-12345 -12345 -12345 -12345
-12345
-12345 -12345 -12345 -12345
-12345
2008 133 6 30 54
@@ -1767,7 +1767,7 @@
-442.2065 -627.9093 -794.7451 -949.0807
-1086.678
-1217.899 -1345.818 -1483.826 -1629.344
-1787.957
-1955.329 -2136.53 -2321.403 -2509.167
-2689.645
- -2865.094 -3031.295 -3194.937 -3348.609
-3491.529
+ -2865.094 -3031.295 -3198.937 -3348.609
-3491.529
-3615.532 -3724.585 -3811.694 -3886.186
-3950.19
-4018.139 -4088.735 -4161.889 -4223.314
-4275.219
-4315.033 -4355.229 -4397.055 -4444.013
-4481.068
@@ -1781,7 +1781,7 @@
-2900.382 -2733.463 -2557.34 -2380.768
-2199.411
-2018.65 -1835.886 -1663.406 -1504.461
-1375.213
-1279.549 -1231.151 -1227.254 -1269.1
-1342.068
- -1444.266 -1564.187 -1702.966 -1845.73
-1985.935
+ -1444.266 -1566.187 -1702.966 -1845.73
-1985.935
-2108.854 -2215.427 -2302.428 -2382.116
-2451.169
-2512.34 -2555.602 -2583.942 -2590.573
-2583.463
-2562.272 -2535.475 -2494.733 -2445.612
-2387.887
@@ -1982,7 +1982,7 @@
-80449.42 -80341.84 -80221.92 -80085.12
-79943.02
-79792.08 -79642 -79486.1 -79331.84
-79174.38
-79021.8 -78862.32 -78700.26 -78527.55
-78352.4
- -78166.28 -77976.15 -77775.91 -77575.99
-77370.91
+ -78166.28 -77976.15 -77775.91 -77703.99
-77370.91
-77165.37 -76944.98 -76712.09 -76458.49
-76193.1
-75908.02 -75610.84 -75292.95 -74960.52
-74601.54
-74220.37 -73809.9 -73374.59 -72901.31
-72395.22
@@ -2074,8 +2074,8 @@
49460.56 49606.18 49739.93 49870.56
49992.12
50113.37 50231.81 50355.56 50468.37
50569.93
50647.43 50709.93 50754 50788.06
50804
- 50807.43 50786.81 50750.87 50695.25
50621.5
- 50518.37 50394.31 50243.06 50076.5
49892.75
+ 50807.43 50786.81 50750.87 50695.25
50685.5
+ 50518.37 50394.31 50243.06 50140.5
49892.75
49698.37 49488.37 49277.12 49060.56
48854.31
48656.5 48478.37 48318.68 48187.75
48076.5
47991.19 47920.56 47864 47810.25
47769.94
@@ -2084,12 +2084,12 @@
47841.5 47901.19 47964.31 48041.5
48120.25
48206.5 48290.25 48380.87 48463.53
48546.96
48624.78 48707.28 48787.9 48879.15
48974
- 49080.4 49190.4 49311.5 49432.28
49558.37
+ 49080.4 49190.4 49375.5 49432.28
49558.37
49682.12 49812.43 49940.4 50073.37
50203.21
50335.71 50457.28 50570.56 50662.43
50738.21
50786.5 50814.46 50813.06 50793.68
50747.2
50680.64 50579.23 50446.34 50269.93
50061.26
- 49816.5 49547.98 49246.26 48914.78
48539.07
+ 49880.5 49547.98 49246.26 48914.78
48539.07
48130.72 47686.66 47220.05 46724.55
46213.8
45684.16 45148.57 44598.93 44042.48
43467.42
42878.2 42263.68 41629.63 40966.04
40282.76
@@ -2655,7 +2655,7 @@
-14297.84 -14457.91 -14600.05 -14759.44
-14899.54
-15057.7 -15197.83 -15354.5 -15491.26
-15648.15
-15789.54 -15950.98 -16095.28 -16259.3
-16402.67
- -16560.03 -16695.9 -16844.77 -16969.5
-17109.72
+ -16560.03 -16695.9 -16844.77 -17001.5
-17109.72
-17225.52 -17350.89 -17453.12 -17570.5
-17664.18
-17769.41 -17850.37 -17944.39 -18012.98
-18092.75
-18143.65 -18201.23 -18230.68 -18269.39
-18278.61
One can of course argue that the difference in values of the changed
data are not great, but it is certainly not desirable for users to
have their data changed without telling them. However, if we have used
the updated version of sacswap I've provided before in the above byte
swapping step, files vel.sac.txt and vel.sac.swap.txt will always be
the same (at least it is like this according to my own tests).
Therefore, to eliminate the possible unwitting data changing effects,
I think we should update the original sacswap program to one that uses
byte_swap(). One can still argue that it might be easier to just add a
"-O3" parameter when compiling the original sacswap program, which
seems to eliminate the problem withing touching the source code, but
this trick can not be guaranteed to work on all the platforms.
Best regards,
--
Kuang He
Department of Physics
University of Connecticut
Storrs, CT 06269-3046
Tel: +1.860.486.4919
Web: http://www.phys.uconn.edu/~he/
On Sun, Aug 31, 2008 at 11:56 AM, George Helffrich
<george<at>gly.bris.ac.uk> wrote:
Dear Huang -
_______________________________________________
Thanks for making the check. Your test program is quite
simple and
straightforward.
I was able to reproduce the problem on an Intel Mac mini (gcc
4.0.1).
It turns out that the code to swap is fine. If you change the
floats to
ints, then the byte swapping works OK.
Here's what the problem is, however: your number is an IEEE
NaN!
When it is returned by the function and stored, then the bit pattern
changes. The reason the pattern changes is that your number is a
signaling
NaN. A processor is allowed to clear the flag when a floating point
number
is stored. That's the reason that the bit changes. Depending on
whether a
floating point store or a fixed point store or move is used, the
result can
differ depending on processor and/or code optimization level.
So there's no problem with the byte swapping code.
sac-dev mailing list
sac-dev<at>iris.washington.edu
http://www.iris.washington.edu/mailman/listinfo/sac-dev
george<at>geology.bristol.ac.uk
-
Dear George,
Hmm, while I still think the previous way of doing it using
byte_swap() is correct, using your way certainly would generate a much
smaller diff file.
The new diff file is attached, and also can be found at:
http://maxwell.phys.uconn.edu/~icrazy/sac/sacswap.c.diff
The original diff file using byte_swap() is also put here for comparison:
http://maxwell.phys.uconn.edu/~icrazy/sac/sacswap.c.diff2
Best regards,
--
Kuang He
Department of Physics
University of Connecticut
Storrs, CT 06269-3046
Tel: +1.860.486.4919
Web: http://www.phys.uconn.edu/~he/
On Sun, Aug 31, 2008 at 2:07 PM, George Helffrich <george<at>gly.bris.ac.uk> wrote:
Dear Kuang -
If you demand that type of compatibility, then the straightforward
way to change sacswap would be to change the type of float_swap (and the
type in the union) to int, and store the swapped value as an int (with a
suitable cast). That would ensure against any bit changes due to NaN
behavior on intermediate machine architectures. Can you provide a diff to
sacswap.c do that?
On 31 Aug 2008, at 18:41, Kuang He wrote:
Dear George,
Thanks for the enlightenment, I did not think of NaN before and was
not aware of fact that the number NaN is allowed to be changed by the
processor.
Just to clarify, the my input number specified in endian1.c (listed
below) under little-endian systems is not NaN, and it actually
represents the number 0xC547AEFF. After byte swapping, this number
will become 0xFFAE7A54, and appear to be signalling NaN to a little
endian system.
cbuf[0] = 0xff;
cbuf[1] = 0xae;
cbuf[2] = 0x47;
cbuf[3] = 0xc5;
Still, I think the original sacswap program can lead to a very
undesirable effect, that is, after using it, the values from sac files
may change. A simple way to demonstrate it is again to use the sac
file I've provided:
[ ... snipped ...]
Therefore, to eliminate the possible unwitting data changing effects,
I think we should update the original sacswap program to one that uses
byte_swap(). One can still argue that it might be easier to just add a
"-O3" parameter when compiling the original sacswap program, which
seems to eliminate the problem withing touching the source code, but
this trick can not be guaranteed to work on all the platforms.
Best regards,
--
Kuang He
Attachments
-
-
-
-
-