Thread: Bugs found

Started: 2008-09-06 09:06:16
Last activity: 2008-09-08 17:40:39
Topics: SAC Developers
Kuang He
2008-09-06 09:06:16
Hi,

I'm using SAC v101.1 on a linux box (Ubuntu 8.04), and the glibc
version is 2.7 (2.7-10ubuntu3, to be exact).

$ uname -a
Linux ....... 2.6.24-19-generic #1 SMP Fri Jul 11 23:41:49 UTC 2008
i686 GNU/Linux

Bug 1: Trying to use Ctrl+D to quit SAC instead of using the command
`quit' will always cause a segmentation fault.

$ sac
SAC> [Press CTRL+D]
Segmentation fault

Bug 2: Putting a space after the comma in something like "&1,DIST"
will _sometimes_ cause SAC to suddenly abort, with a message from
glibc indicating possible double free. Below is an example of a case
where this problem does not show up and another case where the problem
does show up.

$ sac
SAC> r vel.sac
SAC> evaluate to dist &1,dist
SAC> evaluate to dist &1, dist
ERROR interpreting command: evaluate to dist ' dist
ILLEGAL OPTION:

$ sac
SAC> r vel.sac
SAC> evaluate to dist1 &1,dist
SAC> message %dist1
2.84897$
SAC> evaluate to dist1 &1, dist
*** glibc detected *** /usr/local/sac/bin/sac: double free or
corruption (!prev): 0x0843f020 ***
======= Backtrace: =========
/lib/tls/i686/cmov/libc.so.6[0xb7c9ba85]
/lib/tls/i686/cmov/libc.so.6(cfree+0x90)[0xb7c9f4f0]
/usr/local/sac/bin/sac[0x805d749]
/usr/local/sac/bin/sac[0x80c81fe]
/usr/local/sac/bin/sac[0x804e60f]
/usr/local/sac/bin/sac[0x804b991]
/lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe0)[0xb7c46450]
/usr/local/sac/bin/sac[0x804b5b1]
======= Memory map: ========
08048000-08197000 r-xp 00000000 08:01 1309634 /usr/local/sac/bin/sac
08197000-0819d000 rw-p 0014e000 08:01 1309634 /usr/local/sac/bin/sac
0819d000-08443000 rw-p 0819d000 00:00 0 [heap]
b7a00000-b7a21000 rw-p b7a00000 00:00 0
b7a21000-b7b00000 ---p b7a21000 00:00 0
b7c0c000-b7c0d000 rw-p b7c0c000 00:00 0
b7c0d000-b7c11000 r-xp 00000000 08:01 671763 /usr/lib/libXdmcp.so.6.0.0
b7c11000-b7c12000 rw-p 00003000 08:01 671763 /usr/lib/libXdmcp.so.6.0.0
b7c12000-b7c14000 r-xp 00000000 08:01 671752 /usr/lib/libXau.so.6.0.0
b7c14000-b7c15000 rw-p 00001000 08:01 671752 /usr/lib/libXau.so.6.0.0
b7c15000-b7c2c000 r-xp 00000000 08:01 671564 /usr/lib/libxcb.so.1.0.0
b7c2c000-b7c2d000 rw-p 00016000 08:01 671564 /usr/lib/libxcb.so.1.0.0
b7c2d000-b7c2e000 r-xp 00000000 08:01 671723 /usr/lib/libxcb-xlib.so.0.0.0
b7c2e000-b7c2f000 rw-p 00000000 08:01 671723 /usr/lib/libxcb-xlib.so.0.0.0
b7c2f000-b7c30000 rw-p b7c2f000 00:00 0
b7c30000-b7d79000 r-xp 00000000 08:01 2812605 /lib/tls/i686/cmov/libc-2.7.so
b7d79000-b7d7a000 r--p 00149000 08:01 2812605 /lib/tls/i686/cmov/libc-2.7.so
b7d7a000-b7d7c000 rw-p 0014a000 08:01 2812605 /lib/tls/i686/cmov/libc-2.7.so
b7d7c000-b7d7f000 rw-p b7d7c000 00:00 0
b7d7f000-b7dac000 r-xp 00000000 08:01 2779973 /lib/libncurses.so.5.6
b7dac000-b7daf000 rw-p 0002c000 08:01 2779973 /lib/libncurses.so.5.6
b7daf000-b7dd2000 r-xp 00000000 08:01 2812613 /lib/tls/i686/cmov/libm-2.7.so
b7dd2000-b7dd4000 rw-p 00023000 08:01 2812613 /lib/tls/i686/cmov/libm-2.7.so
b7dd4000-b7dd6000 r-xp 00000000 08:01 2812611 /lib/tls/i686/cmov/libdl-2.7.so
b7dd6000-b7dd8000 rw-p 00001000 08:01 2812611 /lib/tls/i686/cmov/libdl-2.7.so
b7ddb000-b7de5000 r-xp 00000000 08:01 2779880 /lib/libgcc_s.so.1
b7de5000-b7de6000 rw-p 0000a000 08:01 2779880 /lib/libgcc_s.so.1
b7de6000-b7de8000 rw-p b7de6000 00:00 0
b7de8000-b7ecc000 r-xp 00000000 08:01 672597 /usr/lib/libX11.so.6.2.0
b7ecc000-b7ecf000 rw-p 000e4000 08:01 672597 /usr/lib/libX11.so.6.2.0
b7ecf000-b7ed0000 rw-p b7ecf000 00:00 0
b7ed0000-b7ee5000 r-xp 00000000 08:01 670455 /usr/lib/libICE.so.6.3.0
b7ee5000-b7ee6000 rw-p 00014000 08:01 670455 /usr/lib/libICE.so.6.3.0
b7ee6000-b7ee8000 rw-p b7ee6000 00:00 0
b7ee8000-b7eef000 r-xp 00000000 08:01 671742 /usr/lib/libSM.so.6.0.0
b7eef000-b7ef0000 rw-p 00006000 08:01 671742 /usr/lib/libSM.so.6.0.0
b7ef0000-b7ef2000 rw-p b7ef0000 00:00 0
b7ef2000-b7ef3000 r-xp b7ef2000 00:00 0 [vdso]
b7ef3000-b7f0d000 r-xp 00000000 08:01 2779919 /lib/ld-2.7.so
b7f0d000-b7f0f000 rw-p 00019000 08:01 2779919 /lib/ld-2.7.so
bf86b000-bf880000 rw-p bffeb000 00:00 0 [stack]
Aborted

The file vel.sac used above can be found at:

http://maxwell.phys.uconn.edu/~icrazy/sac/vel.sac


Best regards,

--
Kuang He
Department of Physics
University of Connecticut
Storrs, CT 06269-3046

Tel: +1.860.486.4919
Web: http://www.phys.uconn.edu/~he/

  • Kuang He
    2008-09-06 09:22:34
    On Sat, Sep 6, 2008 at 2:06 AM, Kuang He <icrazy<at>gmail.com> wrote:
    [...]
    Bug 2: Putting a space after the comma in something like "&1,DIST"
    will _sometimes_ cause SAC to suddenly abort, with a message from
    glibc indicating possible double free. Below is an example of a case
    where this problem does not show up and another case where the problem
    does show up.

    $ sac
    SAC> r vel.sac
    SAC> evaluate to dist &1,dist
    SAC> evaluate to dist &1, dist
    ERROR interpreting command: evaluate to dist ' dist
    ILLEGAL OPTION:

    $ sac
    SAC> r vel.sac
    SAC> evaluate to dist1 &1,dist
    SAC> message %dist1
    2.84897$
    SAC> evaluate to dist1 &1, dist
    *** glibc detected *** /usr/local/sac/bin/sac: double free or
    corruption (!prev): 0x0843f020 ***
    ======= Backtrace: =========
    /lib/tls/i686/cmov/libc.so.6[0xb7c9ba85]
    [...]

    I googled on this error message, and someone suggested using

    $ export MALLOC_CHECK_=0

    before running SAC. The above environment variable setting will ask
    glibc to ignore the multi free() to particular memory, which seemed to
    have solved the problem for now, but still, it is possible that this
    problem is due to a coding error.

    Best regards,

    --
    Kuang He
    Department of Physics
    University of Connecticut
    Storrs, CT 06269-3046

    Tel: +1.860.486.4919
    Web: http://www.phys.uconn.edu/~he/

  • Kuang He
    2008-09-08 06:34:03
    On Sat, Sep 6, 2008 at 2:06 AM, Kuang He <icrazy<at>gmail.com> wrote:
    I'm using SAC v101.1 on a linux box (Ubuntu 8.04), and the glibc
    version is 2.7 (2.7-10ubuntu3, to be exact).

    $ uname -a
    Linux ....... 2.6.24-19-generic #1 SMP Fri Jul 11 23:41:49 UTC 2008
    i686 GNU/Linux

    Bug 1: Trying to use Ctrl+D to quit SAC instead of using the command
    `quit' will always cause a segmentation fault.

    $ sac
    SAC> [Press CTRL+D]
    Segmentation fault

    Here is a quick fix to Bug 1 (based on SAC v101.1):

    $ diff -u src/co/zgpmsg.c.old src/co/zgpmsg.c
    --- src/co/zgpmsg.c.old 2008-09-07 23:18:09.000000000 -0400
    +++ src/co/zgpmsg.c 2008-09-07 23:28:07.000000000 -0400
    @@ -96,6 +96,14 @@
    process_line(char *p) {
    int i;

    + if (p == NULL) { /* user has pressed CTRL+D */
    + if ((p = malloc(5)) == NULL) {
    + printf("%s:%d: error allocating p.\n", __FILE__, __LINE__);
    + exit(1);
    + }
    + strncpy(p, "quit", 4);
    + p[4] = '\0';
    + }
    select_loop_continue(SELECT_OFF); /* Turn off select loop */
    select_loop_message(p, SELECT_MSG_SET); /* Set the outgoing message */


    Probably there are better ways to do this.

    There is another file src/co/zgpmsg.c, which seems to be an old
    version of src/co/zgpmsg.c, and is still in the Makefile. Do I need to
    change this file as well? Please advise.

    Best regards,

    --
    Kuang He
    Department of Physics
    University of Connecticut
    Storrs, CT 06269-3046

    Tel: +1.860.486.4919
    Web: http://www.phys.uconn.edu/~he/

  • Kuang He
    2008-09-08 09:17:41
    On Sat, Sep 6, 2008 at 2:06 AM, Kuang He <icrazy<at>gmail.com> wrote:
    I'm using SAC v101.1 on a linux box (Ubuntu 8.04), and the glibc
    version is 2.7 (2.7-10ubuntu3, to be exact).

    $ uname -a
    Linux ....... 2.6.24-19-generic #1 SMP Fri Jul 11 23:41:49 UTC 2008
    i686 GNU/Linux
    .....
    Bug 2: Putting a space after the comma in something like "&1,DIST"
    will _sometimes_ cause SAC to suddenly abort, with a message from
    glibc indicating possible double free. Below is an example of a case
    where this problem does not show up and another case where the problem
    does show up.

    $ sac
    SAC> r vel.sac
    SAC> evaluate to dist1 &1,dist
    SAC> message %dist1
    2.84897$
    SAC> evaluate to dist1 &1, dist
    *** glibc detected *** /usr/local/sac/bin/sac: double free or
    corruption (!prev): 0x0843f020 ***
    ======= Backtrace: =========
    /lib/tls/i686/cmov/libc.so.6[0xb7c9ba85]
    /lib/tls/i686/cmov/libc.so.6(cfree+0x90)[0xb7c9f4f0]
    ....

    With Brian's help, I was able to locate the place in the source code
    that caused this problem (his OSX machines don't have this problem):
    line 93 of src/cpf/cfmt.c . The code there does not have double
    free()'s, but code snippets shown below just do not make any sense to
    me: temporal variable strtemp1 gets created and destroyed, without
    doing anything useful at all. Commenting out these does solve the
    problem.

    $ diff -u src/cpf/cfmt.c.old src/cpf/cfmt.c
    --- src/cpf/cfmt.c.old 2008-09-08 00:07:33.000000000 -0400
    +++ src/cpf/cfmt.c 2008-09-08 01:48:27.000000000 -0400
    @@ -86,11 +86,14 @@
    iend_ = ibeg + nchar + 2;
    if( iend_ <= MCMSG ){
    kmsg[ibeg - 1] = kmcom.kcom[j - 1][0];
    + /*
    strtemp1 = malloc(MCMSG+1-(ibeg+1));
    strncpy(strtemp1,kmsg+ibeg,MCMSG+1-(ibeg+1));
    strtemp1[MCMSG+1-(ibeg+1)] = '\0';
    copykc( (char*)kmcom.kcom[j + 1],9,
    nchar, strtemp1);
    free(strtemp1);
    + */
    kmsg[iend_ - 2] = kmcom.kcom[j - 1][0];
    kmsg[iend_ - 1] = ' ';
    if( j == cmcom.jcom )
    @@ -103,11 +106,13 @@
    nchar = (long)( Flnum[j + 1] + 0.1 );
    iend_ = ibeg + nchar;
    if( iend_ <= MCMSG ){
    + /*
    strtemp1 = malloc(MCMSG+1-ibeg);
    strncpy(strtemp1,kmsg+ibeg-1,MCMSG+1-ibeg);
    strtemp1[MCMSG+1-ibeg] = '\0';
    copykc( (char*)kmcom.kcom[j + 1],9,
    nchar, strtemp1);
    free(strtemp1);
    + */
    kmsg[iend_ - 1] = ' ';
    if( j == cmcom.jcom )
    iarrow = ibeg - ndiff;


    By the way, I think wrapping all the uses of free() to FREE() shown
    below would be a good idea. The catch is just that since the code base
    is too big, it'll take quite some time to change all of them.

    #define FREE(ptr) do { if (ptr) free(ptr); } while (0)


    Best regards,

    --
    Kuang He
    Department of Physics
    University of Connecticut
    Storrs, CT 06269-3046

    Tel: +1.860.486.4919
    Web: http://www.phys.uconn.edu/~he/

    • Brian Savage
      2008-09-08 17:40:39
      Kuang He,

      Good work in tracking down both bugs.
      I will look them over in a couple of days, after Wednesday probably.
      These fixes will have to wait until after 101.2.

      Cheers
      Brian

      On Sep 8, 2008, at 2:17 AM , Kuang He wrote:

      On Sat, Sep 6, 2008 at 2:06 AM, Kuang He <icrazy<at>gmail.com> wrote:
      I'm using SAC v101.1 on a linux box (Ubuntu 8.04), and the glibc
      version is 2.7 (2.7-10ubuntu3, to be exact).

      $ uname -a
      Linux ....... 2.6.24-19-generic #1 SMP Fri Jul 11 23:41:49 UTC 2008
      i686 GNU/Linux
      .....
      Bug 2: Putting a space after the comma in something like "&1,DIST"
      will _sometimes_ cause SAC to suddenly abort, with a message from
      glibc indicating possible double free. Below is an example of a case
      where this problem does not show up and another case where the
      problem
      does show up.

      $ sac
      SAC> r vel.sac
      SAC> evaluate to dist1 &1,dist
      SAC> message %dist1
      2.84897$
      SAC> evaluate to dist1 &1, dist
      *** glibc detected *** /usr/local/sac/bin/sac: double free or
      corruption (!prev): 0x0843f020 ***
      ======= Backtrace: =========
      /lib/tls/i686/cmov/libc.so.6[0xb7c9ba85]
      /lib/tls/i686/cmov/libc.so.6(cfree+0x90)[0xb7c9f4f0]
      ....

      With Brian's help, I was able to locate the place in the source code
      that caused this problem (his OSX machines don't have this problem):
      line 93 of src/cpf/cfmt.c . The code there does not have double
      free()'s, but code snippets shown below just do not make any sense to
      me: temporal variable strtemp1 gets created and destroyed, without
      doing anything useful at all. Commenting out these does solve the
      problem.

      $ diff -u src/cpf/cfmt.c.old src/cpf/cfmt.c
      --- src/cpf/cfmt.c.old 2008-09-08 00:07:33.000000000 -0400
      +++ src/cpf/cfmt.c 2008-09-08 01:48:27.000000000 -0400
      @@ -86,11 +86,14 @@
      iend_ = ibeg + nchar + 2;
      if( iend_ <= MCMSG ){
      kmsg[ibeg - 1] = kmcom.kcom[j - 1][0];
      + /*
      strtemp1 = malloc(MCMSG+1-(ibeg+1));
      strncpy(strtemp1,kmsg+ibeg,MCMSG+1-
      (ibeg+1));
      strtemp1[MCMSG+1-(ibeg+1)] = '\0';
      copykc( (char*)kmcom.kcom[j + 1],9,
      nchar, strtemp1);
      free(strtemp1);
      + */
      kmsg[iend_ - 2] = kmcom.kcom[j - 1]
      [0];
      kmsg[iend_ - 1] = ' ';
      if( j == cmcom.jcom )
      @@ -103,11 +106,13 @@
      nchar = (long)( Flnum[j + 1] + 0.1 );
      iend_ = ibeg + nchar;
      if( iend_ <= MCMSG ){
      + /*
      strtemp1 = malloc(MCMSG+1-ibeg);
      strncpy(strtemp1,kmsg+ibeg-1,MCMSG
      +1-ibeg);
      strtemp1[MCMSG+1-ibeg] = '\0';
      copykc( (char*)kmcom.kcom[j + 1],9,
      nchar, strtemp1);
      free(strtemp1);
      + */
      kmsg[iend_ - 1] = ' ';
      if( j == cmcom.jcom )
      iarrow = ibeg - ndiff;


      By the way, I think wrapping all the uses of free() to FREE() shown
      below would be a good idea. The catch is just that since the code base
      is too big, it'll take quite some time to change all of them.

      #define FREE(ptr) do { if (ptr) free(ptr); } while (0)


      Best regards,

      --
      Kuang He
      Department of Physics
      University of Connecticut
      Storrs, CT 06269-3046

      Tel: +1.860.486.4919
      Web: http://www.phys.uconn.edu/~he/
      _______________________________________________
      sac-dev mailing list
      sac-dev<at>iris.washington.edu
      http://www.iris.washington.edu/mailman/listinfo/sac-dev


07:12:41 v.ce293803