Next Previous Contents

20. Debugging

There are several ways to debug a S-Lang script. When the interpreter encounters an uncaught exception, it can generate a traceback report showing where the error occurred and the values of local variables in the function call stack frames at the time of the error. Often just knowing where the error occurs is all that is required to correct the problem. More subtle bugs may require a deeper analysis to diagnose the problem. While one can insert the appropriate print statements in the code to get some idea about what is going on, it may be simpler to use the interactive debugger.

20.1 Tracebacks

When the value of the _traceback variable is non-zero, the interpreter will generate a traceback report when it encounters an error. This variable may be set by putting the line

    _traceback = 1;
at the top of the suspect file. If the script is running in slsh, then invoking slsh using the -g option will enable tracebacks:
    slsh -g myscript.sl

If _traceback is set to a positive value, the values of local variables will be printed in the traceback report. If set to a negative integer, the values of the local variables will be absent.

Here is an example of a traceback report:

    Traceback: error
    ***string***:1:verror:Run-Time Error
    /grandpa/d1/src/jed/lib/search.sl:78:search_generic_search:Run-Time Error
    Local Variables:
          String_Type prompt = "Search forward:"
          Integer_Type dir = 1
          Ref_Type line_ok_fun = &_function_return_1
          String_Type str = "ascascascasc"
          Char_Type not_found = 1
          Integer_Type cs = 0
    /grandpa/d1/src/jed/lib/search.sl:85:search_forward:Run-Time Error
There are several ways to read this report; perhaps the simplest is to read it from the bottom. This report says that on line 85 in search.sl the search_forward function called the search_generic_search function. On line 78 it called the verror function, which in turn called error. The search_generic_search function contains 6 local variables whose values at the time of the error are given by the traceback output. The above example shows that a local variable called "not_found" had a Char_Type value of 1 at the time of the error.

20.2 Using the sldb debugger

The interpreter contains a number of hooks that support a debugger. sldb consists of a set of functions that use these hooks to implement a simple debugger. Although written for slsh, the debugger may be used by other S-Lang interpreters that permit the loading of slsh library files. The examples presented here are given in the context of slsh.

In order to use the debugger, the code to to be debugged must be loaded with debugging info enabled. This can be in done several ways, depending upon the application embedding the interpreter.

For applications that support a command line, the simplest way to access the debugger is to use the sldb function with the name of the file to be debugged:

   require ("sldb");
   sldb ("foo.sl");
When called without an argument, sldb will prompt for input. This can be useful for setting or removing breakpoints.

Another mechanism to access the debugger is to put

   require ("sldb");
   sldb_enable ();
at the top of the suspect file. Any files loaded by the file will also be compiled with debugging support, making it unnecessary to add this to all files.

If the file contains any top-level executable statements, the debugger will display the line to be executed and prompt for input. If the file does not contain any executable statements, the debugger will not be activated until one of the functions in the file is executed.

As a concrete example, consider the following contrived slsh script called buggy.sl:

    define divide (a, b, i)
    {
       return a[i] / b;
    }
    define slsh_main ()
    {
       variable x = [1:5];
       variable y = x*x;
       variable i;
       _for i (0, length(x), 1)
         {
            variable z = divide (x, y, i);
            () = fprintf (stdout, "%g/%g = %g", x[i], y[i], z);
         }
    }
Running this via
    slsh buggy.sl
yields
    Expecting Double_Type, found Array_Type
    ./buggy.sl:13:slsh_main:Type Mismatch
More information may be obtained by using slsh's -g option to cause a traceback report to be printed:
    slsh -g buggy.sl
    Expecting Double_Type, found Array_Type
    Traceback: fprintf
    ./buggy.sl:13:slsh_main:Type Mismatch
    Local variables for slsh_main:
         Array_Type x = Integer_Type[5]
         Array_Type y = Integer_Type[5]
         Integer_Type i = 0
         Array_Type z = Integer_Type[5]
    Error encountered while executing slsh_main
From this one can see that the problem is that z is an array and not a scalar as expected.

To run the program under debugger control, startup slsh and load the file using the sldb function:

    slsh> sldb ("./buggy.sl");
Note the use of "./" in the filename. This may be necessary if the file is not in the slsh search path.

The above command causes execution to stop with the following displayed:

    slsh_main at ./buggy.sl:9
    9    variable x = [1:5];
    (sldb)
This shows that the debugger has stopped the script at line 9 of buggy.sl and is waiting for input. The print function may be used to print the value of an expression or variable. Using it to display the value of x yields
    (sldb) print x
    Caught exception:Variable Uninitialized Error
    (sldb)
This is because x has not yet been assigned a value and will not be until line 9 has been executed. The next command may be used to execute the current line and stop at the next one:
    (sldb) next
    10    variable y = x*x;
    (sldb)
The step command functions almost the same as next, except when a function call is involved. In such a case, the next command will step over the function call but step will cause the debugger to enter the function and stop there.

Now the value of x may be displayed using the print command:

    (sldb) print x
    Integer_Type[5]
    (sldb) print x[0]
    1
    (sldb) print x[-1]
    5
    (sldb)

The list command may be used to get a list of the source code around the current line:

    (sldb) list
    5     return a[i] / b;
    6  }
    7  define slsh_main ()
    8  {
    9     variable x = [1:5];
    10    variable y = x*x;
    11    variable i;
    12    _for i (0, length(x), 1)
    13      {
    14      variable z = divide (x, y, i);
    15      () = fprintf (stdout, "%g/%g = %g", x[i], y[i], z);

The break function may be used to set a breakpoint. For example,

    (sldb) break 15
    breakpoint #1 set at ./buggy.sl:15
will set a break point at the line 15 of the current file.

The cont command may be used to continue execution until the next break point:

    (sldb) cont
    Breakpoint 1, slsh_main
        at ./buggy.sl:15
    15      () = fprintf (stdout, "%g/%g = %g", x[i], y[i], z);
    (sldb)
Using the next command produces:
    Received Type Mismatch error.  Entering the debugger
    15      () = fprintf (stdout, "%g/%g = %g", x[i], y[i], z);
This shows that during the execution of line 15, a TypeMismatchError was generated. Let's see what caused it:
    (sldb) print x[i]
    1
    (sldb) print y[i]
    1
    (sldb) print z
    Integer_Type[5]
This shows that the problem was caused by z being an array and not a scalar--- something that was already known from the traceback report. Now let's see why it is not a scalar. Start the program again and set a breakpoint in the divide function:
    slsh_main at ./buggy.sl:9
    9    variable x = [1:5];
    (sldb) break divide
    breakpoint #1 set at divide
    (sldb) cont
    Breakpoint 1, divide
    at ./buggy.sl:5
    5    return a[i] / b;
    (sldb)
The values of a[i]/b and b may be printed:
    (sldb) print a[i]/b
    Integer_Type[5]
    (sldb) print b
    Integer_Type[5]
From this it is easy to see that z is an array because b is an array. The fix for this is to change line 5 to
    z = a[i]/b[i];

The debugger supports several other commands. For example, the up and down commands may be used to move up and down the stack-frames, and where command may be used to display the stack-frames. These commands are useful for examining the variables in the other frames:

    (sldb) where
    #0 ./buggy.sl:5:divide
    #1 ./buggy.sl:14:slsh_main
    (sldb) up
    #1 ./buggy.sl:14:slsh_main
    14      variable z = divide (x, y, i);
    (sldb) print x
    Integer_Type[5]
    (sldb) down
    #0 ./buggy.sl:5:divide
    5    return a[i] / b;
    (sldb) print z
    Integer_Type[5]

On some operating systems, the debugger's watchfpu command may be used to help isolate floating point exceptions. Consider the following example:

     define solve_quadratic (a, b, c)
     {
        variable d = b^2 - 4.0*a*c;
        variable x = -b + sqrt (d);
        return x / (2.0*a);
     }
     define print_root (a, b, c)
     {
        vmessage ("%f %f %f %f\n", a, b, c, solve_quadratic (a,b,c));
     }
     print_root (1,2,3);
Running it via slsh produces:
    1.000000 2.000000 3.000000 nan
Now run it in the debugger:
     <top-level> at ./example.sl:12
     11 print_root (1,2,3);
     (sldb) watchfpu FE_INVALID
     (sldb) cont
     *** FPU exception bits set: FE_INVALID
     Entering the debugger.
     solve_quadratic at ./t.sl:4
     4    variable x = -b + sqrt (d);
This shows the the NaN was produced on line 4.

The watchfpu command may be used to watch for the occurrence of any combination of the following exceptions

     FE_DIVBYZERO
     FE_INEXACT
     FE_INVALID
     FE_OVERFLOW
     FE_UNDERFLOW
by the bitwise-or operation of the desired combination. For instance, to track both FE_INVALID and FE_OVERFLOW, use:
   (sldb) watchfpu FE_INVALID | FE_OVERFLOW


Next Previous Contents