There are essentially two classes of functions that may be called from the interpreter: intrinsic functions and slang functions.
An intrinsic function is one that is implemented in C or some other
compiled language and is callable from the interpreter. Nearly all
of the built-in functions are of this variety. At the moment the
basic interpreter provides nearly 300 intrinsic functions. Examples
include the trigonometric functions sin
and cos
, string
functions such as strcat
, etc. Dynamically loaded modules
such as the png
and pcre
modules add additional
intrinsic functions.
The other type of function is written in S-Lang and is known simply as a ``S-Lang function''. Such a function may be thought of as a group of statements that work together to perform a computation. The specification of such functions is the main subject of this chapter.
Like variables, functions must be declared before they can be used. The
define
keyword is used for this purpose. For example,
define factorial ();
is sufficient to declare a function named factorial
. Unlike
the variable
keyword used for declaring variables, the
define
keyword does not accept a list of names.
Usually, the above form is used only for recursive functions. In most cases, the function name is almost always followed by a parameter list and the body of the function:
define function-name (parameter-list)
{
statement-list
}
The function-name is an identifier and must conform to the
naming scheme for identifiers discussed in the chapter on
Identifiers. The
parameter-list is a comma-separated list of variable names
that represent parameters passed to the function, and may be empty
if no parameters are to be passed. The variables in the
parameter-list are implicitly declared, thus, there is no need
to declare them via a variable declaration statement. In fact any
attempt to do so will result in a syntax error.
The body of the function is enclosed in braces and consists of zero or more statements (statement-list). While there are no imposed limits upon the number statements that may occur within a S-Lang function, it is considered poor programming practice if a function contains many statements. This notion stems from the belief that a function should have a simple, well-defined purpose.
Parameters to a function are always passed by value and never by reference. To see what this means, consider
define add_10 (a)
{
a = a + 10;
}
variable b = 0;
add_10 (b);
Here a function add_10
has been defined, which when
executed, adds 10 to its parameter. A variable b
has
also been declared and initialized to zero before being passed to
add_10
. What will be the value of b
after the call
to add_10
? If S-Lang were a language that passed parameters
by reference, the value of b
would be changed to 10.
However, S-Lang always passes by value, which means that b
will retain its value during and after after the function call.
S-Lang does provide a mechanism for simulating pass by reference via the reference operator. This is described in greater detail in the next section.
If a function is called with a parameter in the parameter list
omitted, the corresponding variable in the function will be set to
NULL
. To make this clear, consider the function
define add_two_numbers (a, b)
{
if (a == NULL) a = 0;
if (b == NULL) b = 0;
return a + b;
}
This function must be called with two parameters. However, either
of them may omitted by calling the function in one of the following
ways:
variable s = add_two_numbers (2,3);
variable s = add_two_numbers (2,);
variable s = add_two_numbers (,3);
variable s = add_two_numbers (,);
The first example calls the function using both parameters, but
at least one of the parameters was omitted in the other
examples. If the parser recognizes that a parameter has been
omitted by finding a comma or right-parenthesis where a value is
expected, it will substitute NULL
for missing value. This means
that the parser will convert the latter three statements in the
above example to:
variable s = add_two_numbers (2, NULL);
variable s = add_two_numbers (NULL, 3);
variable s = add_two_numbers (NULL, NULL);
It is important to note that this mechanism is available only for
function calls that specify more than one parameter. That is,
variable s = add_10 ();
is not equivalent to add_10(NULL)
. The reason for this
is simple: the parser can only tell whether or not NULL
should
be substituted by looking at the position of the comma character in
the parameter list, and only function calls that indicate more than
one parameter will use a comma. A mechanism for handling single
parameter function calls is described later in this chapter.
The usual way to return values from a function is via the
return
statement. This statement has the simple syntax
return expression-list ;
where expression-list is a comma separated list of expressions.
If a function does not return any values, the expression list
will be empty. A simple example of a function that can return
multiple values (two in this case) is:
define sum_and_diff (x, y)
{
variable sum, diff;
sum = x + y; diff = x - y;
return sum, diff;
}
In the previous section an example of a function returning two values was given. That function can also be written somewhat simpler as:
define sum_and_diff (x, y)
{
return x + y, x - y;
}
This function may be called using
(s, d) = sum_and_diff (12, 5);
After the above line is executed, s
will have a value of 17
and the value of d
will be 7.
The most general form of the multiple assignment statement is
( var_1, var_2, ..., var_n ) = expression;
Here expression
is an arbitrary expression that leaves
n
items on the stack, and var_k
represents an l-value
object (permits assignment). The assignment statement removes
those values and assigns them to the specified variables.
Usually, expression
is a call to a function that returns
multiple values, but it need not be. For example,
(s,d) = (x+y, x-y);
produces results that are equivalent to the call to the
sum_and_diff
function. Another common use of the multiple
assignment statement is to swap values:
(x,y) = (y,x);
(a[i], a[j], a[k]) = (a[j], a[k], a[i]);
If an l-value is omitted from the list, then the corresponding value will be removed fro the stack. For example,
(s, ) = sum_and_diff (9, 4);
assigns the sum of 9 and 4 to s
and the
difference (9-4
) is removed from the stack. Similarly,
() = fputs ("good luck", fp);
causes the return value of the fputs
function to be discarded.
It is possible to create functions that return a variable number of values instead of a fixed number. Although such functions are discouraged, it is easy to cope with them. Usually, the value at the top of the stack will indicate the actual number of return values. For such functions, the multiple assignment statement cannot directly be used. To see how such functions can be dealt with, consider the following function:
define read_line (fp)
{
variable line;
if (-1 == fgets (&line, fp))
return -1;
return (line, 0);
}
This function returns either one or two values, depending upon the
return value of fgets
. Such a function may be handled using:
status = read_line (fp);
if (status != -1)
{
s = ();
.
.
}
In this example, the last value returned by read_line
is
assigned to status
and then tested. If it is non-zero, the
second return value is assigned to s
. In particular note the
empty set of parenthesis in the assignment to s
. This simply
indicates that whatever is on the top of the stack when the
statement is executed will be assigned to s
.
One can achieve the effect of passing by reference by using the
reference (&
) and dereference (@
) operators. Consider
again the add_10
function presented in the previous section.
This time it is written as:
define add_10 (a)
{
@a = @a + 10;
}
variable b = 0;
add_10 (&b);
The expression &b
creates a reference to the variable
b
and it is the reference that gets passed to add_10
.
When the function add_10
is called, the value of the local
variable a
will be a reference to the variable b
. It
is only by dereferencing this value that b
can be
accessed and changed. So, the statement @a=@a+10
should be
read as ``add 10 to the value of the object that a
references and assign the result to the object that a
references''.
The reader familiar with C will note the similarity between references in S-Lang and pointers in C.
References are not limited to variables. A reference to a function may also be created and passed to other functions. As a simple example from elementary calculus, consider the following function which returns an approximation to the derivative of another function at a specified point:
define derivative (f, x)
{
variable h = 1e-6;
return ((@f)(x+h) - (@f)(x)) / h;
}
define x_squared (x)
{
return x^2;
}
dydx = derivative (&x_squared, 3);
When the derivative
function is called, the local variable
f
will be a reference to the x_squared
function. The
x_squared
function is called with the specified
parameters by dereferencing f
with the dereference operator.
S-Lang functions may be called with a variable number of arguments.
A natural example of such functions is the strcat
function,
which takes one or more string arguments and returns the
concatenated result. An example of different sort is the
strtrim
function which moves both leading and trailing
whitespace from a string. In this case, when called with one
argument (the string to be ``trimmed''), the characters that are
considered to be whitespace are those in the character-set that have
the whitespace property (space, tab, newline, ...). However, when
called with two arguments, the second argument may be used to
specify the characters that are to be considered as whitespace. The
strtrim
function exemplifies a class of variadic functions
where the additional arguments are used to pass optional information to
the function. Another more flexible and powerful way of passing
optional information is through the use of qualifiers, which is
the subject of the next section.
When a S-Lang function is called with parameters, those parameters
are placed on the run-time stack. The function accesses those
parameters by removing them from the stack and assigning them to the
variables in its parameter list. This details of this operation
are for the most part hidden from the programmer. But what happens
when the number of parameters in the parameter list is not equal to
the number of parameters passed to the function? If the number
passed to the function is less than what the function expects, a
StackUnderflow
error could result as the function tries to
remove items from the stack. If the number passed is greater than
the number in the parameter list, then the extras will remain on the
stack. The latter feature makes it possible to write functions that
take a variable number of arguments.
Consider the add_10
example presented earlier. This time it
is written
define add_10 ()
{
variable x;
x = ();
return x + 10;
}
variable s = add_10 (12); % ==> s = 22;
For the uninitiated, this example looks as if it is destined for
disaster. The add_10
function appears to accept zero
arguments, yet it was called with a single argument. On top of
that, the assignment to x
might look a bit strange. The
truth is, the code presented in this example makes perfect sense,
once you realize what is happening.
First, consider what happens when add_10
is called with the
parameter 12. Internally, 12 is pushed onto the stack
and then the function called. Now, consider the function
add_10
itself. In it, x
is a local variable.
The strange looking assignment `x=()
' causes whatever is on
the top of the stack to be assigned to x
. In other words, after
this statement, the value of x
will be 12, since
12 is at the top of the stack.
A generic function of the form
define function_name (x, y, ..., z)
{
.
.
}
is transformed internally by the parser to something akin to
define function_name ()
{
variable x, y, ..., z;
z = ();
.
.
y = ();
x = ();
.
.
}
before further parsing. (The add_10
function, as defined above, is
already in this form.) With this knowledge in hand, one can write a
function that accepts a variable number of arguments. Consider the
function:
define average_n (n)
{
variable x, y;
variable s;
if (n == 1)
{
x = ();
s = x;
}
else if (n == 2)
{
y = ();
x = ();
s = x + y;
}
else throw NotImplementedError;
return s / n;
}
variable ave1 = average_n (3.0, 1); % ==> 3.0
variable ave2 = average_n (3.0, 5.0, 2); % ==> 4.0
Here, the last argument passed to average_n
is an integer
reflecting the number of quantities to be averaged. Although this
example works fine, its principal limitation is obvious: it only
supports one or two values. Extending it to three or more values
by adding more else if
constructs is rather straightforward but
hardly worth the effort. There must be a better way, and there is:
define average_n (n)
{
variable s, x;
s = 0;
loop (n)
{
x = (); % get next value from stack
s += x;
}
return s / n;
}
The principal limitation of this approach is that one must still
pass an integer that specifies how many values are to be averaged.
Fortunately, a special variable exists that is local to every function
and contains the number of values that were passed to the function.
That variable has the name _NARGS
and may be used as follows:
define average_n ()
{
variable x, s = 0;
if (_NARGS == 0)
usage ("ave = average_n (x, ...);");
loop (_NARGS)
{
x = ();
s += x;
}
return s / _NARGS;
}
Here, if no arguments are passed to the function, the usage
function will generate a UsageError
exception along with a
simple message indicating how to use the function.
One way to pass optional information to a function is to do so using the variable arguments mechanism described in the previous section. However, a much more powerful mechanism is through the use of qualifiers, which were added in version 2.1.
To illustrate the use of qualifiers, consider a graphics application
that defines a function called plot
that plots a set of (x,y)
values specified as 1-d arrays:
plot(x,y);
Suppose that when called in the above manner, the application will
plot the data as black points. But instead of black points, one
might want to plot the data using a red diamond as the plot symbol.
It would be silly to have a separate function such as
plot_red_diamond
for this purpose. A much better way to
achieve this functionality is through the use of qualifiers:
plot(x,y ; color="red", symbol="diamond");
Here, a single semicolon is used to separate the argument-list
proper (x,y
) from the list of qualifiers. In this case, the
qualifiers are ``color'' and ``symbol''. The order of the
qualifiers in unimportant; the function could just as well have been
called with the symbol qualifier listed first.
Now consider the implementation of the plot
function:
define plot (x, y)
{
variable color = qualifier ("color", "black");
variable symbol = qualifier ("symbol", "point");
variable symbol_size = qualifier ("size", 1.0);
.
.
}
Note that the qualifiers are not handled in the parameter list;
rather they are handled in the function body using the
qualifier
function, which is used to obtain the value of the
qualifier. The second argument to the qualifier
function
specifies the default value to be used if the function was not
called with the specified qualifier. Also note that the variable
associated with the qualifier need not have the same name as the
qualifier.
A qualifier need not have a value--- its mere presence may be used to enable or disable a feature or trigger some action. For example,
plot (x, y; connect_points);
specifies a qualifier called connect_points
that indicates
that a line should be drawn between the data points. The presence
of such a qualifier can be detected using the
qualifier_exists
function:
define plot (x,y)
{
.
.
variable connect_points = qualifier_exists ("connect_points");
.
.
}
Sometimes it is useful for a function to pass the qualifiers that it
has received to other functions. Suppose that the plot
function calls draw_symbol
to plot the specified symbol at a
particular location and that it requires the symbol attibutes to be
specified using qualifiers. Then the plot function might look like:
define plot (x, y)
{
variable color = qualifier ("color", "black");
variable symbol = qualifier ("symbol", "point");
variable symbol_size = qualifier ("size", 1.0);
.
.
_for i (0, length(x)-1, 1)
draw_symbol (x[i],y[i]
;color=color, size=symbol_size, symbol=symbol);
.
.
}
The problem with this approach is that it does not scale well: the
plot
function has to be aware of all the qualifiers that the
draw_symbol
function takes and explicitly pass them. In
many cases this can be quite cumbersome and error prone. Rather it
is better to simply pass the qualifiers that were passed to the plot
function on to the draw_symbol
function. This may be achieved
using the __qualifiers
function. The __qualifiers
function returns the list of qualifiers in the form of a structure
whose field names are the same as the qualifier names. In fact, the
use of this function can simplify the implementation of the
plot
function, which may be coded more simply as
define plot (x, y)
{
variable i;
_for i (0, length(x)-1, 1)
draw_symbol (x[i],y[i] ;; __qualifiers());
}
Note the syntax is slightly different. The two semicolons indicate
that the qualifiers are specified not as name-value pairs, but as a
structure. Using a single semicolon would have created a qualifier
called __qualifiers
, which is not what was desired.
As alluded to above an added benefit of this approach is that the
plot
function does not need to know nor care about the
qualifiers supported by draw_symbol
. When called as
plot (x, y; symbol="square", size=2.0, fill=0.8);
the fill
qualifier would get passed to the draw_symbol
function to specify the ``fill'' value to be used when creating
the symbol.
An exit-block is a set of statements that get executed when a
functions returns. They are very useful for cleaning up when a
function returns via an explicit call to return
from deep
within a function.
An exit-block is created by using the EXIT_BLOCK
keyword
according to the syntax
EXIT_BLOCK { statement-list }
where statement-list represents the list of statements that
comprise the exit-block. The following example illustrates the use
of an exit-block:
define simple_demo ()
{
variable n = 0;
EXIT_BLOCK { message ("Exit block called."); }
forever
{
if (n == 10) return;
n++;
}
}
Here, the function contains an exit-block and a forever
loop.
The loop will terminate via the return
statement when n
is 10. Before it returns, the exit-block will get executed.
A function can contain multiple exit-blocks, but only the last one encountered during execution will actually get used. For example,
define simple_demo (n)
{
EXIT_BLOCK { return 1; }
if (n != 1)
{
EXIT_BLOCK { return 2; }
}
return;
}
If 1 is passed to this function, the first exit-block will
get executed because the second one would not have been encountered
during the execution. However, if some other value is passed, the
second exit-block would get executed. This example also
illustrates that it is possible to explicitly return from an
exit-block, but nested exit-blocks are illegal.
The most important rule to remember in calling a function is that if the function returns a value, the caller must do something with it. While this might sound like a trivial statement it is the number one issue that trips-up novice users of the language.
To elaborate on this point further, consider the fputs function, which writes a string to a file descriptor. This function can fail when, e.g., a disk is full, or the file is located on a network share and the network goes down, etc.
S-Lang supports two mechanisms that a function may use to report a failure: raising an exception, returning a status code. The latter mechanism is used by the S-Lang fputs function. i.e., it returns a value to indicate whether or not is was successful. Many users familiar with this function either seem to forget this fact, or assume that the function will succeed and not bother handling the return value. While some languages silently remove such values from the stack, S-Lang regards the stack as a dynamic data structure that programs can utilize. As a result, the value will be left on the S-Lang stack and can cause problems later on.
There are a number of correct ways of ``doing something'' with the
return value from a function. Of course the recommended procedure
is to use the return value as it was meant to be used. In the case
of fputs
, the proper thing to do is to check the return
value, e.g.,
if (-1 == fputs ("good luck", fp))
{
% Handle the error
}
Other acceptable ways to ``do something'' with the return value
include assigning it to a dummy variable,
dummy = fputs ("good luck", fp);
or simply ``popping'' it from the stack:
fputs ("good luck", fp); pop();
The latter mechanism can also be written as
() = fputs ("good luck", fp);
The last form is a special case of the multiple assignment
statement, which was discussed earlier. Since this
form is simpler than assigning the value to a dummy variable or
explicitly calling the pop
function, it is recommended over
the other two mechanisms. Finally, this form has the
redeeming feature that it presents a visual reminder that the
function is returning a value that is not being used.