The language for the compiler I've written is based on the Turtle Procedure Notation (referred to hereby as TPN) described in the text Turtle Geometry by Harold Abelson and Andrea diSessa. In fact, the two languages are functionally identical except for the omission of the EXECUTE command and what they call structure-directed operations. I'll temporarily call my language TGL (Turtle Geometry Language).
However, the syntax of the languages is quite different. This is because of lack of time and experience on my part. Basically, I modified the syntax to make the language easier to parse; at some point, I hope to bring it more in line with TPN or with standard Logo. As it stands right now, TGL is far from being a thing of beauty; the only positive thing that can be said about it is that there is a compiler for it.
The syntax I use resembles that of C or Java; if you've programmed in either of those languages then you'll have no problems understanding my language (although there are a few quirks; see below).
The following program draws a sideways 'L' on the screen:
main() {
FORWARD 100;
LEFT 90;
FORWARD 30;
}
The statements refer to the screen "turtle", which is essentially an invisible cursor. The program starts out with a blank screen, on which the turtle traces its path. The above program tells the turtle to move 100 units (the turtle starts out in the middle of the screen, pointing to the right), then turn left ninety degrees, and then move 30 units straight (which is now up).
The rules for naming a variable are the same as for C and Java: variable names must start with a letter and thenceforth be a combination of letters, numbers, and underscores. Variables cannot be named with a keyword.
The symbol '=' is used for variable assignment; for example,
x = 44;
Unlike TPN, semicolons are required at the end of statements.
There are four basic data types: floating point numbers, strings, boolean values, and lists. The language is not typed, i.e. you don't (and can't) declare the type of a variable.
You can perform the following operations on numeric variables: addition (+), subtraction (-), multiplication (*), and division (/). No exponentiation yet.
You can define strings using double quotes:
my_string = "hello there";
You can concatenate strings, using the + operator. There is no facility as yet to convert a string to a number. Thus, the statements
x = "2";
y = 3;
z = x * y;
will cause an error. Note that there is no compile-time checking for types. If you try an operation with illegal arguments, you won't find out until you run your code.
However, you can add a string to any data type, in which case the compiler converts the other type to a string and concatenates the two.
Boolean variables take the values of TRUE and FALSE. There is no conversion between floating point numbers and boolean values. You can convert a boolean to a string.
Boolean operations are AND, OR (inclusive or), and NOT. The binding of these operators should be as in C or Java: NOT binds the tightest, then AND, then OR. Use parentheses to group expressions.
The operators >, <, and = (not ==) can be applied to floating point numbers. The result of the comparison operators is a boolean value.
Lists are created using bracket notation:
x = [1, 1, 2, 3, 5, 8];
Unlike TPN, commas are required to delimit list entries. The entries of a list can be any expression resolving to a data type, including other lists.
Here are the operations on lists.
For example, if
y = [ [4, 2], 7, 13 ];
then FIRST(y)
is [4,2]
; REST(FIRST(y))
is [4]
, ITEM(y, 1)
is 7
, and SIZE(y)
is 3.
There is no way to concatenate lists, or to dynamically grow them.
Here are the commands to manipulate the turtle:
The default state of the turtle is to have the pen down. The turtle begins a program in the middle of the screen, pointing due east.
Unlike C or Java syntax (and the rest of TGL), the arguments of the turtle commands are not parenthesized.
These work like their brethren in Java and C, with a few key exceptions. I made no attempt to match the syntax of TPN here. In particular, braces are always required around statement lists (TPN uses the indentation level).
If the expression is TRUE, execute the statement list. Re-evaluate the expression; keep executing the statement list until the expression becomes FALSE.
If the expression is TRUE, execute the statement list. Note that braces are always required, even if the statement list is just a single statement. This requirement eases the parsing burden by removing potential ambiguity problems with binding ELSE's.
If the expression is TRUE, execute the first statement list. Otherwise, execute the second statement lists.
Note that by these rules, the following syntax is (unfortunately) illegal:
IF ( expr ) {
…
} ELSE IF ( expr ) {
…
} ELSE {
…
}
Instead, you'll need
IF ( expr ) {
…
} ELSE {
IF ( expr ) {
…
} ELSE {
…
}
}
This inadequate state of affairs will continue until I add an ELSEIF statement or drop the requirement that braces surround the statement list.
As expected, this command executes the first assignment and then checks the boolean expression. If the expression is TRUE, then the statement list is executed. At which point the second assignment is executed, the boolean expression re-checked, and the cycle repeats.
One unfortunate consequence of the way things are set up: an assignment statement requires a semicolon terminator, so there is a trailing semicolon before the closing parenthesis (unlike Java or C). For example:
FOR (i=0; i<10; i=i+1;) {
FORWARD i;
RIGHT i;
}
The syntax for a procedure definition is:
procedure_name ( argument list ) { statement list }
where an argument list is either nothing or a comma-deliminated sequence of variable names. For example, the following procedure draws an arc of a circle through some angle.
arc( r, angle ) {
FOR ( i=0; i<angle; i=i+1; ) {
FORWARD r;
RIGHT 1;
}
}
A procedure can be invoked as part of a statement or expression. For example, to draw a racetrack
FORWARD 800;
arcr( 10, 180 );
FORWARD 800;
arcr( 10, 180 );
If a procedure is used in an expression, it has to return a value; this is done using the RETURN command:
RETURN expression;
For example,
sum (x, y, z) {
RETURN x + y + z;
}
x = sum(1, 2, 3);
This is all very C and Java-like. A major difference is that there is no compile-time checking of arguments or return values. For example, with the procedure sum above, there is nothing to stop you from invoking it with two arguments (which would cause a run-time crash) or four (the fourth argument would be ignored). Likewise, there is nothing to stop you from passing strings as arguments (which would concatenate the strings), or booleans (crash).
You can call procedures recursively.
You must have a procedure called "main"; this is the procedure that first executes when the program is launched. It doesn't matter whether you place other procedures before or after the main procedure.
You'll run into three types of errors in writing turtle programs with this compiler. There are syntax errors, discovered by the parser. These are triggered if your code does not follow the language syntax described above. You'll get a message that's only partially understandable (the message is generated by the JavaCC tool, FYI) -- when I get those, I just key in on the line number and try to figure out what's wrong by looking at the code.
Run-time errors occur if you apply an operation with invalid arguments, like trying to add a boolean and a number, or calling FORWARD on a string. You'll also get run-time errors for function parameter or return-type mismatches; for example, if you use a function in an expression but the function didn't return a value. These errors are only detected when the offending code executes. You'll get an error message indicating the line number of the source code where the error occurred.
Finally, there are compiler errors, which are mistakes in the compiler code. Not much you can do here, except to notify me. Sorry.
©2000-2011 James D'Ambrosia