Up:
  1. TkScript Reference Guide » Scanner
TkScript

reference guide | Scanner


 
Table of Contents:

1. Introduction

The scanner chops up source files into tokens. This is a done as a preprocessing step before the resulting token array is handed to the parser.
 
A token is a single reserved keyword, operator, identifier or character sequence.
 
A pool of unique Strings (identifiers and string constants) is built during the scanner pass.
 
The scanner keeps track of line numbers so that tokens (and later parser tree nodes) can be mapped back to source lines and modules.
2. Charset
Script sources must use the 8bit / ASCII format.
2.1. Escape sequences
SequenceDescription
\\backslash
\'single quotation mark
\"double quotation mark
\nlinefeed
\rcarriage return
\ttabulator
\vvertical tabulator
\fform feed
\bbackspace
\aalert (beep/flash screen)
\eESC character (decimal 27, octal 033).
2.1.1. Escaping embedded substrings
Strings can contain "quoted" substrings. Substrings should be surrounded by single or double quotation marks. Example:

String words1 = "\'hello\' \', \' \'world\'";
String words2 = '\"hello\" \", \" \"world\"';
trace "words1 = " + #(words1.splitSpace(true)); // true=scan for substrings
trace "words2 = " + #(words2.splitSpace(true));
2.1.2. Linefeed character in print/trace statements
The linefeed character '\a' at the end of a String printed using the print or trace statements can be omitted, it will be added automatically.
 
Example:

print "hello, world."; // print statement will add linefeed automatically

 
You can use stdout to print the string as-is (i.e. without any linefeed added) Example:

stdout "hello, world.\n"; // print string as-is
3. Identifiers, keywords, literals and constants
Identifiers are used to define unique names for variables, classes, functions and constants. Identifiers are case-sensitive which means that e.g. MyVariable and myvariable are clearly distinguished.
 
The first char of an identifier has to be a letter [a-zA-Z] or the underscore _; identifiers must not contain delimiters and operators; reserved keywords may also not be used as identifiers. The length should not exceed 128 characters.
4. Delimiter chars
The source scanner uses the following delimiter chars when tokenizing a source script:
= > < == <= >= != , ! && || ++ -- + - * / & | ^ % << >> += -= *= /= 
&= |= ^= %= <<= >>= ( ) { } #[ [ ] ; 
5. Comments
Up:
  1. TkScript Reference Guide » Scanner » Comments
5.1. Line comments

print "hello, world."; // This is a line comment
5.2. Block comments

print /* This is a block comment */ "hello, world.";

 
Note: Block comments should not be nested, it might confuse the syntax highlighters of certain text editors.
6. Reserved keywords
Up:
  1. TkScript Reference Guide » Scanner » Reserved keywords
The following keywords may not be used as identifier names:
  • boolean
  • break
  • byte
  • case
  • catch
  • char
  • clamp
  • class
  • compile
  • constraint
  • default
  • #define
  • define
  • deref
  • delegate
  • dtrace
  • do
  • else
  • enum
  • exception
  • explain
  • extends
  • false
  • finally
  • float
  • for
  • function
  • if
  • instanceof
  • int
  • local
  • loop
  • method
  • module
  • namespace
  • null
  • Object
  • prepare
  • print
  • private
  • protected
  • return
  • returns
  • short
  • static
  • String
  • switch
  • tag
  • this
  • throw
  • trace
  • true
  • try
  • use
  • var
  • void
  • while
  • wrap
7. Number formats and literals

The following forms of representation can be used to write constant values:
7.1. Decimal integer

int i = 10;
7.2. 32bit floating point number

float f = 10.25;
float f2 = 10.25f;
7.3. Hexadecimal integer

int i = $fedcba98 // ASM-style hex literal
int htmlColor = #fedcba98 // ARGB32/HTML-style color literal
int chex = 0x900df00d; // C-style hex literal
7.4. Binary integer

int i = 0b1101001;
7.5. A single ASCII character

char c = '!';
char c2 = '\n'; // linefeed ASCII character
7.6. A sequence of one or more ASCII characters

String s = "a \'string\'\n";
7.7. An ANSI escape sequence (clear screen)

This example clears the screen if run within an ANSI compatible terminal emulator (e.g. Linux terminals):

print "\e[2J";
7.8. Common constant literals
7.8.1. true
This literal represents the integer/boolean value 1 (true).
 
Example:

boolean bPrintHello = true;
if(bPrintHello)
{
print "hello, world.";
}
7.8.2. false
This literal represents the integer/boolean value 0 (false).
Example:

boolean bDontPrintHello = false;
if(!bDontPrintHello)
{
print "hello, world.";
}
7.8.3. maybe
This literal represents the integer/tristate value -1 (maybe).
 
Example:

boolean bLastChoice = true;

boolean bPrintHello = maybe;
if(maybe == bPrintHello)
{
bPrintHello = bLastChoice;
}
if(bPrintHello)
{
print "hello, world.";
}
maybe is used quite rarely but can be useful when e.g. dealing with UI preferences dialogs that want certain elements to be pointed out as being unchanged.
7.8.4. The "null" literal
null represents a pointer to no Object.

// Object references are deleted by (pointer-)assigning 'null'
String s <= new String;
s <= null; // Deletes "s" since the variable "owns" the pointer

 
If a pointer variable is initialized with null during its declaration, no initial object will be allocated:

String s <= null; // Do not allocate initial String
7.9. String-number conversions at runtime
The number format conversion is also performed at runtime when numbers are assigned to strings or vice versa.
 
Example:

int i = String(42);

 
The number parser is also useful to initialize number objects (see Number objects):

Double d <= Double.News("3.1415926535897932384626433832795");
8. User defined constants
Up:
  1. TkScript Reference Guide » Scanner » User defined constants
  2. TkScript reference guide / Classes » Constants
8.1. Module constants
The #define resp. define keywords are used to define a constant value in a source module.
 
Please notice that TkScript has no sophisticated preprocessor so the constant value must be a single token.
Please also take a look at class constants, which allow complex initialization expressions.
8.1.1. Int constant example

#define NUMLOOPS 42
loop(NUMLOOPS) { /* ... */ }
8.1.2. String constant example

#define AUTHOR "Bastian Spiegel "
trace "This software was developed by " + AUTHOR ;
8.1.3. Enumeration example

enum { RED, GREEN, BLUE }; // => RED==0, GREEN==1, BLUE==2

enum { RED, GREEN=4, BLUE }; // => RED==0, GREEN==4, BLUE==5

 
Also see The define and enum statements.
8.2. Class constants
TkScript allows constants to be declared in the scope of a script class.
 
In contrary to module constants, class constants are strongly typed and can use complex initialization expressions.
 
The reason for this is that the constants are actually initialized in the first parser pass, not the scanner :).
 
8.2.1. A simple class constants example

class CConst {
define int A = 1 + 3*4 - 2*3;
define float B = PI * 0.5f;
define String C = "hello, world.";
}

print CConst.A;
print CConst.B;
print CConst.C;

 
Also see Constants.
8.3. Plugin constants
Native C++ plugin classes can export constants via the →yac plugin interface.
 
For example, take a look at the tkopengl plugin which exports a lot of constants (see DisplayList and Texture).
9. System constants
The following is a list of pre-defined system constants:
1PI
- 1 divided by PI
1SQRT2
- 1 divided by sqrt(2)
2PI
- PI multiplied by 2 (6.2..)
BIG_ENDIAN
- big endian byte order, msb first
default
- 0 or 0.0 or false
E
- the math constant E
false
- 0
IOS_IN
- input/output stream mode
IOS_INOUT
- input/output stream mode
IOS_OUT
- input/output stream mode
LITTLE_ENDIAN
- little endian byte order, lsb first
LN10
- the math constant ln(10)
LOG10
- the math constant log(10)
maybe
- -1
null
- NULL object pointer
PI
- the math constant PI
PI2
- PI divided by 2 (1.5..)
RAND_MAX
- maximum value returned by sirnd ASM opcode
SEEK_BEG
- stream seek mode
SEEK_CUR
- stream seek mode
SEEK_CUR
- stream seek mode
SQRT2
- the math constant sqrt(2)
true
- 1


auto-generated by "DOG", the TkScript document generator. Wed, 31/Dec/2008 15:53:35