The JIT compiler translates statement blocks, which have been opened using the compile keyword, to an internal bytecode (p-code) representation. This p-code is optimized by some post processing routines and then finally translated to native CPU code thus speeding up execution time by up to 50 times.
By using a CPUTable to emit the actual machine code, most of the platform-dependent code is removed from the JIT compiler at source level. The CPUTable contains the micro-programs for all p-code instructions. In order to add support for other CPU architectures, this table can be exchanged for a more suitable set of micro programs.
P-Code virtual machines have been around for some decades (read a paper saying 1950ies). Nevertheless I first became aware of VMs when I heard of Sun Microsystems Java language. Actually, I have never been much into Java programming (had to learn it for school once) and I did not look at the Java p-code instruction set or Java VM when I started the TKScript JIT (to be true: When I started optimizing the JIT later on, I used a Java decompiler and actually *had* a look at the Java VM (a bit). Guess the JIT design is quite obvious anyway since the parser tree is stack based, too..). The main sources of inspiration for the JIT were
Some funny thing I just noticed (as of 10-Apr-2004) is that the Forth language had "Integrated access to assembly language.". Originally, the TkScript JIT compiler also included an X86/X87 inline assembler which was removed in 2003 to ease portability.
Probably the most important reference for bytecode virtual machines is UCSD Pascal, read more about it here.
Please notice that TkScript is basically not a bytecode-interpreted language. The bytecode used here is only an intermediate step for the native code generator. It is generated only for compile{/*...*/}
statement sequences.
The native-code generator uses a peephole optimizer, which moves an optimization "window" over the generated code and collapses n instructions to a single one. Take a look at the difference between these two examples: sieve_dasm_opt and sieve_dasm_noopt.
The programmer must tag the statement blocks which are to be compiled to native code by opening them using the keyword compile.
Example:
int flags[8191]; flags.numElements=8191;
compile
{
int size= 8190;
int i, prime, k, count, iter;
trace "Eratosthenes Sieve prime number calculation";
trace "10 iterations<br>";
for (iter = 1; iter <= 10; iter++)
{
count = 0;
for (i = 0; i <= size; i++) flags[i] = true;
for (i = 0; i <= size; i++)
{
if (flags[i])
{
prime = i + i + 3;
k = i + prime;
while (k <= size)
{
flags[k] = false;
k += prime;
}
count ++;
}
}
}
trace count + " primes";
}
"sieve.tks" p-code disassembly:
(#0000)00000000: movvc size 8190 (0.000000f); (#0001)00000003: pushc 9659160 (0.000000f); (#0002)00000005: loadc 4370488 (0.000000f); (#0003)00000007: apicall; (#0004)00000008: incstp; (#0005)00000009: pushc 9659320 (0.000000f); (#0006)0000000b: loadc 4370488 (0.000000f); (#0007)0000000d: apicall; (#0008)0000000e: incstp; (#0009)0000000f: movvc iter 1 (0.000000f); (#0010)00000012: jivic iter > 10 005c; (#0011)00000016: movvc count 0 (0.000000f); (#0012)00000019: movvc i 0 (0.000000f); (#0013)0000001c: jiviv i > size 002b; (#0014)00000020: pushc 1 (0.000000f); (#0015)00000022: pushv i; (#0016)00000024: iasel 0; (#0017)00000026: siapopl; (#0018)00000027: inciv i; (#0019)00000029: bra 001c; (#0020)0000002b: movvc i 0 (0.000000f); (#0021)0000002e: jiviv i > size 0058; (#0022)00000032: pushv i; (#0023)00000034: siapushl; (#0024)00000035: sitestz 0054; (#0025)00000037: pushivaddiv i i; (#0026)0000003a: pushc 3 (0.000000f); (#0027)0000003c: siadd; (#0028)0000003d: popv prime; (#0029)0000003f: pushivaddiv i prime; (#0030)00000042: popv k; (#0031)00000044: jiviv k > size 0052; (#0032)00000048: pushc 0 (0.000000f); (#0033)0000004a: pushv k; (#0034)0000004c: siapopl; (#0035)0000004d: ivaddiv k prime; (#0036)00000050: bra 0044; (#0037)00000052: inciv count; (#0038)00000054: inciv i; (#0039)00000056: bra 002e; (#0040)00000058: inciv iter; (#0041)0000005a: bra 0012; (#0042)0000005c: pushc 9594240 (0.000000f); (#0043)0000005e: loadc 4370488 (0.000000f); (#0044)00000060: apicall; (#0045)00000061: incstp; (#0046)00000062: halt;
compiler output without optimization ("sieve.tks" disassembly page):
(#0000)00000000: pushc 8190 (0.000000f); (#0001)00000002: popv size; (#0002)00000004: pushc 9663256 (0.000000f); (#0003)00000006: loadc 4370488 (0.000000f); (#0004)00000008: apicall; (#0005)00000009: incstp; (#0006)0000000a: pushc 9663416 (0.000000f); (#0007)0000000c: loadc 4370488 (0.000000f); (#0008)0000000e: apicall; (#0009)0000000f: incstp; (#0010)00000010: pushc 1 (0.000000f); (#0011)00000012: popv iter; (#0012)00000014: pushv iter; (#0013)00000016: pushc 10 (0.000000f); (#0014)00000018: sicmpb <=; (#0015)00000019: sitestz 0075; (#0016)0000001b: pushc 0 (0.000000f); (#0017)0000001d: popv count; (#0018)0000001f: pushc 0 (0.000000f); (#0019)00000021: popv i; (#0020)00000023: pushv i; (#0021)00000025: pushv size; (#0022)00000027: sicmpb <=; (#0023)00000028: sitestz 0035; (#0024)0000002a: pushc 1 (0.000000f); (#0025)0000002c: pushv i; (#0026)0000002e: iasel 0; (#0027)00000030: siapopl; (#0028)00000031: inciv i; (#0029)00000033: bra 0023; (#0030)00000035: pushc 0 (0.000000f); (#0031)00000037: popv i; (#0032)00000039: pushv i; (#0033)0000003b: pushv size; (#0034)0000003d: sicmpb <=; (#0035)0000003e: sitestz 0071; (#0036)00000040: pushv i; (#0037)00000042: siapushl; (#0038)00000043: sitestz 006d; (#0039)00000045: pushv i; (#0040)00000047: pushv i; (#0041)00000049: siadd; (#0042)0000004a: pushc 3 (0.000000f); (#0043)0000004c: siadd; (#0044)0000004d: popv prime; (#0045)0000004f: pushv i; (#0046)00000051: pushv prime; (#0047)00000053: siadd; (#0048)00000054: popv k; (#0049)00000056: pushv k; (#0050)00000058: pushv size; (#0051)0000005a: sicmpb <=; (#0052)0000005b: sitestz 006b; (#0053)0000005d: pushc 0 (0.000000f); (#0054)0000005f: pushv k; (#0055)00000061: siapopl; (#0056)00000062: pushv k; (#0057)00000064: pushv prime; (#0058)00000066: siadd; (#0059)00000067: popv k; (#0060)00000069: bra 0056; (#0061)0000006b: inciv count; (#0062)0000006d: inciv i; (#0063)0000006f: bra 0039; (#0064)00000071: inciv iter; (#0065)00000073: bra 0014; (#0066)00000075: pushc 9598296 (0.000000f); (#0067)00000077: loadc 4370488 (0.000000f); (#0068)00000079: apicall; (#0069)0000007a: incstp; (#0070)0000007b: halt;
The following intermediate byte code instruction set is used by the JIT compiler to translate source code to native CPU code. Arguments are usually passed on the hardware stack. Script classes require a static array to keep track of nested class calls. The list may be a little outdated but basically nothing has gravely changed.
apicall <function_adr> - call API function (first argument last) bra <label> - branch to user defined label fvaddc <var> <const> fvaddfv <var1> <var2> fvdivc <var> <const> fvdivfv <var1> <var2> fvmulc <var> <const> fvmulfv <var1> <var2> fvsubc <var> <const> fvsubfv <var1> <var2> halt - terminate the VM. iasel <var> load int array variable var into array base register ivaddc <var> <const> ivaddiv <var1> <var2> ivandc <var> <const> ivandiv <var1> <var2> ivaslc <var> <const> ivasliv <var1> <var2> ivasrc <var> <const> ivasriv <var1> <var2> ivdivc <var> <const> ivdiviv <var1> <var2> iveorc <var> <const> iveoriv <var1> <var2> ivmodc <var> <const> ivmodiv <var1> <var2> ivmulc <var> <const> ivmuliv <var1> <var2> ivorc <var> <const> ivoriv <var1> <var2> ivsubc <var> <const> ivsubiv <var1> <var2> jeqivic <var1> <iconst> <label> jeqiviv <var1> <var2> <label> jgeivic <var1> <iconst> <label> jgeiviv <var1> <var2> <label> jgtivic <var1> <iconst> <label> jgtiviv <var1> <var2> <label> jltivic <var1> <iconst> <label> jltiviv <var1> <var2> <label> jleivic <var1> <iconst> <label> jleiviv <var1> <var2> <label> jneivic <var1> <iconst> <label> jneiviv <var1> <var2> <label> fasel <var> load float array variable var into array base register poplv <lvar> - pop local variable lvar from stack popv <var> -pop global/function variable var from stack pushc <const> - push constant value const onto stack pushfvaddc <var> <const> pushfvaddfv <var1> <var2> pushfvdivc <var> <const> pushfvdivfv <var1> <var2> pushfvmulc <var> <const> pushfvmulfv <var1> <var2> pushfvsubc <var> <const> pushfvsubfv <var1> <var2> pushivandc <var1> <const> pushivandiv <var1> <var_2> pushivorc <var1> <const> pushivoriv <var1> <var_2> pushiveorc <var1> <const> pushiveoriv <var1> <var_2> pushivmodc <var1> <const> pushivmodiv <var1> <var_2> pushivaddc <var1> <const> pushivaddiv <var1> <var_2> pushivsubc <var1> <const> pushivsubiv <var1> <var_2> pushivmulc <var1> <const> pushivmuliv <var1> <var_2> pushivdivc <var1> <const> pushivdiviv <var1> <var_2> pushivaslc <var1> <const> pushivasliv <var1> <var_2> pushivasrc <var1> <const> pushivasriv <var1> <var_2> pushs - push stackpointer onto stack pushv <var> - push global/function variable var onto stack pushdeciv <var> - decrement and push global/function int variable var onto stack pushinciv <var> - increment and push global/function int variable var onto stack pushivdec <var> - push global/function int variable var onto stack and decrement it pushivinc <var> - push global/function int variable var onto stack and increment it movlv <lvar> <lvar2> - move content of lvar2 to lvar movlvc <lvar> <const> - move constant value const to lvar movv <var> <var2> - move content of var2 to var movvc <var> <const> - move constant value const to var sfatan2 - calc arc tan of st[1]/st[0] (both float) sfabs - calc absolute value of stack float val sfadd - add stack float values sfcmp <relop> <label> - compare float values from stack and branch to label if comparison evaluates to 1 sfcos - calc cos of stack float val (rad) sfdiv sp[0]=sp[1]/sp[0] - sffrac - calc remainder of stack float val sfneg - change sign of stack float val sfmul - multiplicate stack float values sfpow - st[0]= st[0] raised to the power of st[1] sfround - round stack float val sfsin - calc sin of stack float val (rad) sfsub sp[0]=sp[1]-sp[0] - sfsqrt - calc square root of stack float val sftan - calc tan of stack float val (rad) sfquad - multiply stack float val with itself sfcmpb <relop> - compare stack float values and store result (1,0) siabs - calc absolute value of stack int val siadd - add stack int values siand - bitwise and stack int values siasl sp[0]=sp[1]<<sp[0] - siasr sp[0]=sp[1]>>sp[0] - sicmp <relop> <label> - compare int values from stack and branch to label if comparison evaluates to 1 sicmpbcompare stack int values and store result (1,0) sidiv sp[0]=sp[1]/sp[0]. - sieor - bitwise eor/xor stack int values siinv - bitwise not int val from stack siloop <label> - decrement int value from stack and branch to label if value>0 simod sp[0]=sp[1]%sp[0] - simul - multiplicate stack int values sineg - change sign of stack int val sinot - C-like logical not operator. pops int val from stack and pushes either 0 or 1 sior - bitwise or stack int values siapopl - read array element siapushl - store array element sipackargb - pop 4 (byte) ints from stack and push packed 32bit int sipop - pop int/float val from stack and discard it. sipush[0..3] - push constant int val [0..3] onto stack sirnd - push new random int val onto stack sisub sp[0]=sp[1]-sp[0] - siswap - swaps swap upper stack values siswapw - swap upper and lower word of int stack value sitestbz - check if stack int val >0 and set it to either 0 or 1 sitestbz2 - check if stack+1 int val >0 and set it to either 0 or 1 sitestz <label> - pop int from stack, compare with 0 and branch to label if 1 sitestzp <label> - compare int value from stack and branch to label if comparison evaluates to 1 siquad - multiply stack int val with itself siunpackargb - pop packed 32bit int from stack and push 4 (byte) ints sreti - grab int/object result value from last C/C++ call (usually in eax) sretf - grab float result value from last C/C++ call (usually in st0) stcif - typecast stack int val to float stcif2 - typecast stack+1 int val to float stcfi - typecast stack float val to int stcfi2 - typecast stack+1 float val to int
Variables, constants and jump/function call addresses in the native code are later replaced by their actual values with the help of some magic (32Bit) place holder symbols that are stored in the CPUTable:
"magic" symbol | instruction |
---|---|
0xC0DE1234 | halt |
0xC0DEC0DE | bra, sfcmp, sicmp, sitestz, sitestzp, siloop |
0xC0FFC0FE | movvc |
0xC0FFC4C4 | pushv, pushivinc, pushinciv, pushivdec, pushdeciv, popv, movv, movvc, inciv, |
0xC0FFC4C4 | deciv, pushv |
0xC0FFC0FF | pushc, loadc |
0xC0FFC4C5 | movv |
0xC0FFAAA1 | iasel, iaselbc |
0xC0FFABA1 | iaselbc |
0xC0FFAAA2 | fasel |
0xC0FFABA2 | faselbc |
0xC0FFC7C7 | incliv, decliv, siapushl, siapushlbc, siapopl, siapoplbc |