Jint Programming Language Manual

Introduction

Jint is a scripting language for Java-based systems. Jint is meant to be an extension of Java(*) programming language geared towards scripting. (It is not quite so at the moment, but mostly because of bugs). Jint has such traditional scripting language features as "typeless" variables, regular expression support and interactive execution environment. It also has a number of other scripting-friendly features, such as variable number of arguments, syntax shortcuts, and switch statement for strings. I will concentrate on this, "scripting" part of the language, without relying on reader's knowledge of Java, but without retelling what Java programming language or object-oriented programming is.

Jint used to mean Java Interpreter. It started as a simple interpreter for Java method calls, then for more general subset of java expressions, and then for the more general subset of Java programming language. It used to be implemented as an interpreter. At some point, though, it became clear that interpreter is too limited for Java environments, because it is almost impossible to do anything without subclassing which cannot be implemented very well by the interpreter. At that point Jint became compiled language. (It is compiled to JVM bytecodes, of course). It is still possible to compile a single statement to bytecode, load and execute it right away, so it still can behave as an "interpreter").

Jint Interactive Environment

The easiest way of trying out Jint system is to run it interactively. Once you start Jint "interpreter" (actually compiler in the interactive mode), you can execute most of the Jint statements just by typing them:
bash% JINTC
jint: 1 + 1;
2
jint: 100.0 * 67.5 + 0.98;
6750.98
jint: "Hello " + "world!";
Hello world!
jint:
In this example JINTC is the name of Jint interpreter/compiler, and "jint:" is a Jint prompt indicating that Jint system is ready for commands. The rest of the lines that start with "jint:" is user input. After every user input there is a Jint system reply, that is just a value of the given expression statement (expression statement is an expression followed by a ';', don't forget ';', simple end-of-line is not taken as the end of statement). Other Jint statements are variable/function/class declarations (to give names to values, groups of operations and object classes), control/loop statements (to describe the logic of the program) and compound statement (to group several statements together).

Expression statement allows one not only to evaluate an expression, but also assign value to variable using assignment operator '='. Jint variables, just like in Java are not prefixed by dollar (or anything else). While variables can be introduced by just assigning to them, it is better to declare all variables. To declare variable one uses variable declaration statement (which does not print the result of the expression evaluation). This is an example:

jint: var q = 89 - 8;
jint: q/9;
9
(If you know Java, you might be surprised not to see variable type in the declaration. One can use typed variables as well; typeless variables is one of the Jint extensions). Besides variables one can declare functions and classes. Function is a small program that performs certain task and can be used many times. It defines something that "can be done". For example:
jint: square( n ) { var res = n*n; return res; }
jint: square(2);
4
jint: square(10);
100
jint: square(1.4142);
1.9999616399999998
The first line of this example is a function declaration. We defined a function named "square" that takes single argument and returns the result of multiplying it by itself. Then we can use this function to calculate squares of several numbers. Function can take many parameters (their names should be separated by commas), but they return only one value (or do not return any).

Each variable or function is defined only in certain scope. One can use variable or function name to refer to it while in the same scope; it is harder (or even impossible) to refer to function or variable outside of the scope where it is defined. Initially one is working with global scope, but every function introduces its own scope, so if one declares variable inside the function body, it won't be visible outside the function. Function parameters (like 'n' in the above example) are considered to be variables defined in function scope. Scopes are nested, so it is still possible to access functions and variables from the global scope while in function scope (unless something in the function scope has the same name).

One can access the whole scope of a function using 'self' variable, but be aware that this can hurt compiled code efficiency.

Jint Type System

"Strong" and "Free" Types

Jint type system is somewhat complicated because it is a mixture between strong typing in Java and run-time typing that is common in scripting languages. Every value in Jint is of certain "compile-time type", or simply "type". There are primitive types: int, short, byte, long, char, float, double and boolean, and there are object types (defined by Java/Jint class, such as Object or String) - those are called "strong types". There is also a "free type" (represented by Jint pseudo-class kmy.jint.lang.Any), which basically defers typing to run time. If operation involves just primitive and object (strong) types, a lot of errors are detected by compiler and the resulting code is much more efficient. "Free type" is much more flexible, but this comes at the expense of compile-time error detection and efficiency. When both are present in a single operation, "free typing" rules are used. For example, consider expression:
a + 1
If variable a is of strong type, compiler can check legality of the operation (it is illegal if a is boolean). It also can figure out what kind of operation is meant to be used here: integer, long, float or double addition (those four are very different for a computer, so must be distinguished). Under strong typing rules, this expression can handle values only of certain type - the type of variable a. On the other hand if variable a is of free type, no checks are made during compilation. At run time, however, type of value that variable a holds is examined and the same checks are performed. It is more expensive, but also more generic, as this expression now can handle values of any numeric type.

Classes

Object types are defined by classes. There are a lot of predefined classes in Java and all of them are available for Jint as well. Some examples are java.lang.String (which represents character strings), java.util.Hashtable (which represents hash table), and java.io.InputStream (which represents source of binary data, such as open file).

Expressions

Here is only a brief summary of different kinds of expressions. It is easy to learn expressions by example.

The simples expression is a single value. Here is the list of what a value can be

Jint Values

Kind Example Type Description
Identifier foo same as variable type variable name
Integer 57 int integer number
Real 3.14519 double real number
Quoted character 'Q' char Unicode character
Quoted string "Hello" java.lang.String Unicode string
Regex /\d+/ kmy.regex.util.Regex regular expression
Substitution s`foo`bar`g kmy.regex.util.Replacer substitution object
Formatted string `psi=${psi%6.4f}` java.lang.String Unicode string with variables

Regular expressions and formats are described in more detail later.

More complex expressions can be built from simple ones using following constructions (EXPR means expression):

Kind Syntax Description
Infix expression EXPR op EXPR op is one of the infix operators
Prefix expression op EXPR op is one of the prefix operators
Postfix expression EXPR op op is one of the postfix operators
Bracketed expression (EXPR) brackets are used to alter order of execution
Array {EXPR, ...} Create array and fill it with the given values
Typed array Clazz[] {EXPR, ...} Create array of type Clazz[] (Clazz must be a class name) and fill it with the given values
Instantiation new Clazz(EXPR, ...) Create an object of class Clazz (Clazz must be a class name)
Subclassing
instantiation
new Clazz(EXPR, ...){
statement ...}
Create an object of in-place defined Clazz's subclass
Array creation new Clazz[EXPR] Create array of EXPR elements (Clazz[EXPR][EXPR]... is allowed)
Block [statement ...] Creates a new kmy.jint.lang.Block object

Expression must not start with a '{', so if array value must be in the beginning of the expression (not sure that this ever makes sense), it must be bracketed.

Jint Operators

Operator Priority Kind Description
. 13 infix Method/field access, right operand must be a method/fiend name
(arg1,arg2...) 12 postfix Method/function call, operand must be either method name or method access
[index] postfix Array element access
++ prefix/postfix Increment by 1
-- prefix/postfix Decrement by 1
! prefix Logical negation
~ prefix Bitwise negation
- prefix Arithmetic negation
+ prefix Integer promotion
* 11 infix Multiplication
/ infix Division
% infix Modulo
+ 10 infix Addition for numbers, concatenation for Strings.
- infix Subtraction
>> 9 infix Signed bit-shift to the right
>>> infix Unsigned bit-shift to the right
<< infix Bit-shift to the left
> 8 infix Greater then
>= infix Greater or equal then
< infix Less then
<= infix Less or equal then
instanceof infix Checks if left operand is an instance of the class named by the right operand (which must be a class name).
== 7 infix Equal
!= infix Not equal
~ infix Match or substitution
& 6 infix Bitwise and
| infix Bitwise or
^ infix Bitwise xor
&& 5 infix Logical and
|| 4 infix Logical or
..?..:.. 3 infix Conditional operator
: infix Property assignment (a.title: "uuu" is equivalent to a.setTitle("uuu")). For compatibility with Java second parameter must not be an array expression {...}
= 2 infix Assignment operator
+= infix "Assignment with operation" operators
-=
*=
/=
%=
&=
|=
^=
>>=
>>>=
<<=
, 1 infix Sequencing operator

Statements

Jint has the same set of statements as Java, except that things like variable or function (or method, or class) definitions are all considered to be Jint statements too.

Expression Statement

Expression statement is just an expression followed by ';'. Expression must have a side effect, so that its execution actually does something - function call or assignment (at present this is not checked by Jint compiler).

Compound Statement

Syntax:
{
  statement1
  statement2
  ...
}
Compound statement allows one to pack several statements into one. Normally it is used together with "if" statement or loop statements.

If Statement

Syntax:
if( expression )
  then-statement
or
if( expression )
  then-statement
else
  else-statement
If expression is true, then-statement is executed, otherwise else-statement is executed (if it is present).

While Statement

Syntax:
while( expression )
  body-statement
If expression evaluates to false, operator is done. If expression evaluates to true, body-statement is executed and then this while-statement is executed again (and again until either body-statement finishes through exception or break, or expression evaluates to true).

Do Statement

Syntax:
do
  body-statement
while( expression );
First body-statement is executed. Then expression is evaluated. If expression evaluates to false, operator is done. If expression evaluates to true, this do-statement is executed again (and again until either body-statement finishes through exception or break, or expression evaluates to true).

For Statement

Syntax:
for( init-statement cond-expression ; next-expression )
  body-statement

Switch statement

Syntax:
switch( expression )
{
  case case-expression :
    statement
    ...
  ...

  default:
    statement
    ...
}
First, expression is evaluated. Then case-expressions are evaluated one by one. Results of the evaluation are compared, and when they are the same, statements in the body of the switch statement are evaluated, starting with the case label that contained matching expression. If no case-expression matches and default label is present, statements are evaluated starting with the default label. Jint switch statement can handle any type of expression (not only integer, as it is in Java). Objects of types java.lang.String and kmy.jint.lang.CharString are compared by converting both values to Strings (using toString() method) and then using equals(Object) method. If case-statement is of type kmy.regex.util.Regex, Regex.searchOnce(String) or Regex.searchOnce(CharString) is used.

Try statement

Syntax:
try
  statement
catch( Clazz var-name )
  catch-statement
...
or
try
  statement
catch( Clazz var-name )
  catch-statement
...
finally
  catch-statement

Synchronized statement

Syntax:
synchronized( expression )
  body_statement
Expression is evaluated and its result (that must be an object) is locked. Then body_statement is evaluated, and then object that has been locked gets unlocked. Only one thread can hold an object's lock at the same time.

Labeled Statement

Syntax:
Label : statement
Label is an identifier. In Jint, statement can be only compound, do, for, while, synchronized, try or switch statement (although Java allows any statement be labeled, there is no much sense in doing so for other kinds of statements, as there is no goto statement anyway).

Break Statement

Syntax:
break;
or
break Label;

Continue Statement

Syntax:
continue;
or
break Label;

Class Definition

Here and later 'access' is either nothing or any number of the following keywords: public, private, package private (considered to be a single keyword), protected, synchronized, static, final. (Classes and interfaces cannot be private, though.) Syntax:
access class name
  extends extended-name
  implements implemented-name,...
{
  statement
  ...
}

Interface Definition

Syntax:
access interface name
  extends extended-name,...
{
  statement
  ...
}

Method/Function Definition

Syntax:
access Clazz name( Clazz arg-name, ... )
  throws Clazz, ...
{
  statement
  ...
}
This defines method name with the given parameter types, return value type, potentially throwing given exception types. Parameter type can be omitted, corresponding parameter can take any value (it will be wrapped if needed). Parameter type is considered to be kmy.jint.lang.Any (this is Jint-specific pseudo-class). Return type also can be omitted. Throws part can be omitted if method never throws unchecked exceptions (Exceptions that are not RuntimeExceptions) - compiler will check that. If, however, both return type and "throws" part is omitted (script-style declaration), method is considered to be throwing any exception (so, this is equivalent to specifying 'throws java.lang.Exception'). If last parameter is declared to have kmy.jint.lang.ArgList type, any number of actual parameters of any type (wrapped in corresponding wrapper classes if needed) can be passed through it. They can be accessed using [..] operator.

Field/Variable Declaration

Syntax:
Clazz name,...;
or
Clazz name=value,...;

Field/Variable Declaration With Subclassing

Syntax:
Clazz name {
 statement
 ...
}
This is roughly equivalent to
public Clazz1 extends Clazz { 
  statement
  ...
}

public Clazz1 name = new Clazz1();
except that name Clazz1 is substituted with some special internal name. Also, compiler might not do subclassing if statements do not introduce new fields or methods (it always does subclassing now).

Regex Class Definition

Syntax:
access class name = regex;
Compiles regular expression regex into class named name. Regular expression should have "standalone" flag.

Text Processing Constructions

Note: These constructions are Jint-specific; there is no language-level equivalents in Java.

There are 3 text processing constructions in Jint:

We will discuss them one by one below.

Regular Expressions

Regular expression (regex) in Jint is an object of type kmy.regex.util.Regex. Regular expression determines a pattern that Strings (and CharStrings) can be matched against. This pattern can contain variables that are filled in in the process of matching. Regexes can also be used together with kmy.jint.io.JintReader to parse character streams. Regular expressions in Jint are normally compiled into (fairly verbose) bytecode either in-line or as a separate classes. Jint expression-level syntax for regex is either m`regex-chars` or /regex-chars/. Jint also supports "globbing" expressions (used for filename matching on many OSes) as a special form of regular expressions. Their syntax is p`globbing-chars`. Regex also can be used as a right-hand side for infix ~ operator.

A regular expression "body" (characters between slashes) are built using "building blocks" listed in the table (RE means any regular expression).

Regular Expression Syntax

Syntax Example Hungry Matches
letter or digit r NA Matches this letter or digit
\n, \r, \t, \f \n NA Matches given special character
\cLetter \cL NA Matches given "control" code, for example \cL matches ^L (\f).
\uXXXX \u274F NA Matches given Unicode character (X is a hex digit)
\xXX \xAD NA Matches character specified as a hex code (X is a hex digit)
\s \s NA Matches space character (' ','\t','\r','\n')
\S \S NA Matches non-space character.
\d \d NA Matches digit ('0'-'9')
\D \D NA Matches non-digit
\w \w NA Matches word character
\W \W NA Matches non-word character
\i \i NA Matches identifier character (word or digit)
\I \I NA Matches non-identifier character
\char \* NA Matches char (it should not be a letter or digit).
. . NA Matches any character
[range range...] [a-zA-Z] NA Range is either char1-char2 or just char (= char-char). Matches any char within any of the ranges given.
[^range range...] [^a-z] NA Matches any char outside of all of the ranges given.
^ ^ NA Matches the beginning of the line
$ $ NA Matches the end of the string (only when either the last char in the regex or is followed by ')' or '|')
< < NA Matches an empty string between either the non-word charater or beginning/end of string and a word character (beginning of a word)
> > NA Matches an empty string between a word character and either the non-word character or beginning/end of string (end of a word)
\b \b NA Matches either a beginning or an end of a word.
\B \B NA Matches both not a beginning and not an end of a word.
RE|RE blue|red NA Matches whatever any of the REs matches.
(RE RE ...) (\d\.\d+) NA Matches what first RE matches followed by what second RE matches and so on. The string that was matched can be referenced later in the regular expression using backreference
${digit...} ${7} NA Backreference. Matches the string that corresponding (...) matched.
\digit... \007 NA Either a backreference or a character in octal notation. Backreferences cannot start with 0 and are given using decimal numbers and corresponding (...) must already be defined. Character can be specified only using octal digits (0-7). If after applying these rules it is still not clear if it is a backreference or a character in octal notation, it is treated as a backreference.
RE* [a-z]* yes Matches any number (including 0) of whatever RE matches
RE*? [a-z]*? no Matches any number (including 0) of whatever RE matches
RE+ (yes|no)+ yes Matches any number (> 0) of whatever RE matches
RE+? [a-z]+? no Matches any number (> 0) of whatever RE matches
RE? [a-z]? yes Matches whatever RE matches or empty string
RE?? [a-z]?? no Matches whatever RE matches or empty string
RE{n,m} [a-z]{2,5} yes Matches whatever RE matches at least n but at most m times (n and m must be decimal numbers)
RE{n,m}? [a-z]{2,5}? no Matches whatever RE matches at least n but at most m times (n and m must be decimal numbers)
${var} or $var ${foo} NA Matches string that is equal to the given variable value.
@var(RE) @a(\d+) NA Matches whatever RE matches, storing result into given variable.

Regular expression meaning can be modified by specifying any number of the following characters right after the regex. Lowercase letters alter the way regex matches; uppercase letters tell compiler how regex should be compiled, where to take variables, etc.

Regular Expression and Replacer Modifiers

g Global (For replacer only) Replace all occurrences.
i Ignore case Ignore letter case.
m Multi-line Treat . as any character except '\n' and '\r'
s Single-line Treat . as any character
B Buffer only RegexRefiller cannot be set for such regular expression, so it cannot be used with JintReader (it can be used with character buffers only)
D Declare This regex declares all its "pick" variables (which are not already declared) as CharString variables
O Offline Disables regex inlining
R Retain all Retain values for all variables, not only for those that are named or referenced
S Standalone
or Static
Make regex class self-contained. All variables declared as CharString fields.

Formatted Strings

Formatted string is a string with embedded variables. When such construction is executed, variables are substituted with their values, conversion of variable value to string can be determined by specifying format.

Formatted string syntax is `abc..`. Variables are embedded with $ character: `g = $g` or `length = ${length}in`. Formats are specified using % character that immediately follows variable name. After % several format modifiers can be inserted. Format character concludes the format specification. Examples: 'hex = 0x${hex%08x}` or 'hex = 0x$hex%08x`. Some formats also can take certain java.text format specification (should go in [ ] modifier).

Format Characters

Char Type Meaning java.text
d, i integer Decimal integer DecimalFormat
c character value of type char  
e number Real number in XX.XXe+XX notation DecimalFormat
E number Real number in XX.XXE+XX notation DecimalFormat
f number Real number in XX.XX notation DecimalFormat
g number Real number in either XX.XX or XX.XXe+XX notation DecimalFormat
G number Real number in either XX.XX or XX.XXE+XX notation DecimalFormat
o integer Octal integer  
n none System-dependent newline  
s anything String representation  
t int, long, Date Shorter date&time representation SimpleDateFormat
T int, long, Date Longer date&time representation SimpleDateFormat
x int, long Hexadecimal integer in lower case  
X int, long Hexadecimal integer in upper case  

Format Modifiers

number Min. field width; if starts with 0 - print leading zeros
.number Number of digits after '.' for for real numbers
= Use max. field width = min. field width
+Always print sign
- Align to the left
^ Align to the center
[...] java.text format (either DecimalFormat or SimpleDateFormat)

Replacers

Replacer is a way to replace certain parts of the string with something else. Conceptually, replacer consists of the regex and formatted string bundled together. Replacer syntax is s`regex-part`formatted-string-part`. Examples: s`<peter>`Peter` will replace first word peter with Peter. s'<peter>'Peter'g will replace all words peter with Peter; s'@n(\d+)`0x$n%x`g will replace all decimal integers with their hexadecimal representation. Replacer is an object of type kmy.regex.util. Replacer also can be used as a right-hand side for infix ~ and ~= operators.


(*) Java is a registered trademark of Sun Microsystems, Inc.