Jint Regex Library

Jint text processing library (regular expressions and formatting) can be used standalone, as general-purpose library and its source files can be downloaded separately (or you can get them as a part of Jint download).

Java files in kmy/regex/tools can be used as sample source code. Examples:

java kmy.regex.tools.Grep '<(\w+)>.*\1'
will find all the lines where some word is used twice.
java kmy.regex.tools.Subst '\(@n(\d+)\)' '[${n%x}]'
will replace all decimal integer numbers in brackets with hexadecimal numbers in square brackets.

These are some speed comparisons. Note that this cannot really be used to benchmark JVM, as regex compiler generates fairly unusual (but legal) sequence of bytecodes. This results were obtained using 'retest -cpR', 'retest -pR' and 'regex.perl -p'.

Platform JVM kmy-regex (compiled) kmy-regex (interpreted) perl (for comparison)
Linux, Pentium Pro, 200MHz IBM JDK 1.1.6 (JIT) [crashed] 7.41 2.45
IBM JDK 1.1.6 (no JIT) 6.21 52.27
Sun JDK 1.2.2 (no JIT) 6.28 61.47
kaffe 1.05 [failed] 26.94
Sun SPARC, 300MHz classic JVM 1.2 (JIT) 4.23 12.38 1.98
classic JVM 1.2 (no JIT) 4.63 38.96
NT 4.0, Pentium II, 400MHz Sun JVM 1.2 (JIT) 0.68 5.42 1.22
Microsoft, jview, (JIT) 4.95 4.45