Java : StreamTokenizer with Examples
StreamTokenizer (Java SE 19 & JDK 19) API Examples.
You will find code examples on most StreamTokenizer methods.
Summary
The StreamTokenizer class takes an input stream and parses it into "tokens", allowing the tokens to be read one at a time. The parsing process is controlled by a table and a number of flags that can be set to various states. The stream tokenizer can recognize identifiers, numbers, quoted strings, and various comment styles.
final var s = """
abcd 1234
"X Y Z"
""";
try (final var reader = new StringReader(s)) {
final var tokenizer = new StreamTokenizer(reader);
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
System.out.println(tokenizer.sval); // abcd
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_NUMBER); // true
System.out.println(tokenizer.nval); // 1234.0
System.out.println(tokenizer.nextToken() == '"'); // true
System.out.println(tokenizer.sval); // X Y Z
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_EOF); // true
}
Fields
double nval
If the current token is a number, this field contains the value of that number.
Please see nextToken().
String sval
If the current token is a word token, this field contains a string giving the characters of the word token.
Please see nextToken().
static final int TT_EOF
A constant indicating that the end of the stream has been read.
final var s = "abcd\nXYZ";
try (final var reader = new StringReader(s)) {
final var tokenizer = new StreamTokenizer(reader);
tokenizer.eolIsSignificant(true);
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
System.out.println(tokenizer.sval); // abcd
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_EOL); // true
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
System.out.println(tokenizer.sval); // XYZ
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_EOF); // true
}
static final int TT_EOL
A constant indicating that the end of the line has been read.
Please see TT_EOF.
static final int TT_NUMBER
A constant indicating that a number token has been read.
Please see nextToken().
static final int TT_WORD
A constant indicating that a word token has been read.
Please see nextToken().
int ttype
After a call to the nextToken method, this field contains the type of the token just read.
final var s = "abcd 1234 XYZ";
try (final var reader = new StringReader(s)) {
final var tokenizer = new StreamTokenizer(reader);
tokenizer.nextToken();
System.out.println(tokenizer.ttype == StreamTokenizer.TT_WORD); // true
System.out.println(tokenizer.sval); // abcd
tokenizer.nextToken();
System.out.println(tokenizer.ttype == StreamTokenizer.TT_NUMBER); // true
System.out.println(tokenizer.nval); // 1234.0
tokenizer.nextToken();
System.out.println(tokenizer.ttype == StreamTokenizer.TT_WORD); // true
System.out.println(tokenizer.sval); // XYZ
tokenizer.nextToken();
System.out.println(tokenizer.ttype == StreamTokenizer.TT_EOF); // true
}
Constructors
StreamTokenizer (InputStream is)
Deprecated. As of JDK version 1.1, the preferred way to tokenize an input stream is to convert it into a character stream, for example:
Deprecated.
StreamTokenizer (Reader r)
Create a tokenizer that parses the given character stream.
final var s = "abcd XYZ";
try (final var reader = new StringReader(s)) {
final var tokenizer = new StreamTokenizer(reader);
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
System.out.println(tokenizer.sval); // abcd
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
System.out.println(tokenizer.sval); // XYZ
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_EOF); // true
}
Methods
void commentChar (int ch)
Specified that the character argument starts a single-line comment.
final var s = """
abcd
# comment
XYZ
""";
try (final var reader = new StringReader(s)) {
final var tokenizer = new StreamTokenizer(reader);
tokenizer.commentChar('#');
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
System.out.println(tokenizer.sval); // abcd
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
System.out.println(tokenizer.sval); // XYZ
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_EOF); // true
}
void eolIsSignificant (boolean flag)
Determines whether or not ends of line are treated as tokens.
Please see TT_EOF.
int lineno ()
Return the current line number.
final var s = """
abcd 1234
XYZ
""";
try (final var reader = new StringReader(s)) {
final var tokenizer = new StreamTokenizer(reader);
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
System.out.println(tokenizer.sval); // abcd
System.out.println(tokenizer.lineno()); // 1
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_NUMBER); // true
System.out.println(tokenizer.nval); // 1234.0
System.out.println(tokenizer.lineno()); // 1
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
System.out.println(tokenizer.sval); // XYZ
System.out.println(tokenizer.lineno()); // 2
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_EOF); // true
System.out.println(tokenizer.lineno()); // 3
}
void lowerCaseMode (boolean fl)
Determines whether or not word token are automatically lowercased.
final var s = "abcd XYZ";
try (final var reader = new StringReader(s)) {
final var tokenizer = new StreamTokenizer(reader);
tokenizer.lowerCaseMode(true);
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
System.out.println(tokenizer.sval); // abcd
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
System.out.println(tokenizer.sval); // xyz
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_EOF); // true
}
int nextToken ()
Parses the next token from the input stream of this tokenizer.
final var s = "abcd 1234 XYZ";
try (final var reader = new StringReader(s)) {
final var tokenizer = new StreamTokenizer(reader);
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
System.out.println(tokenizer.sval); // abcd
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_NUMBER); // true
System.out.println(tokenizer.nval); // 1234.0
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
System.out.println(tokenizer.sval); // XYZ
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_EOF); // true
}
void ordinaryChar (int ch)
Specifies that the character argument is "ordinary" in this tokenizer.
final var s = "abcQxyz";
try (final var reader = new StringReader(s)) {
final var tokenizer = new StreamTokenizer(reader);
tokenizer.ordinaryChar('Q');
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
System.out.println(tokenizer.sval); // abc
System.out.println(tokenizer.nextToken() == 'Q'); // true
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
System.out.println(tokenizer.sval); // xyz
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_EOF); // true
}
try (final var reader = new StringReader(s)) {
final var tokenizer = new StreamTokenizer(reader);
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
System.out.println(tokenizer.sval); // abcQxyz
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_EOF); // true
}
void ordinaryChars (int low, int hi)
Specifies that all characters c in the range low <= c <= high are "ordinary" in this tokenizer.
Please see also : ordinaryChar(int ch)
final var s = "AAAxBBByCCCzDDD";
try (final var reader = new StringReader(s)) {
final var tokenizer = new StreamTokenizer(reader);
tokenizer.ordinaryChars('x', 'z');
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
System.out.println(tokenizer.sval); // AAA
System.out.println(tokenizer.nextToken() == 'x'); // true
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
System.out.println(tokenizer.sval); // BBB
System.out.println(tokenizer.nextToken() == 'y'); // true
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
System.out.println(tokenizer.sval); // CCC
System.out.println(tokenizer.nextToken() == 'z'); // true
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
System.out.println(tokenizer.sval); // DDD
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_EOF); // true
}
void parseNumbers ()
Specifies that numbers should be parsed by this tokenizer.
final var s = "123";
try (final var reader = new StringReader(s)) {
final var tokenizer = new StreamTokenizer(reader);
tokenizer.resetSyntax();
System.out.println(tokenizer.nextToken() == '1'); // true
System.out.println(tokenizer.nextToken() == '2'); // true
System.out.println(tokenizer.nextToken() == '3'); // true
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_EOF); // true
}
try (final var reader = new StringReader(s)) {
final var tokenizer = new StreamTokenizer(reader);
tokenizer.resetSyntax();
tokenizer.parseNumbers();
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_NUMBER); // true
System.out.println(tokenizer.nval); // 123.0
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_EOF); // true
}
void pushBack ()
Causes the next call to the nextToken method of this tokenizer to return the current value in the ttype field, and not to modify the value in the nval or sval field.
final var s = "abcd 1234";
try (final var reader = new StringReader(s)) {
final var tokenizer = new StreamTokenizer(reader);
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
System.out.println(tokenizer.sval); // abcd
tokenizer.pushBack();
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
System.out.println(tokenizer.sval); // abcd
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_NUMBER); // true
System.out.println(tokenizer.nval); // 1234.0
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_EOF); // true
}
void quoteChar (int ch)
Specifies that matching pairs of this character delimit string constants in this tokenizer.
final var s = """
abcd
=1234 XYZ=
""";
try (final var reader = new StringReader(s)) {
final var tokenizer = new StreamTokenizer(reader);
tokenizer.quoteChar('=');
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
System.out.println(tokenizer.sval); // abcd
System.out.println(tokenizer.nextToken() == '='); // true
System.out.println(tokenizer.sval); // 1234 XYZ
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_EOF); // true
}
void resetSyntax ()
Resets this tokenizer's syntax table so that all characters are "ordinary."
Please see parseNumbers().
void slashSlashComments (boolean flag)
Determines whether or not the tokenizer recognizes C++-style comments.
final var s = """
abcd // comment
XYZ
""";
try (final var reader = new StringReader(s)) {
final var tokenizer = new StreamTokenizer(reader);
tokenizer.ordinaryChar('/');
tokenizer.slashSlashComments(true);
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
System.out.println(tokenizer.sval); // abcd
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
System.out.println(tokenizer.sval); // XYZ
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_EOF); // true
}
try (final var reader = new StringReader(s)) {
final var tokenizer = new StreamTokenizer(reader);
tokenizer.ordinaryChar('/');
tokenizer.slashSlashComments(false);
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
System.out.println(tokenizer.sval); // abcd
System.out.println(tokenizer.nextToken() == '/'); // true
System.out.println(tokenizer.nextToken() == '/'); // true
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
System.out.println(tokenizer.sval); // comment
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
System.out.println(tokenizer.sval); // XYZ
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_EOF); // true
}
void slashStarComments (boolean flag)
Determines whether or not the tokenizer recognizes C-style comments.
final var s = "abcd /* comment */ XYZ";
try (final var reader = new StringReader(s)) {
final var tokenizer = new StreamTokenizer(reader);
tokenizer.ordinaryChar('/');
tokenizer.slashStarComments(true);
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
System.out.println(tokenizer.sval); // abcd
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
System.out.println(tokenizer.sval); // XYZ
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_EOF); // true
}
try (final var reader = new StringReader(s)) {
final var tokenizer = new StreamTokenizer(reader);
tokenizer.ordinaryChar('/');
tokenizer.slashStarComments(false);
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
System.out.println(tokenizer.sval); // abcd
System.out.println(tokenizer.nextToken() == '/'); // true
System.out.println(tokenizer.nextToken() == '*'); // true
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
System.out.println(tokenizer.sval); // comment
System.out.println(tokenizer.nextToken() == '*'); // true
System.out.println(tokenizer.nextToken() == '/'); // true
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
System.out.println(tokenizer.sval); // XYZ
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_EOF); // true
}
String toString ()
Returns the string representation of the current stream token and the line number it occurs on.
final var s = """
abcd
1234
""";
try (final var reader = new StringReader(s)) {
final var tokenizer = new StreamTokenizer(reader);
tokenizer.nextToken();
final var str1 = tokenizer.toString();
System.out.println(str1); // Token[abcd], line 1
tokenizer.nextToken();
final var str2 = tokenizer.toString();
System.out.println(str2); // Token[n=1234.0], line 2
tokenizer.nextToken();
final var str3 = tokenizer.toString();
System.out.println(str3); // Token[EOF], line 3
}
void whitespaceChars (int low, int hi)
Specifies that all characters c in the range low <= c <= high are white space characters.
final var s = "AAAxBBByCCCzDDD";
try (final var reader = new StringReader(s)) {
final var tokenizer = new StreamTokenizer(reader);
tokenizer.whitespaceChars('x', 'z');
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
System.out.println(tokenizer.sval); // AAA
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
System.out.println(tokenizer.sval); // BBB
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
System.out.println(tokenizer.sval); // CCC
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
System.out.println(tokenizer.sval); // DDD
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_EOF); // true
}
void wordChars (int low, int hi)
Specifies that all characters c in the range low <= c <= high are word constituents.
final var s = "abcdef";
try (final var reader = new StringReader(s)) {
final var tokenizer = new StreamTokenizer(reader);
tokenizer.resetSyntax();
tokenizer.wordChars('a', 'c');
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
System.out.println(tokenizer.sval); // abc
System.out.println(tokenizer.nextToken() == 'd'); // true
System.out.println(tokenizer.nextToken() == 'e'); // true
System.out.println(tokenizer.nextToken() == 'f'); // true
System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_EOF); // true
}
Related posts
- API Examples