Java : StreamTokenizer with Examples

StreamTokenizer (Java SE 19 & JDK 19) API Examples.
You will find code examples on most StreamTokenizer methods.


Summary

The StreamTokenizer class takes an input stream and parses it into "tokens", allowing the tokens to be read one at a time. The parsing process is controlled by a table and a number of flags that can be set to various states. The stream tokenizer can recognize identifiers, numbers, quoted strings, and various comment styles.

Class diagram

final var s = """
        abcd 1234
        "X  Y  Z"
        """;
try (final var reader = new StringReader(s)) {
    final var tokenizer = new StreamTokenizer(reader);

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
    System.out.println(tokenizer.sval); // abcd

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_NUMBER); // true
    System.out.println(tokenizer.nval); // 1234.0

    System.out.println(tokenizer.nextToken() == '"'); // true
    System.out.println(tokenizer.sval); // X  Y  Z

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_EOF); // true
}

Fields

double nval

If the current token is a number, this field contains the value of that number.

Please see nextToken().

String sval

If the current token is a word token, this field contains a string giving the characters of the word token.

Please see nextToken().

static final int TT_EOF

A constant indicating that the end of the stream has been read.

final var s = "abcd\nXYZ";
try (final var reader = new StringReader(s)) {
    final var tokenizer = new StreamTokenizer(reader);
    tokenizer.eolIsSignificant(true);

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
    System.out.println(tokenizer.sval); // abcd

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_EOL); // true

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
    System.out.println(tokenizer.sval); // XYZ

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_EOF); // true
}

static final int TT_EOL

A constant indicating that the end of the line has been read.

Please see TT_EOF.

static final int TT_NUMBER

A constant indicating that a number token has been read.

Please see nextToken().

static final int TT_WORD

A constant indicating that a word token has been read.

Please see nextToken().

int ttype

After a call to the nextToken method, this field contains the type of the token just read.

final var s = "abcd 1234 XYZ";
try (final var reader = new StringReader(s)) {
    final var tokenizer = new StreamTokenizer(reader);

    tokenizer.nextToken();
    System.out.println(tokenizer.ttype == StreamTokenizer.TT_WORD); // true
    System.out.println(tokenizer.sval); // abcd

    tokenizer.nextToken();
    System.out.println(tokenizer.ttype == StreamTokenizer.TT_NUMBER); // true
    System.out.println(tokenizer.nval); // 1234.0

    tokenizer.nextToken();
    System.out.println(tokenizer.ttype == StreamTokenizer.TT_WORD); // true
    System.out.println(tokenizer.sval); // XYZ

    tokenizer.nextToken();
    System.out.println(tokenizer.ttype == StreamTokenizer.TT_EOF); // true
}

Constructors

StreamTokenizer (InputStream is)

Deprecated. As of JDK version 1.1, the preferred way to tokenize an input stream is to convert it into a character stream, for example:

Deprecated.

StreamTokenizer (Reader r)

Create a tokenizer that parses the given character stream.

final var s = "abcd XYZ";
try (final var reader = new StringReader(s)) {
    final var tokenizer = new StreamTokenizer(reader);

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
    System.out.println(tokenizer.sval); // abcd

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
    System.out.println(tokenizer.sval); // XYZ

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_EOF); // true
}

Methods

void commentChar (int ch)

Specified that the character argument starts a single-line comment.

final var s = """
        abcd
        # comment
        XYZ
        """;
try (final var reader = new StringReader(s)) {
    final var tokenizer = new StreamTokenizer(reader);
    tokenizer.commentChar('#');

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
    System.out.println(tokenizer.sval); // abcd

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
    System.out.println(tokenizer.sval); // XYZ

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_EOF); // true
}

void eolIsSignificant (boolean flag)

Determines whether or not ends of line are treated as tokens.

Please see TT_EOF.

int lineno ()

Return the current line number.

final var s = """
        abcd 1234
        XYZ
        """;
try (final var reader = new StringReader(s)) {
    final var tokenizer = new StreamTokenizer(reader);

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
    System.out.println(tokenizer.sval); // abcd
    System.out.println(tokenizer.lineno()); // 1

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_NUMBER); // true
    System.out.println(tokenizer.nval); // 1234.0
    System.out.println(tokenizer.lineno()); // 1

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
    System.out.println(tokenizer.sval); // XYZ
    System.out.println(tokenizer.lineno()); // 2

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_EOF); // true
    System.out.println(tokenizer.lineno()); // 3
}

void lowerCaseMode (boolean fl)

Determines whether or not word token are automatically lowercased.

final var s = "abcd XYZ";
try (final var reader = new StringReader(s)) {
    final var tokenizer = new StreamTokenizer(reader);
    tokenizer.lowerCaseMode(true);

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
    System.out.println(tokenizer.sval); // abcd

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
    System.out.println(tokenizer.sval); // xyz

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_EOF); // true
}

int nextToken ()

Parses the next token from the input stream of this tokenizer.

final var s = "abcd 1234 XYZ";
try (final var reader = new StringReader(s)) {
    final var tokenizer = new StreamTokenizer(reader);

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
    System.out.println(tokenizer.sval); // abcd

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_NUMBER); // true
    System.out.println(tokenizer.nval); // 1234.0

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
    System.out.println(tokenizer.sval); // XYZ

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_EOF); // true
}

void ordinaryChar (int ch)

Specifies that the character argument is "ordinary" in this tokenizer.

final var s = "abcQxyz";
try (final var reader = new StringReader(s)) {
    final var tokenizer = new StreamTokenizer(reader);
    tokenizer.ordinaryChar('Q');

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
    System.out.println(tokenizer.sval); // abc

    System.out.println(tokenizer.nextToken() == 'Q'); // true

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
    System.out.println(tokenizer.sval); // xyz

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_EOF); // true
}

try (final var reader = new StringReader(s)) {
    final var tokenizer = new StreamTokenizer(reader);

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
    System.out.println(tokenizer.sval); // abcQxyz

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_EOF); // true
}

void ordinaryChars (int low, int hi)

Specifies that all characters c in the range low <= c <= high are "ordinary" in this tokenizer.

Please see also : ordinaryChar(int ch)

final var s = "AAAxBBByCCCzDDD";
try (final var reader = new StringReader(s)) {
    final var tokenizer = new StreamTokenizer(reader);
    tokenizer.ordinaryChars('x', 'z');

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
    System.out.println(tokenizer.sval); // AAA

    System.out.println(tokenizer.nextToken() == 'x'); // true

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
    System.out.println(tokenizer.sval); // BBB

    System.out.println(tokenizer.nextToken() == 'y'); // true

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
    System.out.println(tokenizer.sval); // CCC

    System.out.println(tokenizer.nextToken() == 'z'); // true

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
    System.out.println(tokenizer.sval); // DDD

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_EOF); // true
}

void parseNumbers ()

Specifies that numbers should be parsed by this tokenizer.

final var s = "123";
try (final var reader = new StringReader(s)) {
    final var tokenizer = new StreamTokenizer(reader);
    tokenizer.resetSyntax();

    System.out.println(tokenizer.nextToken() == '1'); // true
    System.out.println(tokenizer.nextToken() == '2'); // true
    System.out.println(tokenizer.nextToken() == '3'); // true
    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_EOF); // true
}

try (final var reader = new StringReader(s)) {
    final var tokenizer = new StreamTokenizer(reader);
    tokenizer.resetSyntax();
    tokenizer.parseNumbers();

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_NUMBER); // true
    System.out.println(tokenizer.nval); // 123.0

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_EOF); // true
}

void pushBack ()

Causes the next call to the nextToken method of this tokenizer to return the current value in the ttype field, and not to modify the value in the nval or sval field.

final var s = "abcd 1234";
try (final var reader = new StringReader(s)) {
    final var tokenizer = new StreamTokenizer(reader);

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
    System.out.println(tokenizer.sval); // abcd

    tokenizer.pushBack();

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
    System.out.println(tokenizer.sval); // abcd

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_NUMBER); // true
    System.out.println(tokenizer.nval); // 1234.0

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_EOF); // true
}

void quoteChar (int ch)

Specifies that matching pairs of this character delimit string constants in this tokenizer.

final var s = """
        abcd
        =1234 XYZ=
        """;
try (final var reader = new StringReader(s)) {
    final var tokenizer = new StreamTokenizer(reader);
    tokenizer.quoteChar('=');

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
    System.out.println(tokenizer.sval); // abcd

    System.out.println(tokenizer.nextToken() == '='); // true
    System.out.println(tokenizer.sval); // 1234 XYZ

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_EOF); // true
}

void resetSyntax ()

Resets this tokenizer's syntax table so that all characters are "ordinary."

Please see parseNumbers().

void slashSlashComments (boolean flag)

Determines whether or not the tokenizer recognizes C++-style comments.

final var s = """
        abcd // comment
        XYZ
        """;
try (final var reader = new StringReader(s)) {
    final var tokenizer = new StreamTokenizer(reader);
    tokenizer.ordinaryChar('/');
    tokenizer.slashSlashComments(true);

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
    System.out.println(tokenizer.sval); // abcd

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
    System.out.println(tokenizer.sval); // XYZ

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_EOF); // true
}

try (final var reader = new StringReader(s)) {
    final var tokenizer = new StreamTokenizer(reader);
    tokenizer.ordinaryChar('/');
    tokenizer.slashSlashComments(false);

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
    System.out.println(tokenizer.sval); // abcd

    System.out.println(tokenizer.nextToken() == '/'); // true
    System.out.println(tokenizer.nextToken() == '/'); // true

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
    System.out.println(tokenizer.sval); // comment

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
    System.out.println(tokenizer.sval); // XYZ

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_EOF); // true
}

void slashStarComments (boolean flag)

Determines whether or not the tokenizer recognizes C-style comments.

final var s = "abcd /* comment */ XYZ";
try (final var reader = new StringReader(s)) {
    final var tokenizer = new StreamTokenizer(reader);
    tokenizer.ordinaryChar('/');
    tokenizer.slashStarComments(true);

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
    System.out.println(tokenizer.sval); // abcd

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
    System.out.println(tokenizer.sval); // XYZ

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_EOF); // true
}

try (final var reader = new StringReader(s)) {
    final var tokenizer = new StreamTokenizer(reader);
    tokenizer.ordinaryChar('/');
    tokenizer.slashStarComments(false);

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
    System.out.println(tokenizer.sval); // abcd

    System.out.println(tokenizer.nextToken() == '/'); // true
    System.out.println(tokenizer.nextToken() == '*'); // true

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
    System.out.println(tokenizer.sval); // comment

    System.out.println(tokenizer.nextToken() == '*'); // true
    System.out.println(tokenizer.nextToken() == '/'); // true

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
    System.out.println(tokenizer.sval); // XYZ

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_EOF); // true
}

String toString ()

Returns the string representation of the current stream token and the line number it occurs on.

final var s = """
        abcd
        1234
        """;
try (final var reader = new StringReader(s)) {
    final var tokenizer = new StreamTokenizer(reader);

    tokenizer.nextToken();
    final var str1 = tokenizer.toString();
    System.out.println(str1); // Token[abcd], line 1

    tokenizer.nextToken();
    final var str2 = tokenizer.toString();
    System.out.println(str2); // Token[n=1234.0], line 2

    tokenizer.nextToken();
    final var str3 = tokenizer.toString();
    System.out.println(str3); // Token[EOF], line 3
}

void whitespaceChars (int low, int hi)

Specifies that all characters c in the range low <= c <= high are white space characters.

final var s = "AAAxBBByCCCzDDD";
try (final var reader = new StringReader(s)) {
    final var tokenizer = new StreamTokenizer(reader);
    tokenizer.whitespaceChars('x', 'z');

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
    System.out.println(tokenizer.sval); // AAA

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
    System.out.println(tokenizer.sval); // BBB

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
    System.out.println(tokenizer.sval); // CCC

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
    System.out.println(tokenizer.sval); // DDD

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_EOF); // true
}

void wordChars (int low, int hi)

Specifies that all characters c in the range low <= c <= high are word constituents.

final var s = "abcdef";
try (final var reader = new StringReader(s)) {
    final var tokenizer = new StreamTokenizer(reader);
    tokenizer.resetSyntax();
    tokenizer.wordChars('a', 'c');

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_WORD); // true
    System.out.println(tokenizer.sval); // abc

    System.out.println(tokenizer.nextToken() == 'd'); // true
    System.out.println(tokenizer.nextToken() == 'e'); // true
    System.out.println(tokenizer.nextToken() == 'f'); // true

    System.out.println(tokenizer.nextToken() == StreamTokenizer.TT_EOF); // true
}

Related posts

To top of page