public abstract class AbstractTokenizer<T> extends Object implements Tokenizer<T>
getNext()
method. This implementation does not
allow null tokens, since
null is used in the protected nextToken field to signify that no more
tokens are available.Modifier and Type | Field and Description |
---|---|
static String |
NEWLINE_TOKEN
For tokenizing carriage returns.
|
protected T |
nextToken |
Constructor and Description |
---|
AbstractTokenizer() |
Modifier and Type | Method and Description |
---|---|
protected abstract T |
getNext()
Internally fetches the next token.
|
boolean |
hasNext()
Returns
true if this Tokenizer has more elements. |
T |
next()
Returns the next token from this Tokenizer.
|
T |
peek()
This is an optional operation, by default supported.
|
void |
remove()
This is an optional operation, by default not supported.
|
List<T> |
tokenize()
Returns text as a List of tokens.
|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
forEachRemaining
public static final String NEWLINE_TOKEN
tokenizeNLs = true
. It is assumed that no tokenizer allows *NL* as a token.
This is certainly true for PTBTokenizer-derived tokenizers, where the asterisks would
become separate tokens.protected T nextToken
protected abstract T getNext()
public T next()
next
in interface Iterator<T>
NoSuchElementException
- if the token stream has no more tokens.public boolean hasNext()
true
if this Tokenizer has more elements.public void remove()
public T peek()
peek
in interface Tokenizer<T>
NoSuchElementException
- if the token stream has no more tokens.