public static class SpanishTokenizer.SpanishTokenizerFactory<T extends HasWord> extends Object implements TokenizerFactory<T>
Modifier and Type | Field and Description |
---|---|
protected LexedTokenFactory<T> |
factory |
protected Properties |
lexerProperties |
protected boolean |
splitCompoundOption |
protected boolean |
splitContractionOption |
protected boolean |
splitVerbOption |
Modifier and Type | Method and Description |
---|---|
Iterator<T> |
getIterator(Reader r)
Return an iterator over the contents read from r.
|
Tokenizer<T> |
getTokenizer(Reader r)
Get a tokenizer for this reader.
|
Tokenizer<T> |
getTokenizer(Reader r,
String extraOptions)
Get a tokenizer for this reader.
|
static TokenizerFactory<CoreLabel> |
newCoreLabelTokenizerFactory() |
static <T extends HasWord> |
newSpanishTokenizerFactory(LexedTokenFactory<T> factory,
String options)
Constructs a new SpanishTokenizer that returns T objects and uses the options passed in.
|
void |
setOptions(String options)
Set underlying tokenizer options.
|
protected final LexedTokenFactory<T extends HasWord> factory
protected Properties lexerProperties
protected boolean splitCompoundOption
protected boolean splitVerbOption
protected boolean splitContractionOption
public static TokenizerFactory<CoreLabel> newCoreLabelTokenizerFactory()
public static <T extends HasWord> SpanishTokenizer.SpanishTokenizerFactory<T> newSpanishTokenizerFactory(LexedTokenFactory<T> factory, String options)
options
- a String of options, separated by commasfactory
- a factory for the token type that the tokenizer will returnpublic Iterator<T> getIterator(Reader r)
IteratorFromReaderFactory
getIterator
in interface IteratorFromReaderFactory<T extends HasWord>
r
- Where to read objects frompublic Tokenizer<T> getTokenizer(Reader r)
TokenizerFactory
getTokenizer
in interface TokenizerFactory<T extends HasWord>
r
- A Reader (which is assumed to already by buffered, if appropriate)public void setOptions(String options)
setOptions
in interface TokenizerFactory<T extends HasWord>
options
- A comma-separated list of optionspublic Tokenizer<T> getTokenizer(Reader r, String extraOptions)
TokenizerFactory
getTokenizer
in interface TokenizerFactory<T extends HasWord>
r
- A Reader (which is assumed to already by buffered, if appropriate)extraOptions
- Options for how this tokenizer should behave