public interface Tokenizer
Modifier and Type | Method and Description |
---|---|
String |
getErrorDescription()
If hasErrors returns true, returns a description of the error
encountered.
|
Token |
getNextToken()
Returns the next token.
|
boolean |
hasErrors()
Returns true if there were errors while reading tokens.
|
boolean |
hasMoreTokens()
Returns true if there are more tokens, false otherwise.
|
boolean |
isBreak()
Determines if the current token should start a new sentence.
|
void |
setInputReader(Reader reader)
Sets the input reader.
|
void |
setInputText(String textToTokenize)
Sets the text to be tokenized by this tokenizer.
|
void |
setPostpunctuationSymbols(String symbols)
Sets the postpunctuation symbols of this Tokenizer to the given
symbols.
|
void |
setPrepunctuationSymbols(String symbols)
Sets the prepunctuation symbols of this Tokenizer to the given
symbols.
|
void |
setSingleCharSymbols(String symbols)
Sets the single character symbols of this Tokenizer to the given
symbols.
|
void |
setWhitespaceSymbols(String symbols)
Sets the whitespace symbols of this Tokenizer to the given
symbols.
|
void setInputText(String textToTokenize)
textToTokenize
- the text to tokenizevoid setInputReader(Reader reader)
reader
- the input sourceToken getNextToken()
boolean hasMoreTokens()
boolean hasErrors()
String getErrorDescription()
void setWhitespaceSymbols(String symbols)
symbols
- the whitespace symbolsvoid setSingleCharSymbols(String symbols)
symbols
- the single character symbolsvoid setPrepunctuationSymbols(String symbols)
symbols
- the prepunctuation symbolsvoid setPostpunctuationSymbols(String symbols)
symbols
- the postpunctuation symbolsboolean isBreak()
WebARTS Library Licensed Under the GNU - General Public License. Other Libraries licensed under their respective Open Source Licenses