public class Rule extends Object
Rules have a pattern, left context, right context, output phoneme, set of languages for which they apply and a logical flag indicating if all languages must be in play. A rule matches if:
Rules are typically generated by parsing rules resources. In normal use, there will be no need for the user to explicitly construct their own.
Rules are immutable and thread-safe.
Rules resources
Rules are typically loaded from resource files. These are UTF-8 encoded text files. They are systematically named following the pattern:
org/apache/commons/codec/language/bm/${NameType#getName}_${RuleType#getName}_${language}.txt
The format of these resources is the following:
Modifier and Type | Class and Description |
---|---|
static class |
Rule.Phoneme |
static interface |
Rule.PhonemeExpr |
static class |
Rule.PhonemeList |
static interface |
Rule.RPattern
A minimal wrapper around the functionality of Pattern that we use, to allow for alternate implementations.
|
Modifier and Type | Field and Description |
---|---|
static String |
ALL |
static Rule.RPattern |
ALL_STRINGS_RMATCHER |
private static String |
DOUBLE_QUOTE |
private static String |
HASH_INCLUDE |
private Rule.RPattern |
lContext |
private String |
pattern |
private Rule.PhonemeExpr |
phoneme |
private Rule.RPattern |
rContext |
private static Map<NameType,Map<RuleType,Map<String,Map<String,List<Rule>>>>> |
RULES |
Constructor and Description |
---|
Rule(String pattern,
String lContext,
String rContext,
Rule.PhonemeExpr phoneme)
Creates a new rule.
|
Modifier and Type | Method and Description |
---|---|
private static boolean |
contains(CharSequence chars,
char input) |
private static String |
createResourceName(NameType nameType,
RuleType rt,
String lang) |
private static Scanner |
createScanner(NameType nameType,
RuleType rt,
String lang) |
private static Scanner |
createScanner(String lang) |
private static boolean |
endsWith(CharSequence input,
CharSequence suffix) |
static List<Rule> |
getInstance(NameType nameType,
RuleType rt,
Languages.LanguageSet langs)
Gets rules for a combination of name type, rule type and languages.
|
static List<Rule> |
getInstance(NameType nameType,
RuleType rt,
String lang)
Gets rules for a combination of name type, rule type and a single language.
|
static Map<String,List<Rule>> |
getInstanceMap(NameType nameType,
RuleType rt,
Languages.LanguageSet langs)
Gets rules for a combination of name type, rule type and languages.
|
static Map<String,List<Rule>> |
getInstanceMap(NameType nameType,
RuleType rt,
String lang)
Gets rules for a combination of name type, rule type and a single language.
|
Rule.RPattern |
getLContext()
Gets the left context.
|
String |
getPattern()
Gets the pattern.
|
Rule.PhonemeExpr |
getPhoneme()
Gets the phoneme.
|
Rule.RPattern |
getRContext()
Gets the right context.
|
private static Rule.Phoneme |
parsePhoneme(String ph) |
private static Rule.PhonemeExpr |
parsePhonemeExpr(String ph) |
private static Map<String,List<Rule>> |
parseRules(Scanner scanner,
String location) |
private static Rule.RPattern |
pattern(String regex)
Attempts to compile the regex into direct string ops, falling back to Pattern and Matcher in the worst case.
|
boolean |
patternAndContextMatches(CharSequence input,
int i)
Decides if the pattern and context match the input starting at a position.
|
private static boolean |
startsWith(CharSequence input,
CharSequence prefix) |
private static String |
stripQuotes(String str) |
public static final Rule.RPattern ALL_STRINGS_RMATCHER
public static final String ALL
private static final String DOUBLE_QUOTE
private static final String HASH_INCLUDE
private final Rule.RPattern lContext
private final Rule.PhonemeExpr phoneme
private final Rule.RPattern rContext
public Rule(String pattern, String lContext, String rContext, Rule.PhonemeExpr phoneme)
pattern
- the patternlContext
- the left contextrContext
- the right contextphoneme
- the resulting phonemeprivate static boolean contains(CharSequence chars, char input)
private static String createResourceName(NameType nameType, RuleType rt, String lang)
private static Scanner createScanner(NameType nameType, RuleType rt, String lang)
private static Scanner createScanner(String lang)
private static boolean endsWith(CharSequence input, CharSequence suffix)
public static List<Rule> getInstance(NameType nameType, RuleType rt, Languages.LanguageSet langs)
nameType
- the NameType to considerrt
- the RuleType to considerlangs
- the set of languages to considerpublic static List<Rule> getInstance(NameType nameType, RuleType rt, String lang)
nameType
- the NameType to considerrt
- the RuleType to considerlang
- the language to considerpublic static Map<String,List<Rule>> getInstanceMap(NameType nameType, RuleType rt, Languages.LanguageSet langs)
nameType
- the NameType to considerrt
- the RuleType to considerlangs
- the set of languages to considerpublic static Map<String,List<Rule>> getInstanceMap(NameType nameType, RuleType rt, String lang)
nameType
- the NameType to considerrt
- the RuleType to considerlang
- the language to considerprivate static Rule.Phoneme parsePhoneme(String ph)
private static Rule.PhonemeExpr parsePhonemeExpr(String ph)
private static Rule.RPattern pattern(String regex)
regex
- the regular expression to compileprivate static boolean startsWith(CharSequence input, CharSequence prefix)
private static String stripQuotes(String str)
public Rule.RPattern getLContext()
public String getPattern()
public Rule.PhonemeExpr getPhoneme()
public Rule.RPattern getRContext()
public boolean patternAndContextMatches(CharSequence input, int i)
lContext
matches input
up to i
, pattern
matches at i and
rContext
matches from the end of the match of pattern
to the end of input
.input
- the input Stringi
- the int position within the inputWebARTS Library Licensed Under the GNU - General Public License. Other Libraries licensed under their respective Open Source Licenses