public class Nysiis extends Object implements StringEncoder
NYSIIS features an accuracy increase of 2.7% over the traditional Soundex algorithm.
Algorithm description:
1. Transcode first characters of name 1a. MAC -> MCC 1b. KN -> NN 1c. K -> C 1d. PH -> FF 1e. PF -> FF 1f. SCH -> SSS 2. Transcode last characters of name 2a. EE, IE -> Y 2b. DT,RT,RD,NT,ND -> D 3. First character of key = first character of name 4. Transcode remaining characters by following these rules, incrementing by one character each time 4a. EV -> AF else A,E,I,O,U -> A 4b. Q -> G 4c. Z -> S 4d. M -> N 4e. KN -> N else K -> C 4f. SCH -> SSS 4g. PH -> FF 4h. H -> If previous or next is nonvowel, previous 4i. W -> If previous is vowel, previous 4j. Add current to key if current != last key character 5. If last character is S, remove it 6. If last characters are AY, replace with Y 7. If last character is A, remove it 8. Collapse all strings of repeated characters 9. Add original first character of name as first character of key
This class is immutable and thread-safe.
Soundex
Modifier and Type | Field and Description |
---|---|
private static char[] |
CHARS_A |
private static char[] |
CHARS_AF |
private static char[] |
CHARS_C |
private static char[] |
CHARS_FF |
private static char[] |
CHARS_G |
private static char[] |
CHARS_N |
private static char[] |
CHARS_NN |
private static char[] |
CHARS_S |
private static char[] |
CHARS_SSS |
private static Pattern |
PAT_DT_ETC |
private static Pattern |
PAT_EE_IE |
private static Pattern |
PAT_K |
private static Pattern |
PAT_KN |
private static Pattern |
PAT_MAC |
private static Pattern |
PAT_PH_PF |
private static Pattern |
PAT_SCH |
private static char |
SPACE |
private boolean |
strict
Indicates the strict mode.
|
private static int |
TRUE_LENGTH |
Constructor and Description |
---|
Nysiis()
Creates an instance of the
Nysiis encoder with strict mode (original form),
i.e. encoded strings have a maximum length of 6. |
Nysiis(boolean strict)
Create an instance of the
Nysiis encoder with the specified strict mode:
true : encoded strings have a maximum length of 6
false : encoded strings may have arbitrary length
|
Modifier and Type | Method and Description |
---|---|
Object |
encode(Object obj)
Encodes an Object using the NYSIIS algorithm.
|
String |
encode(String str)
Encodes a String using the NYSIIS algorithm.
|
boolean |
isStrict()
Indicates the strict mode for this
Nysiis encoder. |
private static boolean |
isVowel(char c)
Tests if the given character is a vowel.
|
String |
nysiis(String str)
Retrieves the NYSIIS code for a given String object.
|
private static char[] |
transcodeRemaining(char prev,
char curr,
char next,
char aNext)
Transcodes the remaining parts of the String.
|
private static final char[] CHARS_A
private static final char[] CHARS_AF
private static final char[] CHARS_C
private static final char[] CHARS_FF
private static final char[] CHARS_G
private static final char[] CHARS_N
private static final char[] CHARS_NN
private static final char[] CHARS_S
private static final char[] CHARS_SSS
private static final Pattern PAT_DT_ETC
private static final char SPACE
private static final int TRUE_LENGTH
private final boolean strict
public Nysiis()
Nysiis
encoder with strict mode (original form),
i.e. encoded strings have a maximum length of 6.private static boolean isVowel(char c)
c
- the character to testtrue
if the character is a vowel, false
otherwiseprivate static char[] transcodeRemaining(char prev, char curr, char next, char aNext)
prev
- the previous charactercurr
- the current characternext
- the next characteraNext
- the after next characterpublic Object encode(Object obj) throws EncoderException
EncoderException
if the supplied object is not of type
String
.encode
in interface Encoder
obj
- Object to encodeString
) containing the NYSIIS code which corresponds to the given String.EncoderException
- if the parameter supplied is not of a String
IllegalArgumentException
- if a character is not mappedpublic String encode(String str)
encode
in interface StringEncoder
str
- A String object to encodeIllegalArgumentException
- if a character is not mappedpublic boolean isStrict()
Nysiis
encoder.true
if the encoder is configured for strict mode, false
otherwiseWebARTS Library Licensed Under the GNU - General Public License. Other Libraries licensed under their respective Open Source Licenses