Java : Normalizer with Examples
Normalizer (Java SE 22 & JDK 22) with Examples.
You will find code examples on most Normalizer methods.
Summary
This class provides the method normalize which transforms Unicode text into an equivalent composed or decomposed form, allowing for easier sorting and searching of text. The normalize method supports the standard normalization forms described in Unicode Standard Annex #15 — Unicode Normalization Forms.
final var values = Normalizer.Form.values();
System.out.println(Arrays.toString(values)); // [NFD, NFC, NFKD, NFKC]
final var sources = List.of("Å", "¼", "⑩", "㌀");
for (final var src : sources) {
System.out.println("----------");
System.out.println("src : " + src);
for (final var form : Normalizer.Form.values()) {
if (!Normalizer.isNormalized(src, form)) {
System.out.println(" " + form + " : " + Normalizer.normalize(src, form));
}
}
}
// Result
// ↓
//----------
//src : Å
// NFD : Å
// NFC : Å
// NFKD : Å
// NFKC : Å
//----------
//src : ¼
// NFKD : 1⁄4
// NFKC : 1⁄4
//----------
//src : ⑩
// NFKD : 10
// NFKC : 10
//----------
//src : ㌀
// NFKD : アパート
// NFKC : アパート
Methods
static boolean isNormalized (CharSequence src, Normalizer.Form form)
Determines if the given sequence of char values is normalized.
final var src = "⑩";
final var form = Normalizer.Form.NFKC;
System.out.println(Normalizer.isNormalized(src, form)); // false
final var normalized = Normalizer.normalize(src, form);
System.out.println(normalized); // 10
System.out.println(Normalizer.isNormalized(normalized, form)); // true
static String normalize (CharSequence src, Normalizer.Form form)
Normalize a sequence of char values.
final var src = "⑩";
final var form = Normalizer.Form.NFKC;
System.out.println(Normalizer.isNormalized(src, form)); // false
final var normalized = Normalizer.normalize(src, form);
System.out.println(normalized); // 10
System.out.println(Normalizer.isNormalized(normalized, form)); // true