Java : Normalizer with Examples

Normalizer (Java SE 22 & JDK 22) with Examples.
You will find code examples on most Normalizer methods.


Summary

This class provides the method normalize which transforms Unicode text into an equivalent composed or decomposed form, allowing for easier sorting and searching of text. The normalize method supports the standard normalization forms described in Unicode Standard Annex #15 — Unicode Normalization Forms.

Class diagram

final var values = Normalizer.Form.values();
System.out.println(Arrays.toString(values)); // [NFD, NFC, NFKD, NFKC]

final var sources = List.of("Å", "¼", "⑩", "㌀");

for (final var src : sources) {
    System.out.println("----------");
    System.out.println("src : " + src);

    for (final var form : Normalizer.Form.values()) {
        if (!Normalizer.isNormalized(src, form)) {
            System.out.println("  " + form + " : " + Normalizer.normalize(src, form));
        }
    }
}

// Result
// ↓
//----------
//src : Å
//  NFD : Å
//  NFC : Å
//  NFKD : Å
//  NFKC : Å
//----------
//src : ¼
//  NFKD : 1⁄4
//  NFKC : 1⁄4
//----------
//src : ⑩
//  NFKD : 10
//  NFKC : 10
//----------
//src : ㌀
//  NFKD : アパート
//  NFKC : アパート

Methods

static boolean isNormalized (CharSequence src, Normalizer.Form form)

Determines if the given sequence of char values is normalized.

final var src = "⑩";
final var form = Normalizer.Form.NFKC;

System.out.println(Normalizer.isNormalized(src, form)); // false

final var normalized = Normalizer.normalize(src, form);
System.out.println(normalized); // 10
System.out.println(Normalizer.isNormalized(normalized, form)); // true

static String normalize (CharSequence src, Normalizer.Form form)

Normalize a sequence of char values.

final var src = "⑩";
final var form = Normalizer.Form.NFKC;

System.out.println(Normalizer.isNormalized(src, form)); // false

final var normalized = Normalizer.normalize(src, form);
System.out.println(normalized); // 10
System.out.println(Normalizer.isNormalized(normalized, form)); // true

Related posts

To top of page