Java : DocumentBuilderFactory (XML) con ejemplos

DocumentBuilderFactory (Java SE 22 & JDK 22) en Java con ejemplos.
Encontrará ejemplos de código en la mayoría de los métodos de DocumentBuilderFactory.

Nota :


Summary

Define una API de fábrica que permite a las aplicaciones obtener un analizador que produce árboles de objetos DOM a partir de documentos XML. (Traducción automática)

Class diagram

final var xml = """
        <root>
            <child-a>AAA</child-a>
            <child-b>BBB</child-b>
        </root>
        """;

final var factory = DocumentBuilderFactory.newInstance();
final var builder = factory.newDocumentBuilder();

final var document = builder.parse(new ByteArrayInputStream(xml.getBytes()));

final var childA = document.getElementsByTagName("child-a").item(0);
System.out.println(childA); // [child-a: null]
System.out.println(childA.getTextContent()); // AAA

final var childB = document.getElementsByTagName("child-b").item(0);
System.out.println(childB); // [child-b: null]
System.out.println(childB.getTextContent()); // BBB

Please see also the link below.

XML processing can expose applications to certain vulnerabilities. Among the most prominent and well-known attacks are the XML External Entity (XXE) injection attack and the exponential entity expansion attack, also know as the XML bomb or billion laughs attack.


Constructors

DocumentBuilderFactory ()

Constructor protegido para evitar la instanciación. (Traducción automática)

protected. I think it's rare to create a subclass of this class. Therefore, the code example is omitted.

Methods

abstract Object getAttribute (String name)

Permite al usuario recuperar atributos específicos en la implementación subyacente. (Traducción automática)

final var dtdFile = Path.of("R:", "java-work", "sample.dtd");
System.out.println(dtdFile); // R:\java-work\sample.dtd

Files.writeString(dtdFile, """
        <!ENTITY aaa "bbb">
        """);

final var xml = """
        <!DOCTYPE root SYSTEM "file:///R:/java-work/sample.dtd">
        <root>&aaa;</root>
        """;

final var factory = DocumentBuilderFactory.newInstance();

{
    factory.setAttribute(XMLConstants.ACCESS_EXTERNAL_DTD, "all");

    final var ret = factory.getAttribute(XMLConstants.ACCESS_EXTERNAL_DTD);
    System.out.println(ret); // "all"

    final var builder = factory.newDocumentBuilder();
    final var document = builder.parse(new ByteArrayInputStream(xml.getBytes()));

    final var root = document.getDocumentElement();
    System.out.println(root); // [root: null]
    System.out.println(root.getTextContent()); // bbb
}

{
    factory.setAttribute(XMLConstants.ACCESS_EXTERNAL_DTD, "");

    final var ret = factory.getAttribute(XMLConstants.ACCESS_EXTERNAL_DTD);
    System.out.println(ret); // ""

    final var builder = factory.newDocumentBuilder();

    try {
        final var document = builder.parse(new ByteArrayInputStream(xml.getBytes()));
    } catch (SAXException e) {
        System.out.println(e);
    }

    // Result
    // ↓
    //org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 57; External DTD:
    // Failed to read external DTD 'sample.dtd', because 'file' access is not allowed due to
    // restriction set by the accessExternalDTD property.
}

abstract boolean getFeature (String name)

Obtener el estado de la característica nombrada. (Traducción automática)

// An example of the exponential entity expansion attack.
final var xml = """
        <!DOCTYPE root[
            <!ENTITY x100 "X">
            <!ENTITY x99 "&x100;&x100;">
            <!ENTITY x98 "&x99;&x99;">
            ...
            (omitted)
            ...
            <!ENTITY x3 "&x4;&x4;">
            <!ENTITY x2 "&x3;&x3;">
            <!ENTITY x1 "&x2;&x2;">
        ]>
        <root>&x1;</root>
        """;

final var factory = DocumentBuilderFactory.newInstance();

{
    final var ret = factory.getFeature(XMLConstants.FEATURE_SECURE_PROCESSING);
    System.out.println(ret); // true

    final var builder = factory.newDocumentBuilder();

    try {
        final var document = builder.parse(new ByteArrayInputStream(xml.getBytes()));
    } catch (SAXException e) {
        System.out.println(e);
    }

    // Result
    // ↓
    //org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; JAXP00010001:
    // The parser has encountered more than "64000" entity expansions in this document;
    // this is the limit imposed by the JDK.
}

factory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, false);

{
    final var ret = factory.getFeature(XMLConstants.FEATURE_SECURE_PROCESSING);
    System.out.println(ret); // false

    final var builder = factory.newDocumentBuilder();

    // Warning! Entities are growing exponentially, so parsing it takes a very long time.
    //final var document = builder.parse(new ByteArrayInputStream(xml.getBytes()));
}

Schema getSchema ()

Obtiene el objeto Schema especificado a través del método setSchema(Schema schema). (Traducción automática)

final var xsd = """
        <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
            <xsd:element name="root" type="xsd:string"/>
        </xsd:schema>
        """;

final var schemaFactory = SchemaFactory.newDefaultInstance();
final var schema = schemaFactory.newSchema(
        new StreamSource(new ByteArrayInputStream(xsd.getBytes())));

final var factory = DocumentBuilderFactory.newInstance();

System.out.println(factory.getSchema()); // null

factory.setSchema(schema);
System.out.println(factory.getSchema().equals(schema)); // true

final var errorHandler = new DefaultHandler() {
    @Override
    public void error(SAXParseException e) {
        System.out.println("-- ErrorHandler error --");
        System.out.println(e);
    }
};

{
    final var xml = """
            <root>abcd</root>
            """;

    final var builder = factory.newDocumentBuilder();
    builder.setErrorHandler(errorHandler);

    final var document = builder.parse(new ByteArrayInputStream(xml.getBytes()));

    final var root = document.getDocumentElement();
    System.out.println(root); // [root: null]
    System.out.println(root.getTextContent()); // abcd
}

{
    final var xml = """
            <root><child>abcd</child></root>
            """;

    final var builder = factory.newDocumentBuilder();
    builder.setErrorHandler(errorHandler);

    final var document = builder.parse(new ByteArrayInputStream(xml.getBytes()));

    // Result
    // ↓
    //-- ErrorHandler error --
    //org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 33; cvc-type.3.1.2:
    // Element 'root' is a simple type, so it must have no element information item [children].
}

boolean isCoalescing ()

Indica si la fábrica está configurada o no para producir analizadores que conviertan nodos CDATA en nodos de texto y los agreguen al nodo de texto adyacente (si lo hay). (Traducción automática)

final var xml = """
        <root>aaa<![CDATA[<&>]]></root>
        """;

final var factory = DocumentBuilderFactory.newInstance();

{
    final var ret = factory.isCoalescing();
    System.out.println(ret); // false

    final var builder = factory.newDocumentBuilder();
    final var document = builder.parse(new ByteArrayInputStream(xml.getBytes()));

    final var root = document.getDocumentElement();
    System.out.println(root); // [root: null]

    final var nodes = root.getChildNodes();
    System.out.println("-- nodes --");
    for (int i = 0; i < nodes.getLength(); i++) {
        System.out.println(nodes.item(i));
    }

    // Result
    // ↓
    //-- nodes --
    //[#text: aaa]
    //[#cdata-section: <&>]
}

factory.setCoalescing(true);

{
    final var ret = factory.isCoalescing();
    System.out.println(ret); // true

    final var builder = factory.newDocumentBuilder();
    final var document = builder.parse(new ByteArrayInputStream(xml.getBytes()));

    final var root = document.getDocumentElement();
    System.out.println(root); // [root: null]

    final var nodes = root.getChildNodes();
    System.out.println("-- nodes --");
    for (int i = 0; i < nodes.getLength(); i++) {
        System.out.println(nodes.item(i));
    }

    // Result
    // ↓
    //-- nodes --
    //[#text: aaa<&>]
}

boolean isExpandEntityReferences ()

Indica si la fábrica está configurada o no para producir analizadores que expandan los nodos de referencia de entidad. (Traducción automática)

final var xml = """
        <!DOCTYPE root [
            <!ENTITY aaa "bbb">
        ]>
        <root>&aaa;</root>
        """;

final var factory = DocumentBuilderFactory.newInstance();

{
    final var ret = factory.isExpandEntityReferences();
    System.out.println(ret); // true

    final var builder = factory.newDocumentBuilder();
    final var document = builder.parse(new ByteArrayInputStream(xml.getBytes()));

    final var root = document.getDocumentElement();
    System.out.println(root); // [root: null]

    final var child = root.getFirstChild();
    System.out.println(child); // [#text: bbb]
}

factory.setExpandEntityReferences(false);

{
    final var ret = factory.isExpandEntityReferences();
    System.out.println(ret); // false

    final var builder = factory.newDocumentBuilder();
    final var document = builder.parse(new ByteArrayInputStream(xml.getBytes()));

    final var root = document.getDocumentElement();
    System.out.println(root); // [root: null]

    if (root.getFirstChild() instanceof EntityReference entityReference) {
        System.out.println(entityReference); // [aaa: null]
    }
}

boolean isIgnoringComments ()

Indica si la fábrica está configurada o no para producir analizadores que ignoren los comentarios. (Traducción automática)

final var xml = """
        <root>aaa<!--bbb--></root>
        """;

final var factory = DocumentBuilderFactory.newInstance();

{
    final var ret = factory.isIgnoringComments();
    System.out.println(ret); // false

    final var builder = factory.newDocumentBuilder();
    final var document = builder.parse(new ByteArrayInputStream(xml.getBytes()));

    final var root = document.getDocumentElement();
    System.out.println(root); // [root: null]

    final var nodes = root.getChildNodes();
    System.out.println("-- nodes --");
    for (int i = 0; i < nodes.getLength(); i++) {
        System.out.println(nodes.item(i));
    }

    // Result
    // ↓
    //-- nodes --
    //[#text: aaa]
    //[#comment: bbb]
}

factory.setIgnoringComments(true);

{
    final var ret = factory.isIgnoringComments();
    System.out.println(ret); // true

    final var builder = factory.newDocumentBuilder();
    final var document = builder.parse(new ByteArrayInputStream(xml.getBytes()));

    final var root = document.getDocumentElement();
    System.out.println(root); // [root: null]

    final var nodes = root.getChildNodes();
    System.out.println("-- nodes --");
    for (int i = 0; i < nodes.getLength(); i++) {
        System.out.println(nodes.item(i));
    }

    // Result
    // ↓
    //-- nodes --
    //[#text: aaa]
}

boolean isIgnoringElementContentWhitespace ()

Indica si la fábrica está configurada o no para producir analizadores que ignoren los espacios en blanco ignorables en el contenido del elemento. (Traducción automática)

final var xml = """
        <!DOCTYPE root [
            <!ELEMENT child-a (dummy?)>
            <!ELEMENT child-b (#PCDATA)>
        ]>
        <root>
            <child-a> </child-a>
            <child-b> </child-b>
        </root>
        """;

final var factory = DocumentBuilderFactory.newInstance();

{
    final var ret = factory.isIgnoringElementContentWhitespace();
    System.out.println(ret); // false

    final var builder = factory.newDocumentBuilder();
    final var document = builder.parse(new ByteArrayInputStream(xml.getBytes()));

    final var childA = document.getElementsByTagName("child-a").item(0);
    System.out.println(childA); // [child-a: null]

    final var childB = document.getElementsByTagName("child-b").item(0);
    System.out.println(childB); // [child-b: null]

    System.out.println(childA.getFirstChild()); // [#text:  ]
    System.out.println(childB.getFirstChild()); // [#text:  ]
}

factory.setIgnoringElementContentWhitespace(true);

{
    final var ret = factory.isIgnoringElementContentWhitespace();
    System.out.println(ret); // true

    final var builder = factory.newDocumentBuilder();
    final var document = builder.parse(new ByteArrayInputStream(xml.getBytes()));

    final var childA = document.getElementsByTagName("child-a").item(0);
    System.out.println(childA); // [child-a: null]

    final var childB = document.getElementsByTagName("child-b").item(0);
    System.out.println(childB); // [child-b: null]

    System.out.println(childA.getFirstChild()); // null
    System.out.println(childB.getFirstChild()); // [#text:  ]
}

boolean isNamespaceAware ()

Indica si la fábrica está configurada o no para producir analizadores que reconozcan espacios de nombres. (Traducción automática)

final var xml = """
        <ns:root xmlns:ns="sample">
            <ns:child/>
        </ns:root>
        """;

final var factory = DocumentBuilderFactory.newInstance();

{
    final var ret = factory.isNamespaceAware();
    System.out.println(ret); // false

    final var builder = factory.newDocumentBuilder();
    final var document = builder.parse(new ByteArrayInputStream(xml.getBytes()));

    final var child = document.getElementsByTagNameNS("sample", "child").item(0);
    System.out.println(child); // null
}

factory.setNamespaceAware(true);

{
    final var ret = factory.isNamespaceAware();
    System.out.println(ret); // true

    final var builder = factory.newDocumentBuilder();
    final var document = builder.parse(new ByteArrayInputStream(xml.getBytes()));

    final var child = document.getElementsByTagNameNS("sample", "child").item(0);
    System.out.println(child); // [ns:child: null]
}

boolean isValidating ()

Indica si la fábrica está configurada o no para producir analizadores que validen el contenido XML durante el análisis. (Traducción automática)

// The XML document intentionally does not match the DTD.
final var xml = """
        <!DOCTYPE root [
            <!ELEMENT root (child-a)>
        ]>
        <root><child-z/></root>
        """;

final var factory = DocumentBuilderFactory.newInstance();

final var errorHandler = new DefaultHandler() {
    @Override
    public void error(SAXParseException e) {
        System.out.println("-- ErrorHandler error --");
        System.out.println(e);
    }
};

{
    final var ret = factory.isValidating();
    System.out.println(ret); // false

    final var builder = factory.newDocumentBuilder();
    builder.setErrorHandler(errorHandler);

    final var document = builder.parse(new ByteArrayInputStream(xml.getBytes()));

    final var childZ = document.getElementsByTagName("child-z").item(0);
    System.out.println(childZ); // [child-z: null]
}

factory.setValidating(true);

{
    final var ret = factory.isValidating();
    System.out.println(ret); // true

    final var builder = factory.newDocumentBuilder();
    builder.setErrorHandler(errorHandler);

    final var document = builder.parse(new ByteArrayInputStream(xml.getBytes()));

    // Result
    // ↓
    //-- ErrorHandler error --
    //org.xml.sax.SAXParseException; lineNumber: 4; columnNumber: 17;
    // Element type "child-z" must be declared.
    //-- ErrorHandler error --
    //org.xml.sax.SAXParseException; lineNumber: 4; columnNumber: 24;
    // The content of element type "root" must match "(child-a)".
}

boolean isXIncludeAware ()

Obtener el estado del procesamiento de XInclude. (Traducción automática)

final var sampleFile = Path.of("R:", "java-work", "sample.xml");
System.out.println(sampleFile); // R:\java-work\sample.xml

Files.writeString(sampleFile, """
        <child>abcd</child>
        """);

final var xml = """
        <root xmlns:xi="http://www.w3.org/2001/XInclude">
            <xi:include href="file:///R:/java-work/sample.xml" parse="xml" />
        </root>
        """;

final var factory = DocumentBuilderFactory.newNSInstance();

{
    final var ret = factory.isXIncludeAware();
    System.out.println(ret); // false

    final var builder = factory.newDocumentBuilder();
    final var document = builder.parse(new ByteArrayInputStream(xml.getBytes()));

    final var nodes = document.getElementsByTagName("child");
    System.out.println(nodes.getLength()); // 0
}

factory.setXIncludeAware(true);

{
    final var ret = factory.isXIncludeAware();
    System.out.println(ret); // true

    final var builder = factory.newDocumentBuilder();
    final var document = builder.parse(new ByteArrayInputStream(xml.getBytes()));

    final var child = document.getElementsByTagName("child").item(0);
    System.out.println(child); // [child: null]
    System.out.println(child.getTextContent()); // abcd
}

static DocumentBuilderFactory newDefaultInstance ()

Crea una nueva instancia de la implementación predeterminada del sistema incorporada de DocumentBuilderFactory. (Traducción automática)

final var xml = """
        <root>
            <child-a>AAA</child-a>
            <child-b>BBB</child-b>
        </root>
        """;

final var factory = DocumentBuilderFactory.newDefaultInstance();
final var builder = factory.newDocumentBuilder();

final var document = builder.parse(new ByteArrayInputStream(xml.getBytes()));

final var childA = document.getElementsByTagName("child-a").item(0);
System.out.println(childA); // [child-a: null]
System.out.println(childA.getTextContent()); // AAA

final var childB = document.getElementsByTagName("child-b").item(0);
System.out.println(childB); // [child-b: null]
System.out.println(childB.getTextContent()); // BBB

static DocumentBuilderFactory newDefaultNSInstance ()

Crea una nueva instancia de NamespaceAware de la implementación predeterminada del sistema incorporada de DocumentBuilderFactory. (Traducción automática)

final var nsFactory = DocumentBuilderFactory.newDefaultNSInstance();
System.out.println(nsFactory.isNamespaceAware()); // true

final var factory = DocumentBuilderFactory.newDefaultInstance();
System.out.println(factory.isNamespaceAware()); // false

abstract DocumentBuilder newDocumentBuilder ()

Crea una nueva instancia de un DocumentBuilder utilizando los parámetros configurados actualmente. (Traducción automática)

final var xml = """
        <root>
            <child-a>AAA</child-a>
            <child-b>BBB</child-b>
        </root>
        """;

final var factory = DocumentBuilderFactory.newInstance();
final var builder = factory.newDocumentBuilder();

final var document = builder.parse(new ByteArrayInputStream(xml.getBytes()));

final var childA = document.getElementsByTagName("child-a").item(0);
System.out.println(childA); // [child-a: null]
System.out.println(childA.getTextContent()); // AAA

final var childB = document.getElementsByTagName("child-b").item(0);
System.out.println(childB); // [child-b: null]
System.out.println(childB.getTextContent()); // BBB

static DocumentBuilderFactory newInstance ()

Obtiene una nueva instancia de un DocumentBuilderFactory. (Traducción automática)

final var xml = """
        <root>
            <child-a>AAA</child-a>
            <child-b>BBB</child-b>
        </root>
        """;

final var factory = DocumentBuilderFactory.newInstance();
final var builder = factory.newDocumentBuilder();

final var document = builder.parse(new ByteArrayInputStream(xml.getBytes()));

final var childA = document.getElementsByTagName("child-a").item(0);
System.out.println(childA); // [child-a: null]
System.out.println(childA.getTextContent()); // AAA

final var childB = document.getElementsByTagName("child-b").item(0);
System.out.println(childB); // [child-b: null]
System.out.println(childB.getTextContent()); // BBB

static DocumentBuilderFactory newInstance (String factoryClassName, ClassLoader classLoader)

Obtenga una nueva instancia de DocumentBuilderFactory a partir del nombre de clase. (Traducción automática)

This method is probably for third party libraries. Therefore, the code example is omitted.

static DocumentBuilderFactory newNSInstance ()

Crea una nueva instancia NamespaceAware de un DocumentBuilderFactory. (Traducción automática)

final var nsFactory = DocumentBuilderFactory.newNSInstance();
System.out.println(nsFactory.isNamespaceAware()); // true

final var factory = DocumentBuilderFactory.newInstance();
System.out.println(factory.isNamespaceAware()); // false

static DocumentBuilderFactory newNSInstance (String factoryClassName, ClassLoader classLoader)

Crea una nueva instancia de NamespaceAware de un DocumentBuilderFactory a partir del nombre de clase. (Traducción automática)

This method is probably for third party libraries. Therefore, the code example is omitted.

abstract void setAttribute (String name, Object value)

Permite al usuario establecer atributos específicos en la implementación subyacente. (Traducción automática)

final var dtdFile = Path.of("R:", "java-work", "sample.dtd");
System.out.println(dtdFile); // R:\java-work\sample.dtd

Files.writeString(dtdFile, """
        <!ENTITY aaa "bbb">
        """);

final var xml = """
        <!DOCTYPE root SYSTEM "file:///R:/java-work/sample.dtd">
        <root>&aaa;</root>
        """;

final var factory = DocumentBuilderFactory.newInstance();

{
    factory.setAttribute(XMLConstants.ACCESS_EXTERNAL_DTD, "all");

    final var ret = factory.getAttribute(XMLConstants.ACCESS_EXTERNAL_DTD);
    System.out.println(ret); // "all"

    final var builder = factory.newDocumentBuilder();
    final var document = builder.parse(new ByteArrayInputStream(xml.getBytes()));

    final var root = document.getDocumentElement();
    System.out.println(root); // [root: null]
    System.out.println(root.getTextContent()); // bbb
}

{
    factory.setAttribute(XMLConstants.ACCESS_EXTERNAL_DTD, "");

    final var ret = factory.getAttribute(XMLConstants.ACCESS_EXTERNAL_DTD);
    System.out.println(ret); // ""

    final var builder = factory.newDocumentBuilder();

    try {
        final var document = builder.parse(new ByteArrayInputStream(xml.getBytes()));
    } catch (SAXException e) {
        System.out.println(e);
    }

    // Result
    // ↓
    //org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 57; External DTD:
    // Failed to read external DTD 'sample.dtd', because 'file' access is not allowed due to
    // restriction set by the accessExternalDTD property.
}

void setCoalescing (boolean coalescing)

Especifica que el analizador producido por este código convertirá los nodos CDATA en nodos de texto y los agregará al nodo de texto adyacente (si lo hay). (Traducción automática)

final var xml = """
        <root>aaa<![CDATA[<&>]]></root>
        """;

final var factory = DocumentBuilderFactory.newInstance();

{
    final var ret = factory.isCoalescing();
    System.out.println(ret); // false

    final var builder = factory.newDocumentBuilder();
    final var document = builder.parse(new ByteArrayInputStream(xml.getBytes()));

    final var root = document.getDocumentElement();
    System.out.println(root); // [root: null]

    final var nodes = root.getChildNodes();
    System.out.println("-- nodes --");
    for (int i = 0; i < nodes.getLength(); i++) {
        System.out.println(nodes.item(i));
    }

    // Result
    // ↓
    //-- nodes --
    //[#text: aaa]
    //[#cdata-section: <&>]
}

factory.setCoalescing(true);

{
    final var ret = factory.isCoalescing();
    System.out.println(ret); // true

    final var builder = factory.newDocumentBuilder();
    final var document = builder.parse(new ByteArrayInputStream(xml.getBytes()));

    final var root = document.getDocumentElement();
    System.out.println(root); // [root: null]

    final var nodes = root.getChildNodes();
    System.out.println("-- nodes --");
    for (int i = 0; i < nodes.getLength(); i++) {
        System.out.println(nodes.item(i));
    }

    // Result
    // ↓
    //-- nodes --
    //[#text: aaa<&>]
}

void setExpandEntityReferences (boolean expandEntityRef)

Especifica que el analizador producido por este código expandirá los nodos de referencia de entidad. (Traducción automática)

final var xml = """
        <!DOCTYPE root [
            <!ENTITY aaa "bbb">
        ]>
        <root>&aaa;</root>
        """;

final var factory = DocumentBuilderFactory.newInstance();

{
    final var ret = factory.isExpandEntityReferences();
    System.out.println(ret); // true

    final var builder = factory.newDocumentBuilder();
    final var document = builder.parse(new ByteArrayInputStream(xml.getBytes()));

    final var root = document.getDocumentElement();
    System.out.println(root); // [root: null]

    final var child = root.getFirstChild();
    System.out.println(child); // [#text: bbb]
}

factory.setExpandEntityReferences(false);

{
    final var ret = factory.isExpandEntityReferences();
    System.out.println(ret); // false

    final var builder = factory.newDocumentBuilder();
    final var document = builder.parse(new ByteArrayInputStream(xml.getBytes()));

    final var root = document.getDocumentElement();
    System.out.println(root); // [root: null]

    if (root.getFirstChild() instanceof EntityReference entityReference) {
        System.out.println(entityReference); // [aaa: null]
    }
}

abstract void setFeature (String name, boolean value)

Establezca una función para este DocumentBuilderFactory y los DocumentBuilders creados por esta fábrica. (Traducción automática)

// An example of the exponential entity expansion attack.
final var xml = """
        <!DOCTYPE root[
            <!ENTITY x100 "X">
            <!ENTITY x99 "&x100;&x100;">
            <!ENTITY x98 "&x99;&x99;">
            ...
            (omitted)
            ...
            <!ENTITY x3 "&x4;&x4;">
            <!ENTITY x2 "&x3;&x3;">
            <!ENTITY x1 "&x2;&x2;">
        ]>
        <root>&x1;</root>
        """;

final var factory = DocumentBuilderFactory.newInstance();

{
    final var ret = factory.getFeature(XMLConstants.FEATURE_SECURE_PROCESSING);
    System.out.println(ret); // true

    final var builder = factory.newDocumentBuilder();

    try {
        final var document = builder.parse(new ByteArrayInputStream(xml.getBytes()));
    } catch (SAXException e) {
        System.out.println(e);
    }

    // Result
    // ↓
    //org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; JAXP00010001:
    // The parser has encountered more than "64000" entity expansions in this document;
    // this is the limit imposed by the JDK.
}

factory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, false);

{
    final var ret = factory.getFeature(XMLConstants.FEATURE_SECURE_PROCESSING);
    System.out.println(ret); // false

    final var builder = factory.newDocumentBuilder();

    // Warning! Entities are growing exponentially, so parsing it takes a very long time.
    //final var document = builder.parse(new ByteArrayInputStream(xml.getBytes()));
}

void setIgnoringComments (boolean ignoreComments)

Especifica que el analizador producido por este código ignorará los comentarios. (Traducción automática)

final var xml = """
        <root>aaa<!--bbb--></root>
        """;

final var factory = DocumentBuilderFactory.newInstance();

{
    final var ret = factory.isIgnoringComments();
    System.out.println(ret); // false

    final var builder = factory.newDocumentBuilder();
    final var document = builder.parse(new ByteArrayInputStream(xml.getBytes()));

    final var root = document.getDocumentElement();
    System.out.println(root); // [root: null]

    final var nodes = root.getChildNodes();
    System.out.println("-- nodes --");
    for (int i = 0; i < nodes.getLength(); i++) {
        System.out.println(nodes.item(i));
    }

    // Result
    // ↓
    //-- nodes --
    //[#text: aaa]
    //[#comment: bbb]
}

factory.setIgnoringComments(true);

{
    final var ret = factory.isIgnoringComments();
    System.out.println(ret); // true

    final var builder = factory.newDocumentBuilder();
    final var document = builder.parse(new ByteArrayInputStream(xml.getBytes()));

    final var root = document.getDocumentElement();
    System.out.println(root); // [root: null]

    final var nodes = root.getChildNodes();
    System.out.println("-- nodes --");
    for (int i = 0; i < nodes.getLength(); i++) {
        System.out.println(nodes.item(i));
    }

    // Result
    // ↓
    //-- nodes --
    //[#text: aaa]
}

void setIgnoringElementContentWhitespace (boolean whitespace)

Especifica que los analizadores creados por esta fábrica deben eliminar los espacios en blanco en el contenido de los elementos (a veces conocidos vagamente como "espacios en blanco ignorables") al analizar documentos XML (consulte XML Rec 2.10). (Traducción automática)

final var xml = """
        <!DOCTYPE root [
            <!ELEMENT child-a (dummy?)>
            <!ELEMENT child-b (#PCDATA)>
        ]>
        <root>
            <child-a> </child-a>
            <child-b> </child-b>
        </root>
        """;

final var factory = DocumentBuilderFactory.newInstance();

{
    final var ret = factory.isIgnoringElementContentWhitespace();
    System.out.println(ret); // false

    final var builder = factory.newDocumentBuilder();
    final var document = builder.parse(new ByteArrayInputStream(xml.getBytes()));

    final var childA = document.getElementsByTagName("child-a").item(0);
    System.out.println(childA); // [child-a: null]

    final var childB = document.getElementsByTagName("child-b").item(0);
    System.out.println(childB); // [child-b: null]

    System.out.println(childA.getFirstChild()); // [#text:  ]
    System.out.println(childB.getFirstChild()); // [#text:  ]
}

factory.setIgnoringElementContentWhitespace(true);

{
    final var ret = factory.isIgnoringElementContentWhitespace();
    System.out.println(ret); // true

    final var builder = factory.newDocumentBuilder();
    final var document = builder.parse(new ByteArrayInputStream(xml.getBytes()));

    final var childA = document.getElementsByTagName("child-a").item(0);
    System.out.println(childA); // [child-a: null]

    final var childB = document.getElementsByTagName("child-b").item(0);
    System.out.println(childB); // [child-b: null]

    System.out.println(childA.getFirstChild()); // null
    System.out.println(childB.getFirstChild()); // [#text:  ]
}

void setNamespaceAware (boolean awareness)

Especifica que el analizador producido por este código proporcionará soporte para espacios de nombres XML. (Traducción automática)

final var xml = """
        <ns:root xmlns:ns="sample">
            <ns:child/>
        </ns:root>
        """;

final var factory = DocumentBuilderFactory.newInstance();

{
    final var ret = factory.isNamespaceAware();
    System.out.println(ret); // false

    final var builder = factory.newDocumentBuilder();
    final var document = builder.parse(new ByteArrayInputStream(xml.getBytes()));

    final var child = document.getElementsByTagNameNS("sample", "child").item(0);
    System.out.println(child); // null
}

factory.setNamespaceAware(true);

{
    final var ret = factory.isNamespaceAware();
    System.out.println(ret); // true

    final var builder = factory.newDocumentBuilder();
    final var document = builder.parse(new ByteArrayInputStream(xml.getBytes()));

    final var child = document.getElementsByTagNameNS("sample", "child").item(0);
    System.out.println(child); // [ns:child: null]
}

void setSchema (Schema schema)

Establezca el esquema que utilizarán los analizadores creados a partir de esta fábrica. (Traducción automática)

final var xsd = """
        <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
            <xsd:element name="root" type="xsd:string"/>
        </xsd:schema>
        """;

final var schemaFactory = SchemaFactory.newDefaultInstance();
final var schema = schemaFactory.newSchema(
        new StreamSource(new ByteArrayInputStream(xsd.getBytes())));

final var factory = DocumentBuilderFactory.newInstance();

System.out.println(factory.getSchema()); // null

factory.setSchema(schema);
System.out.println(factory.getSchema().equals(schema)); // true

final var errorHandler = new DefaultHandler() {
    @Override
    public void error(SAXParseException e) {
        System.out.println("-- ErrorHandler error --");
        System.out.println(e);
    }
};

{
    final var xml = """
            <root>abcd</root>
            """;

    final var builder = factory.newDocumentBuilder();
    builder.setErrorHandler(errorHandler);

    final var document = builder.parse(new ByteArrayInputStream(xml.getBytes()));

    final var root = document.getDocumentElement();
    System.out.println(root); // [root: null]
    System.out.println(root.getTextContent()); // abcd
}

{
    final var xml = """
            <root><child>abcd</child></root>
            """;

    final var builder = factory.newDocumentBuilder();
    builder.setErrorHandler(errorHandler);

    final var document = builder.parse(new ByteArrayInputStream(xml.getBytes()));

    // Result
    // ↓
    //-- ErrorHandler error --
    //org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 33; cvc-type.3.1.2:
    // Element 'root' is a simple type, so it must have no element information item [children].
}

void setValidating (boolean validating)

Especifica que el analizador producido por este código validará los documentos a medida que se analicen. (Traducción automática)

// The XML document intentionally does not match the DTD.
final var xml = """
        <!DOCTYPE root [
            <!ELEMENT root (child-a)>
        ]>
        <root><child-z/></root>
        """;

final var factory = DocumentBuilderFactory.newInstance();

final var errorHandler = new DefaultHandler() {
    @Override
    public void error(SAXParseException e) {
        System.out.println("-- ErrorHandler error --");
        System.out.println(e);
    }
};

{
    final var ret = factory.isValidating();
    System.out.println(ret); // false

    final var builder = factory.newDocumentBuilder();
    builder.setErrorHandler(errorHandler);

    final var document = builder.parse(new ByteArrayInputStream(xml.getBytes()));

    final var childZ = document.getElementsByTagName("child-z").item(0);
    System.out.println(childZ); // [child-z: null]
}

factory.setValidating(true);

{
    final var ret = factory.isValidating();
    System.out.println(ret); // true

    final var builder = factory.newDocumentBuilder();
    builder.setErrorHandler(errorHandler);

    final var document = builder.parse(new ByteArrayInputStream(xml.getBytes()));

    // Result
    // ↓
    //-- ErrorHandler error --
    //org.xml.sax.SAXParseException; lineNumber: 4; columnNumber: 17;
    // Element type "child-z" must be declared.
    //-- ErrorHandler error --
    //org.xml.sax.SAXParseException; lineNumber: 4; columnNumber: 24;
    // The content of element type "root" must match "(child-a)".
}

void setXIncludeAware (boolean state)

Establecer el estado del procesamiento de XInclude. (Traducción automática)

final var sampleFile = Path.of("R:", "java-work", "sample.xml");
System.out.println(sampleFile); // R:\java-work\sample.xml

Files.writeString(sampleFile, """
        <child>abcd</child>
        """);

final var xml = """
        <root xmlns:xi="http://www.w3.org/2001/XInclude">
            <xi:include href="file:///R:/java-work/sample.xml" parse="xml" />
        </root>
        """;

final var factory = DocumentBuilderFactory.newNSInstance();

{
    final var ret = factory.isXIncludeAware();
    System.out.println(ret); // false

    final var builder = factory.newDocumentBuilder();
    final var document = builder.parse(new ByteArrayInputStream(xml.getBytes()));

    final var nodes = document.getElementsByTagName("child");
    System.out.println(nodes.getLength()); // 0
}

factory.setXIncludeAware(true);

{
    final var ret = factory.isXIncludeAware();
    System.out.println(ret); // true

    final var builder = factory.newDocumentBuilder();
    final var document = builder.parse(new ByteArrayInputStream(xml.getBytes()));

    final var child = document.getElementsByTagName("child").item(0);
    System.out.println(child); // [child: null]
    System.out.println(child.getTextContent()); // abcd
}

Related posts

To top of page