|
|
XML::LibXML::Document - DOM Document Class
use XML::LibXML
$dom = XML::LibXML::Document->new( $version, $encoding ); $dom = XML::LibXML::Document->createDocument( $version, $encoding ); $strEncoding = $doc->encoding(); $doc->setEncoding($new_encoding); $strVersion = $doc->version(); $doc->standalone $doc->setStandalone($numvalue); my $compression = $doc->compression; $doc->setCompression($ziplevel); $docstring = $dom->toString($format); $state = $doc->toFile($filename, $format); $state = $doc->toFH($fh, $format); $document->toStringHTML(); $bool = $dom->is_valid(); $dom->validate(); $root = $dom->documentElement(); $dom->setDocumentElement( $root ); $element = $dom->createElement( $nodename ); $element = $dom->createElementNS( $namespaceURI, $qname ); $text = $dom->createTextNode( $content_text ); $comment = $dom->createComment( $comment_text ); $attrnode = $doc->createAttribute($name [,$value]); $fragment = $doc->createDocumentFragment() $attrnode = $doc->createAttributeNS( namespaceURI, $name [,$value] ); $cdata = $dom->create( $cdata_content ); my $pi = $doc->createProcessingInstruction( $target, $data ); my $entref = $doc->createEntityReference($refname); $document->importNode( $node ); $document->adoptNode( $node ); my $dtd = $doc->externalSubset; my $dtd = $doc->internalSubset; $doc->setExternalSubset($dtd); $doc->setInternalSubset($dtd); my $dtd = $doc->removeExternalSubset(); my $dtd = $doc->removeInternalSubset();
The Document Class is in most cases the result of a parsing process. But sometimes it is necessary to create a Document from scratch. The DOM Document Class provides functions that are conform to the DOM Core naming style.
It inherits all functions from XML::LibXML::Node as specified in the DOM specification. This enables to access the nodes beside the root element on document level - a DTD for example. The support for these nodes is limited at the moment.
While generaly nodes are bound to a document in the DOM concept it is suggested that one should always create a node not bound to any document. There is no need of really including the node to the document, but once the node is bound to a document, it is quite safe that all strings have the correct encoding. If an unbound textnode with an iso encoded string is created (e.g. with $CLASS->new()), the toString function may not return the expected result.
All this seems like a limitation as long UTF8 encoding is ashured. If iso encoded strings come into play it is much safer to use the node creation functions of XML::LibXML::Document.
createDocument()
<?xml version="your version" encoding="your encoding"?>
Both parameter are optional. The default value for $version is 1.0, of course. If the $encoding parameter is not set, the encoding will be left unset, which means UTF8 is implied.
The call of createDocument without any parameter will result the following code:
<?xml version="1.0"?>
Alternatively one can call this constructor directly from the XML::LibXML class level, to avoid some typing. This will not cause any effect to the class instance, which is alway XML::LibXML::Document.
my $document = XML::LibXML->createDocument( "1.0", "UTF8" );
is therefore a shortcut for
my $document = XML::LibXML::Document->createDocument( "1.0", "UTF8" );
my $doc = XML::LibXML->createDocument( "1.0", "ISO-8859-15" ); print $doc->encoding; # prints ISO-8859-15
Optionally this function can be accessed by actualEncoding or getEncoding.
Note that this function has to be used very careful, since you can't simply convert one encoding in any other, since some (or even all) characters may not exist in the new encoding. XML::LibXML will not test if the operation is allowed or possible for the given document. The only switching ashured to work is to UTF8.
getVersion() is an alternative form of this function.
Note that this feature will only work if libxml2 is compiled with zlib support.
The optional $format parameter sets the indenting of the output. This parameter is expected to be an integer value, that specifies that indentation should be used. The format parameter can have three different values if it is used:
If $format is 0, than the document is dumped as it was originally parsed
If $format is 1, libxml2 will add ignoreable whitespaces, so the nodes content is easier to read. Existing text nodes will not be altered
If $format is 2 (or higher), libxml2 will act as $format == 1 but it add a leading and a trailing linebreak to each text node.
libxml2 uses a hardcoded indentation of 2 space characters per indentation level. This value can not be altered on runtime.
NOTE: XML::LibXML::Document::toString returns the data in the document encoding rather than UTF8!
The format parameter has the same behaviour as in toString().
The format parameter has the same behaviour as in toString().
You may also pass in a XML::LibXML::Dtd object, to validate against an external DTD:
if (!$dom->is_valid($dtd)) { warn("document is not valid!"); }
Again, you may pass in a DTD object
Optionaly one can use getDocumentElement.
Since this method is quite long one may use its short form createPI().
An entity reference is unique to a document and cannot passed to other documents as other nodes can be passed.
NOTE: A text content containing something that looks like an entity reference, will not be expanded to a real entity reference unless it is a predefined entity
my $string = "&foo;"; $some_element->appendText( $string ); print $some_element->textContent; # prints "&foo;"
NOTE: Don't try to use importNode()
to import subtrees that contain an entity reference - even if the entity reference is the root node of the subtree. This will cause serious problems to your program. This is
a limitation of libxml2 and not of XML::LibXML itself.
After a document adopted a node, the node, its attributes and all its descendants belong to the new document. Because the node does not belong to the old document, it will be unlinked from its old location first.
NOTE: Don't try to adoptNode()
to import
subtrees that contain entity references - even if
the entity reference is the root node of the
subtree. This will cause serious problems to your
program. This is a limitation of libxml2 and not
of XML::LibXML itself.
NOTE Dtd nodes are no ordinary nodes in libxml2. The support for these nodes in XML::LibXML is still limited. In particular one may not want use common node function on doctype declaration nodes!
NOTE Dtd nodes are no ordinary nodes in libxml2. The support for these nodes in XML::LibXML is still limited. In particular one may not want use common node function on doctype declaration nodes!
This method sets a DTD node as an external subset of the given document.
This method sets a DTD node as an internal subset of the given document.
If a document has an external subset defined it can be removed from the document by using this function. The removed dtd node will be returned.
If a document has an internal subset defined it can be removed from the document by using this function. The removed dtd node will be returned.
Matt Sergeant, Christian Glahn
XML::LibXML, XML::LibXML::DOM, XML::LibXML::Element, XML::LibXML::Text, XML::LibXML::Attr, XML::LibXML::Comment, XML::LibXML::DocumentFragment, XML::LibXML::DTD
1.53