XSLT, generate DTD, declaration
1. | how do I generate a reference to a DTD |
<xsl:text disable-output-escaping="yes"> <![CDATA[ <!DOCTYPE wml PUBLIC "-//WAPFORUM//DTD WML 1.1//EN" "http://www.wapforum.org/DTD/wml_1.1.xml"> ]]> </xsl:text> Mike Brown notes: As Mike Kay pointed out, I didn't notice that later in the Output part of the spec, it says that using disable-output-escaping lifts any well-formedness restrictions. This is just one of those things where the spec is carefully worded for clarity from the "here are the results of using this instruction" point of view, with few or no examples given for "if you're trying to achieve a certain result, here's how to use this instruction to do it"... I think more informative examples would help, but the wording of the normative sections is probably fine. I am using XSLT for just that very purpose - and I test the starting element to find out which doctype I want to insert - all you have to do is output CDATA: <!-- MATCH ROOT NODE, GENERATE DOCTYPE BASED ON STARTING ELEMENT: --> <xsl:template match="/"> <xsl:choose> <xsl:when test="bva.grp"> <xsl:text disable-output-escaping="yes"><![CDATA[<!DOCTYPE VALUE-ADD.GROUP PUBLIC "-//Brooker's//DTD Brooker's Legislation Value-Add Group//EN">]]></xsl:text> </xsl:when> <xsl:when test="act"> <xsl:text disable-output-escaping="yes"><![CDATA[<!DOCTYPE ACT PUBLIC "-//Brooker's//DTD Brooker's Act//EN">]]></xsl:text> </xsl:when> ... etc </xsl:choose> <xsl:apply-templates /> </xsl:template> And obviously you can put entity references inside the CDATA section as well, although in this particular case I haven't needed to. | |
2. | How to copy the DOCTYPE value |
if you preprocess a document with: <!DOCTYPE xxx SYSTEM "yyy"> <xxx> <foo/> </xxx> into: <!DOCTYPE xxx SYSTEM "yyy"> <!-- DOCTYPE xxx SYSTEM "yyy" --> <xxx> <foo/> </xxx> This doesn't alter the validity of the document in any way, but does add a "comment item" into the document's infoset that XSLT/XPath can address. Then you can use an XPath expression like: file://comment()[contains(.,'DOCTYPE')][1] to refer to the first comment containing DOCTYPE and then use a combination of normalize-space(), substring-after, and substring() to get out the uri for the DTD of the document. Since you cannot set the doctype-system="" property of <xsl:output> dynamically, you'd have to then resort to a use of <xsl:value-of disable-output-escaping="yes"/> and concat() to literally print the <!DOCTYPE into the result tree. Given the post-processed source document above, the following XSLT transform produces the output: <!DOCTYPE xxx SYSTEM "yyy"> <xxx> <foo/> </xxx> <xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output indent="yes"/> <xsl:template match="/"> <!-- | Output the Doctype in the result based on | the DOCTYPE comment we preprocessed into the document +--> <!-- For convenience, get a literal quote sign in a variable --> <xsl:variable name="q">"</xsl:variable> <!-- Get the DOCTYPE comment in a variable --> <xsl:variable name="d" select="//comment()[contains(.,'DOCTYPE')][1]"/> <!-- Get the "uri" part of the doctype comment --> <xsl:variable name="e" select="substring-after(normalize-space($d), 'SYSTEM ')"/> <!-- Strip off the quotes from the "uri" --> <xsl:variable name="f" select="substring-before(substring-after($e,$q),$q)"/> <!-- Output the <!DOCTYPE --> <xsl:value-of disable-output-escaping="yes" select="concat('<!DOCTYPE ',name(/*[1]), ' SYSTEM',$q,$f,$q,'> ')"/> <xsl:apply-templates select="@*|*|processing-instruction()|comment()"/> </xsl:template> <!-- | Identity Transformation. XT doesn't seem to support the | more terse "@*|node()" at present, so this is the long form. +--> <xsl:template match="@*|*|processing-instruction()|comment()"> <xsl:copy> <xsl:apply-templates select="@*|*|processing-instruction()|comment()"/> </xsl:copy> </xsl:template> <!-- Suppress printing our little trick in the output --> <xsl:template match="//comment() [contains(.,'DOCTYPE')][1]"/> </xsl:transform> Mike Brown cautions: Anyone using this should note that this will only work if the string '-->' does not occur in the internal DTD subset. The following would throw it, for example: <!DOCTYPE xxx SYSTEM "yyy" [ <!-- a comment in the internal subset --> <!ENTITY foo "bar"> ]> | |
3. | Can I specify a DOCTYPE in my stylesheet |
yes <!DOCTYPE xsl:stylesheet [ <!ENTITY API "this" > ]> .... <xsl:template match="/index"> <general> <title>&API; Index</title> but then &API; will be expanded by the xml parser as it parses the stylesheet so the xsl engine will see <title>this Index</title> In which case there may or may not be any point in having the entity, you could have just written "this index" in place. You can probably more usefully share the dtd with your original document, if that alrady has a definition of the entity: <!DOCTYPE xsl:stylesheet SYSTEM "..\..\docs\dtds\general.dtd" > This only works if your xsl system uses a non validating parser that does read external entities. (They are not required to be read according to the xml spec.) Note that both of these solutions put `this' into your output. If you really want &API; then you don't need a doctype in your stylesheet at all just use <title> <xsl:text disable-output-escaping="yes">&API; Index<xsl:text> </title> Note however this last solution only works if you know the output tree is going to be linearised into a file and then reparsed as XML. If instead the result tree is being stuffed straight to a renderer or other XML application as an XML tree, then the receiver will get the characters & A P I ; not an entity reference. | |
4. | Is there an XSLT DTD |
There's an appendix to the XSLT spec "DTD Fragment for XSLT Stylesheets (Non-Normative)" atW3C David Carlisle adds > Thank you but it doesn't help as it is not a complete DTD. No, you have to read the instructions on how to complete it for your stylesheet It is _impossible_ to write a dtd that covers every xsl stylesheet as they may include arbitrary elements from the target DTD. So you have to define the result result-elements entity to list any result elements taht may appear inside an xsl element and then add all possible result elements to the dtd. Normally it isn't worth the bother, no one uses validating parsers to read xsl stylesheets do they? John Simpson adds If you're transforming to HTML, then there's one "valid XSLT DTD" with one definition of result-elements. If you're transforming to MathML, there's a completely different result-elements. In fact, there are as many definitions of result-elements as there are possible XML vocabularies in the universe. Hence there can *be* no result-elements, and that's why the appendix to the XSLT Rec is both a fragment and non-normative. | |
5. | DOCTYPE in output |
<!DOCTYPE DLmeta SYSTEM "http://www.dlmeta.de/dlmeta/2000/DLmeta.dtd" [<!ENTITY %LocalInclude SYSTEM "http://www.dlmeta.de/local/2000/ariadne/ariadne_local.dtd"> %LocalInclude; ]> How would I declare this in my style sheet? As others have commented, you can't directly, but do you _have_ to use a local subset. The above is equivalent to <!DOCTYPE DLmeta SYSTEM "local-DLmeta.dtd" > where local-DLmeta.dtd is <!ENTITY %LocalInclude SYSTEM "http://www.dlmeta.de/local/2000/ariadne/ariadne_local.dtd"> %LocalInclude; <!ENTITY % main SYSTEM "http://www.dlmeta.de/dlmeta/2000/DLmeta.dtd" > %main; and you can produce the doctype in this form using the standard <xsl:output doctype-system="local-DLmeta.dtd"/> | |
6. | Internal DTD-subset and CDATA-section |
> what could I write in my XSLT to output the following > as part of the result: > > <!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 20001102//EN" > "http://www.w3.org/TR/2000/CR-SVG-20001102/DTD/svg-20001102.dtd" > [ > <!ENTITY fast-slow "0 0 .5 1"> > <!ENTITY slow-fast ".5 0 1 1"> > ]> > <svg > xmlns="http://www.w3.org/Graphics/SVG/SVG-19990812.dtd" > xmlns:xlink="http://www.w3.org/1999/xlink" viewBox="0 > 0 800 600"> > <style type="text/css"> > <![CDATA[ > .balls {font: 30pt arial} > ]]></style> You can set the DOCTYPE declaration and the fact that you want to use a CDATA section (although as David C. said, there's no point in having one with the example as shown) using the xsl:output element: <xsl:output doctype-public="-//W3C//DTD SVG 20001102//EN" doctype-system="http://www.w3.org/TR/2000/CR-SVG-20001102/DTD/ svg-20001102.dtd" cdata-section-elements="style" /> However, this doesn't allow you to define an internal subset in the way that you have. Using pure XSLT, you have to do this using disable-output-escaping: <xsl:text disable-output-escaping="yes"><![CDATA[ <!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 20001102//EN" "http://www.w3.org/TR/2000/CR-SVG-20001102/DTD/svg-20001102.dtd" [ <!ENTITY fast-slow "0 0 .5 1"> <!ENTITY slow-fast ".5 0 1 1"> ]> ]]></xsl:text> (Saxon has some support for creating internal DTD subsets.) There's no much point in defining entities unless you're going to use them, so presumably you'll also want to create entity references within your output. Again, you have to disable output escaping to ensure that the entities are used: <xsl:text disable-output-escaping="yes">&fast-slow;</xsl:text> (Or you can use saxon:entity-ref.) Note that you cannot use disable-output-escaping to put the entity reference in as an attribute value. There is no way of doing that in XSLT. Having said that, you should be careful using disable-output-escaping. You cannot guarantee that a processor will understand it or use it. Generally, you should not care about using entities in your output - you should just generate the text that you want. To create the namespace declarations, you just have to have the svg element created somewhere where the namespace declarations are in scope. For example: <svg xmlns="http://www.w3.org/Graphics/SVG/SVG-19990812.dtd" xmlns:xlink="http://www.w3.org/1999/xlink" viewBox="0 0 800 600"> ... </svg> | |
7. | Testing if current document equal to xxx.xml |
<xsl:if test="generate-id(/) = generate-id(document('doc.xml'))"> Yes it is </xsl:if> | |
8. | Match on an element when Doctype is present |
this is a FAQ. To match in the xhtml namespace (which is defaulted by the xhtml dtd even if you don't make it explicit) you need to declare the xhtml namespace in your stylesheet with something like xmlns:h="http://www.w3.org/1999/xhtml" then use h:html rather than html etc. |