1. | XSLT 2.0 examples | ||||||||||||||||
During the ongoing discussion about XSLT 2 today and considering/not considering the switch, I thought: why isn't there a nice all-encompassing example that shows the merits of XSLT 2 over XSLT 1? It'll come as no surprise that there's no trivial answer to that, so I figured: why not ask the masses (some people claim that the masses are always right, though that is debatable). The target audience is: XSLT 1 users that would like to know more / understand more of XSLT 2 and are considering the switch. Before we start discussing the template itself, let me throw in some rules:
Apart from (6) and (7), I think it is fairly trivial. I put in (7) so that people that want to see other tools in action (Saxon, Gestalt, Altova) can see them in action (it follows that this means no SA behavior, sorry). I put in (6) to keep the example rather trivial. If users of mentioned and/or missing tools would be so kind as to answer with a one-liner to call the template below from a command line? About the details that should be in the example. I thought of a nice example that can possibly be made infinitely better. Consider it a first draft, and I invite everyone to shoot at it (you may even blast it away ;) This what I put in so far:
The example is rather trivial (it should be, I believe). It takes a list of users of XSLT products, and groups them per product: <james-johnsson>Saxon, c, xslt 2</james-johnsson> <super-troopers>xsltproc, nc, xslt 1</super-troopers> Here: the node name is the user. Then follows a CSV string. The first part is the processor, the second says Compliant or NonCompliant, the third says the language (I am pretty sure the input is not correct, sorry about my lack of knowledge of the compliancy level). The output groups per processor as follows, where some processing is done on the strings and the users are comma-concatenated under <users>: <processor name="Xsltproc"> <level>processor is non-compliant</level> <language>XSLT 1</language> <users>George Williams Geraldson, Super Troopers</users> </processor> I understand this is a rather superficial example. If anybody can make it better, clearer, input is welcome. Keep in mind that it should be kept easy as well as showing the power of XSLT 2 (so it can be used as a showcase). Here's the XSLT so far, any ideas, complaints, suggestions, rewrites, opinions etc are welcome: <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"; xmlns:xs = "http://www.w3.org/2001/XMLSchema"; xmlns:my = "urn:my" version="2.0" exclude-result-prefixes="#all"> <xsl:output indent="yes" /> <!-- the input, just in a variable ready to use, easy for testing, no need for exslt:node-set() --> <xsl:variable name="preferences"> <james-johnsson>Saxon, c, xslt 2</james-johnsson> <george-williams-geraldson>xsltproc, nc, xslt 1</george-williams-geraldson> <super-troopers>xsltproc, nc, xslt 1</super-troopers> <merry-mirriams>libxslt, nc, xslt 1</merry-mirriams> <john-ronald-reuel-tolkien>saxon, c, xslt 2</john-ronald-reuel-tolkien> <sir-tomald-richards>gestAlt, nc, XSLT 2</sir-tomald-richards> <agatha-kirsten>saxon, c, xslt 2</agatha-kirsten> <mollie-jollie>saxon, c, xslt 2</mollie-jollie> </xsl:variable> <xsl:template match="/" name="main"> <xsl:variable name="micro-pipeline"> <xsl:apply-templates select="$preferences/*" /> </xsl:variable> <!-- group by processor --> <xsl:for-each-group select="$micro-pipeline/processor" group-by="token[1]/upper-case(text())"> <processor name="{my:camel-case(token[1])}" > <xsl:apply-templates select="token[position() = 2 to 3]" /> <users> <!-- join the users in one string and camel case their names --> <xsl:value-of select=" string-join( my:camel-case(current-group()/user) , ', ')" /> </users> </processor> </xsl:for-each-group> </xsl:template> <!-- matches for $preferences nodes --> <xsl:template match="*" priority="0"> <processor> <xsl:next-match /> </processor> </xsl:template> <xsl:template match="*"> <user><xsl:value-of select="local-name(.)" /></user> <xsl:next-match /> </xsl:template> <xsl:template match="text()"> <xsl:for-each select="tokenize(., ',')"> <token><xsl:value-of select="normalize-space(.)" /></token> </xsl:for-each> </xsl:template> <!-- what follows: matches for micro pipeline all matches are case-insensitive, with no need for translate() and trouble with more complex characters --> <xsl:template match="token[matches(., '^c$', 'i')]"> <level>processor is compliant</level> </xsl:template> <xsl:template match="token[matches(., '^nc$', 'i')]"> <level>processor is non-compliant</level> </xsl:template> <xsl:template match="token[matches(., '^xslt', 'i')]"> <language><xsl:value-of select="upper-case(.)" /></language> </xsl:template> <!-- put the nasty bit aside in a function it camel-cases a dashed or space delimited string --> <xsl:function name="my:camel-case" as="xs:string*"> <xsl:param name="string" as="xs:string*"/> <xsl:sequence select="for $s in $string return string-join( for $word in tokenize($s, '-| ') return concat( upper-case(substring($word, 1, 1)), substring($word, 2)) , ' ')" /> </xsl:function> </xsl:stylesheet> | |||||||||||||||||
2. | Forward and backwards compatibility | ||||||||||||||||
This text is essentially unchanged from the 1.0 spec. It's actually a brilliant bit of future-proofing. Suppose that XSLT 3.0 has just been published, and it includes a new <xsl:perform-magic> instruction, which is implemented in Saxon version 19.2, but not yet in MSXML6. You want to invoke this instruction when your stylesheet is running under Saxon, but when running under MSXML6, you just want to leave out that part of the output. So you write: <xsl:template match="thing" version="3.0"> <xsl:perform-magic select="magic-dust"> <xsl:fallback>Sorry, Microsoft don't do magic</xsl:fallback> </xsl:perform-magic> </xsl:template> Specifying version="3.0" means that the Microsoft processor (or any XSLT 1.0 or 2.0 processor) is obliged to execute the xsl:fallback instruction. If you had said version="1.0" or version="2.0", then the processor would instead have thrown a static error saying that there is no such instruction as xsl:perform-magic. | |||||||||||||||||
3. | 1.0 and 2.0 differences | ||||||||||||||||
The expression substring-before(//foo/foo2, '-') is a valid XPath 1.0 expression. In XPath 2.0 it is valid only if //foo/foo2 returns a single node. The Saxon error message implies that it is selecting more than one node. If you want to select only the first node, use substring-before((//foo/foo2)[1], '-'). If you want to process all the nodes (in XPath 2.0) use for $x in //foo/foo2 return substring-before($x, '-') | |||||||||||||||||
4. | Wildcards in namespace | ||||||||||||||||
XPath 2.0 allows constructs of the form *:local which will match local in any namespace, but there's no way of matching a set of namespaces. I would recommend using a first transformation pass to normalize the namespace URI, to keep this separate from the "real" transformation logic. This is just a variant on the identity template: <xsl:template match="one-uri:*"> <xsl:element name="{local-name}" namespace="two-uri"> <xsl:copy-of select="@*"/> <xsl:apply-templates/> </xsl:element> </xsl:template> | |||||||||||||||||
5. | Boolean tests in 2.0 | ||||||||||||||||
beware though that that will get you burned again when you start using XSLT2 scented water. xsl:value-of returns a text node with string value the string value of the expression. This is subtly or not so subtly different from a string. It doesn't make so much difference in XSLT1 as the only way to carry strings around is to put them in text nodes, but in xpath2 you can have sequences of strings and sequences of text nodes (and sequences that contain both strings and text nodes) the rules for the two cases (and in particular whether spaces are automatically inserted between adjacent items) are different on the two cases. Yes, but I think the 2.0 way makes more sense. Just to make sure we are talking about the same thing (and to help cement my knowledge), consider: <root> <node>foo</node> <node>bar</node> </root> In 1.0: <xsl:template match="root"> <xsl:value-of select="node"/> </xsl:template> Returns: 'foo' Because in XSLT 1.0 'first item semantics' apply when a value-of is performed on a sequence. In 2.0 the same template would return: 'foo bar' That is, all items in the sequence with a single space as a seperator. In order to remove/control the space, we can use the @separator on value-of: <xsl:value-of select="node" separator=""/> Which would produce: 'foobar' For me, that's much more intuitive than just picking the first one. Another plus for 2.0 :) Of course, if there is another sequence related area to get burned on please post an example - it's good to know the gotchas up front. | |||||||||||||||||
6. | xsl expressions return type | ||||||||||||||||
To its caller. For example, <xsl:variable name="x" as="xs:boolean"> <xsl:apply-templates select="chap" mode="doc:has-footnotes"/> </xsl:variable> If the call on apply-templates returns a boolean, that boolean will be the value of variable $x. We all tend to think in terms of the 1.0 model where XPath expressions read the source document and XSLT instructions write to the result tree. Thanks largely to Jeni Tennison's intervention half way through the 2.0 design process, that's no longer the processing model: instructions and expressions now both return results to their caller, and the result can be any sequence of atomic values or nodes. There are basically two ways of getting the result of an XSLT instruction back into the XPath world to make it available for further processing: you can assign it to a variable, as above, or you can return it as the result of a function, as in my earlier example: <xsl:function name="doc:has-footnotes" as="xs:boolean"> <xsl:param name="chap" as="element()"/> <xsl:apply-templates select="$chap"/> </xsl:function> | |||||||||||||||||
7. | Is XSLT 2.0 Turing complete | ||||||||||||||||
There is a proof that XSLT 1.0 is Turing complete at unidex The proof clearly applies equally to XSLT 2.0 since the Universal Turing Machine used in the proof is a legal XSLT 2.0 stylesheet. | |||||||||||||||||
8. | Copy, without namespaces | ||||||||||||||||
In XSLT 2.0 you can do xsl:copy without copying namespaces by adding the attribute copy-namespaces="no" to the xsl:copy or xsl:copy-of element. | |||||||||||||||||
9. | Start processing at a named template | ||||||||||||||||
Yes. FXSL for XSLT 2.0 has all the initial templates declared as: <xsl:template match="/" name="initial" > if a source document is not required. That way, Dmitre can specify a dummy source document (as that is the way he prefers to work), and I can specify an initial template (as I like to work from the command line). Yes. Basically, you have nothing for '/' to match against. When you have no source document, there is no initial context node. | |||||||||||||||||
10. | Wrong version | ||||||||||||||||
The stack trace showed that he was using Xalan/XSLTC. If you specify version="2.0" on a stylesheet and submit it to a 1.0 processor, it runs in forwards-compatibility mode. This means that the <xsl:function> element will be ignored; but if there's an XPath expression that calls my:function without declaring the prefix my, it's quite likely this will give a compile-time error. |