1. | Data types | ||||||
The only documentation for these features that's currently available is the XPath 2.0 working draft at W3C XPath 2.0 changes the type system. Whereas XPath 1.0 allows values of four types (number, string, boolean, or node-set), in XPath 2.0 the value of every expression is a sequence of items (a single item is treated as a sequence of length one). The items in a sequence may be nodes or atomic values, and the atomic values may be any of the simple types of XML Schema, for example boolean, string, decimal, double, date, QName, or anyURI. Node-sets in XPath 1.0 are replaced by node sequences; the difference is that a node sequence may be in any order, not necessarily document order. | |||||||
2. | XSD Builtin simple datatypes | ||||||
Ednote. Although this question relates directly to Saxon, the answers enlighten us on these datatypes. SA is the Schema aware version of Saxon, Saxon-B is the non-Schema aware XSLT processor.
The XSLT 2.0 specification states that a "Basic XSLT Processor" supports only the primitive types plus xs:integer. (There's been some confusion over this, but that's the current situation.) It's my intention that Saxon-B "out-of-the-box" should conform at this conformance level. This is why I put the warnings into the current release to advise users when they are relying on facilities that are only available with the "schema-aware" conformance level. The actual code for the derived types will remain present in the open source product (it's needed for XQuery, and once code is released as open source, all subsequent modifications have to be published). I may therefore decide to offer a switch that enables these types in Saxon-B. There's an escape clause in the conformance rules that probably permits this: "An atomic value may also belong to an implementation-defined type that has been added to the context for use with extension functions or extension instructions." But this is not really within the spirit of the rules, so the derived types will (at some stage) be disabled by default, in the interests of interoperability. This is assuming, of course, that there are no further changes to the spec. > These are list types, and as such they can be used only as type annotations on nodes, not as the type of an XPath value. (If a node is annotated as having type xs:NMTOKENS, then the result of atomizing the node is a value of type xs:NMTOKEN*) Saxon-B doesn't allow type annotations on nodes, so it doesn't support these types. The distinction between "schema types" and "sequence types" is not always well understood, even by WG members, and it is confusingly explained in the specs. There are basically two type hierarchies. * Schema types are types as defined in XML Schema: they divide in to simple types and complex types, and simple types further divide into union types, list types, and atomic types. Schema types appear in XSLT/XPath in the form of type annotations on nodes. * Sequence types are the types of XPath values: a sequence type consists (usually) of an item type and a cardinality. The item types divide into atomic types and node types. Atomic types are common to the two hierarchies, and the node types element(N,T) and attribute (N,T) may reference a schema type to identify the type annotation appearing on the node. So a type such as xs:IDREFS may appear in XSLT/XPath only in a construct such as attribute(*, xs:IDREFS). For a longer explanation, see the "stylesheets and schemas" chapter of my book. | |||||||
3. | Rules for < have changed | ||||||
In the current draft XPath 2.0 specs, the rules for "<" and ">" have changed when the arguments are strings (or untyped nodes). In XPath 1.0, both operands were converted to numbers, and were compared numerically. Note this is different from "=", where they are compared as strings. In XPath 1.0, "1" = "1.0" is false, but "1" <= "1.0" is true. In XPath 2.0 WD, a numeric comparison happens if one or both operands is a number, but if both are strings (or untyped nodes), you get an alphabetic comparison (using the default collating sequence). This means that "10" < "2" (both operands strings) will be true. The rules are also generalized to allow comparison of other types such as dates and times. So you need to rewrite the expression by wrapping one or both operands in the number() function. Or better, initialize the $lev attribute so its type is numeric. | |||||||
4. | Type safety in xslt 2 | ||||||
Here's an example that illustrates what happens when you multiply two values under XSLT 2.0. Say you had the following unvalidated (and hence untyped) XML: <problem risk="3" severity="4">...</problem> and you wanted to create a number of exclamation marks equal to the value of @risk * @severity. @risk and @severity are both always integers, but the XML is untyped so the XSLT processor doesn't know this. Since neither @risk nor @severity is typed, an XSLT 2.0 processor will assume that since you want to multiply them together they must be of type xs:double. When you multiply two doubles, the result is a value of type xs:double. If you try to use a double as an argument or operand to a function or operator that expects an integer (or indeed most other types), you will get a type error. For example, if you try to do: <xsl:value-of select="string-pad('!', @risk * @severity)" /> or: <xsl:for-each select="1 to @risk * @severity">!</xsl:for-each> then you will get errors because string-pad() expects an integer as its second argument and the 'to' operator expects integers for its arguments. You have to do one of:
Explicit casts look like: <xsl:value-of select="string-pad('!', xs:integer(@risk * @severity))" /> Creating the variable looks like: <xsl:variable name="danger"> <xsl:value-of select="@risk * @severity" /> </xsl:variable> <xsl:value-of select="string-pad('!', $danger)" /> This latter works because the $danger variable holds the document node of a tree that contains the value of @risk * @severity. The typed value of the document node is the value 7 of the type xdt:untypedAtomic. Since the type is xdt:untypedAtomic, the value is cast automatically to the required type of xs:integer when the variable is used. On a related subject,
Well, if you had the literals 4.5 and 5.5 they would both be interpreted as values of the type xs:decimal. If you add two decimals together, you get another decimal as a result, so the result of the expression "4.5 + 5.5" is the decimal value 10.0. On the other hand, if you had two untyped attributes, one with the value "4.5" and the other with the value "5.5" then when you added them together they'd be converted to doubles and the result would be the double value 10E0. If you use <xsl:value-of> to convert either a decimal or a double to a string then the way it gets serialised now depends on the value of the decimal or double. This isn't yet in the public drafts, but in Saxon's implementation the idea is that if there aren't any significant digits after the decimal point then it is serialised as an integer, so that you get "10". (Also, I think that if it's a fairly small or very large double then an appropriate exponent will be used; I couldn't find these large/small numbers in testing with Saxon 7.4 so either Mike hasn't implemented that or I'm misremembering.) Setting the XPath 1.0 compatibility mode on using version="1.0" should mean that any XPath 1.0 expression will give the same results as it used to. The places where it doesn't are listed in Appendix F of XPath 2.0 at: xpath 2 If you find something that isn't listed there, you should let the WG know by writing to public-qt-comments@w3.org. Using version="1.0" with XPath 2.0 expressions gives you: - First item semantics when passing a sequence to a function that expects a single item. For example, if you pass a function that expects a single node a sequence of five nodes then it will pick the first one whereas under XPath 2.0 it will give an error. - Automatic conversion to a string when passing a value to a function that expects a string. For example, if you pass substring() the current-dateTime() as the first argument then it will be converted to a string in backwards compatible mode whereas under XPath 2.0 it will give an error. - Automatic conversion to a double when passing a value to a function that expects a double. - Automatic conversion to a double of operands in arithmetic expressions. For example if you try to subtract one xs:date from another then you'll get the double NaN under the backwards compatibility rules whereas under XPath 2.0 you will get the xdt:dayTimeDuration between the two dates. - Automatic conversion to a double of the items in operands in general comparisons when either operand sequence contains a numeric value. For example, 1 = '1' should, I think, be true in backwards compatibility mode as it is in XPath 1.0, whereas in XPath 2.0 it will give an error because integers cannot be compared to strings. Using version="1.0" with an XSLT 2.0 processor does not prevent you from using XSLT 2.0 and XPath 2.0 constructs such as conditional expressions and <xsl:for-each-group>. Note that this means that if you want to make sure that a stylesheet works under XSLT 1.0, you have to test it with an XSLT 1.0 processor rather than an XSLT 2.0 processor. | |||||||
5. | Two functions, same number of parameters | ||||||
This prevents you from defining two functions with the same name and number of arguments. If you have a stylesheet with the two function definitions as above, you will get an error. What you *can* do is have the function accept a very general type and then have internal tests that determine the behaviour based on the type of the argument. For example: <xsl:function name="ol:func"> <xsl:param name="arg1" as="xdt:anyAtomicType" /> <xsl:choose> <xsl:when test="$arg1 instance of xs:integer"> <h1>This is an integer</h1> </xsl:when> <xsl:when test="$arg2 instance of xs:string"> <h1>This is a string</h1> </xsl:when> <xsl:otherwise> <xsl:message terminate="yes"> ol:func() expects an integer or a string </xsl:message> </xsl:otherwise> </xsl:choose> </xsl:function> Of course this way you won't get any static error checking to tell you that the type of the argument you're passing to the function is wrong... | |||||||
6. | Default parameter type | ||||||
No, the default type is "item()*" which accepts anything (any sequence of items). The default value of this param is a zero-length string, but the supplied value can be anything you like. | |||||||
7. | On xslt 2.0 types | ||||||
Yes, or at least partly. If you have an 'as' attribute on <xsl:variable> then it indicates the static type of the variable, but the variable itself can be of a more specific type. The "static type" is the type that the XSLT processor knows about when it first goes through the styelsheet, or an XSLT editor might be able to use: even without having an XML document to work on, the 'as' attribute tells you what type of value the variable must hold. For example, if I have: <xsl:variable name="foo" as="item()*">...</xsl:variable> then the static type, indicated by the 'as' attribute, is any number of any kind of item -- in other words, the variable could hold anything at all. If I have: <xsl:variable name="foo" as="xs:decimal" select="..." /> then the static type, indicated by the 'as' attribute, is a decimal number. Contrast this with the "dynamic type", which is the type of the actual value of the variable when you actually do the transformation on a particular XML document. For example, if you have: <xsl:variable name="foo" as="item()*"> <a /><b /><c /> </xsl:variable> then the static type is any number of any item but the value of the variable is a sequence of three elements. Similarly, if you have: <xsl:variable name="foo" as="xs:decimal" select="2" /> then the static type is a xs:decimal but the type of the variable's value is a xs:integer. As long as the dynamic type (the type of the value that the variable gets set to) is a subset of the static type (the type of the variable as declared by the 'as' attribute), you're OK.
The SequenceType syntax is specified in the XPath 2.0 spec at W3C It's impossible to give a complete list of sequence types because user-defined data types can be imported into a stylesheet from a schema, and of course there are infinite numbers of element and attribute names. I am trying to group the possible sequence types for explanatory purposes, simply to make the list easier to understand. If you'd prefer a flat list, just look at the things in quotes. However, to summarize, a SequenceType can be:
Just because document-node() comes earlier in the list than element() doesn't mean that you can provide an element where a document-node() is expected. For a full list of the built-in atomic data types, look in the F&O spec at: W3C and: W3C As I said, these atomic data types can be augmented with ones that you define yourself, within a schema, so it's not possible to generate an exhaustive list. > Yes, you're probably right Jeni... but I'm not surprised, There's certainly more to learn in XSLT 2.0 than there was in XSLT 1.0, which isn't surprising considering that it can do that much more. The 'as' attribute sets limits on what the content or select attribute of the <xsl:variable> must evaluate to, but as long as evaluating the content (or select attribute) is within those limits, it can vary.
Yes. The value returned from evaluating the XPath expression "2" is an xs:integer with the value 2. The 'as' attribute of the <xsl:variable> element says that the type of the variable is a xs:decimal. So the static type is xs:decimal, the dynamic type is xs:integer.
The value of the variable is an xs:integer, and it will remain so: the xs:integer isn't cast to a xs:decimal because an xs:integer can always be used where a xs:decimal is expected because xs:integer is a subtype of xs:decimal.
Well, "matches" would be the correct terminology, I guess. I should rephrase to: "As long as the value the variable gets set to matche the static type, you're OK." Whether a value "matches" a particular SequenceType is determined according to the rules in XPath 2.0 at: W3C "Can be cast to" isn't the correct terminology because the only casting that's supported in XPath is the casting of a single atomic value to a different atomic type. So for example "a sequence of three elements" can't be "cast to" the SequenceType "element()+", but it *matches* the SequenceType "element()+". >> The SequenceType syntax is specified in the XPath 2.0 spec at: The term "data type" is usually used to refer to atomic data types such as xs:decimal, xs:date and so on. A "sequence type" refers to the type of a sequence, such as "one or more elements". So yes, there is a meaningful difference.
In basic XSLT processors, we might well only require support for atomic values with a subset of the data types, probably:
When declaring the type of a variable (i.e. in a SequenceType), you will also be able to refer to the more general type xdt:anyAtomicType. If you're using a full schema-aware XSLT processor, then you'll be able to use all the built-in types from XML Schema and XPath 2.0, as well as the ones that you import from a schema yourself.
No. There is never a need to cast a value from a subtype to a supertype, because a value of the subtype is *by*definition* a value of the supertype already. For example, the xs:integer 2 is a xs:decimal value. The implicit casting is mainly used when casting from an untyped value (of a node) to a particular type. For example, casting the value of the 'dob' attribute to a xs:date. In this case, the type of the value of the node (xdt:untypedAtomic) is not a subtype of the type to which it's being cast (xs:date). (The other time it's used is when promoting values from xs:decimal to xs:float or xs:double, and from xs:float to xs:double.)
Ah. Basically, "element(Name, Type)" only matches elements that are called "Name" and have a type "Type". For example, "element(Start, xs:dateTime)" will match elements called <Start> with a type of xs:dateTime. (It's a bit more complicated than that, because of element substitution groups, but I won't go into that because you've said you don't care about schema-aware processing.) The type is assigned when the element is validated against a schema (which it might be when it's generated using XSLT). In basic XSLT, all elements have the type xdt:untypedAny (xs:anyType in the current specs, but I think that's going to change), so if you're using basic XSLT then you don't have to worry about this kind of test. The only element node tests you will be interested in are: - element() which match all elements, and all elements with a particular name, respectively.
If you wanted to say that the variable holds an element called "elementName" whose type is xs:integer. For example, when setting the variable with: <xsl:variable name="fred" as="element(elementName, xs:integer)"> <elementName xsl:type="xs:integer">4</elementName> </xsl:variable> Note that SequenceTypes are used in other places as well as in the 'as' attribute on variable-binding elements. For example, you might want to say that a function returns a sequence of <value> elements of type xs:integer: <xsl:function name="my:get-values" as="element(value, xs:integer)"> ... </xsl:function> And note that this currently only applies in schema-aware XSLT processors. If you try to use a SequenceType like this in a basic XSLT processor, you will get an error. Mike Kay also responded: "as" on xsl:variable is an assertion about the type. For example, if you say <xsl:variable name="x" as="xs:integer" select="my:prime-number()"/> then you are asserting that the function my:prime-number() will return an integer, or a value that can be (loosely-speaking) treated as an integer. If it returns an xs:unsignedInteger, your assertion is correct, because xs:unsignedInteger is a subtype of xs:integer. If it returns an attribute node that contains an integer, or that contains an untyped value that can be read as an integer, then you're also OK. But if the function returns a string or a date, then you'll get an error. The system can report this error at compile time if it can detect it then, otherwise it will be a run-time error. With a simpler case such as
one would hope that most systems will report the error at compile time, but the rules don't require this. (This is because we haven't tried to define the concept of a "constant expression"). > Similarly, if you have: You don't get casting here, only the weaker kind of conversion allowed in function call and assignment contexts. This allows (a) extraction of the content of a node, (b) numeric promotion (e.g. integer to decimal, but not decimal to integer), and (c) casting of untyped values only. The static type of $foo is xs:decimal, but the dynamic type of its value is integer. Static types aren't likely to affect XSLT processors significantly; they are much more important for some XQuery processors (such as Microsoft's) which are proposing to implement "conservative" static typing: which means the static type of an expression has to be right for the context where it is used, not just the dynamic type of its value. This might sound complicated but it's exactly what happens in many programming languages like Java: it's an error to write Node n = xyz.getElement(); abc.setElement(n); if setElement expects an Element; you have to write abc.setElement((Element)n); This kind of cast is written "treat as" in XPath/XQuery (because SQL uses "cast" to mean something different). | |||||||
8. | Data typing can be useful | ||||||
In XSLT, as in other programming languages, typing can be very useful. I personally have already benefitted by XSLT 2.0 typing. In particular, it allows me to eliminate code, which in XSLT 1.0 was necessary, e.g. to check if a node-set passed as parameter is empty and then issue an error message: Instead of: <xsl:template name="foldl1"> <xsl:param name="pFunc" select="/.."/> <xsl:param name="pList" select="/.."/> <xsl:choose> <xsl:when test="not($pList)"> <xsl:message terminate="yes">Some strong words!!!</xsl:message> </xsl:when> <xsl:otherwise>useful code here</xsl:otherwise> I now can simply write: <xsl:function name="f:foldl1"> <xsl:param name="pFunc" as="element()"/> <xsl:param name="pList" as="item()+"/> <!-- useful code here --> This is a significant reduction of the complexity of the code and the total number of lines. As result the code is simpler, more easier to write and understand, programmer productivity is increased. | |||||||
9. | data() type | ||||||
In the XPath data model true() is both a boolean and a sequence containing a single boolean. There is no distinction between an item and a sequence of length one containing that item. This reflects the way list-valued attributes work in XML Schema (and in DTDs): you wouldn't expect the attribute value "red" to behave differently when you change the type from NMTOKEN to NMTOKENS. data(.) forces atomization (i.e. extracting the value of a node). If X is an element then it cannot be a boolean, but its content can be a boolean. Many operators such as "+" and "=" force atomization of their operands, but some, like count() and "instance of", do not. For example with an NMTOKENS attribute a="red green blue", count(@a) is 1 but count(data(@a)) is 3.
There is a thing called the "context item" which is either a single item (=a sequence of one item) or is undefined (loosely, null). | |||||||
10. | Understanding the Relationship of Nodes, Sequences, and Trees | ||||||
I have been having some excellent exchanges with Michael Kay and have learned a lot. I thought that I would summarize what I learned, so that others can benefit as well. Understanding the Relationship of Nodes, Sequences, and Trees
To understand these rules, let's consider an example. Below is the XML document that my stylesheet operates upon: <?xml version="1.0"?> <FitnessCenter> <Member> <Name>Jeff</Name> </Member> <Member> <Name>David</Name> </Member> <Member> <Name>Roger</Name> </Member> </FitnessCenter> In my stylesheet I have created this variable: <xsl:variable name="members" as="element()+"> <xsl:sequence select="/FitnessCenter/Member[2]"/> <Member> <Name>Sally</Name> </Member> <Member> <Name>Linda</Name> </Member> </xsl:variable> Note that this variable contains a mix of elements - the first element (the David Member) comes from the FitnessCenter. The second and third elements (Stacey and Linda) are defined within the variable itself. Further, note that this sequence does not have a parent node (due to the presence of as="element()+". A characteristic of xsl:sequence when used in a sequence that does not have a parent node is that it does not create a copy of the node that it references; instead, it uses the original node. Thus, $members[1] is referencing the original node: <Member> <Name>David</Name> </Member> from the FitnessCenter. Now let's consider the above rules in the context of this example. (1) A node can belong to only one tree. $members[1] references this node: <Member> <Name>David</Name> </Member> This node belongs to the FitnessCenter tree. $members[2] references this node: <Member> <Name>Stacey</Name> </Member> This node belongs to the $members sequence. ($members does not create a tree. It is only creating a sequence of nodes.) (2) A node may belong to any number of sequences. This node: <Member> <Name>David</Name> </Member> belongs to both the FitnessCenter sequence as well as the $members sequence. (3) Axes always apply to the tree that the node is in. Axes never apply to the sequence that the node is in. Consider this XSLT statement which uses the preceding-sibling axis: <xsl:copy-of select="$members[1]/preceding-sibling::*[1]"/> $members[1] references the David node, which is in the FitnessCenter tree. Therefore, it is referencing David's preceding-sibling in the FitnessCenter tree: <Member> <Name>Jeff</Name> </Member> Likewise, this is referencing David's following-sibling in the FitnessCenter tree: <xsl:copy-of select="$members[1]/following-sibling::*[1]"/> Output: <Member> <Name>Roger</Name> </Member> Note that you cannot use preceding-sibling nor following-sibling on $member[2] or $member[3] because these axes only apply to nodes in a tree. The Stacey Member and Linda Member are not in a tree - they are only in a sequence. (4) When xsl:sequence is used in a sequence which has no parent node then the sequence contains the original node referenced by xsl:sequence and not a copy. Consider this variable declaration: <xsl:variable name="members" as="element()+"> <xsl:sequence select="/FitnessCenter/Member[2]"/> <Member> <Name>Sally</Name> </Member> <Member> <Name>Linda</Name> </Member> </xsl:variable> This variable is comprised of a sequence of nodes. The sequence does not have a parent node. Therefore, the sequence is comprised of the original node. (5) When xsl:sequence is used in a sequence which has a parent node then the element that is referenced by xsl:sequence is copied. Thus, the sequence is comprised of a copy and not the original. Consider this variable declaration: <xsl:variable name="members"> <xsl:sequence select="/FitnessCenter/Member[2]"/> <Member> <Name>Sally</Name> </Member> <Member> <Name>Linda</Name> </Member> </xsl:variable> Note the absence of as="element()+". Thus, this Member sequence has a document node as its parent. Consequently, a *copy* of /FitnessCenter/Member[2] is made and used in the sequence. So this statement produces an empty output: <xsl:copy-of select="$members[1]/preceding-sibling::*[1]"/> (6) The preceding-sibling and following-sibling axes can only be used in a tree. That is, they cannot be used in a sequence that does not have a parent node. (Nodes are "siblings" iff they have a common parent) Therefore, for example, you cannot use following-sibling to get the member that follows Sally: <xsl:variable name="members" as="element()+"> <xsl:sequence select="/FitnessCenter/Member[2]"/> <Member> <Name>Sally</Name> </Member> <Member> <Name>Linda</Name> </Member> </xsl:variable> This statement produces an empty output: <xsl:copy-of select="$members[2]/following-sibling::*[1]"/> However, in this version the sequence does have a parent (document) node, so you can use following-sibling to retrieve the member that follows Sally: <xsl:variable name="members"> <xsl:sequence select="/FitnessCenter/Member[2]"/> <Member> <Name>Sally</Name> </Member> <Member> <Name>Linda</Name> </Member> </xsl:variable> This statement: <xsl:copy-of select="$members[2]/following-sibling::*[1]"/> yields this output: <Member> <Name>Linda</Name> </Member> An alternate form that will also work is this: <xsl:variable name="members" as="element()+"> <Members> <xsl:sequence select="/FitnessCenter/Member[2]"/> <Member> <Name>Sally</Name> </Member> <Member> <Name>Linda</Name> </Member> </Members> </xsl:variable> Note that the member sequence has a parent node (<Members>). Therefore, the following-sibling and preceding-sibling axes can be used on the member sequence, e.g., This statement: <xsl:copy-of select="$members/Member[2]/following-sibling::*[1]"/> yields this output: <Member> <Name>Linda</Name> </Member> | |||||||
11. | Typed match | ||||||
Yes:
element(*, my:postal-address-type) means that it must be an element that has been validated by a schema processor as conforming to the the global schema-defined type my:postal-address-type, which must be defined in a schema that has been imported using <xsl:import-schema>. This can be either a simple type or a complex type. Some other variations:
declares a variable that will always contain an element named X match="element(billing-address, my:postal-address-type)" matches an element whose name is "billing-address" that has been validated against a particular schema type match="schema-element(my:billing-address)" matches an element that's been validated against a global element declaration called "my:billing-address" in an imported schema; the element needn't actually have this name, it could be a member of the substitution group. | |||||||
12. | getting node type in xsl | ||||||
An alternative that would work is to pre-process your schema using xslt to produce an xslt file that "knows" the element types. If your schema is highly complex this might be difficult but for many schema documents it is relatively easy to write an xslt file that inputs a schema and (say) collects all the element names that have type xsd:integer and outputs <xsl:template mode="type" match="elem1|elem2|....|last-element"> <xsl:text>integer</xsl:text> </xsl:template> then, having produced this xsl file you can import that into your main xslt file and whenever you need to know the type of an element just apply this type mode to get the type name of the current element. <xsl:variable name="type"> <xsl:apply-templates mode="type" select="."/> </xsl:variable> <xsl:if test="$type='integer'"> do something about integers... | |||||||
13. | Test For Numeric Values? | ||||||
> I want to test a node value to see whether ". castable as xs:decimal" - or xs:integer, xs:double etc if preferred)
There is such a construct: e.g. ($x instance of xs:decimal) But in this case, I interpreted the requirement as being not to test whether the attribute had a numeric type (ie. was defined as a number in the schema), but whether its value had the lexical form of a number. The XPath 2.0 construct for this is ($x castable as xs:decimal). | |||||||
14. | testing for string and number | ||||||
It's retained for backwards compatibility with XSLT 1.0; the "native" way of doing this in 2.0 would be <xsl:sort select="xs:double(.)"/> if they are doubles, or more likely <xsl:sort select="xs:integer(.)"/> if they are integers. | |||||||
15. | |||||||
I don't know what your expectations are but these results are correct according to the spec. If you don't validate the input document against a schema, then the "typed value" of its nodes is untypedAtomic. If you test ($y instance of xdt:untypedAtomic) you will get the answer true. untypedAtomic behaves essentially like XSLT 1.0 - if you use the value where a string is expected, it's treated as a string, if you use it where an integer is expected, it's converted to an integer.
You need to distinguish "instance of" and "castable as". The "instance of" operator is useful if you write a function that can accept arguments of several different types and you want to test which type you have been given (just like "instanceof" in Java). The "castable as" operator is useful when you are given untyped data and you want to see whether its lexical form makes it suitable for casting to a particular type such as xs:integer or xs:date - which is where this thread started. | |||||||
16. | Check for integer | ||||||
My stylesheet uses this statement: <xsl:value-of select="data(flt:Aircraft/flt:Altitude) instance of xsd:integer"/> The output I get is: "false" (The output I seek is "true", as the Altitude element does have an integer value.) Can someone tell me the correct way to do this? If there's no schema, then the Altitude element is untyped, so applying data() to it gives an instance of xs:untypedAtomic, not an integer. Answer. The expression you want is flt:Aircraft/flt:Altitude castable as xsd:integer which tests not whether the value is an integer, but whether conversion to an integer would succeed. You could also test this with a regular expression matches(flt:Aircraft/flt:Altitude, '[0-9]+') Abel points out And be aware, it does not check for an xs:integer, it only checks for the existence of one or more digits inside an item. Matching are: '123ABC', 'ABC123', 'ABC1ZYX' etc. To match only digits, you must supply it with start/end matches, like so: matches(flt:Aircraft/flt:Altitude, '^\s*[+-]?\d+\s*$'') Furthermore, it does not do the same as 'castable as'. Because a string like '1E10' is an xs:double which is castable as xs:integer. To make matters worse, the xs:string containing '1E10' cannot be cast to xs:integer directly (meaning 'castable as' would return false), it must first be converted to xs:double. Since you can only use matches() on strings, stuff like this cannot be mimicked with it. What about numeric values expressed in exponentional notation? These are (with a side step to xs:double) easily castable as integer. Of course, that should only apply to values that have no decimals (not sure of reqs). If you need it, you can expand your expression so: matches(flt:Aircraft/flt:Altitude, '^\s*[+-]?\d+([eE]+?\d+)?\s*$') Abel Braaksma | |||||||
17. | Co-constraints | ||||||
Schematron + xPath 2.0 is extremely powerful. In fact, one could argue that it can do everything that XML Schema (or RelaxNG) can do, plus a lot more. =20 For example, below is an XML document showing information about an aircraft and vertical obstructions on its flight path. One critical operational constraint is: "Check that the aircraft's altitude is at least 500 feet above all the vertical obstructions" This "co-constraint" cannot be expressed using XML Schemas (or RelaxNG). But with Schematron + xPath 2.0 the co-constraint can be expressed using this xPath: every $j in flt:VerticalObstruction satisfies if ($j/flt:Height) then number(flt:Aircraft/flt:Altitude) gt number($j/flt:Height + $j/flt:Elevation + 500) else number(flt:Aircraft/flt:Altitude) gt number($j/flt:Elevation + 500) As best I can tell, the functionality of Schematron + xPath 2.0 is a superset of XML Schemas (and RelaxNG). However, I am still researching this. The findings in this discussion will be incorporated into a paper I am writing. I appreciate all your input. /Roger <?xml version=3D"1.0"?> <Flight xmlns=3D"http://www.aviation.org";> <Aircraft type=3D"Boeing 747"> <Altitude units=3D"feet" reference=3D"MSL">3300</Altitude> <Location> <Latitude>42.371</Latitude> <Longitude>-71.000</Longitude> </Location> </Aircraft> <VerticalObstruction type=3D"tower"> <!-- The top of the tower is 1500 feet --> <Elevation units=3D"feet">1000</Elevation> <Height units=3D"feet">500</Height> <Location> <Latitude>42.371</Latitude> <Longitude>-71.025</Longitude> </Location> </VerticalObstruction> <VerticalObstruction type=3D"mountain"> <Elevation units=3D"feet">2600</Elevation> <Location> <Latitude>42.371</Latitude> <Longitude>-71.155</Longitude> </Location> </VerticalObstruction> <VerticalObstruction type=3D"building"> <!-- The top of the building is 700 feet --> <Elevation units=3D"feet">500</Elevation> <Height units=3D"feet">200</Height> <Location> <Latitude>42.371</Latitude> <Longitude>-71.299</Longitude> </Location> </VerticalObstruction> </Flight> | |||||||
18. | Convert to a number | ||||||
you mean, I think, xs:integer(@number-att), which is indeed possible. It will fail with an error if the @number-att contains any [^0-9.+-] (with some exceptions). However, there are several ways to prevent this (unrecoverable) error to be raised: (: number() never fails :) xs:integer(number(@number-att)) (: more cleanly, gives you more control :) if (@number-att castable as xs:integer) then xs:integer(@number-att) else 0 | |||||||
19. | Types and variables | ||||||
if you use an as attribute the variable is bound to the sequence constructed, so in your case $test is an element node with name one (this is an element node with no parent, something that can not exist in xslt1) so $test is element one and $test/two selects its child element with name 2. If you do not use an as attribute and use content rather than a select attribute the xsl:variable works as in xslt1 and always generates a single document node / and any generated content is made a child of that node (by copying). so in the second case $test is / $test/one is its child and $test/one/two is its child. > Also could you advise what type I should be using for this kind of > task? it doesn't make much difference in your case with a single constructed element (except it changes the way you access it, as you found) but consider <xsl:variable name="test" as="element()*"> <a/> <b/> </xsl:variable> That's a sequence of two parentless elements, so having no parents they are not siblings so $test/self::a/following-sibling::b is empty <xsl:variable name="test"> <a/> <b/> </xsl:variable> is a / node with a and b children so $test/a/following-sibling::b is the b node. so, if you think you might want to wander around via axis paths parentless nodes can be confusing, but there is sometimes a big, big win for using as="element()* if you have <xsl:variable name="test" as="element()*"> <xsl:sequence select="foo/bar"/> </xsl:variable> then its like <xsl:variable name="test" select="foo/bar"/> and selects all the foo/bar elements but selects the existing nodes so selecting $test/foo[1]/bar[1]/../../../x/y may well work and seelct something in the original tree <xsl:variable name="test""> <xsl:sequence select="foo/bar"/> </xsl:variable> on the other hand generates a new / node and creates children of this node by _copying_ the nodes so now $test/foo[1]/bar[1]/../../../x/y will definitely be empty as going up teo from teh bar elements will get you to the / at the top of this element. obviously you don't want to copy whole document trees when you don't need to, but often the system won't really copy it anyway (if i understand MK correctly) but using as= makes ypu less reliant on the optimiser spotting that it can reuse nodes without actually copying them. | |||||||
20. | Variables, siblings or orphans | ||||||
<xsl:variable name="v" as="element()+"> <a/> <a/> <a/> </xsl:variable> The elements in the resulting sequence are not siblings. They are parentless, whereas siblings always share a parent. To make them siblings you need to add a document node, which you can do simply by leaving out the "as" attribute: <xsl:variable name="v"> <a/> <a/> <a/> </xsl:variable> |