MK, Andrew Welsh and DC Andrew Welch asked
Consider the following code:
<xsl:variable name="foo" select="nothing" as="xs:string?"/>
<xsl:choose>
<xsl:when test="$foo != ''">A</xsl:when>
<xsl:when test="$foo = ''">B</xsl:when>
<xsl:when test="not($foo != '')">C</xsl:when> </xsl:choose>
When there isn't a <nothing> element, the output is C. That is:
$foo != '' is false
and
$foo = '' also is false
Which is strange. If I do "$foo is empty" then Saxon tells me $foo is a string and not a nodeset. After adding the explicit cast, the test
passes:
string($foo) = ''
Which suggests that $foo isn't a string (so which is it?). It almost as if the empty nodeset doesn't get implicitly cast like a 'populated'
nodeset, and the as: attribute is ignored. Is there a difference between the way the two are handled?
Also, is using "!= ''" a bad way of checking if the variable has content when the variable type is 'xs:string?' (ie optional)?
dc made the point:
But I think it's true to say that as="xs:string" does _not_ force an empty sequence to coerce to an empty string, isn't it?
(ednote: Making the point that a cast does not occur)
mk replied:
The variable was defined as:
<xsl:variable name="foo" select="nothing" as="xs:string?"/>
The select expression yields a node-sequence, the @as expression requires atomic values, so XSLT invokes atomization. The result of atomizing an empty node-sequence is an empty sequence of strings.
The only conversions forced by the "as" attribute are atomization and numeric promotion (e.g. int to double). It doesn't cause a cast. If the "as"
attribute had said "xs:string" rather than "xs:string?", a type error would be reported.
(ednote: From the XSLT 2.0 WD. [ERR XT0570] It is a type error if the supplied value of a variable cannot be converted to the required type.)
I asked just what was the meaning of the '?'.
dc replied:
It should be read the way ? is read in regex or dtd syntax as 0-or-1 you could also use + or * there, again with their regex or dtd meanings of 0-or-more or 1-or-more.
A type of xs:string requires a value that is a string.
A type of xs:string? requires a value that is a sequence of 0 or more strings. (as always though, there is no difference between a single string and a sequence of length 1 that contains a string)
(ednote: This (for me) is subtle. The emphasis is on *zero* or more strings. Hence an empty
string is a valid value.)
mk expressed this differently. The ? is part of the type. It means that the value (after atomization) must either be a string, or nothing (an empty sequence).
Finally, dc answered Andrews question,
q. So, what is the difference between the atomization process when the node
<abc/> is present but empty, and when it's not there?
a. In one case you get an empty string "" (for which $abc = '' is true)
In the other you get an empty sequence () (for which the test $abc='' is false, as no item in the sequence is equal to "")
And mk came back with:
A world of difference. The typed value (i.e. the atomized value) of an empty element <abc/> actually depends on how it's described in the schema. If there's no schema, the typed value is a zero-length untypedAtomic, which compares equal to the string "". If there's a schema that describes <abc/> as having a simple type of string, then the typed value is a single zero-length string. However, if the schema says that the type is xs:NMTOKENS, then the typed value is an empty sequence. The empty sequence contains no value that's equal to "", so abc="" returns false.
If abc is defined in the schema as a complex type that doesn't allow mixed content, then atomizing <abc/> is an error.
These rules might seem arbitrary but the reflect the fact that the meaning of an empty element actually depends on what might have been there if it weren't empty.
I found that last sentance almost philosophical.
Thank you gentlemen for an enlightening thread. If I can get past the DC math filter without reprimand,
sorry not today...
I think his point that the {} empty set is a member of
any set was the Ah-ha moment for me.
That's what allows the as="xs:string?" to succeed rather than
report the error.
The empty set isn't a _member_ of every set.
{} isn't a member of {1,2,3} for example. It is a member of {{}} though (the set with one member, the empty set) It is a _subset_ of every set. |