Jeni Tennison Given:
<description>Here's a Google search for a famous phrase, <a
href="http://www.google.com/search?q=to+be+or+not+to+be";>to be or not to
be</a>; give it a try and see what happens.
... </description>
1. its utf-8 encoded.
2. The output is to be html.
Jeni responded
As usual, the easiest recourse here is to use DOE (this is what it's
designed for): <xsl:template match="description">
<xsl:value-of select="." disable-output-escaping="yes" />
</xsl:template>
> Is this stopping a 'double' escaping of the entity?
The text within the <description> element is just text. If you have:
<description>blah <bold>blah</bold> blah</description>
then the value of the <description> element is the string:
blah <bold>blah</bold> blah
Note that this is a *string*, not a tree. The XSLT processor can't
recognise escaped markup in text automatically. The '<bold>' is just
the characters '<', 'b', 'o', 'l', 'd', '>' as far as the XSLT
processor is concerned, not a tag.
When an XSLT processor serialises a string as XML (or HTML) then it
escapes any significant characters (e.g. < and &) using the normal
escapes. So if you did:
<xsl:value-of select="description" />
and you were generating XML or HTML then the XSLT processor would
escape the < characters that it sees in the string value of the
<description> element, and the output that you'd see would be:
blah <bold>blah</bold> blah
What disable-output-escaping does is to disable this output escaping
-- it stops the processor from escaping the characters that are
significant in XML. So you'd get:
blah <bold>blah</bold> blah
in the output. A browser will then recognise the <bold> as a tag and
behave accordingly.
Some processors have a parsing extension function (e.g. check out
saxon:parse()) that would allow you to interpret the text inside the
<description> element as XML, which would then allow you to do:
<xsl:copy-of select="saxon:parse(description)" />
but note that the content has to be well-formed to do that. In XSLT 2.0, you should use character maps rather than DOE. >> Or complain to the feed until they give you XML rather than this
>> hybrid.
>
> It is XML... strictly speaking? But agreed its a mess.
What I meant was that the content of the <description> element is just
text, not XML -- it doesn't have any structure that's recognisable to
an XML parser, but you want it to have. |