Jeni Tennison Given:  
<description>Here's a Google search for a famous phrase, <a
href="http://www.google.com/search?q=to+be+or+not+to+be";>to be or not to
be</a>; give it a try and see what happens. 
... </description>
 1. its utf-8 encoded. 
2. The output is to be html. 
Jeni responded 
As usual, the easiest recourse here is to use DOE (this is what it's
designed for): <xsl:template match="description">
  <xsl:value-of select="." disable-output-escaping="yes" />
</xsl:template>
 > Is this stopping a 'double' escaping of the entity? 
 The text within the <description> element is just text. If you have: 
  <description>blah <bold>blah</bold> blah</description>
 then the value of the <description> element is the string: 
  blah <bold>blah</bold> blah
 Note that this is a *string*, not a tree. The XSLT processor can't
recognise escaped markup in text automatically. The '<bold>' is just
the characters '<', 'b', 'o', 'l', 'd', '>' as far as the XSLT
processor is concerned, not a tag. 
When an XSLT processor serialises a string as XML (or HTML) then it
escapes any significant characters (e.g. < and &) using the normal
escapes. So if you did: 
  <xsl:value-of select="description" />
 and you were generating XML or HTML then the XSLT processor would
escape the < characters that it sees in the string value of the
<description> element, and the output that you'd see would be: 
  blah <bold>blah</bold> blah
 What disable-output-escaping does is to disable this output escaping
-- it stops the processor from escaping the characters that are
significant in XML. So you'd get: 
  blah <bold>blah</bold> blah
 in the output. A browser will then recognise the <bold> as a tag and
behave accordingly. 
Some processors have a parsing extension function (e.g. check out
saxon:parse()) that would allow you to interpret the text inside the
<description> element as XML, which would then allow you to do: 
  <xsl:copy-of select="saxon:parse(description)" />
 but note that the content has to be well-formed to do that. In XSLT 2.0, you should use character maps rather than DOE. >> Or complain to the feed until they give you XML rather than this 
>> hybrid. 
> 
> It is XML... strictly speaking? But agreed its a mess. 
 What I meant was that the content of the <description> element is just
text, not XML -- it doesn't have any structure that's recognisable to
an XML parser, but you want it to have.  |