Jeni Tennison I have the following question. Assume that I have the following XML file,
("........" represents some removed irrelevant lines): <CPU partNum="1345">
.............
<CPU partNum="15678">
............
<CPU partNum="1345">
..............
<CPU partNum="11111">
..............
<CPU partNum="11111">
............
<CPU partNum="1345">
..........
<CPU partNum="11111">
.........
and I want to write a stylesheet such that after parsing this XML file, it
counts the number
of those elements which have the same partNum and report that in some way,
say in a table
(in HTML) like the following: partNum Qty
1345 3
15678 1
11111 3
This is a grouping problem: you want to group the parts together, and count
the number of items in each group. Grouping problems are currently (and
using basic XSLT without any processor extensions) best solved using the
Muenchian method, which involves defining a key that does the grouping
quickly for you. When you design a key to help you group things together, you have three
variables to set: * name - a name for the key, anything you like: 'parts' in this case
* match - the things that you want to group: elements with 'partNum'
attributes in this case
* use - the thing that defines the groups: the number of the part in this case
So, the key that you want looks like: <xsl:key name="parts" match="*[@partNum]" use="@partNum" /> This element is a top-level element: it goes right underneath the
xsl:stylesheet element. Note that I have assumed within the 'match'
expression that different elements, with different names, might all have
'partNum' attributes, so as well as: <CPU partNum="1345" />
you might have:
<HDD partNum="5437" />
If the elements that you're interested in are *all* CPUs, then you could use:
<xsl:key name="CPUs" match="CPU" use="@partNum" />
instead. Indeed, you could define several different keys for each of the
elements that have partNums, if you know those in advance. Retrieving the groups involves using the key() function. The first
argument is the name of the key (so 'parts' in this case) and the second
argument is the value of the thing that was used to group the things you're
grouping, so a part number in this case, something like: key('parts', '1345')
You can dynamically decide what the part number (or even key name) is. Getting a list of the part numbers is the slightly tricky bit. You need to
identify one of the group for each of the groups that you've defined in the
key. You do this by comparing a node (a 'CPU' element) with the first node
that you get when you use the key value for that node to index into the
key. So, if the first element that you get when you use the value '1345'
to index into your 'parts' key is the same as the current element, then you
know that it's the first one to appear in the list. To compare nodes, you
use the generate-id() function. So, a template matching on the parent of
the CPU elements should look something like: <xsl:template match="parts">
<table>
<tr><th>partNum</th><th>Qty</th></tr>
<xsl:apply-templates
select="*[@partNum and
generate-id(.)=generate-id(key('parts', @partNum))]" />
</table>
</xsl:template>
This guarantees that the only parts that have templates applied to them are
those that occur first in the list of parts with that particular part
number. You can then have a template that matches on them and outputs the
rows of your table. To count the number of parts with that particular part
number, you count the number of elements that are retrieved when you use
that part number to index into the key that you've used to group your
elements: <xsl:template match="*[@partNum]">
<tr>
<!-- first column is the value of the partNum attribute -->
<td><xsl:value-of select="@partNum" /></td>
<!-- second column is the number of parts with that partNum -->
<td><xsl:value-of select="count(key('parts', @partNum))" /></td>
</tr>
</xsl:template>
This is tested and works in SAXON. The important things to take out of
this example are how to use the key to group the elements that you're
interested in: <xsl:key name="parts" match="*[@partNum]" use="@partNum" />
how to retrieve them and count them: count(key('parts', @partNum)) and how to retrieve the unique elements so that you can identify what part
numbers are used in your file: *[@partNum and
generate-id(.)=generate-id(key('parts', @partNum))]
>The difficulty that I have is the following: I have different XML files
>(with similar structure)
>in which, these partNum are different , and may even change in the future,
I'm not sure whether this means you want to summarise the content of all
these different files at the same time. If you do, you should take note
that the key() function only indexes nodes in the documents that you are
currently working on. You can set the current documents using the
document() function, but that's another question :) |