Dimitre Novatchev
> I'm looking at graph processing problems as a testbed for this, and
> came across a problem that I haven't been able to solve elegantly. The
> problem is to find "linker" vertexes that a pair of verteces from a
> pre-defined set. For example, if the graph verteces represent cities
> and edges represent flights between them, then given a list of cities,
> find all intermediate cities that you would stop in via a "connecting
> flight".
>
> For example, given the following simple graph:
>
> V1 -> V2 -> V3 -> V4
> \<- V5 ->/
>
> (V5 points to both V2 and V4), and its XML serialization:
>
> <graph>
> <vertex id="V1"/>
> <vertex id="V2" type="anchor"/>
> <vertex id="V3"/>
> <vertex id="V4" type="anchor"/>
> <vertex id="V5"/>
> <edge source="V1" target="V2"/>
> <edge source="V2" target="V3"/>
> <edge source="V3" target="V4"/>
> <edge source="V5" target="V2"/>
> <edge source="V5" target="V4"/>
> </graph>
>
> I would like to transform this into a second graph where all vertexes
> that "link" two anchor distinct vertexes are flagged as link nodes. In
> this case, there are two anchor vertexes V2 and V4, and I can link
> them through V3 (V2 -> V3 -> V4) and V5 (V2 <- V5 -> V4). Note that
> linked verteces must be distinct, so traversing the V2 <- V1 -> V2
> path should not yield V1 as a link node. So I'd like to see something
> like this:
>
> <graph>
> <vertex id="V1"/>
> <vertex id="V2" type="anchor"/>
> <vertex id="V3" linker="true"/>
> <vertex id="V4" type="anchor"/>
> <vertex id="V5" linker="true"/>
> <edge source="V1" target="V2"/>
> <edge source="V2" target="V3"/>
> <edge source="V3" target="V4"/>
> <edge source="V5" target="V2"/>
> <edge source="V5" target="V4"/>
> </graph>
>
> It would be ideal to come up with a generalized solution that would
> let you use 1, 2, .. N intermediate linking nodes.
Here's the general solution and it is quite simple and
straightforward. I use a recursive template based on the following
rule: Paths(/.., X, Z) = Union( {X -> Xi} . Paths(X, Xi, Z) )
Where the function Paths(Eset, X, Z) returns all paths from X to Z,
that do not include any vertices belonging to the exclusion-set Eset.
Here {X -> Xi} are all arcs from X.
Informally,
{X -> Xi} . Paths(X, Xi, Z)
is the Cartesian product of the arc {X -> Xi} with the set of paths Paths(X,
Xi, Z), giving a new set of paths.
As you can see the code of the transformation is 71 lines long, but it
should be noted that for the purpose of readability I write some XPath
expressions on several lines and also almost half of these 71 lines are just
closing tags.
Here's the code. This transformation: <xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:exsl="http://exslt.org/common"
exclude-result-prefixes="exsl"
>
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:key name="kNeighbors" match="vertex"
use="../edge[@target = current()/@id]/@source"/>
<xsl:key name="kNeighbors" match="vertex"
use="../edge[@source = current()/@id]/@target"/>
<xsl:template match="/">
<xsl:call-template name="getPaths">
<xsl:with-param name="pNode1"
select="/*/vertex[@type='anchor'][1]"/>
<xsl:with-param name="pNode2"
select="/*/vertex[@type='anchor'][2]"/>
</xsl:call-template>
</xsl:template>
<xsl:template name="getPaths">
<xsl:param name="pNode1" select="/.."/>
<xsl:param name="pNode2" select="/.."/>
<xsl:param name="pExcluded" select="/.."/>
<xsl:for-each select=
"key('kNeighbors', $pNode1/@id)
[not(@id = $pExcluded/@id)]">
<xsl:choose>
<xsl:when test="@id = $pNode2/@id">
<path>
<xsl:copy-of
select="/*/edge[$pNode1/@id = @*
and
$pNode2/@id = @*
]"/>
</path>
</xsl:when>
<xsl:otherwise>
<xsl:variable name="vrtfTail">
<xsl:call-template name="getPaths">
<xsl:with-param name="pNode1"
select="."/>
<xsl:with-param name="pNode2"
select="$pNode2"/>
<xsl:with-param name="pExcluded"
select="$pExcluded | $pNode1"/>
</xsl:call-template>
</xsl:variable>
<xsl:variable name="vTail"
select="exsl:node-set($vrtfTail)/*"/>
<xsl:if test="$vTail">
<path>
<xsl:copy-of
select="/*/edge[$pNode1/@id = @*
and
current()/@id = @*
]"/>
<xsl:copy-of select="$vTail/*"/>
</path>
</xsl:if>
</xsl:otherwise>
</xsl:choose>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
when applied on this source.xml: <graph>
<vertex id="V1"/>
<vertex id="V2" type="anchor"/>
<vertex id="V3"/>
<vertex id="V4" type="anchor"/>
<vertex id="V5"/>
<edge source="V1" target="V2"/>
<edge source="V1" target="V3"/>
<edge source="V2" target="V3"/>
<edge source="V3" target="V4"/>
<edge source="V5" target="V2"/>
<edge source="V5" target="V4"/>
</graph>
produces the wanted result: <path>
<edge source="V1" target="V2"/>
<edge source="V1" target="V3"/>
<edge source="V3" target="V4"/>
</path>
<path>
<edge source="V2" target="V3"/>
<edge source="V3" target="V4"/>
</path>
<path>
<edge source="V5" target="V2"/>
<edge source="V5" target="V4"/>
</path>
These are all three different paths (with all nodes in the path only
once) from V2 to V4 in the graph described by the xml above. The first solution: is 3 edges long.
The other two are two edges long each (Note that I added to your original
graph structure a new arc from V1 to V3 in order to make it more "general").
I hope this helped in this specific problem and also to answer
positively your questions about the expressiveness of XPath/XSLT and
the appropriateness of XSLT as a tool for solving this type of
problems.
Sjoerd Visscher adds > It would be ideal to come up with a generalized solution that would
> let you use 1, 2, .. N intermediate linking nodes. I've been able to
> get this working with nested loops, but it isn't particularly
> declarative or speedy, and is certainly more verbose than I'd like,
> so I'm wondering if anyone here has insights into how to do this
> elegantly and in XSLT/XPath style. For example, is it possible to
> write a single XPath expression that will select <vertex>
> elements that obey the above criteria? If not, does anyone have any
> suggestions for how to code this effectively and efficiently with
> XSLT?
The following XSL transformation does what you want: <?xml version="1.0"?>
<xsl:transform version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:key name="a" match="/graph/vertex[@type='anchor']" use="@id" />
<xsl:key name="e1" match="/graph/edge/@target" use="../@source" />
<xsl:key name="e2" match="/graph/edge/@source" use="../@target" />
<!-- identity template -->
<xsl:template match="@* | node()">
<xsl:copy>
<xsl:apply-templates select="@* | node()" />
</xsl:copy>
</xsl:template>
<xsl:template match="vertex">
<xsl:copy>
<xsl:apply-templates select="@* | node()" />
<!-- more than one edge to an anchor vertex? -->
<xsl:if test="count(key('a',key('e1',@id)|key('e2',@id)))>1">
<xsl:attribute name="linker">true</xsl:attribute>
</xsl:if>
</xsl:copy>
</xsl:template>
</xsl:transform>
I've used the key function here as both a look-up and as a filter shortcut. The neat thing about the key function is that it returns a set of distinct nodes. F.e. if you'd have two edges from V1 to V2, the expression key('e1', @id) returns 2 nodes, but when put through the key function again to find the anchor vertices, there's only one result: the V2 vertex. Extending this to a generalized solution is still hard though... |