Nikolai Grigoriev A good hyphenation file should mark
permitted break points, and disable breaks that are undesirable.
Do you mean that your patterns produce spurious hyphenations
that don't happen in TeX only because line-breaking algorithm
always finds a better alternative?
Liang's algorithm uses priorites in hyphenation patterns: these
priorities are certainly respected. XEP's hyphenator finds all
breaks permitted by Liang's algorithm plus additional constraints
from hyphenation-{push|remain}-character-count, and only those breaks.
What differs XEP from TeX is its line-breaking algorithm that triggers
hyphenation. We don't use global optimization - our approach considers
only single lines. It is in this point that we drop all priorities - all
hyphenation point permitted by Liang's patterns are considered
equivalent. But my impression is that FOP does the same (I may be wrong).
Line breaking is not Liang's part, and the algorithm of pattern processing
in XEP is exactly Liang's, with no omissions. Anyhow, XEP does not hyphenate until it actually has to: if it can get through
by slightly adjusting inter-character or inter-word spaces, it does. (You can
notice that XEP almost never hyphenates long lines; and if it has to, it tends
to split words in the middle). So, for an ordinary text, the penalty for
treating all hyphenation points as being equivalent is actually negligible.
(I'd like to stress once again that we don't produce hyphenations that are
not permitted by patterns). > I can't understand how such a simplified algorithm could perform well,
> unless it discards most of the valid hyphenation points. Have you tested
> XEP's hyphenation algorithm with a bunch of hard-to-hyphenate words?
> (For example, several words with tricky combinations of consonants
> and vowels around? ).
Certainly, there are some words that don't hyphenate well (e.g. "names-pace"
for English :-)). You have to put them into \hyphenation {}. |