1 Hyphen - hyphenation library to use converted TeX hyphenation patterns 2 3 (C) 1998 Raph Levien 4 (C) 2001 ALTLinux, Moscow 5 (C) 2006, 2007, 2008 Lszl Nmeth 6 7 This was part of libHnj library by Raph Levien. 8 9 Peter Novodvorsky from ALTLinux cut hyphenation part from libHnj 10 to use it in OpenOffice.org. 11 12 Compound word and non-standard hyphenation support by Lszl Nmeth. 13 14 License is the original LibHnj license: 15 LibHnj is dual licensed under LGPL and MPL (see also README.libhnj). 16 17 Because LGPL allows GPL relicensing, COPYING contains now 18 LGPL/GPL/MPL tri-license for explicit Mozilla source compatibility. 19 20 Original Libhnj source with OOo's patches are managed by Rene Engelhard 21 and Chris Halls at Debian: 22 23 http://packages.debian.org/stable/libdevel/libhnj-dev 24 and http://packages.debian.org/unstable/source/libhnj 25 26 27 OTHER FILES 28 29 This distribution is the source of the en_US hyphenation patterns 30 "hyph_en_US.dic", too. See README_hyph_en_US.txt. 31 32 Source files of hyph_en_US.dic in the distribution: 33 34 hyphen.tex (en_US hyphenation patterns from plain TeX) 35 36 Source: http://tug.ctan.org/text-archive/macros/plain/base/hyphen.tex 37 38 tbhyphext.tex: hyphenation exception log from TugBoat archive 39 40 Source of the hyphenation exception list: 41 http://www.ctan.org/tex-archive/info/digests/tugboat/tb0hyf.tex 42 43 Generated with the hyphenex script 44 (http://www.ctan.org/tex-archive/info/digests/tugboat/hyphenex.sh) 45 46 sh hyphenex.sh <tb0hyf.tex >tbhyphext.tex 47 48 49 INSTALLATION 50 51 ./configure 52 make 53 make install 54 55 UNIT TESTS (WITH VALGRIND DEBUGGER) 56 57 make check 58 VALGRIND=memcheck make check 59 60 USAGE 61 62 ./example hyph_en_US.dic mywords.txt 63 64 or (under Linux) 65 66 echo example | ./example hyph_en_US.dic /dev/stdin 67 68 NOTE: In the case of Unicode encoded input, convert your words 69 to lowercase before hyphenation (under UTF-8 console environment): 70 71 cat mywords.txt | awk '{print tolower($0)}' >mywordslow.txt 72 73 DEVELOPMENT 74 75 See README.hyphen for hyphenation algorithm, README.nonstandard 76 and doc/tb87nemeth.pdf for non-standard hyphenation, 77 README.compound for compound word hyphenation, and tests/*. 78 79 Description of the dictionary format: 80 81 First line contains the character encoding (ISO8859-x, UTF-8). 82 83 Possible options in the following lines: 84 85 LEFTHYPHENMIN num minimal hyphenation distance from the left word end 86 RIGHTHYPHENMIN num minimal hyphation distance from the right word end 87 COMPOUNDLEFTHYPHENMIN num min. hyph. dist. from the left compound word boundary 88 COMPOUNDRIGHTHYPHENMIN num min. hyph. dist. from the right comp. word boundary 89 90 hyphenation patterns see README.* files 91 92 NEXTWORD separate the two compound sets (see README.compound) 93 94 Default values: 95 Without explicite declarations, hyphenmin fields of dict struct 96 are zeroes, but in this case the lefthyphenmin and righthyphenmin 97 will be the default 2 under the hyphenation (for backward compatibility). 98 99 Comments 100 101 Use percent sign at the beginning of the lines to add comments to your 102 hpyhenation patterns (after the character encoding in the first line): 103 104 % comment 105 106 ***************************************************************************** 107 * Warning! Correct working of Libhnj *needs* prepared hyphenation patterns. * 108 109 For example, generating hyph_en_US.dic from "hyphen.us" TeX patterns: 110 111 perl substrings.pl hyphen.us hyph_en_US.dic ISO8859-1 112 113 or with default LEFTHYPHENMIN and RIGHTHYPHENMIN values: 114 115 perl substrings.pl hyphen.us hyph_en_US.dic ISO8859-1 2 3 116 perl substrings.pl hyphen.gb hyph_en_GB.dic ISO8859-1 3 3 117 **************************************************************************** 118 119 OTHERS 120 121 Java hyphenation: Peter B. West (Folio project) implements a hyphenator with 122 non standard hyphenation facilities based on extended Libhnj. The HyFo module 123 is released in binary form as jar files and in source form as zip files. 124 See http://sourceforge.net/project/showfiles.php?group_id=119136 125 126 Lszl Nmeth 127 <nemeth (at) openoffice (dot) org> 128