Home | History | Annotate | Download | only in hyphenation
      1 Hyphen - hyphenation library to use converted TeX hyphenation patterns
      2  
      3 (C) 1998 Raph Levien
      4 (C) 2001 ALTLinux, Moscow
      5 (C) 2006, 2007, 2008 Lszl Nmeth
      6  
      7 This was part of libHnj library by Raph Levien.
      8  
      9 Peter Novodvorsky from ALTLinux cut hyphenation part from libHnj
     10 to use it in OpenOffice.org.
     11  
     12 Compound word and non-standard hyphenation support by Lszl Nmeth.
     13   
     14 License is the original LibHnj license:
     15 LibHnj is dual licensed under LGPL and MPL (see also README.libhnj).
     16 
     17 Because LGPL allows GPL relicensing, COPYING contains now 
     18 LGPL/GPL/MPL tri-license for explicit Mozilla source compatibility.
     19 
     20 Original Libhnj source with OOo's patches are managed by Rene Engelhard
     21 and Chris Halls at Debian:
     22 
     23 http://packages.debian.org/stable/libdevel/libhnj-dev
     24 and http://packages.debian.org/unstable/source/libhnj
     25 
     26 
     27 OTHER FILES
     28 
     29 This distribution is the source of the en_US hyphenation patterns
     30 "hyph_en_US.dic", too. See README_hyph_en_US.txt.
     31 
     32 Source files of hyph_en_US.dic in the distribution:
     33 
     34 hyphen.tex (en_US hyphenation patterns from plain TeX)
     35 
     36   Source: http://tug.ctan.org/text-archive/macros/plain/base/hyphen.tex
     37 
     38 tbhyphext.tex: hyphenation exception log from TugBoat archive
     39 
     40   Source of the hyphenation exception list: 
     41   http://www.ctan.org/tex-archive/info/digests/tugboat/tb0hyf.tex
     42 
     43   Generated with the hyphenex script
     44   (http://www.ctan.org/tex-archive/info/digests/tugboat/hyphenex.sh)
     45 
     46   sh hyphenex.sh <tb0hyf.tex >tbhyphext.tex
     47 
     48 
     49 INSTALLATION
     50 
     51 ./configure
     52 make
     53 make install
     54 
     55 UNIT TESTS (WITH VALGRIND DEBUGGER)
     56 
     57 make check
     58 VALGRIND=memcheck make check
     59 
     60 USAGE
     61 
     62 ./example hyph_en_US.dic mywords.txt
     63 
     64 or (under Linux)
     65 
     66 echo example | ./example hyph_en_US.dic /dev/stdin
     67 
     68 NOTE: In the case of Unicode encoded input, convert your words
     69 to lowercase before hyphenation (under UTF-8 console environment):
     70 
     71 cat mywords.txt | awk '{print tolower($0)}' >mywordslow.txt
     72 
     73 DEVELOPMENT
     74 
     75 See README.hyphen for hyphenation algorithm, README.nonstandard
     76 and doc/tb87nemeth.pdf for non-standard hyphenation,
     77 README.compound for compound word hyphenation, and tests/*.
     78 
     79 Description of the dictionary format:
     80 
     81 First line contains the character encoding (ISO8859-x, UTF-8).
     82 
     83 Possible options in the following lines:
     84 
     85 LEFTHYPHENMIN num          minimal hyphenation distance from the left word end
     86 RIGHTHYPHENMIN num         minimal hyphation distance from the right word end
     87 COMPOUNDLEFTHYPHENMIN num  min. hyph. dist. from the left compound word boundary
     88 COMPOUNDRIGHTHYPHENMIN num min. hyph. dist. from the right comp. word boundary
     89 
     90 hyphenation patterns       see README.* files
     91 
     92 NEXTWORD                   separate the two compound sets (see README.compound)
     93 
     94 Default values:
     95 Without explicite declarations, hyphenmin fields of dict struct
     96 are zeroes, but in this case the lefthyphenmin and righthyphenmin
     97 will be the default 2 under the hyphenation (for backward compatibility).
     98 
     99 Comments
    100 
    101 Use percent sign at the beginning of the lines to add comments to your
    102 hpyhenation patterns (after the character encoding in the first line):
    103 
    104 % comment
    105 
    106 *****************************************************************************
    107 * Warning! Correct working of Libhnj *needs* prepared hyphenation patterns. *
    108 
    109 For example, generating hyph_en_US.dic from "hyphen.us" TeX patterns:
    110     
    111 perl substrings.pl hyphen.us hyph_en_US.dic ISO8859-1
    112 
    113 or with default LEFTHYPHENMIN and RIGHTHYPHENMIN values:
    114 
    115 perl substrings.pl hyphen.us hyph_en_US.dic ISO8859-1 2 3
    116 perl substrings.pl hyphen.gb hyph_en_GB.dic ISO8859-1 3 3
    117 ****************************************************************************
    118 
    119 OTHERS
    120 
    121 Java hyphenation: Peter B. West (Folio project) implements a hyphenator with
    122 non standard hyphenation facilities based on extended Libhnj. The HyFo module
    123 is released in binary form as jar files and in source form as zip files.
    124 See http://sourceforge.net/project/showfiles.php?group_id=119136
    125 
    126 Lszl Nmeth
    127 <nemeth (at) openoffice (dot) org>
    128