Home | History | Annotate | Download | only in docs
      1 <chapter id="what-is-harfbuzz">
      2   <title>What is Harfbuzz?</title>
      3   <para>
      4     Harfbuzz is a <emphasis>text shaping engine</emphasis>. It solves
      5     the problem of selecting and positioning glyphs from a font given a
      6     Unicode string.
      7   </para>
      8   <section id="why-do-i-need-it">
      9     <title>Why do I need it?</title>
     10     <para>
     11       Text shaping is an integral part of preparing text for display. It
     12       is a fairly low level operation; Harfbuzz is used directly by
     13       graphic rendering libraries such as Pango, and the layout engines
     14       in Firefox, LibreOffice and Chromium. Unless you are
     15       <emphasis>writing</emphasis> one of these layout engines yourself,
     16       you will probably not need to use Harfbuzz - normally higher level
     17       libraries will turn text into glyphs for you.
     18     </para>
     19     <para>
     20       However, if you <emphasis>are</emphasis> writing a layout engine
     21       or graphics library yourself, you will need to perform text
     22       shaping, and this is where Harfbuzz can help you. Here are some
     23       reasons why you need it:
     24     </para>
     25     <itemizedlist>
     26       <listitem>
     27         <para>
     28           OpenType fonts contain a set of glyphs, indexed by glyph ID.
     29           The glyph ID within the font does not necessarily relate to a
     30           Unicode codepoint. For instance, some fonts have the letter
     31           &quot;a&quot; as glyph ID 1. To pull the right glyph out of
     32           the font in order to display it, you need to consult a table
     33           within the font (the &quot;cmap&quot; table) which maps
     34           Unicode codepoints to glyph IDs. Text shaping turns codepoints
     35           into glyph IDs.
     36         </para>
     37       </listitem>
     38       <listitem>
     39         <para>
     40           Many OpenType fonts contain ligatures: combinations of
     41           characters which are rendered together. For instance, it's
     42           common for the <literal>fi</literal> combination to appear in
     43           print as the single ligature &quot;&quot;. Whether you should
     44           render text as <literal>fi</literal> or &quot;&quot; does not
     45           depend on the input text, but on the capabilities of the font
     46           and the level of ligature application you wish to perform.
     47           Text shaping involves querying the font's ligature tables and
     48           determining what substitutions should be made.
     49         </para>
     50       </listitem>
     51       <listitem>
     52         <para>
     53           While ligatures like &quot;&quot; are typographic
     54           refinements, some languages <emphasis>require</emphasis> such
     55           substitutions to be made in order to display text correctly.
     56           In Tamil, when the letter &quot;TTA&quot; () letter is
     57           followed by &quot;U&quot; (), the combination should appear
     58           as the single glyph &quot;&quot;. The sequence of Unicode
     59           characters &quot;&quot; needs to be rendered as a single
     60           glyph from the font - text shaping chooses the correct glyph
     61           from the sequence of characters provided.
     62         </para>
     63       </listitem>
     64       <listitem>
     65         <para>
     66           Similarly, each Arabic character has four different variants:
     67           within a font, there will be glyphs for the initial, medial,
     68           final, and isolated forms of each letter. Unicode only encodes
     69           one codepoint per character, and so a Unicode string will not
     70           tell you which glyph to use. Text shaping chooses the correct
     71           form of the letter and returns the correct glyph from the font
     72           that you need to render.
     73         </para>
     74       </listitem>
     75       <listitem>
     76         <para>
     77           Other languages have marks and accents which need to be
     78           rendered in certain positions around a base character. For
     79           instance, the Moldovan language has the Cyrillic letter
     80           &quot;zhe&quot; () with a breve accent, like so: . Some
     81           fonts will contain this character as an individual glyph,
     82           whereas other fonts will not contain a zhe-with-breve glyph
     83           but expect the rendering engine to form the character by
     84           overlaying the two glyphs  and . Where you should draw the
     85           combining breve depends on the height of the preceding glyph.
     86           Again, for Arabic, the correct positioning of vowel marks
     87           depends on the height of the character on which you are
     88           placing the mark. Text shaping tells you whether you have a
     89           precomposed glyph within your font or if you need to compose a
     90           glyph yourself out of combining marks, and if so, where to
     91           position those marks.
     92         </para>
     93       </listitem>
     94     </itemizedlist>
     95     <para>
     96       If this is something that you need to do, then you need a text
     97       shaping engine: you could use Uniscribe if you are using Windows;
     98       you could use CoreText on OS X; or you could use Harfbuzz. In the
     99       rest of this manual, we are going to assume that you are the
    100       implementor of a text layout engine.
    101     </para>
    102   </section>
    103   <section id="why-is-it-called-harfbuzz">
    104     <title>Why is it called Harfbuzz?</title>
    105     <para>
    106       Harfbuzz began its life as text shaping code within the FreeType
    107       project, (and you will see references to the FreeType authors
    108       within the source code copyright declarations) but was then
    109       abstracted out to its own project. This project is maintained by
    110       Behdad Esfahbod, and named Harfbuzz. Originally, it was a shaping
    111       engine for OpenType fonts - &quot;Harfbuzz&quot; is the Persian
    112       for &quot;open type&quot;.
    113     </para>
    114   </section>
    115 </chapter>