Home | History | Annotate | Download | only in dictionaries
      1 # This is a sample wordlist that can be converted to a binary dictionary
      2 # for use by the Latin IME.
      3 # The file is essentially a CSV file, with indent level denoting nesting.
      4 #
      5 # The file starts with a single CSV line with the header attributes. Whatever
      6 # the content, these are included as is in the binary file. The first attribute
      7 # of the file should be `dictionary'. Usual fields are `locale', `description',
      8 # `date', `version', `options'.
      9 #
     10 # Each word has a `word' entry and at least a `f' argument denoting its
     11 # probability, as an integer between 0 and 255 on a logarithmic scale, with
     12 # 255 meaning 1 and each decrement in 1 dividing probability by 1.15.
     13 # As a special case, a weight of 0 is taken to mean profanity - words that
     14 # should not be considered a typo, but that should never be suggested
     15 # explicitly. An entry may be made not a word by adding a `not_a_word'
     16 # field with a value of `true'. The main reason for putting such entries
     17 # into the dictionary is to add shortcut targets and maybe a whitelist
     18 # replacement.
     19 #
     20 # Each word may or may not have any number of shortcut target lines
     21 # starting with a `shortcut' entry and having at least a `f' frequency
     22 # value between 0 and 14, or the special value `whitelist' which becomes
     23 # 15, which is then taken to be the whitelist target of this word.
     24 #
     25 # Each word may also have any number of bigram lines starting with a
     26 # `bigram' entry containing the following word whose frequency should
     27 # override the unigram frequency when following the word this bigram is
     28 # for.
     29 #
     30 dictionary=main:en,locale=en,description=Sample wordlist,date=1351495318,version=1
     31  word=sample,f=200
     32   bigram=wordlist,f=243
     33  word=wordlist,f=180
     34  word=shortcut,f=176
     35   shortcut=target,f=10
     36  word=witelisted,f=10,not_a_word=true
     37   shortcut=whitelisted,f=whitelist
     38  word=profanity,f=0
     39