Home | History | Annotate | Download | only in marisa-trie
      1 - Project name
      2 
      3 marisa-trie
      4 http://code.google.com/p/marisa-trie/
      5 
      6 - Project summary
      7 
      8 Matching Algorithms with Recursively Implemented StorAge
      9 
     10 - Version
     11 
     12 0.1.4
     13 
     14 - Description
     15 
     16 This project *marisa-trie* provides a C++ library *libmarisa* and command line tools *`marisa-*`* for building and operating nesting patricia tries. The brand-new dictionary structure is designed to be static and space efficient. Also, *marisa-trie* enables not only simple lookups but also prefix searches and predictive searches.
     17 
     18  * Prefix searches are to find keys from prefixes of a given query.
     19  * Predictive searches are to find keys starting with a given query.
     20 
     21 The biggest advantage of *marisa-trie* is that it can build a considerably compact dictionary. See below for the size of dictionaries built with various trie implementations.
     22 
     23  * Input
     24   * Source: enwiki-20110115-all-titles-in-ns0.gz
     25   * Contents: all page titles of English Wikipedia (Jan. 2011)
     26   * Number of keys: 8,279,325
     27   * Total size: 167,223,035 bytes (plain) / 46,344,646 bytes (gzipped)
     28 
     29 || *Implementation* || *Size (bytes)* || *Remarks*                   ||
     30 ||  darts-clone     ||   316,065,792  || Compacted double-array trie ||
     31 ||  tx-trie         ||   107,119,864  || LOUDS-based trie            ||
     32 || *marisa-trie*    ||   *42,688,271* || Nesting patricia trie       ||
     33 
     34  * Documentation (in Japanese)
     35   * HowTo
     36   * ListOfTools
     37   * LibraryInterface
     38   * BenchmarkResults
     39 
     40 - Version control system
     41 
     42 Subversion
     43 
     44 - Source code license
     45 
     46 New BSD License
     47 
     48 - Project labels
     49 
     50 Nesting
     51 Patricia
     52 Trie
     53 Static
     54 Dictionary
     55 CPlusPlus
     56