Home | History | Annotate | Download | only in libdivsufsort
      1 # libdivsufsort
      2 
      3 libdivsufsort is a software library that implements a lightweight suffix array construction algorithm.
      4 
      5 ## News
      6 * 2015-03-21: The project has moved from [Google Code](http://code.google.com/p/libdivsufsort/) to [GitHub](https://github.com/y-256/libdivsufsort)
      7 
      8 ## Introduction
      9 This library provides a simple and an efficient C API to construct a suffix array and a Burrows-Wheeler transformed string from a given string over a constant-size alphabet.
     10 The algorithm runs in O(n log n) worst-case time using only 5n+O(1) bytes of memory space, where n is the length of
     11 the string.
     12 
     13 ## Build requirements
     14 * An ANSI C Compiler (e.g. GNU GCC)
     15 * [CMake](http://www.cmake.org/ "CMake") version 2.4.2 or newer
     16 * CMake-supported build tool
     17 
     18 ## Building on GNU/Linux
     19 1. Get the source code from GitHub. You can either
     20     * use git to clone the repository
     21     ```
     22     git clone https://github.com/y-256/libdivsufsort.git
     23     ```
     24     * or download a [zip file](../../archive/master.zip) directly
     25 2. Create a `build` directory in the package source directory.
     26 ```shell
     27 $ cd libdivsufsort
     28 $ mkdir build
     29 $ cd build
     30 ```
     31 3. Configure the package for your system.
     32 If you want to install to a different location,  change the -DCMAKE_INSTALL_PREFIX option.
     33 ```shell
     34 $ cmake -DCMAKE_BUILD_TYPE="Release" \
     35 -DCMAKE_INSTALL_PREFIX="/usr/local" ..
     36 ```
     37 4. Compile the package.
     38 ```shell
     39 $ make
     40 ```
     41 5. (Optional) Install the library and header files.
     42 ```shell
     43 $ sudo make install
     44 ```
     45 
     46 ## API
     47 ```c
     48 /* Data types */
     49 typedef int32_t saint_t;
     50 typedef int32_t saidx_t;
     51 typedef uint8_t sauchar_t;
     52 
     53 /*
     54  * Constructs the suffix array of a given string.
     55  * @param T[0..n-1] The input string.
     56  * @param SA[0..n-1] The output array or suffixes.
     57  * @param n The length of the given string.
     58  * @return 0 if no error occurred, -1 or -2 otherwise.
     59  */
     60 saint_t
     61 divsufsort(const sauchar_t *T, saidx_t *SA, saidx_t n);
     62 
     63 /*
     64  * Constructs the burrows-wheeler transformed string of a given string.
     65  * @param T[0..n-1] The input string.
     66  * @param U[0..n-1] The output string. (can be T)
     67  * @param A[0..n-1] The temporary array. (can be NULL)
     68  * @param n The length of the given string.
     69  * @return The primary index if no error occurred, -1 or -2 otherwise.
     70  */
     71 saidx_t
     72 divbwt(const sauchar_t *T, sauchar_t *U, saidx_t *A, saidx_t n);
     73 ```
     74 
     75 ## Example Usage
     76 ```c
     77 #include <stdio.h>
     78 #include <stdlib.h>
     79 #include <string.h>
     80 
     81 #include <divsufsort.h>
     82 
     83 int main() {
     84     // intput data
     85     char *Text = "abracadabra";
     86     int n = strlen(Text);
     87     int i, j;
     88 
     89     // allocate
     90     int *SA = (int *)malloc(n * sizeof(int));
     91 
     92     // sort
     93     divsufsort((unsigned char *)Text, SA, n);
     94 
     95     // output
     96     for(i = 0; i < n; ++i) {
     97         printf("SA[%2d] = %2d: ", i, SA[i]);
     98         for(j = SA[i]; j < n; ++j) {
     99             printf("%c", Text[j]);
    100         }
    101         printf("$\n");
    102     }
    103 
    104     // deallocate
    105     free(SA);
    106 
    107     return 0;
    108 }
    109 ```
    110 See the [examples](examples) directory for a few other examples.
    111 
    112 ## Benchmarks
    113 See [Benchmarks](https://github.com/y-256/libdivsufsort/blob/wiki/SACA_Benchmarks.md) page for details.
    114 
    115 ## License
    116 libdivsufsort is released under the [MIT license](LICENSE "MIT license").
    117 > The MIT License (MIT)
    118 >
    119 > Copyright (c) 2003 Yuta Mori All rights reserved.
    120 >
    121 > Permission is hereby granted, free of charge, to any person obtaining a copy
    122 > of this software and associated documentation files (the "Software"), to deal
    123 > in the Software without restriction, including without limitation the rights
    124 > to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
    125 > copies of the Software, and to permit persons to whom the Software is
    126 > furnished to do so, subject to the following conditions:
    127 >
    128 > The above copyright notice and this permission notice shall be included in all
    129 > copies or substantial portions of the Software.
    130 >
    131 > THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
    132 > IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
    133 > FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
    134 > AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
    135 > LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
    136 > OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
    137 > SOFTWARE.
    138 
    139 ## Author
    140 * Yuta Mori
    141