1 # libdivsufsort 2 3 libdivsufsort is a software library that implements a lightweight suffix array construction algorithm. 4 5 ## News 6 * 2015-03-21: The project has moved from [Google Code](http://code.google.com/p/libdivsufsort/) to [GitHub](https://github.com/y-256/libdivsufsort) 7 8 ## Introduction 9 This library provides a simple and an efficient C API to construct a suffix array and a Burrows-Wheeler transformed string from a given string over a constant-size alphabet. 10 The algorithm runs in O(n log n) worst-case time using only 5n+O(1) bytes of memory space, where n is the length of 11 the string. 12 13 ## Build requirements 14 * An ANSI C Compiler (e.g. GNU GCC) 15 * [CMake](http://www.cmake.org/ "CMake") version 2.4.2 or newer 16 * CMake-supported build tool 17 18 ## Building on GNU/Linux 19 1. Get the source code from GitHub. You can either 20 * use git to clone the repository 21 ``` 22 git clone https://github.com/y-256/libdivsufsort.git 23 ``` 24 * or download a [zip file](../../archive/master.zip) directly 25 2. Create a `build` directory in the package source directory. 26 ```shell 27 $ cd libdivsufsort 28 $ mkdir build 29 $ cd build 30 ``` 31 3. Configure the package for your system. 32 If you want to install to a different location, change the -DCMAKE_INSTALL_PREFIX option. 33 ```shell 34 $ cmake -DCMAKE_BUILD_TYPE="Release" \ 35 -DCMAKE_INSTALL_PREFIX="/usr/local" .. 36 ``` 37 4. Compile the package. 38 ```shell 39 $ make 40 ``` 41 5. (Optional) Install the library and header files. 42 ```shell 43 $ sudo make install 44 ``` 45 46 ## API 47 ```c 48 /* Data types */ 49 typedef int32_t saint_t; 50 typedef int32_t saidx_t; 51 typedef uint8_t sauchar_t; 52 53 /* 54 * Constructs the suffix array of a given string. 55 * @param T[0..n-1] The input string. 56 * @param SA[0..n-1] The output array or suffixes. 57 * @param n The length of the given string. 58 * @return 0 if no error occurred, -1 or -2 otherwise. 59 */ 60 saint_t 61 divsufsort(const sauchar_t *T, saidx_t *SA, saidx_t n); 62 63 /* 64 * Constructs the burrows-wheeler transformed string of a given string. 65 * @param T[0..n-1] The input string. 66 * @param U[0..n-1] The output string. (can be T) 67 * @param A[0..n-1] The temporary array. (can be NULL) 68 * @param n The length of the given string. 69 * @return The primary index if no error occurred, -1 or -2 otherwise. 70 */ 71 saidx_t 72 divbwt(const sauchar_t *T, sauchar_t *U, saidx_t *A, saidx_t n); 73 ``` 74 75 ## Example Usage 76 ```c 77 #include <stdio.h> 78 #include <stdlib.h> 79 #include <string.h> 80 81 #include <divsufsort.h> 82 83 int main() { 84 // intput data 85 char *Text = "abracadabra"; 86 int n = strlen(Text); 87 int i, j; 88 89 // allocate 90 int *SA = (int *)malloc(n * sizeof(int)); 91 92 // sort 93 divsufsort((unsigned char *)Text, SA, n); 94 95 // output 96 for(i = 0; i < n; ++i) { 97 printf("SA[%2d] = %2d: ", i, SA[i]); 98 for(j = SA[i]; j < n; ++j) { 99 printf("%c", Text[j]); 100 } 101 printf("$\n"); 102 } 103 104 // deallocate 105 free(SA); 106 107 return 0; 108 } 109 ``` 110 See the [examples](examples) directory for a few other examples. 111 112 ## Benchmarks 113 See [Benchmarks](https://github.com/y-256/libdivsufsort/blob/wiki/SACA_Benchmarks.md) page for details. 114 115 ## License 116 libdivsufsort is released under the [MIT license](LICENSE "MIT license"). 117 > The MIT License (MIT) 118 > 119 > Copyright (c) 2003 Yuta Mori All rights reserved. 120 > 121 > Permission is hereby granted, free of charge, to any person obtaining a copy 122 > of this software and associated documentation files (the "Software"), to deal 123 > in the Software without restriction, including without limitation the rights 124 > to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 125 > copies of the Software, and to permit persons to whom the Software is 126 > furnished to do so, subject to the following conditions: 127 > 128 > The above copyright notice and this permission notice shall be included in all 129 > copies or substantial portions of the Software. 130 > 131 > THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 132 > IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 133 > FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 134 > AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 135 > LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 136 > OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 137 > SOFTWARE. 138 139 ## Author 140 * Yuta Mori 141