README.md
1 # libdivsufsort
2
3 libdivsufsort is a software library that implements a lightweight suffix array construction algorithm.
4
5 ## News
6 * 2015-03-21: The project has moved from [Google Code](http://code.google.com/p/libdivsufsort/) to [GitHub](https://github.com/y-256/libdivsufsort)
7
8 ## Introduction
9 This library provides a simple and an efficient C API to construct a suffix array and a Burrows-Wheeler transformed string from a given string over a constant-size alphabet.
10 The algorithm runs in O(n log n) worst-case time using only 5n+O(1) bytes of memory space, where n is the length of
11 the string.
12
13 ## Build requirements
14 * An ANSI C Compiler (e.g. GNU GCC)
15 * [CMake](http://www.cmake.org/ "CMake") version 2.4.2 or newer
16 * CMake-supported build tool
17
18 ## Building on GNU/Linux
19 1. Get the source code from GitHub. You can either
20 * use git to clone the repository
21 ```
22 git clone https://github.com/y-256/libdivsufsort.git
23 ```
24 * or download a [zip file](../../archive/master.zip) directly
25 2. Create a `build` directory in the package source directory.
26 ```shell
27 $ cd libdivsufsort
28 $ mkdir build
29 $ cd build
30 ```
31 3. Configure the package for your system.
32 If you want to install to a different location, change the -DCMAKE_INSTALL_PREFIX option.
33 ```shell
34 $ cmake -DCMAKE_BUILD_TYPE="Release" \
35 -DCMAKE_INSTALL_PREFIX="/usr/local" ..
36 ```
37 4. Compile the package.
38 ```shell
39 $ make
40 ```
41 5. (Optional) Install the library and header files.
42 ```shell
43 $ sudo make install
44 ```
45
46 ## API
47 ```c
48 /* Data types */
49 typedef int32_t saint_t;
50 typedef int32_t saidx_t;
51 typedef uint8_t sauchar_t;
52
53 /*
54 * Constructs the suffix array of a given string.
55 * @param T[0..n-1] The input string.
56 * @param SA[0..n-1] The output array or suffixes.
57 * @param n The length of the given string.
58 * @return 0 if no error occurred, -1 or -2 otherwise.
59 */
60 saint_t
61 divsufsort(const sauchar_t *T, saidx_t *SA, saidx_t n);
62
63 /*
64 * Constructs the burrows-wheeler transformed string of a given string.
65 * @param T[0..n-1] The input string.
66 * @param U[0..n-1] The output string. (can be T)
67 * @param A[0..n-1] The temporary array. (can be NULL)
68 * @param n The length of the given string.
69 * @return The primary index if no error occurred, -1 or -2 otherwise.
70 */
71 saidx_t
72 divbwt(const sauchar_t *T, sauchar_t *U, saidx_t *A, saidx_t n);
73 ```
74
75 ## Example Usage
76 ```c
77 #include <stdio.h>
78 #include <stdlib.h>
79 #include <string.h>
80
81 #include <divsufsort.h>
82
83 int main() {
84 // intput data
85 char *Text = "abracadabra";
86 int n = strlen(Text);
87 int i, j;
88
89 // allocate
90 int *SA = (int *)malloc(n * sizeof(int));
91
92 // sort
93 divsufsort((unsigned char *)Text, SA, n);
94
95 // output
96 for(i = 0; i < n; ++i) {
97 printf("SA[%2d] = %2d: ", i, SA[i]);
98 for(j = SA[i]; j < n; ++j) {
99 printf("%c", Text[j]);
100 }
101 printf("$\n");
102 }
103
104 // deallocate
105 free(SA);
106
107 return 0;
108 }
109 ```
110 See the [examples](examples) directory for a few other examples.
111
112 ## Benchmarks
113 See [Benchmarks](https://github.com/y-256/libdivsufsort/blob/wiki/SACA_Benchmarks.md) page for details.
114
115 ## License
116 libdivsufsort is released under the [MIT license](LICENSE "MIT license").
117 > The MIT License (MIT)
118 >
119 > Copyright (c) 2003 Yuta Mori All rights reserved.
120 >
121 > Permission is hereby granted, free of charge, to any person obtaining a copy
122 > of this software and associated documentation files (the "Software"), to deal
123 > in the Software without restriction, including without limitation the rights
124 > to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
125 > copies of the Software, and to permit persons to whom the Software is
126 > furnished to do so, subject to the following conditions:
127 >
128 > The above copyright notice and this permission notice shall be included in all
129 > copies or substantial portions of the Software.
130 >
131 > THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
132 > IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
133 > FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
134 > AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
135 > LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
136 > OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
137 > SOFTWARE.
138
139 ## Author
140 * Yuta Mori
141