README
1 README 2007/05/31
2
3 Oniguruma ---- (C) K.Kosako <sndgk393 AT ybb DOT ne DOT jp>
4
5 http://www.geocities.jp/kosako3/oniguruma/
6
7 Oniguruma is a regular expressions library.
8 The characteristics of this library is that different character encoding
9 for every regular expression object can be specified.
10
11 Supported character encodings:
12
13 ASCII, UTF-8, UTF-16BE, UTF-16LE, UTF-32BE, UTF-32LE,
14 EUC-JP, EUC-TW, EUC-KR, EUC-CN,
15 Shift_JIS, Big5, GB18030, KOI8-R, CP1251,
16 ISO-8859-1, ISO-8859-2, ISO-8859-3, ISO-8859-4, ISO-8859-5,
17 ISO-8859-6, ISO-8859-7, ISO-8859-8, ISO-8859-9, ISO-8859-10,
18 ISO-8859-11, ISO-8859-13, ISO-8859-14, ISO-8859-15, ISO-8859-16
19
20 * GB18030: contributed by KUBO Takehiro
21 * CP1251: contributed by Byte
22 ------------------------------------------------------------
23
24 License
25
26 BSD license.
27
28
29 Install
30
31 Case 1: Unix and Cygwin platform
32
33 1. ./configure
34 2. make
35 3. make install
36
37 * uninstall
38
39 make uninstall
40
41 * test (ASCII/EUC-JP)
42
43 make atest
44
45 * configuration check
46
47 onig-config --cflags
48 onig-config --libs
49 onig-config --prefix
50 onig-config --exec-prefix
51
52
53
54 Case 2: Win32 platform (VC++)
55
56 1. copy win32\Makefile Makefile
57 2. copy win32\config.h config.h
58 3. nmake
59
60 onig_s.lib: static link library
61 onig.dll: dynamic link library
62
63 * test (ASCII/Shift_JIS)
64 4. copy win32\testc.c testc.c
65 5. nmake ctest
66
67
68
69 Regular Expressions
70
71 See doc/RE (or doc/RE.ja for Japanese).
72
73
74 Usage
75
76 Include oniguruma.h in your program. (Oniguruma API)
77 See doc/API for Oniguruma API.
78
79 If you want to disable UChar type (== unsigned char) definition
80 in oniguruma.h, define ONIG_ESCAPE_UCHAR_COLLISION and then
81 include oniguruma.h.
82
83 If you want to disable regex_t type definition in oniguruma.h,
84 define ONIG_ESCAPE_REGEX_T_COLLISION and then include oniguruma.h.
85
86 Example of the compiling/linking command line in Unix or Cygwin,
87 (prefix == /usr/local case)
88
89 cc sample.c -L/usr/local/lib -lonig
90
91
92 If you want to use static link library(onig_s.lib) in Win32,
93 add option -DONIG_EXTERN=extern to C compiler.
94
95
96
97 Sample Programs
98
99 sample/simple.c example of the minimum (Oniguruma API)
100 sample/names.c example of the named group callback.
101 sample/encode.c example of some encodings.
102 sample/listcap.c example of the capture history.
103 sample/posix.c POSIX API sample.
104 sample/sql.c example of the variable meta characters.
105 (SQL-like pattern matching)
106
107 Test Programs
108 sample/syntax.c Perl, Java and ASIS syntax test.
109 sample/crnl.c --enable-crnl-as-line-terminator test
110
111
112 Source Files
113
114 oniguruma.h Oniguruma API header file. (public)
115 onig-config.in configuration check program template.
116
117 regenc.h character encodings framework header file.
118 regint.h internal definitions
119 regparse.h internal definitions for regparse.c and regcomp.c
120 regcomp.c compiling and optimization functions
121 regenc.c character encodings framework.
122 regerror.c error message function
123 regext.c extended API functions. (deluxe version API)
124 regexec.c search and match functions
125 regparse.c parsing functions.
126 regsyntax.c pattern syntax functions and built-in syntax definitions.
127 regtrav.c capture history tree data traverse functions.
128 regversion.c version info function.
129 st.h hash table functions header file
130 st.c hash table functions
131
132 oniggnu.h GNU regex API header file. (public)
133 reggnu.c GNU regex API functions
134
135 onigposix.h POSIX API header file. (public)
136 regposerr.c POSIX error message function.
137 regposix.c POSIX API functions.
138
139 enc/mktable.c character type table generator.
140 enc/ascii.c ASCII encoding.
141 enc/euc_jp.c EUC-JP encoding.
142 enc/euc_tw.c EUC-TW encoding.
143 enc/euc_kr.c EUC-KR, EUC-CN encoding.
144 enc/sjis.c Shift_JIS encoding.
145 enc/big5.c Big5 encoding.
146 enc/gb18030.c GB18030 encoding.
147 enc/koi8.c KOI8 encoding.
148 enc/koi8_r.c KOI8-R encoding.
149 enc/cp1251.c CP1251 encoding.
150 enc/iso8859_1.c ISO-8859-1 encoding. (Latin-1)
151 enc/iso8859_2.c ISO-8859-2 encoding. (Latin-2)
152 enc/iso8859_3.c ISO-8859-3 encoding. (Latin-3)
153 enc/iso8859_4.c ISO-8859-4 encoding. (Latin-4)
154 enc/iso8859_5.c ISO-8859-5 encoding. (Cyrillic)
155 enc/iso8859_6.c ISO-8859-6 encoding. (Arabic)
156 enc/iso8859_7.c ISO-8859-7 encoding. (Greek)
157 enc/iso8859_8.c ISO-8859-8 encoding. (Hebrew)
158 enc/iso8859_9.c ISO-8859-9 encoding. (Latin-5 or Turkish)
159 enc/iso8859_10.c ISO-8859-10 encoding. (Latin-6 or Nordic)
160 enc/iso8859_11.c ISO-8859-11 encoding. (Thai)
161 enc/iso8859_13.c ISO-8859-13 encoding. (Latin-7 or Baltic Rim)
162 enc/iso8859_14.c ISO-8859-14 encoding. (Latin-8 or Celtic)
163 enc/iso8859_15.c ISO-8859-15 encoding. (Latin-9 or West European with Euro)
164 enc/iso8859_16.c ISO-8859-16 encoding.
165 (Latin-10 or South-Eastern European with Euro)
166 enc/utf8.c UTF-8 encoding.
167 enc/utf16_be.c UTF-16BE encoding.
168 enc/utf16_le.c UTF-16LE encoding.
169 enc/utf32_be.c UTF-32BE encoding.
170 enc/utf32_le.c UTF-32LE encoding.
171 enc/unicode.c Unicode information data.
172
173 win32/Makefile Makefile for Win32 (VC++)
174 win32/config.h config.h for Win32
175
176
177
178 ToDo
179
180 ? case fold flag: Katakana <-> Hiragana.
181 ? add ONIG_OPTION_NOTBOS/NOTEOS. (\A, \z, \Z)
182 ?? \X (== \PM\pM*)
183 ?? implement syntax behavior ONIG_SYN_CONTEXT_INDEP_ANCHORS.
184 ?? transmission stopper. (return ONIG_STOP from match_at())
185
186 and I'm thankful to Akinori MUSHA.
187
188
189 Mail Address: K.Kosako <sndgk393 AT ybb DOT ne DOT jp>
190