Home | History | Annotate | only in /external/pcre/dist/doc/html
Up to higher level directory
NameDateSize
index.html23-Apr-20157.5K
NON-AUTOTOOLS-BUILD.txt23-Apr-201530K
pcre-config.html23-Apr-20153.6K
pcre.html23-Apr-20159.6K
pcre16.html23-Apr-201515.8K
pcre32.html23-Apr-201515.7K
pcre_assign_jit_stack.html23-Apr-20152.5K
pcre_compile.html23-Apr-20154.6K
pcre_compile2.html23-Apr-20154.8K
pcre_config.html23-Apr-20153.8K
pcre_copy_named_substring.html23-Apr-20152.3K
pcre_copy_substring.html23-Apr-20152.1K
pcre_dfa_exec.html23-Apr-20155.8K
pcre_exec.html23-Apr-20154.9K
pcre_free_study.html23-Apr-20151.2K
pcre_free_substring.html23-Apr-20151.3K
pcre_free_substring_list.html23-Apr-20151.3K
pcre_fullinfo.html23-Apr-20155K
pcre_get_named_substring.html23-Apr-20152.4K
pcre_get_stringnumber.html23-Apr-20151.7K
pcre_get_stringtable_entries.html23-Apr-20152K
pcre_get_substring.html23-Apr-20152.2K
pcre_get_substring_list.html23-Apr-20152.1K
pcre_jit_exec.html23-Apr-20154.7K
pcre_jit_stack_alloc.html23-Apr-20151.7K
pcre_jit_stack_free.html23-Apr-20151.2K
pcre_maketables.html23-Apr-20151.4K
pcre_pattern_to_host_byte_order.html23-Apr-20151.9K
pcre_refcount.html23-Apr-20151.4K
pcre_study.html23-Apr-20152.1K
pcre_utf16_to_host_byte_order.html23-Apr-20152K
pcre_utf32_to_host_byte_order.html23-Apr-20152K
pcre_version.html23-Apr-20151.1K
pcreapi.html23-Apr-2015132.2K
pcrebuild.html23-Apr-201522.3K
pcrecallout.html23-Apr-201511.4K
pcrecompat.html23-Apr-20159.6K
pcrecpp.html23-Apr-201514.1K
pcredemo.html23-Apr-201515.8K
pcregrep.html23-Apr-201537K
pcrejit.html23-Apr-201519.6K
pcrelimits.html23-Apr-20153.1K
pcrematching.html23-Apr-201510.6K
pcrepartial.html23-Apr-201523K
pcrepattern.html23-Apr-2015136.4K
pcreperform.html23-Apr-20157.6K
pcreposix.html23-Apr-201511.9K
pcreprecompile.html23-Apr-20157.2K
pcresample.html23-Apr-20153.7K
pcrestack.html23-Apr-20159.4K
pcresyntax.html23-Apr-201516.3K
pcretest.html23-Apr-201550.8K
pcreunicode.html23-Apr-201511.2K
README.txt23-Apr-201544K

README.txt

      1 README file for PCRE (Perl-compatible regular expression library)
      2 -----------------------------------------------------------------
      3 
      4 The latest release of PCRE is always available in three alternative formats
      5 from:
      6 
      7   ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/pcre-xxx.tar.gz
      8   ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/pcre-xxx.tar.bz2
      9   ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/pcre-xxx.zip
     10 
     11 There is a mailing list for discussion about the development of PCRE at
     12 pcre-dev (a] exim.org. You can access the archives and subscribe or manage your
     13 subscription here:
     14 
     15    https://lists.exim.org/mailman/listinfo/pcre-dev
     16 
     17 Please read the NEWS file if you are upgrading from a previous release.
     18 The contents of this README file are:
     19 
     20   The PCRE APIs
     21   Documentation for PCRE
     22   Contributions by users of PCRE
     23   Building PCRE on non-Unix-like systems
     24   Building PCRE without using autotools
     25   Building PCRE using autotools
     26   Retrieving configuration information
     27   Shared libraries
     28   Cross-compiling using autotools
     29   Using HP's ANSI C++ compiler (aCC)
     30   Compiling in Tru64 using native compilers
     31   Using Sun's compilers for Solaris
     32   Using PCRE from MySQL
     33   Making new tarballs
     34   Testing PCRE
     35   Character tables
     36   File manifest
     37 
     38 
     39 The PCRE APIs
     40 -------------
     41 
     42 PCRE is written in C, and it has its own API. There are three sets of
     43 functions, one for the 8-bit library, which processes strings of bytes, one for
     44 the 16-bit library, which processes strings of 16-bit values, and one for the
     45 32-bit library, which processes strings of 32-bit values. The distribution also
     46 includes a set of C++ wrapper functions (see the pcrecpp man page for details),
     47 courtesy of Google Inc., which can be used to call the 8-bit PCRE library from
     48 C++. Other C++ wrappers have been created from time to time. See, for example:
     49 https://github.com/YasserAsmi/regexp, which aims to be simple and similar in
     50 style to the C API.
     51 
     52 The distribution also contains a set of C wrapper functions (again, just for
     53 the 8-bit library) that are based on the POSIX regular expression API (see the
     54 pcreposix man page). These end up in the library called libpcreposix. Note that
     55 this just provides a POSIX calling interface to PCRE; the regular expressions
     56 themselves still follow Perl syntax and semantics. The POSIX API is restricted,
     57 and does not give full access to all of PCRE's facilities.
     58 
     59 The header file for the POSIX-style functions is called pcreposix.h. The
     60 official POSIX name is regex.h, but I did not want to risk possible problems
     61 with existing files of that name by distributing it that way. To use PCRE with
     62 an existing program that uses the POSIX API, pcreposix.h will have to be
     63 renamed or pointed at by a link.
     64 
     65 If you are using the POSIX interface to PCRE and there is already a POSIX regex
     66 library installed on your system, as well as worrying about the regex.h header
     67 file (as mentioned above), you must also take care when linking programs to
     68 ensure that they link with PCRE's libpcreposix library. Otherwise they may pick
     69 up the POSIX functions of the same name from the other library.
     70 
     71 One way of avoiding this confusion is to compile PCRE with the addition of
     72 -Dregcomp=PCREregcomp (and similarly for the other POSIX functions) to the
     73 compiler flags (CFLAGS if you are using "configure" -- see below). This has the
     74 effect of renaming the functions so that the names no longer clash. Of course,
     75 you have to do the same thing for your applications, or write them using the
     76 new names.
     77 
     78 
     79 Documentation for PCRE
     80 ----------------------
     81 
     82 If you install PCRE in the normal way on a Unix-like system, you will end up
     83 with a set of man pages whose names all start with "pcre". The one that is just
     84 called "pcre" lists all the others. In addition to these man pages, the PCRE
     85 documentation is supplied in two other forms:
     86 
     87   1. There are files called doc/pcre.txt, doc/pcregrep.txt, and
     88      doc/pcretest.txt in the source distribution. The first of these is a
     89      concatenation of the text forms of all the section 3 man pages except
     90      the listing of pcredemo.c and those that summarize individual functions.
     91      The other two are the text forms of the section 1 man pages for the
     92      pcregrep and pcretest commands. These text forms are provided for ease of
     93      scanning with text editors or similar tools. They are installed in
     94      <prefix>/share/doc/pcre, where <prefix> is the installation prefix
     95      (defaulting to /usr/local).
     96 
     97   2. A set of files containing all the documentation in HTML form, hyperlinked
     98      in various ways, and rooted in a file called index.html, is distributed in
     99      doc/html and installed in <prefix>/share/doc/pcre/html.
    100 
    101 Users of PCRE have contributed files containing the documentation for various
    102 releases in CHM format. These can be found in the Contrib directory of the FTP
    103 site (see next section).
    104 
    105 
    106 Contributions by users of PCRE
    107 ------------------------------
    108 
    109 You can find contributions from PCRE users in the directory
    110 
    111   ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/Contrib
    112 
    113 There is a README file giving brief descriptions of what they are. Some are
    114 complete in themselves; others are pointers to URLs containing relevant files.
    115 Some of this material is likely to be well out-of-date. Several of the earlier
    116 contributions provided support for compiling PCRE on various flavours of
    117 Windows (I myself do not use Windows). Nowadays there is more Windows support
    118 in the standard distribution, so these contibutions have been archived.
    119 
    120 A PCRE user maintains downloadable Windows binaries of the pcregrep and
    121 pcretest programs here:
    122 
    123   http://www.rexegg.com/pcregrep-pcretest.html
    124 
    125 
    126 Building PCRE on non-Unix-like systems
    127 --------------------------------------
    128 
    129 For a non-Unix-like system, please read the comments in the file
    130 NON-AUTOTOOLS-BUILD, though if your system supports the use of "configure" and
    131 "make" you may be able to build PCRE using autotools in the same way as for
    132 many Unix-like systems.
    133 
    134 PCRE can also be configured using the GUI facility provided by CMake's
    135 cmake-gui command. This creates Makefiles, solution files, etc. The file
    136 NON-AUTOTOOLS-BUILD has information about CMake.
    137 
    138 PCRE has been compiled on many different operating systems. It should be
    139 straightforward to build PCRE on any system that has a Standard C compiler and
    140 library, because it uses only Standard C functions.
    141 
    142 
    143 Building PCRE without using autotools
    144 -------------------------------------
    145 
    146 The use of autotools (in particular, libtool) is problematic in some
    147 environments, even some that are Unix or Unix-like. See the NON-AUTOTOOLS-BUILD
    148 file for ways of building PCRE without using autotools.
    149 
    150 
    151 Building PCRE using autotools
    152 -----------------------------
    153 
    154 If you are using HP's ANSI C++ compiler (aCC), please see the special note
    155 in the section entitled "Using HP's ANSI C++ compiler (aCC)" below.
    156 
    157 The following instructions assume the use of the widely used "configure; make;
    158 make install" (autotools) process.
    159 
    160 To build PCRE on system that supports autotools, first run the "configure"
    161 command from the PCRE distribution directory, with your current directory set
    162 to the directory where you want the files to be created. This command is a
    163 standard GNU "autoconf" configuration script, for which generic instructions
    164 are supplied in the file INSTALL.
    165 
    166 Most commonly, people build PCRE within its own distribution directory, and in
    167 this case, on many systems, just running "./configure" is sufficient. However,
    168 the usual methods of changing standard defaults are available. For example:
    169 
    170 CFLAGS='-O2 -Wall' ./configure --prefix=/opt/local
    171 
    172 This command specifies that the C compiler should be run with the flags '-O2
    173 -Wall' instead of the default, and that "make install" should install PCRE
    174 under /opt/local instead of the default /usr/local.
    175 
    176 If you want to build in a different directory, just run "configure" with that
    177 directory as current. For example, suppose you have unpacked the PCRE source
    178 into /source/pcre/pcre-xxx, but you want to build it in /build/pcre/pcre-xxx:
    179 
    180 cd /build/pcre/pcre-xxx
    181 /source/pcre/pcre-xxx/configure
    182 
    183 PCRE is written in C and is normally compiled as a C library. However, it is
    184 possible to build it as a C++ library, though the provided building apparatus
    185 does not have any features to support this.
    186 
    187 There are some optional features that can be included or omitted from the PCRE
    188 library. They are also documented in the pcrebuild man page.
    189 
    190 . By default, both shared and static libraries are built. You can change this
    191   by adding one of these options to the "configure" command:
    192 
    193   --disable-shared
    194   --disable-static
    195 
    196   (See also "Shared libraries on Unix-like systems" below.)
    197 
    198 . By default, only the 8-bit library is built. If you add --enable-pcre16 to
    199   the "configure" command, the 16-bit library is also built. If you add
    200   --enable-pcre32 to the "configure" command, the 32-bit library is also built.
    201   If you want only the 16-bit or 32-bit library, use --disable-pcre8 to disable
    202   building the 8-bit library.
    203 
    204 . If you are building the 8-bit library and want to suppress the building of
    205   the C++ wrapper library, you can add --disable-cpp to the "configure"
    206   command. Otherwise, when "configure" is run without --disable-pcre8, it will
    207   try to find a C++ compiler and C++ header files, and if it succeeds, it will
    208   try to build the C++ wrapper.
    209 
    210 . If you want to include support for just-in-time compiling, which can give
    211   large performance improvements on certain platforms, add --enable-jit to the
    212   "configure" command. This support is available only for certain hardware
    213   architectures. If you try to enable it on an unsupported architecture, there
    214   will be a compile time error.
    215 
    216 . When JIT support is enabled, pcregrep automatically makes use of it, unless
    217   you add --disable-pcregrep-jit to the "configure" command.
    218 
    219 . If you want to make use of the support for UTF-8 Unicode character strings in
    220   the 8-bit library, or UTF-16 Unicode character strings in the 16-bit library,
    221   or UTF-32 Unicode character strings in the 32-bit library, you must add
    222   --enable-utf to the "configure" command. Without it, the code for handling
    223   UTF-8, UTF-16 and UTF-8 is not included in the relevant library. Even
    224   when --enable-utf is included, the use of a UTF encoding still has to be
    225   enabled by an option at run time. When PCRE is compiled with this option, its
    226   input can only either be ASCII or UTF-8/16/32, even when running on EBCDIC
    227   platforms. It is not possible to use both --enable-utf and --enable-ebcdic at
    228   the same time.
    229 
    230 . There are no separate options for enabling UTF-8, UTF-16 and UTF-32
    231   independently because that would allow ridiculous settings such as requesting
    232   UTF-16 support while building only the 8-bit library. However, the option
    233   --enable-utf8 is retained for backwards compatibility with earlier releases
    234   that did not support 16-bit or 32-bit character strings. It is synonymous with
    235   --enable-utf. It is not possible to configure one library with UTF support
    236   and the other without in the same configuration.
    237 
    238 . If, in addition to support for UTF-8/16/32 character strings, you want to
    239   include support for the \P, \p, and \X sequences that recognize Unicode
    240   character properties, you must add --enable-unicode-properties to the
    241   "configure" command. This adds about 30K to the size of the library (in the
    242   form of a property table); only the basic two-letter properties such as Lu
    243   are supported.
    244 
    245 . You can build PCRE to recognize either CR or LF or the sequence CRLF or any
    246   of the preceding, or any of the Unicode newline sequences as indicating the
    247   end of a line. Whatever you specify at build time is the default; the caller
    248   of PCRE can change the selection at run time. The default newline indicator
    249   is a single LF character (the Unix standard). You can specify the default
    250   newline indicator by adding --enable-newline-is-cr or --enable-newline-is-lf
    251   or --enable-newline-is-crlf or --enable-newline-is-anycrlf or
    252   --enable-newline-is-any to the "configure" command, respectively.
    253 
    254   If you specify --enable-newline-is-cr or --enable-newline-is-crlf, some of
    255   the standard tests will fail, because the lines in the test files end with
    256   LF. Even if the files are edited to change the line endings, there are likely
    257   to be some failures. With --enable-newline-is-anycrlf or
    258   --enable-newline-is-any, many tests should succeed, but there may be some
    259   failures.
    260 
    261 . By default, the sequence \R in a pattern matches any Unicode line ending
    262   sequence. This is independent of the option specifying what PCRE considers to
    263   be the end of a line (see above). However, the caller of PCRE can restrict \R
    264   to match only CR, LF, or CRLF. You can make this the default by adding
    265   --enable-bsr-anycrlf to the "configure" command (bsr = "backslash R").
    266 
    267 . When called via the POSIX interface, PCRE uses malloc() to get additional
    268   storage for processing capturing parentheses if there are more than 10 of
    269   them in a pattern. You can increase this threshold by setting, for example,
    270 
    271   --with-posix-malloc-threshold=20
    272 
    273   on the "configure" command.
    274 
    275 . PCRE has a counter that limits the depth of nesting of parentheses in a
    276   pattern. This limits the amount of system stack that a pattern uses when it
    277   is compiled. The default is 250, but you can change it by setting, for
    278   example,
    279 
    280   --with-parens-nest-limit=500
    281 
    282 . PCRE has a counter that can be set to limit the amount of resources it uses
    283   when matching a pattern. If the limit is exceeded during a match, the match
    284   fails. The default is ten million. You can change the default by setting, for
    285   example,
    286 
    287   --with-match-limit=500000
    288 
    289   on the "configure" command. This is just the default; individual calls to
    290   pcre_exec() can supply their own value. There is more discussion on the
    291   pcreapi man page.
    292 
    293 . There is a separate counter that limits the depth of recursive function calls
    294   during a matching process. This also has a default of ten million, which is
    295   essentially "unlimited". You can change the default by setting, for example,
    296 
    297   --with-match-limit-recursion=500000
    298 
    299   Recursive function calls use up the runtime stack; running out of stack can
    300   cause programs to crash in strange ways. There is a discussion about stack
    301   sizes in the pcrestack man page.
    302 
    303 . The default maximum compiled pattern size is around 64K. You can increase
    304   this by adding --with-link-size=3 to the "configure" command. In the 8-bit
    305   library, PCRE then uses three bytes instead of two for offsets to different
    306   parts of the compiled pattern. In the 16-bit library, --with-link-size=3 is
    307   the same as --with-link-size=4, which (in both libraries) uses four-byte
    308   offsets. Increasing the internal link size reduces performance. In the 32-bit
    309   library, the only supported link size is 4.
    310 
    311 . You can build PCRE so that its internal match() function that is called from
    312   pcre_exec() does not call itself recursively. Instead, it uses memory blocks
    313   obtained from the heap via the special functions pcre_stack_malloc() and
    314   pcre_stack_free() to save data that would otherwise be saved on the stack. To
    315   build PCRE like this, use
    316 
    317   --disable-stack-for-recursion
    318 
    319   on the "configure" command. PCRE runs more slowly in this mode, but it may be
    320   necessary in environments with limited stack sizes. This applies only to the
    321   normal execution of the pcre_exec() function; if JIT support is being
    322   successfully used, it is not relevant. Equally, it does not apply to
    323   pcre_dfa_exec(), which does not use deeply nested recursion. There is a
    324   discussion about stack sizes in the pcrestack man page.
    325 
    326 . For speed, PCRE uses four tables for manipulating and identifying characters
    327   whose code point values are less than 256. By default, it uses a set of
    328   tables for ASCII encoding that is part of the distribution. If you specify
    329 
    330   --enable-rebuild-chartables
    331 
    332   a program called dftables is compiled and run in the default C locale when
    333   you obey "make". It builds a source file called pcre_chartables.c. If you do
    334   not specify this option, pcre_chartables.c is created as a copy of
    335   pcre_chartables.c.dist. See "Character tables" below for further information.
    336 
    337 . It is possible to compile PCRE for use on systems that use EBCDIC as their
    338   character code (as opposed to ASCII/Unicode) by specifying
    339 
    340   --enable-ebcdic
    341 
    342   This automatically implies --enable-rebuild-chartables (see above). However,
    343   when PCRE is built this way, it always operates in EBCDIC. It cannot support
    344   both EBCDIC and UTF-8/16/32. There is a second option, --enable-ebcdic-nl25,
    345   which specifies that the code value for the EBCDIC NL character is 0x25
    346   instead of the default 0x15.
    347 
    348 . In environments where valgrind is installed, if you specify
    349 
    350   --enable-valgrind
    351 
    352   PCRE will use valgrind annotations to mark certain memory regions as
    353   unaddressable. This allows it to detect invalid memory accesses, and is
    354   mostly useful for debugging PCRE itself.
    355 
    356 . In environments where the gcc compiler is used and lcov version 1.6 or above
    357   is installed, if you specify
    358 
    359   --enable-coverage
    360 
    361   the build process implements a code coverage report for the test suite. The
    362   report is generated by running "make coverage". If ccache is installed on
    363   your system, it must be disabled when building PCRE for coverage reporting.
    364   You can do this by setting the environment variable CCACHE_DISABLE=1 before
    365   running "make" to build PCRE. There is more information about coverage
    366   reporting in the "pcrebuild" documentation.
    367 
    368 . The pcregrep program currently supports only 8-bit data files, and so
    369   requires the 8-bit PCRE library. It is possible to compile pcregrep to use
    370   libz and/or libbz2, in order to read .gz and .bz2 files (respectively), by
    371   specifying one or both of
    372 
    373   --enable-pcregrep-libz
    374   --enable-pcregrep-libbz2
    375 
    376   Of course, the relevant libraries must be installed on your system.
    377 
    378 . The default size (in bytes) of the internal buffer used by pcregrep can be
    379   set by, for example:
    380 
    381   --with-pcregrep-bufsize=51200
    382 
    383   The value must be a plain integer. The default is 20480.
    384 
    385 . It is possible to compile pcretest so that it links with the libreadline
    386   or libedit libraries, by specifying, respectively,
    387 
    388   --enable-pcretest-libreadline or --enable-pcretest-libedit
    389 
    390   If this is done, when pcretest's input is from a terminal, it reads it using
    391   the readline() function. This provides line-editing and history facilities.
    392   Note that libreadline is GPL-licenced, so if you distribute a binary of
    393   pcretest linked in this way, there may be licensing issues. These can be
    394   avoided by linking with libedit (which has a BSD licence) instead.
    395 
    396   Enabling libreadline causes the -lreadline option to be added to the pcretest
    397   build. In many operating environments with a sytem-installed readline
    398   library this is sufficient. However, in some environments (e.g. if an
    399   unmodified distribution version of readline is in use), it may be necessary
    400   to specify something like LIBS="-lncurses" as well. This is because, to quote
    401   the readline INSTALL, "Readline uses the termcap functions, but does not link
    402   with the termcap or curses library itself, allowing applications which link
    403   with readline the to choose an appropriate library." If you get error
    404   messages about missing functions tgetstr, tgetent, tputs, tgetflag, or tgoto,
    405   this is the problem, and linking with the ncurses library should fix it.
    406 
    407 The "configure" script builds the following files for the basic C library:
    408 
    409 . Makefile             the makefile that builds the library
    410 . config.h             build-time configuration options for the library
    411 . pcre.h               the public PCRE header file
    412 . pcre-config          script that shows the building settings such as CFLAGS
    413                          that were set for "configure"
    414 . libpcre.pc         ) data for the pkg-config command
    415 . libpcre16.pc       )
    416 . libpcre32.pc       )
    417 . libpcreposix.pc    )
    418 . libtool              script that builds shared and/or static libraries
    419 
    420 Versions of config.h and pcre.h are distributed in the PCRE tarballs under the
    421 names config.h.generic and pcre.h.generic. These are provided for those who
    422 have to built PCRE without using "configure" or CMake. If you use "configure"
    423 or CMake, the .generic versions are not used.
    424 
    425 When building the 8-bit library, if a C++ compiler is found, the following
    426 files are also built:
    427 
    428 . libpcrecpp.pc        data for the pkg-config command
    429 . pcrecpparg.h         header file for calling PCRE via the C++ wrapper
    430 . pcre_stringpiece.h   header for the C++ "stringpiece" functions
    431 
    432 The "configure" script also creates config.status, which is an executable
    433 script that can be run to recreate the configuration, and config.log, which
    434 contains compiler output from tests that "configure" runs.
    435 
    436 Once "configure" has run, you can run "make". This builds the the libraries
    437 libpcre, libpcre16 and/or libpcre32, and a test program called pcretest. If you
    438 enabled JIT support with --enable-jit, a test program called pcre_jit_test is
    439 built as well.
    440 
    441 If the 8-bit library is built, libpcreposix and the pcregrep command are also
    442 built, and if a C++ compiler was found on your system, and you did not disable
    443 it with --disable-cpp, "make" builds the C++ wrapper library, which is called
    444 libpcrecpp, as well as some test programs called pcrecpp_unittest,
    445 pcre_scanner_unittest, and pcre_stringpiece_unittest.
    446 
    447 The command "make check" runs all the appropriate tests. Details of the PCRE
    448 tests are given below in a separate section of this document.
    449 
    450 You can use "make install" to install PCRE into live directories on your
    451 system. The following are installed (file names are all relative to the
    452 <prefix> that is set when "configure" is run):
    453 
    454   Commands (bin):
    455     pcretest
    456     pcregrep (if 8-bit support is enabled)
    457     pcre-config
    458 
    459   Libraries (lib):
    460     libpcre16     (if 16-bit support is enabled)
    461     libpcre32     (if 32-bit support is enabled)
    462     libpcre       (if 8-bit support is enabled)
    463     libpcreposix  (if 8-bit support is enabled)
    464     libpcrecpp    (if 8-bit and C++ support is enabled)
    465 
    466   Configuration information (lib/pkgconfig):
    467     libpcre16.pc
    468     libpcre32.pc
    469     libpcre.pc
    470     libpcreposix.pc
    471     libpcrecpp.pc (if C++ support is enabled)
    472 
    473   Header files (include):
    474     pcre.h
    475     pcreposix.h
    476     pcre_scanner.h      )
    477     pcre_stringpiece.h  ) if C++ support is enabled
    478     pcrecpp.h           )
    479     pcrecpparg.h        )
    480 
    481   Man pages (share/man/man{1,3}):
    482     pcregrep.1
    483     pcretest.1
    484     pcre-config.1
    485     pcre.3
    486     pcre*.3 (lots more pages, all starting "pcre")
    487 
    488   HTML documentation (share/doc/pcre/html):
    489     index.html
    490     *.html (lots more pages, hyperlinked from index.html)
    491 
    492   Text file documentation (share/doc/pcre):
    493     AUTHORS
    494     COPYING
    495     ChangeLog
    496     LICENCE
    497     NEWS
    498     README
    499     pcre.txt         (a concatenation of the man(3) pages)
    500     pcretest.txt     the pcretest man page
    501     pcregrep.txt     the pcregrep man page
    502     pcre-config.txt  the pcre-config man page
    503 
    504 If you want to remove PCRE from your system, you can run "make uninstall".
    505 This removes all the files that "make install" installed. However, it does not
    506 remove any directories, because these are often shared with other programs.
    507 
    508 
    509 Retrieving configuration information
    510 ------------------------------------
    511 
    512 Running "make install" installs the command pcre-config, which can be used to
    513 recall information about the PCRE configuration and installation. For example:
    514 
    515   pcre-config --version
    516 
    517 prints the version number, and
    518 
    519   pcre-config --libs
    520 
    521 outputs information about where the library is installed. This command can be
    522 included in makefiles for programs that use PCRE, saving the programmer from
    523 having to remember too many details.
    524 
    525 The pkg-config command is another system for saving and retrieving information
    526 about installed libraries. Instead of separate commands for each library, a
    527 single command is used. For example:
    528 
    529   pkg-config --cflags pcre
    530 
    531 The data is held in *.pc files that are installed in a directory called
    532 <prefix>/lib/pkgconfig.
    533 
    534 
    535 Shared libraries
    536 ----------------
    537 
    538 The default distribution builds PCRE as shared libraries and static libraries,
    539 as long as the operating system supports shared libraries. Shared library
    540 support relies on the "libtool" script which is built as part of the
    541 "configure" process.
    542 
    543 The libtool script is used to compile and link both shared and static
    544 libraries. They are placed in a subdirectory called .libs when they are newly
    545 built. The programs pcretest and pcregrep are built to use these uninstalled
    546 libraries (by means of wrapper scripts in the case of shared libraries). When
    547 you use "make install" to install shared libraries, pcregrep and pcretest are
    548 automatically re-built to use the newly installed shared libraries before being
    549 installed themselves. However, the versions left in the build directory still
    550 use the uninstalled libraries.
    551 
    552 To build PCRE using static libraries only you must use --disable-shared when
    553 configuring it. For example:
    554 
    555 ./configure --prefix=/usr/gnu --disable-shared
    556 
    557 Then run "make" in the usual way. Similarly, you can use --disable-static to
    558 build only shared libraries.
    559 
    560 
    561 Cross-compiling using autotools
    562 -------------------------------
    563 
    564 You can specify CC and CFLAGS in the normal way to the "configure" command, in
    565 order to cross-compile PCRE for some other host. However, you should NOT
    566 specify --enable-rebuild-chartables, because if you do, the dftables.c source
    567 file is compiled and run on the local host, in order to generate the inbuilt
    568 character tables (the pcre_chartables.c file). This will probably not work,
    569 because dftables.c needs to be compiled with the local compiler, not the cross
    570 compiler.
    571 
    572 When --enable-rebuild-chartables is not specified, pcre_chartables.c is created
    573 by making a copy of pcre_chartables.c.dist, which is a default set of tables
    574 that assumes ASCII code. Cross-compiling with the default tables should not be
    575 a problem.
    576 
    577 If you need to modify the character tables when cross-compiling, you should
    578 move pcre_chartables.c.dist out of the way, then compile dftables.c by hand and
    579 run it on the local host to make a new version of pcre_chartables.c.dist.
    580 Then when you cross-compile PCRE this new version of the tables will be used.
    581 
    582 
    583 Using HP's ANSI C++ compiler (aCC)
    584 ----------------------------------
    585 
    586 Unless C++ support is disabled by specifying the "--disable-cpp" option of the
    587 "configure" script, you must include the "-AA" option in the CXXFLAGS
    588 environment variable in order for the C++ components to compile correctly.
    589 
    590 Also, note that the aCC compiler on PA-RISC platforms may have a defect whereby
    591 needed libraries fail to get included when specifying the "-AA" compiler
    592 option. If you experience unresolved symbols when linking the C++ programs,
    593 use the workaround of specifying the following environment variable prior to
    594 running the "configure" script:
    595 
    596   CXXLDFLAGS="-lstd_v2 -lCsup_v2"
    597 
    598 
    599 Compiling in Tru64 using native compilers
    600 -----------------------------------------
    601 
    602 The following error may occur when compiling with native compilers in the Tru64
    603 operating system:
    604 
    605   CXX    libpcrecpp_la-pcrecpp.lo
    606 cxx: Error: /usr/lib/cmplrs/cxx/V7.1-006/include/cxx/iosfwd, line 58: #error
    607           directive: "cannot include iosfwd -- define __USE_STD_IOSTREAM to
    608           override default - see section 7.1.2 of the C++ Using Guide"
    609 #error "cannot include iosfwd -- define __USE_STD_IOSTREAM to override default
    610 - see section 7.1.2 of the C++ Using Guide"
    611 
    612 This may be followed by other errors, complaining that 'namespace "std" has no
    613 member'. The solution to this is to add the line
    614 
    615 #define __USE_STD_IOSTREAM 1
    616 
    617 to the config.h file.
    618 
    619 
    620 Using Sun's compilers for Solaris
    621 ---------------------------------
    622 
    623 A user reports that the following configurations work on Solaris 9 sparcv9 and
    624 Solaris 9 x86 (32-bit):
    625 
    626   Solaris 9 sparcv9: ./configure --disable-cpp CC=/bin/cc CFLAGS="-m64 -g"
    627   Solaris 9 x86:     ./configure --disable-cpp CC=/bin/cc CFLAGS="-g"
    628 
    629 
    630 Using PCRE from MySQL
    631 ---------------------
    632 
    633 On systems where both PCRE and MySQL are installed, it is possible to make use
    634 of PCRE from within MySQL, as an alternative to the built-in pattern matching.
    635 There is a web page that tells you how to do this:
    636 
    637   http://www.mysqludf.org/lib_mysqludf_preg/index.php
    638 
    639 
    640 Making new tarballs
    641 -------------------
    642 
    643 The command "make dist" creates three PCRE tarballs, in tar.gz, tar.bz2, and
    644 zip formats. The command "make distcheck" does the same, but then does a trial
    645 build of the new distribution to ensure that it works.
    646 
    647 If you have modified any of the man page sources in the doc directory, you
    648 should first run the PrepareRelease script before making a distribution. This
    649 script creates the .txt and HTML forms of the documentation from the man pages.
    650 
    651 
    652 Testing PCRE
    653 ------------
    654 
    655 To test the basic PCRE library on a Unix-like system, run the RunTest script.
    656 There is another script called RunGrepTest that tests the options of the
    657 pcregrep command. If the C++ wrapper library is built, three test programs
    658 called pcrecpp_unittest, pcre_scanner_unittest, and pcre_stringpiece_unittest
    659 are also built. When JIT support is enabled, another test program called
    660 pcre_jit_test is built.
    661 
    662 Both the scripts and all the program tests are run if you obey "make check" or
    663 "make test". For other environments, see the instructions in
    664 NON-AUTOTOOLS-BUILD.
    665 
    666 The RunTest script runs the pcretest test program (which is documented in its
    667 own man page) on each of the relevant testinput files in the testdata
    668 directory, and compares the output with the contents of the corresponding
    669 testoutput files. RunTest uses a file called testtry to hold the main output
    670 from pcretest. Other files whose names begin with "test" are used as working
    671 files in some tests.
    672 
    673 Some tests are relevant only when certain build-time options were selected. For
    674 example, the tests for UTF-8/16/32 support are run only if --enable-utf was
    675 used. RunTest outputs a comment when it skips a test.
    676 
    677 Many of the tests that are not skipped are run up to three times. The second
    678 run forces pcre_study() to be called for all patterns except for a few in some
    679 tests that are marked "never study" (see the pcretest program for how this is
    680 done). If JIT support is available, the non-DFA tests are run a third time,
    681 this time with a forced pcre_study() with the PCRE_STUDY_JIT_COMPILE option.
    682 This testing can be suppressed by putting "nojit" on the RunTest command line.
    683 
    684 The entire set of tests is run once for each of the 8-bit, 16-bit and 32-bit
    685 libraries that are enabled. If you want to run just one set of tests, call
    686 RunTest with either the -8, -16 or -32 option.
    687 
    688 If valgrind is installed, you can run the tests under it by putting "valgrind"
    689 on the RunTest command line. To run pcretest on just one or more specific test
    690 files, give their numbers as arguments to RunTest, for example:
    691 
    692   RunTest 2 7 11
    693 
    694 You can also specify ranges of tests such as 3-6 or 3- (meaning 3 to the
    695 end), or a number preceded by ~ to exclude a test. For example:
    696 
    697   Runtest 3-15 ~10
    698 
    699 This runs tests 3 to 15, excluding test 10, and just ~13 runs all the tests
    700 except test 13. Whatever order the arguments are in, the tests are always run
    701 in numerical order.
    702 
    703 You can also call RunTest with the single argument "list" to cause it to output
    704 a list of tests.
    705 
    706 The first test file can be fed directly into the perltest.pl script to check
    707 that Perl gives the same results. The only difference you should see is in the
    708 first few lines, where the Perl version is given instead of the PCRE version.
    709 
    710 The second set of tests check pcre_fullinfo(), pcre_study(),
    711 pcre_copy_substring(), pcre_get_substring(), pcre_get_substring_list(), error
    712 detection, and run-time flags that are specific to PCRE, as well as the POSIX
    713 wrapper API. It also uses the debugging flags to check some of the internals of
    714 pcre_compile().
    715 
    716 If you build PCRE with a locale setting that is not the standard C locale, the
    717 character tables may be different (see next paragraph). In some cases, this may
    718 cause failures in the second set of tests. For example, in a locale where the
    719 isprint() function yields TRUE for characters in the range 128-255, the use of
    720 [:isascii:] inside a character class defines a different set of characters, and
    721 this shows up in this test as a difference in the compiled code, which is being
    722 listed for checking. Where the comparison test output contains [\x00-\x7f] the
    723 test will contain [\x00-\xff], and similarly in some other cases. This is not a
    724 bug in PCRE.
    725 
    726 The third set of tests checks pcre_maketables(), the facility for building a
    727 set of character tables for a specific locale and using them instead of the
    728 default tables. The tests make use of the "fr_FR" (French) locale. Before
    729 running the test, the script checks for the presence of this locale by running
    730 the "locale" command. If that command fails, or if it doesn't include "fr_FR"
    731 in the list of available locales, the third test cannot be run, and a comment
    732 is output to say why. If running this test produces instances of the error
    733 
    734   ** Failed to set locale "fr_FR"
    735 
    736 in the comparison output, it means that locale is not available on your system,
    737 despite being listed by "locale". This does not mean that PCRE is broken.
    738 
    739 [If you are trying to run this test on Windows, you may be able to get it to
    740 work by changing "fr_FR" to "french" everywhere it occurs. Alternatively, use
    741 RunTest.bat. The version of RunTest.bat included with PCRE 7.4 and above uses
    742 Windows versions of test 2. More info on using RunTest.bat is included in the
    743 document entitled NON-UNIX-USE.]
    744 
    745 The fourth and fifth tests check the UTF-8/16/32 support and error handling and
    746 internal UTF features of PCRE that are not relevant to Perl, respectively. The
    747 sixth and seventh tests do the same for Unicode character properties support.
    748 
    749 The eighth, ninth, and tenth tests check the pcre_dfa_exec() alternative
    750 matching function, in non-UTF-8/16/32 mode, UTF-8/16/32 mode, and UTF-8/16/32
    751 mode with Unicode property support, respectively.
    752 
    753 The eleventh test checks some internal offsets and code size features; it is
    754 run only when the default "link size" of 2 is set (in other cases the sizes
    755 change) and when Unicode property support is enabled.
    756 
    757 The twelfth test is run only when JIT support is available, and the thirteenth
    758 test is run only when JIT support is not available. They test some JIT-specific
    759 features such as information output from pcretest about JIT compilation.
    760 
    761 The fourteenth, fifteenth, and sixteenth tests are run only in 8-bit mode, and
    762 the seventeenth, eighteenth, and nineteenth tests are run only in 16/32-bit
    763 mode. These are tests that generate different output in the two modes. They are
    764 for general cases, UTF-8/16/32 support, and Unicode property support,
    765 respectively.
    766 
    767 The twentieth test is run only in 16/32-bit mode. It tests some specific
    768 16/32-bit features of the DFA matching engine.
    769 
    770 The twenty-first and twenty-second tests are run only in 16/32-bit mode, when
    771 the link size is set to 2 for the 16-bit library. They test reloading
    772 pre-compiled patterns.
    773 
    774 The twenty-third and twenty-fourth tests are run only in 16-bit mode. They are
    775 for general cases, and UTF-16 support, respectively.
    776 
    777 The twenty-fifth and twenty-sixth tests are run only in 32-bit mode. They are
    778 for general cases, and UTF-32 support, respectively.
    779 
    780 
    781 Character tables
    782 ----------------
    783 
    784 For speed, PCRE uses four tables for manipulating and identifying characters
    785 whose code point values are less than 256. The final argument of the
    786 pcre_compile() function is a pointer to a block of memory containing the
    787 concatenated tables. A call to pcre_maketables() can be used to generate a set
    788 of tables in the current locale. If the final argument for pcre_compile() is
    789 passed as NULL, a set of default tables that is built into the binary is used.
    790 
    791 The source file called pcre_chartables.c contains the default set of tables. By
    792 default, this is created as a copy of pcre_chartables.c.dist, which contains
    793 tables for ASCII coding. However, if --enable-rebuild-chartables is specified
    794 for ./configure, a different version of pcre_chartables.c is built by the
    795 program dftables (compiled from dftables.c), which uses the ANSI C character
    796 handling functions such as isalnum(), isalpha(), isupper(), islower(), etc. to
    797 build the table sources. This means that the default C locale which is set for
    798 your system will control the contents of these default tables. You can change
    799 the default tables by editing pcre_chartables.c and then re-building PCRE. If
    800 you do this, you should take care to ensure that the file does not get
    801 automatically re-generated. The best way to do this is to move
    802 pcre_chartables.c.dist out of the way and replace it with your customized
    803 tables.
    804 
    805 When the dftables program is run as a result of --enable-rebuild-chartables,
    806 it uses the default C locale that is set on your system. It does not pay
    807 attention to the LC_xxx environment variables. In other words, it uses the
    808 system's default locale rather than whatever the compiling user happens to have
    809 set. If you really do want to build a source set of character tables in a
    810 locale that is specified by the LC_xxx variables, you can run the dftables
    811 program by hand with the -L option. For example:
    812 
    813   ./dftables -L pcre_chartables.c.special
    814 
    815 The first two 256-byte tables provide lower casing and case flipping functions,
    816 respectively. The next table consists of three 32-byte bit maps which identify
    817 digits, "word" characters, and white space, respectively. These are used when
    818 building 32-byte bit maps that represent character classes for code points less
    819 than 256.
    820 
    821 The final 256-byte table has bits indicating various character types, as
    822 follows:
    823 
    824     1   white space character
    825     2   letter
    826     4   decimal digit
    827     8   hexadecimal digit
    828    16   alphanumeric or '_'
    829   128   regular expression metacharacter or binary zero
    830 
    831 You should not alter the set of characters that contain the 128 bit, as that
    832 will cause PCRE to malfunction.
    833 
    834 
    835 File manifest
    836 -------------
    837 
    838 The distribution should contain the files listed below. Where a file name is
    839 given as pcre[16|32]_xxx it means that there are three files, one with the name
    840 pcre_xxx, one with the name pcre16_xx, and a third with the name pcre32_xxx.
    841 
    842 (A) Source files of the PCRE library functions and their headers:
    843 
    844   dftables.c              auxiliary program for building pcre_chartables.c
    845                           when --enable-rebuild-chartables is specified
    846 
    847   pcre_chartables.c.dist  a default set of character tables that assume ASCII
    848                           coding; used, unless --enable-rebuild-chartables is
    849                           specified, by copying to pcre[16]_chartables.c
    850 
    851   pcreposix.c                )
    852   pcre[16|32]_byte_order.c   )
    853   pcre[16|32]_compile.c      )
    854   pcre[16|32]_config.c       )
    855   pcre[16|32]_dfa_exec.c     )
    856   pcre[16|32]_exec.c         )
    857   pcre[16|32]_fullinfo.c     )
    858   pcre[16|32]_get.c          ) sources for the functions in the library,
    859   pcre[16|32]_globals.c      )   and some internal functions that they use
    860   pcre[16|32]_jit_compile.c  )
    861   pcre[16|32]_maketables.c   )
    862   pcre[16|32]_newline.c      )
    863   pcre[16|32]_refcount.c     )
    864   pcre[16|32]_string_utils.c )
    865   pcre[16|32]_study.c        )
    866   pcre[16|32]_tables.c       )
    867   pcre[16|32]_ucd.c          )
    868   pcre[16|32]_version.c      )
    869   pcre[16|32]_xclass.c       )
    870   pcre_ord2utf8.c            )
    871   pcre_valid_utf8.c          )
    872   pcre16_ord2utf16.c         )
    873   pcre16_utf16_utils.c       )
    874   pcre16_valid_utf16.c       )
    875   pcre32_utf32_utils.c       )
    876   pcre32_valid_utf32.c       )
    877 
    878   pcre[16|32]_printint.c     ) debugging function that is used by pcretest,
    879                              )   and can also be #included in pcre_compile()
    880 
    881   pcre.h.in               template for pcre.h when built by "configure"
    882   pcreposix.h             header for the external POSIX wrapper API
    883   pcre_internal.h         header for internal use
    884   sljit/*                 16 files that make up the JIT compiler
    885   ucp.h                   header for Unicode property handling
    886 
    887   config.h.in             template for config.h, which is built by "configure"
    888 
    889   pcrecpp.h               public header file for the C++ wrapper
    890   pcrecpparg.h.in         template for another C++ header file
    891   pcre_scanner.h          public header file for C++ scanner functions
    892   pcrecpp.cc              )
    893   pcre_scanner.cc         ) source for the C++ wrapper library
    894 
    895   pcre_stringpiece.h.in   template for pcre_stringpiece.h, the header for the
    896                             C++ stringpiece functions
    897   pcre_stringpiece.cc     source for the C++ stringpiece functions
    898 
    899 (B) Source files for programs that use PCRE:
    900 
    901   pcredemo.c              simple demonstration of coding calls to PCRE
    902   pcregrep.c              source of a grep utility that uses PCRE
    903   pcretest.c              comprehensive test program
    904 
    905 (C) Auxiliary files:
    906 
    907   132html                 script to turn "man" pages into HTML
    908   AUTHORS                 information about the author of PCRE
    909   ChangeLog               log of changes to the code
    910   CleanTxt                script to clean nroff output for txt man pages
    911   Detrail                 script to remove trailing spaces
    912   HACKING                 some notes about the internals of PCRE
    913   INSTALL                 generic installation instructions
    914   LICENCE                 conditions for the use of PCRE
    915   COPYING                 the same, using GNU's standard name
    916   Makefile.in             ) template for Unix Makefile, which is built by
    917                           )   "configure"
    918   Makefile.am             ) the automake input that was used to create
    919                           )   Makefile.in
    920   NEWS                    important changes in this release
    921   NON-UNIX-USE            the previous name for NON-AUTOTOOLS-BUILD
    922   NON-AUTOTOOLS-BUILD     notes on building PCRE without using autotools
    923   PrepareRelease          script to make preparations for "make dist"
    924   README                  this file
    925   RunTest                 a Unix shell script for running tests
    926   RunGrepTest             a Unix shell script for pcregrep tests
    927   aclocal.m4              m4 macros (generated by "aclocal")
    928   config.guess            ) files used by libtool,
    929   config.sub              )   used only when building a shared library
    930   configure               a configuring shell script (built by autoconf)
    931   configure.ac            ) the autoconf input that was used to build
    932                           )   "configure" and config.h
    933   depcomp                 ) script to find program dependencies, generated by
    934                           )   automake
    935   doc/*.3                 man page sources for PCRE
    936   doc/*.1                 man page sources for pcregrep and pcretest
    937   doc/index.html.src      the base HTML page
    938   doc/html/*              HTML documentation
    939   doc/pcre.txt            plain text version of the man pages
    940   doc/pcretest.txt        plain text documentation of test program
    941   doc/perltest.txt        plain text documentation of Perl test program
    942   install-sh              a shell script for installing files
    943   libpcre16.pc.in         template for libpcre16.pc for pkg-config
    944   libpcre32.pc.in         template for libpcre32.pc for pkg-config
    945   libpcre.pc.in           template for libpcre.pc for pkg-config
    946   libpcreposix.pc.in      template for libpcreposix.pc for pkg-config
    947   libpcrecpp.pc.in        template for libpcrecpp.pc for pkg-config
    948   ltmain.sh               file used to build a libtool script
    949   missing                 ) common stub for a few missing GNU programs while
    950                           )   installing, generated by automake
    951   mkinstalldirs           script for making install directories
    952   perltest.pl             Perl test program
    953   pcre-config.in          source of script which retains PCRE information
    954   pcre_jit_test.c         test program for the JIT compiler
    955   pcrecpp_unittest.cc          )
    956   pcre_scanner_unittest.cc     ) test programs for the C++ wrapper
    957   pcre_stringpiece_unittest.cc )
    958   testdata/testinput*     test data for main library tests
    959   testdata/testoutput*    expected test results
    960   testdata/grep*          input and output for pcregrep tests
    961   testdata/*              other supporting test files
    962 
    963 (D) Auxiliary files for cmake support
    964 
    965   cmake/COPYING-CMAKE-SCRIPTS
    966   cmake/FindPackageHandleStandardArgs.cmake
    967   cmake/FindEditline.cmake
    968   cmake/FindReadline.cmake
    969   CMakeLists.txt
    970   config-cmake.h.in
    971 
    972 (E) Auxiliary files for VPASCAL
    973 
    974   makevp.bat
    975   makevp_c.txt
    976   makevp_l.txt
    977   pcregexp.pas
    978 
    979 (F) Auxiliary files for building PCRE "by hand"
    980 
    981   pcre.h.generic          ) a version of the public PCRE header file
    982                           )   for use in non-"configure" environments
    983   config.h.generic        ) a version of config.h for use in non-"configure"
    984                           )   environments
    985 
    986 (F) Miscellaneous
    987 
    988   RunTest.bat            a script for running tests under Windows
    989 
    990 Philip Hazel
    991 Email local part: ph10
    992 Email domain: cam.ac.uk
    993 Last updated: 24 October 2014
    994