Home | History | Annotate | Download | only in dist2
      1 Building PCRE2 without using autotools
      2 --------------------------------------
      3 
      4 This document has been converted from the PCRE1 document. I have removed a
      5 number of sections about building in various environments, as they applied only
      6 to PCRE1 and are probably out of date.
      7 
      8 This document contains the following sections:
      9 
     10   General
     11   Generic instructions for the PCRE2 C library
     12   Stack size in Windows environments
     13   Linking programs in Windows environments
     14   Calling conventions in Windows environments
     15   Comments about Win32 builds
     16   Building PCRE2 on Windows with CMake
     17   Testing with RunTest.bat
     18   Building PCRE2 on native z/OS and z/VM
     19 
     20 
     21 GENERAL
     22 
     23 The basic PCRE2 library consists entirely of code written in Standard C, and so
     24 should compile successfully on any system that has a Standard C compiler and
     25 library.
     26 
     27 The PCRE2 distribution includes a "configure" file for use by the
     28 configure/make (autotools) build system, as found in many Unix-like
     29 environments. The README file contains information about the options for
     30 "configure".
     31 
     32 There is also support for CMake, which some users prefer, especially in Windows
     33 environments, though it can also be run in Unix-like environments. See the
     34 section entitled "Building PCRE2 on Windows with CMake" below.
     35 
     36 Versions of src/config.h and src/pcre2.h are distributed in the PCRE2 tarballs
     37 under the names src/config.h.generic and src/pcre2.h.generic. These are
     38 provided for those who build PCRE2 without using "configure" or CMake. If you
     39 use "configure" or CMake, the .generic versions are not used.
     40 
     41 
     42 GENERIC INSTRUCTIONS FOR THE PCRE2 C LIBRARY
     43 
     44 The following are generic instructions for building the PCRE2 C library "by
     45 hand". If you are going to use CMake, this section does not apply to you; you
     46 can skip ahead to the CMake section.
     47 
     48  (1) Copy or rename the file src/config.h.generic as src/config.h, and edit the
     49      macro settings that it contains to whatever is appropriate for your
     50      environment. In particular, you can alter the definition of the NEWLINE
     51      macro to specify what character(s) you want to be interpreted as line
     52      terminators.
     53 
     54      When you compile any of the PCRE2 modules, you must specify
     55      -DHAVE_CONFIG_H to your compiler so that src/config.h is included in the
     56      sources.
     57 
     58      An alternative approach is not to edit src/config.h, but to use -D on the
     59      compiler command line to make any changes that you need to the
     60      configuration options. In this case -DHAVE_CONFIG_H must not be set.
     61 
     62      NOTE: There have been occasions when the way in which certain parameters
     63      in src/config.h are used has changed between releases. (In the
     64      configure/make world, this is handled automatically.) When upgrading to a
     65      new release, you are strongly advised to review src/config.h.generic
     66      before re-using what you had previously.
     67 
     68  (2) Copy or rename the file src/pcre2.h.generic as src/pcre2.h.
     69 
     70  (3) EITHER:
     71        Copy or rename file src/pcre2_chartables.c.dist as
     72        src/pcre2_chartables.c.
     73 
     74      OR:
     75        Compile src/dftables.c as a stand-alone program (using -DHAVE_CONFIG_H
     76        if you have set up src/config.h), and then run it with the single
     77        argument "src/pcre2_chartables.c". This generates a set of standard
     78        character tables and writes them to that file. The tables are generated
     79        using the default C locale for your system. If you want to use a locale
     80        that is specified by LC_xxx environment variables, add the -L option to
     81        the dftables command. You must use this method if you are building on a
     82        system that uses EBCDIC code.
     83 
     84      The tables in src/pcre2_chartables.c are defaults. The caller of PCRE2 can
     85      specify alternative tables at run time.
     86 
     87  (4) For an 8-bit library, compile the following source files from the src
     88      directory, setting -DPCRE2_CODE_UNIT_WIDTH=8 as a compiler option. Also
     89      set -DHAVE_CONFIG_H if you have set up src/config.h with your
     90      configuration, or else use other -D settings to change the configuration
     91      as required.
     92 
     93        pcre2_auto_possess.c
     94        pcre2_chartables.c
     95        pcre2_compile.c
     96        pcre2_config.c
     97        pcre2_context.c
     98        pcre2_dfa_match.c
     99        pcre2_error.c
    100        pcre2_find_bracket.c
    101        pcre2_jit_compile.c
    102        pcre2_maketables.c
    103        pcre2_match.c
    104        pcre2_match_data.c
    105        pcre2_newline.c
    106        pcre2_ord2utf.c
    107        pcre2_pattern_info.c
    108        pcre2_serialize.c
    109        pcre2_string_utils.c
    110        pcre2_study.c
    111        pcre2_substitute.c
    112        pcre2_substring.c
    113        pcre2_tables.c
    114        pcre2_ucd.c
    115        pcre2_valid_utf.c
    116        pcre2_xclass.c
    117 
    118      Make sure that you include -I. in the compiler command (or equivalent for
    119      an unusual compiler) so that all included PCRE2 header files are first
    120      sought in the src directory under the current directory. Otherwise you run
    121      the risk of picking up a previously-installed file from somewhere else.
    122 
    123      Note that you must compile pcre2_jit_compile.c, even if you have not
    124      defined SUPPORT_JIT in src/config.h, because when JIT support is not
    125      configured, dummy functions are compiled. When JIT support IS configured,
    126      pcre2_compile.c #includes other files from the sljit subdirectory, where
    127      there should be 16 files, all of whose names begin with "sljit". It also
    128      #includes src/pcre2_jit_match.c and src/pcre2_jit_misc.c, so you should
    129      not compile these yourself.
    130 
    131  (5) Now link all the compiled code into an object library in whichever form
    132      your system keeps such libraries. This is the basic PCRE2 C 8-bit library.
    133      If your system has static and shared libraries, you may have to do this
    134      once for each type.
    135 
    136  (6) If you want to build a 16-bit library or 32-bit library (as well as, or
    137      instead of the 8-bit library) just supply 16 or 32 as the value of
    138      -DPCRE2_CODE_UNIT_WIDTH when you are compiling.
    139 
    140  (7) If you want to build the POSIX wrapper functions (which apply only to the
    141      8-bit library), ensure that you have the src/pcre2posix.h file and then
    142      compile src/pcre2posix.c. Link the result (on its own) as the pcre2posix
    143      library.
    144 
    145  (8) The pcre2test program can be linked with any combination of the 8-bit,
    146      16-bit and 32-bit libraries (depending on what you selected in
    147      src/config.h). Compile src/pcre2test.c; don't forget -DHAVE_CONFIG_H if
    148      necessary, but do NOT define PCRE2_CODE_UNIT_WIDTH. Then link with the
    149      appropriate library/ies. If you compiled an 8-bit library, pcre2test also
    150      needs the pcre2posix wrapper library.
    151 
    152  (9) Run pcre2test on the testinput files in the testdata directory, and check
    153      that the output matches the corresponding testoutput files. There are
    154      comments about what each test does in the section entitled "Testing PCRE2"
    155      in the README file. If you compiled more than one of the 8-bit, 16-bit and
    156      32-bit libraries, you need to run pcre2test with the -16 option to do
    157      16-bit tests and with the -32 option to do 32-bit tests.
    158 
    159      Some tests are relevant only when certain build-time options are selected.
    160      For example, test 4 is for Unicode support, and will not run if you have
    161      built PCRE2 without it. See the comments at the start of each testinput
    162      file. If you have a suitable Unix-like shell, the RunTest script will run
    163      the appropriate tests for you. The command "RunTest list" will output a
    164      list of all the tests.
    165 
    166      Note that the supplied files are in Unix format, with just LF characters
    167      as line terminators. You may need to edit them to change this if your
    168      system uses a different convention.
    169 
    170 (10) If you have built PCRE2 with SUPPORT_JIT, the JIT features can be tested
    171      by running pcre2test with the -jit option. This is done automatically by
    172      the RunTest script. You might also like to build and run the freestanding
    173      JIT test program, src/pcre2_jit_test.c.
    174 
    175 (11) If you want to use the pcre2grep command, compile and link
    176      src/pcre2grep.c; it uses only the basic 8-bit PCRE2 library (it does not
    177      need the pcre2posix library).
    178 
    179 
    180 STACK SIZE IN WINDOWS ENVIRONMENTS
    181 
    182 The default processor stack size of 1Mb in some Windows environments is too
    183 small for matching patterns that need much recursion. In particular, test 2 may
    184 fail because of this. Normally, running out of stack causes a crash, but there
    185 have been cases where the test program has just died silently. See your linker
    186 documentation for how to increase stack size if you experience problems. If you
    187 are using CMake (see "BUILDING PCRE2 ON WINDOWS WITH CMAKE" below) and the gcc
    188 compiler, you can increase the stack size for pcre2test and pcre2grep by
    189 setting the CMAKE_EXE_LINKER_FLAGS variable to "-Wl,--stack,8388608" (for
    190 example). The Linux default of 8Mb is a reasonable choice for the stack, though
    191 even that can be too small for some pattern/subject combinations.
    192 
    193 PCRE2 has a compile configuration option to disable the use of stack for
    194 recursion so that heap is used instead. However, pattern matching is
    195 significantly slower when this is done. There is more about stack usage in the
    196 "pcre2stack" documentation.
    197 
    198 
    199 LINKING PROGRAMS IN WINDOWS ENVIRONMENTS
    200 
    201 If you want to statically link a program against a PCRE2 library in the form of
    202 a non-dll .a file, you must define PCRE2_STATIC before including src/pcre2.h.
    203 
    204 
    205 CALLING CONVENTIONS IN WINDOWS ENVIRONMENTS
    206 
    207 It is possible to compile programs to use different calling conventions using
    208 MSVC. Search the web for "calling conventions" for more information. To make it
    209 easier to change the calling convention for the exported functions in the
    210 PCRE2 library, the macro PCRE2_CALL_CONVENTION is present in all the external
    211 definitions. It can be set externally when compiling (e.g. in CFLAGS). If it is
    212 not set, it defaults to empty; the default calling convention is then used
    213 (which is what is wanted most of the time).
    214 
    215 
    216 COMMENTS ABOUT WIN32 BUILDS (see also "BUILDING PCRE2 ON WINDOWS WITH CMAKE")
    217 
    218 There are two ways of building PCRE2 using the "configure, make, make install"
    219 paradigm on Windows systems: using MinGW or using Cygwin. These are not at all
    220 the same thing; they are completely different from each other. There is also
    221 support for building using CMake, which some users find a more straightforward
    222 way of building PCRE2 under Windows.
    223 
    224 The MinGW home page (http://www.mingw.org/) says this:
    225 
    226   MinGW: A collection of freely available and freely distributable Windows
    227   specific header files and import libraries combined with GNU toolsets that
    228   allow one to produce native Windows programs that do not rely on any
    229   3rd-party C runtime DLLs.
    230 
    231 The Cygwin home page (http://www.cygwin.com/) says this:
    232 
    233   Cygwin is a Linux-like environment for Windows. It consists of two parts:
    234 
    235   . A DLL (cygwin1.dll) which acts as a Linux API emulation layer providing
    236     substantial Linux API functionality
    237 
    238   . A collection of tools which provide Linux look and feel.
    239 
    240 On both MinGW and Cygwin, PCRE2 should build correctly using:
    241 
    242   ./configure && make && make install
    243 
    244 This should create two libraries called libpcre2-8 and libpcre2-posix. These
    245 are independent libraries: when you link with libpcre2-posix you must also link
    246 with libpcre2-8, which contains the basic functions.
    247 
    248 Using Cygwin's compiler generates libraries and executables that depend on
    249 cygwin1.dll. If a library that is generated this way is distributed,
    250 cygwin1.dll has to be distributed as well. Since cygwin1.dll is under the GPL
    251 licence, this forces not only PCRE2 to be under the GPL, but also the entire
    252 application. A distributor who wants to keep their own code proprietary must
    253 purchase an appropriate Cygwin licence.
    254 
    255 MinGW has no such restrictions. The MinGW compiler generates a library or
    256 executable that can run standalone on Windows without any third party dll or
    257 licensing issues.
    258 
    259 But there is more complication:
    260 
    261 If a Cygwin user uses the -mno-cygwin Cygwin gcc flag, what that really does is
    262 to tell Cygwin's gcc to use the MinGW gcc. Cygwin's gcc is only acting as a
    263 front end to MinGW's gcc (if you install Cygwin's gcc, you get both Cygwin's
    264 gcc and MinGW's gcc). So, a user can:
    265 
    266 . Build native binaries by using MinGW or by getting Cygwin and using
    267   -mno-cygwin.
    268 
    269 . Build binaries that depend on cygwin1.dll by using Cygwin with the normal
    270   compiler flags.
    271 
    272 The test files that are supplied with PCRE2 are in UNIX format, with LF
    273 characters as line terminators. Unless your PCRE2 library uses a default
    274 newline option that includes LF as a valid newline, it may be necessary to
    275 change the line terminators in the test files to get some of the tests to work.
    276 
    277 
    278 BUILDING PCRE2 ON WINDOWS WITH CMAKE
    279 
    280 CMake is an alternative configuration facility that can be used instead of
    281 "configure". CMake creates project files (make files, solution files, etc.)
    282 tailored to numerous development environments, including Visual Studio,
    283 Borland, Msys, MinGW, NMake, and Unix. If possible, use short paths with no
    284 spaces in the names for your CMake installation and your PCRE2 source and build
    285 directories.
    286 
    287 The following instructions were contributed by a PCRE1 user, but they should
    288 also work for PCRE2. If they are not followed exactly, errors may occur. In the
    289 event that errors do occur, it is recommended that you delete the CMake cache
    290 before attempting to repeat the CMake build process. In the CMake GUI, the
    291 cache can be deleted by selecting "File > Delete Cache".
    292 
    293 1.  Install the latest CMake version available from http://www.cmake.org/, and
    294     ensure that cmake\bin is on your path.
    295 
    296 2.  Unzip (retaining folder structure) the PCRE2 source tree into a source
    297     directory such as C:\pcre2. You should ensure your local date and time
    298     is not earlier than the file dates in your source dir if the release is
    299     very new.
    300 
    301 3.  Create a new, empty build directory, preferably a subdirectory of the
    302     source dir. For example, C:\pcre2\pcre2-xx\build.
    303 
    304 4.  Run cmake-gui from the Shell envirornment of your build tool, for example,
    305     Msys for Msys/MinGW or Visual Studio Command Prompt for VC/VC++. Do not try
    306     to start Cmake from the Windows Start menu, as this can lead to errors.
    307 
    308 5.  Enter C:\pcre2\pcre2-xx and C:\pcre2\pcre2-xx\build for the source and
    309     build directories, respectively.
    310 
    311 6.  Hit the "Configure" button.
    312 
    313 7.  Select the particular IDE / build tool that you are using (Visual
    314     Studio, MSYS makefiles, MinGW makefiles, etc.)
    315 
    316 8.  The GUI will then list several configuration options. This is where
    317     you can disable Unicode support or select other PCRE2 optional features.
    318 
    319 9.  Hit "Configure" again. The adjacent "Generate" button should now be
    320     active.
    321 
    322 10. Hit "Generate".
    323 
    324 11. The build directory should now contain a usable build system, be it a
    325     solution file for Visual Studio, makefiles for MinGW, etc. Exit from
    326     cmake-gui and use the generated build system with your compiler or IDE.
    327     E.g., for MinGW you can run "make", or for Visual Studio, open the PCRE2
    328     solution, select the desired configuration (Debug, or Release, etc.) and
    329     build the ALL_BUILD project.
    330 
    331 12. If during configuration with cmake-gui you've elected to build the test
    332     programs, you can execute them by building the test project. E.g., for
    333     MinGW: "make test"; for Visual Studio build the RUN_TESTS project. The
    334     most recent build configuration is targeted by the tests. A summary of
    335     test results is presented. Complete test output is subsequently
    336     available for review in Testing\Temporary under your build dir.
    337 
    338 
    339 TESTING WITH RUNTEST.BAT
    340 
    341 If configured with CMake, building the test project ("make test" or building
    342 ALL_TESTS in Visual Studio) creates (and runs) pcre2_test.bat (and depending
    343 on your configuration options, possibly other test programs) in the build
    344 directory. The pcre2_test.bat script runs RunTest.bat with correct source and
    345 exe paths.
    346 
    347 For manual testing with RunTest.bat, provided the build dir is a subdirectory
    348 of the source directory: Open command shell window. Chdir to the location
    349 of your pcre2test.exe and pcre2grep.exe programs. Call RunTest.bat with
    350 "..\RunTest.Bat" or "..\..\RunTest.bat" as appropriate.
    351 
    352 To run only a particular test with RunTest.Bat provide a test number argument.
    353 
    354 Otherwise:
    355 
    356 1. Copy RunTest.bat into the directory where pcre2test.exe and pcre2grep.exe
    357    have been created.
    358 
    359 2. Edit RunTest.bat to indentify the full or relative location of
    360    the pcre2 source (wherein which the testdata folder resides), e.g.:
    361 
    362    set srcdir=C:\pcre2\pcre2-10.00
    363 
    364 3. In a Windows command environment, chdir to the location of your bat and
    365    exe programs.
    366 
    367 4. Run RunTest.bat. Test outputs will automatically be compared to expected
    368    results, and discrepancies will be identified in the console output.
    369 
    370 To independently test the just-in-time compiler, run pcre2_jit_test.exe.
    371 
    372 
    373 BUILDING PCRE2 ON NATIVE Z/OS AND Z/VM
    374 
    375 z/OS and z/VM are operating systems for mainframe computers, produced by IBM.
    376 The character code used is EBCDIC, not ASCII or Unicode. In z/OS, UNIX APIs and
    377 applications can be supported through UNIX System Services, and in such an
    378 environment PCRE2 can be built in the same way as in other systems. However, in
    379 native z/OS (without UNIX System Services) and in z/VM, special ports are
    380 required. For details, please see this web site:
    381 
    382   http://www.zaconsultants.net
    383 
    384 The site currently has ports for PCRE1 releases, but PCRE2 should follow in due
    385 course.
    386 
    387 You may also download PCRE1 from WWW.CBTTAPE.ORG,file 882. Everything, source
    388 and executable, is in EBCDIC and native z/OS file formats and this is the
    389 recommended download site.
    390 
    391 =============================
    392 Last Updated: 16 July 2015
    393