Home | History | Annotate | Download | only in libjpeg-turbo
      1 *******************************************************************************
      2 **     Background
      3 *******************************************************************************
      4 
      5 libjpeg-turbo is a JPEG image codec that uses SIMD instructions (MMX, SSE2,
      6 NEON) to accelerate baseline JPEG compression and decompression on x86, x86-64,
      7 and ARM systems.  On such systems, libjpeg-turbo is generally 2-4x as fast as
      8 libjpeg, all else being equal.  On other types of systems, libjpeg-turbo can
      9 still outperform libjpeg by a significant amount, by virtue of its
     10 highly-optimized Huffman coding routines.  In many cases, the performance of
     11 libjpeg-turbo rivals that of proprietary high-speed JPEG codecs.
     12 
     13 libjpeg-turbo implements both the traditional libjpeg API as well as the less
     14 powerful but more straightforward TurboJPEG API.  libjpeg-turbo also features
     15 colorspace extensions that allow it to compress from/decompress to 32-bit and
     16 big-endian pixel buffers (RGBX, XBGR, etc.), as well as a full-featured Java
     17 interface.
     18 
     19 libjpeg-turbo was originally based on libjpeg/SIMD, an MMX-accelerated
     20 derivative of libjpeg v6b developed by Miyasaka Masaru.  The TigerVNC and
     21 VirtualGL projects made numerous enhancements to the codec in 2009, and in
     22 early 2010, libjpeg-turbo spun off into an independent project, with the goal
     23 of making high-speed JPEG compression/decompression technology available to a
     24 broader range of users and developers.
     25 
     26 
     27 *******************************************************************************
     28 **     License
     29 *******************************************************************************
     30 
     31 libjpeg-turbo is covered by three compatible BSD-style open source licenses.
     32 Refer to LICENSE.txt for a roll-up of license terms.
     33 
     34 
     35 *******************************************************************************
     36 **     Using libjpeg-turbo
     37 *******************************************************************************
     38 
     39 libjpeg-turbo includes two APIs that can be used to compress and decompress
     40 JPEG images:
     41 
     42   TurboJPEG API:  This API provides an easy-to-use interface for compressing
     43   and decompressing JPEG images in memory.  It also provides some functionality
     44   that would not be straightforward to achieve using the underlying libjpeg
     45   API, such as generating planar YUV images and performing multiple
     46   simultaneous lossless transforms on an image.  The Java interface for
     47   libjpeg-turbo is written on top of the TurboJPEG API.
     48 
     49   libjpeg API:  This is the de facto industry-standard API for compressing and
     50   decompressing JPEG images.  It is more difficult to use than the TurboJPEG
     51   API but also more powerful.  The libjpeg API implementation in libjpeg-turbo
     52   is both API/ABI-compatible and mathematically compatible with libjpeg v6b.
     53   It can also optionally be configured to be API/ABI-compatible with libjpeg v7
     54   and v8 (see below.)
     55 
     56 There is no significant performance advantage to either API when both are used
     57 to perform similar operations.
     58 
     59 =====================
     60 Colorspace Extensions
     61 =====================
     62 
     63 libjpeg-turbo includes extensions that allow JPEG images to be compressed
     64 directly from (and decompressed directly to) buffers that use BGR, BGRX,
     65 RGBX, XBGR, and XRGB pixel ordering.  This is implemented with ten new
     66 colorspace constants:
     67 
     68   JCS_EXT_RGB   /* red/green/blue */
     69   JCS_EXT_RGBX  /* red/green/blue/x */
     70   JCS_EXT_BGR   /* blue/green/red */
     71   JCS_EXT_BGRX  /* blue/green/red/x */
     72   JCS_EXT_XBGR  /* x/blue/green/red */
     73   JCS_EXT_XRGB  /* x/red/green/blue */
     74   JCS_EXT_RGBA  /* red/green/blue/alpha */
     75   JCS_EXT_BGRA  /* blue/green/red/alpha */
     76   JCS_EXT_ABGR  /* alpha/blue/green/red */
     77   JCS_EXT_ARGB  /* alpha/red/green/blue */
     78 
     79 Setting cinfo.in_color_space (compression) or cinfo.out_color_space
     80 (decompression) to one of these values will cause libjpeg-turbo to read the
     81 red, green, and blue values from (or write them to) the appropriate position in
     82 the pixel when compressing from/decompressing to an RGB buffer.
     83 
     84 Your application can check for the existence of these extensions at compile
     85 time with:
     86 
     87   #ifdef JCS_EXTENSIONS
     88 
     89 At run time, attempting to use these extensions with a libjpeg implementation
     90 that does not support them will result in a "Bogus input colorspace" error.
     91 Applications can trap this error in order to test whether run-time support is
     92 available for the colorspace extensions.
     93 
     94 When using the RGBX, BGRX, XBGR, and XRGB colorspaces during decompression, the
     95 X byte is undefined, and in order to ensure the best performance, libjpeg-turbo
     96 can set that byte to whatever value it wishes.  If an application expects the X
     97 byte to be used as an alpha channel, then it should specify JCS_EXT_RGBA,
     98 JCS_EXT_BGRA, JCS_EXT_ABGR, or JCS_EXT_ARGB.  When these colorspace constants
     99 are used, the X byte is guaranteed to be 0xFF, which is interpreted as opaque.
    100 
    101 Your application can check for the existence of the alpha channel colorspace
    102 extensions at compile time with:
    103 
    104   #ifdef JCS_ALPHA_EXTENSIONS
    105 
    106 jcstest.c, located in the libjpeg-turbo source tree, demonstrates how to check
    107 for the existence of the colorspace extensions at compile time and run time.
    108 
    109 ===================================
    110 libjpeg v7 and v8 API/ABI Emulation
    111 ===================================
    112 
    113 With libjpeg v7 and v8, new features were added that necessitated extending the
    114 compression and decompression structures.  Unfortunately, due to the exposed
    115 nature of those structures, extending them also necessitated breaking backward
    116 ABI compatibility with previous libjpeg releases.  Thus, programs that were
    117 built to use libjpeg v7 or v8 did not work with libjpeg-turbo, since it is
    118 based on the libjpeg v6b code base.  Although libjpeg v7 and v8 are not
    119 as widely used as v6b, enough programs (including a few Linux distros) made
    120 the switch that there was a demand to emulate the libjpeg v7 and v8 ABIs
    121 in libjpeg-turbo.  It should be noted, however, that this feature was added
    122 primarily so that applications that had already been compiled to use libjpeg
    123 v7+ could take advantage of accelerated baseline JPEG encoding/decoding
    124 without recompiling.  libjpeg-turbo does not claim to support all of the
    125 libjpeg v7+ features, nor to produce identical output to libjpeg v7+ in all
    126 cases (see below.)
    127 
    128 By passing an argument of --with-jpeg7 or --with-jpeg8 to configure, or an
    129 argument of -DWITH_JPEG7=1 or -DWITH_JPEG8=1 to cmake, you can build a version
    130 of libjpeg-turbo that emulates the libjpeg v7 or v8 ABI, so that programs
    131 that are built against libjpeg v7 or v8 can be run with libjpeg-turbo.  The
    132 following section describes which libjpeg v7+ features are supported and which
    133 aren't.
    134 
    135 Support for libjpeg v7 and v8 Features:
    136 ---------------------------------------
    137 
    138 Fully supported:
    139 
    140 -- libjpeg: IDCT scaling extensions in decompressor
    141    libjpeg-turbo supports IDCT scaling with scaling factors of 1/8, 1/4, 3/8,
    142    1/2, 5/8, 3/4, 7/8, 9/8, 5/4, 11/8, 3/2, 13/8, 7/4, 15/8, and 2/1 (only 1/4
    143    and 1/2 are SIMD-accelerated.)
    144 
    145 -- libjpeg: arithmetic coding
    146 
    147 -- libjpeg: In-memory source and destination managers
    148    See notes below.
    149 
    150 -- cjpeg: Separate quality settings for luminance and chrominance
    151    Note that the libpjeg v7+ API was extended to accommodate this feature only
    152    for convenience purposes.  It has always been possible to implement this
    153    feature with libjpeg v6b (see rdswitch.c for an example.)
    154 
    155 -- cjpeg: 32-bit BMP support
    156 
    157 -- cjpeg: -rgb option
    158 
    159 -- jpegtran: lossless cropping
    160 
    161 -- jpegtran: -perfect option
    162 
    163 -- jpegtran: forcing width/height when performing lossless crop
    164 
    165 -- rdjpgcom: -raw option
    166 
    167 -- rdjpgcom: locale awareness
    168 
    169 
    170 Not supported:
    171 
    172 NOTE:  As of this writing, extensive research has been conducted into the
    173 usefulness of DCT scaling as a means of data reduction and SmartScale as a
    174 means of quality improvement.  The reader is invited to peruse the research at
    175 http://www.libjpeg-turbo.org/About/SmartScale and draw his/her own conclusions,
    176 but it is the general belief of our project that these features have not
    177 demonstrated sufficient usefulness to justify inclusion in libjpeg-turbo.
    178 
    179 -- libjpeg: DCT scaling in compressor
    180    cinfo.scale_num and cinfo.scale_denom are silently ignored.
    181    There is no technical reason why DCT scaling could not be supported when
    182    emulating the libjpeg v7+ API/ABI, but without the SmartScale extension (see
    183    below), only scaling factors of 1/2, 8/15, 4/7, 8/13, 2/3, 8/11, 4/5, and
    184    8/9 would be available, which is of limited usefulness.
    185 
    186 -- libjpeg: SmartScale
    187    cinfo.block_size is silently ignored.
    188    SmartScale is an extension to the JPEG format that allows for DCT block
    189    sizes other than 8x8.  Providing support for this new format would be
    190    feasible (particularly without full acceleration.)  However, until/unless
    191    the format becomes either an official industry standard or, at minimum, an
    192    accepted solution in the community, we are hesitant to implement it, as
    193    there is no sense of whether or how it might change in the future.  It is
    194    our belief that SmartScale has not demonstrated sufficient usefulness as a
    195    lossless format nor as a means of quality enhancement, and thus, our primary
    196    interest in providing this feature would be as a means of supporting
    197    additional DCT scaling factors.
    198 
    199 -- libjpeg: Fancy downsampling in compressor
    200    cinfo.do_fancy_downsampling is silently ignored.
    201    This requires the DCT scaling feature, which is not supported.
    202 
    203 -- jpegtran: Scaling
    204    This requires both the DCT scaling and SmartScale features, which are not
    205    supported.
    206 
    207 -- Lossless RGB JPEG files
    208    This requires the SmartScale feature, which is not supported.
    209 
    210 What About libjpeg v9?
    211 ----------------------
    212 
    213 libjpeg v9 introduced yet another field to the JPEG compression structure
    214 (color_transform), thus making the ABI backward incompatible with that of
    215 libjpeg v8.  This new field was introduced solely for the purpose of supporting
    216 lossless SmartScale encoding.  Further, there was actually no reason to extend
    217 the API in this manner, as the color transform could have just as easily been
    218 activated by way of a new JPEG colorspace constant, thus preserving backward
    219 ABI compatibility.
    220 
    221 Our research (see link above) has shown that lossless SmartScale does not
    222 generally accomplish anything that can't already be accomplished better with
    223 existing, standard lossless formats.  Thus, at this time, it is our belief that
    224 there is not sufficient technical justification for software to upgrade from
    225 libjpeg v8 to libjpeg v9, and therefore, not sufficient technical justification
    226 for us to emulate the libjpeg v9 ABI.
    227 
    228 =====================================
    229 In-Memory Source/Destination Managers
    230 =====================================
    231 
    232 By default, libjpeg-turbo 1.3 and later includes the jpeg_mem_src() and
    233 jpeg_mem_dest() functions, even when not emulating the libjpeg v8 API/ABI.
    234 Previously, it was necessary to build libjpeg-turbo from source with libjpeg v8
    235 API/ABI emulation in order to use the in-memory source/destination managers,
    236 but several projects requested that those functions be included when emulating
    237 the libjpeg v6b API/ABI as well.  This allows the use of those functions by
    238 programs that need them without breaking ABI compatibility for programs that
    239 don't, and it allows those functions to be provided in the "official"
    240 libjpeg-turbo binaries.
    241 
    242 Those who are concerned about maintaining strict conformance with the libjpeg
    243 v6b or v7 API can pass an argument of --without-mem-srcdst to configure or
    244 an argument of -DWITH_MEM_SRCDST=0 to CMake prior to building libjpeg-turbo.
    245 This will restore the pre-1.3 behavior, in which jpeg_mem_src() and
    246 jpeg_mem_dest() are only included when emulating the libjpeg v8 API/ABI.
    247 
    248 On Un*x systems, including the in-memory source/destination managers changes
    249 the dynamic library version from 62.0.0 to 62.1.0 if using libjpeg v6b API/ABI
    250 emulation and from 7.0.0 to 7.1.0 if using libjpeg v7 API/ABI emulation.
    251 
    252 Note that, on most Un*x systems, the dynamic linker will not look for a
    253 function in a library until that function is actually used.  Thus, if a program
    254 is built against libjpeg-turbo 1.3+ and uses jpeg_mem_src() or jpeg_mem_dest(),
    255 that program will not fail if run against an older version of libjpeg-turbo or
    256 against libjpeg v7- until the program actually tries to call jpeg_mem_src() or
    257 jpeg_mem_dest().  Such is not the case on Windows.  If a program is built
    258 against the libjpeg-turbo 1.3+ DLL and uses jpeg_mem_src() or jpeg_mem_dest(),
    259 then it must use the libjpeg-turbo 1.3+ DLL at run time.
    260 
    261 Both cjpeg and djpeg have been extended to allow testing the in-memory
    262 source/destination manager functions.  See their respective man pages for more
    263 details.
    264 
    265 
    266 *******************************************************************************
    267 **     Mathematical Compatibility
    268 *******************************************************************************
    269 
    270 For the most part, libjpeg-turbo should produce identical output to libjpeg
    271 v6b.  The one exception to this is when using the floating point DCT/IDCT, in
    272 which case the outputs of libjpeg v6b and libjpeg-turbo can differ for the
    273 following reasons:
    274 
    275 -- The SSE/SSE2 floating point DCT implementation in libjpeg-turbo is ever so
    276    slightly more accurate than the implementation in libjpeg v6b, but not by
    277    any amount perceptible to human vision (generally in the range of 0.01 to
    278    0.08 dB gain in PNSR.)
    279 -- When not using the SIMD extensions, libjpeg-turbo uses the more accurate
    280    (and slightly faster) floating point IDCT algorithm introduced in libjpeg
    281    v8a as opposed to the algorithm used in libjpeg v6b.  It should be noted,
    282    however, that this algorithm basically brings the accuracy of the floating
    283    point IDCT in line with the accuracy of the slow integer IDCT.  The floating
    284    point DCT/IDCT algorithms are mainly a legacy feature, and they do not
    285    produce significantly more accuracy than the slow integer algorithms (to put
    286    numbers on this, the typical difference in PNSR between the two algorithms
    287    is less than 0.10 dB, whereas changing the quality level by 1 in the upper
    288    range of the quality scale is typically more like a 1.0 dB difference.)
    289 -- If the floating point algorithms in libjpeg-turbo are not implemented using
    290    SIMD instructions on a particular platform, then the accuracy of the
    291    floating point DCT/IDCT can depend on the compiler settings.
    292 
    293 While libjpeg-turbo does emulate the libjpeg v8 API/ABI, under the hood, it is
    294 still using the same algorithms as libjpeg v6b, so there are several specific
    295 cases in which libjpeg-turbo cannot be expected to produce the same output as
    296 libjpeg v8:
    297 
    298 -- When decompressing using scaling factors of 1/2 and 1/4, because libjpeg v8
    299    implements those scaling algorithms differently than libjpeg v6b does, and
    300    libjpeg-turbo's SIMD extensions are based on the libjpeg v6b behavior.
    301 
    302 -- When using chrominance subsampling, because libjpeg v8 implements this
    303    with its DCT/IDCT scaling algorithms rather than with a separate
    304    downsampling/upsampling algorithm.  In our testing, the subsampled/upsampled
    305    output of libjpeg v8 is less accurate than that of libjpeg v6b for this
    306    reason.
    307 
    308 -- When decompressing using a scaling factor > 1 and merged (AKA "non-fancy" or
    309    "non-smooth") chrominance upsampling, because libjpeg v8 does not support
    310    merged upsampling with scaling factors > 1.
    311 
    312 
    313 *******************************************************************************
    314 **     Performance Pitfalls
    315 *******************************************************************************
    316 
    317 ===============
    318 Restart Markers
    319 ===============
    320 
    321 The optimized Huffman decoder in libjpeg-turbo does not handle restart markers
    322 in a way that makes the rest of the libjpeg infrastructure happy, so it is
    323 necessary to use the slow Huffman decoder when decompressing a JPEG image that
    324 has restart markers.  This can cause the decompression performance to drop by
    325 as much as 20%, but the performance will still be much greater than that of
    326 libjpeg.  Many consumer packages, such as PhotoShop, use restart markers when
    327 generating JPEG images, so images generated by those programs will experience
    328 this issue.
    329 
    330 ===============================================
    331 Fast Integer Forward DCT at High Quality Levels
    332 ===============================================
    333 
    334 The algorithm used by the SIMD-accelerated quantization function cannot produce
    335 correct results whenever the fast integer forward DCT is used along with a JPEG
    336 quality of 98-100.  Thus, libjpeg-turbo must use the non-SIMD quantization
    337 function in those cases.  This causes performance to drop by as much as 40%.
    338 It is therefore strongly advised that you use the slow integer forward DCT
    339 whenever encoding images with a JPEG quality of 98 or higher.
    340