Home | History | Annotate | Download | only in libjpeg-turbo
      1 Background
      2 ==========
      3 
      4 libjpeg-turbo is a JPEG image codec that uses SIMD instructions (MMX, SSE2,
      5 NEON, AltiVec) to accelerate baseline JPEG compression and decompression on
      6 x86, x86-64, ARM, and PowerPC systems.  On such systems, libjpeg-turbo is
      7 generally 2-6x as fast as libjpeg, all else being equal.  On other types of
      8 systems, libjpeg-turbo can still outperform libjpeg by a significant amount, by
      9 virtue of its highly-optimized Huffman coding routines.  In many cases, the
     10 performance of libjpeg-turbo rivals that of proprietary high-speed JPEG codecs.
     11 
     12 libjpeg-turbo implements both the traditional libjpeg API as well as the less
     13 powerful but more straightforward TurboJPEG API.  libjpeg-turbo also features
     14 colorspace extensions that allow it to compress from/decompress to 32-bit and
     15 big-endian pixel buffers (RGBX, XBGR, etc.), as well as a full-featured Java
     16 interface.
     17 
     18 libjpeg-turbo was originally based on libjpeg/SIMD, an MMX-accelerated
     19 derivative of libjpeg v6b developed by Miyasaka Masaru.  The TigerVNC and
     20 VirtualGL projects made numerous enhancements to the codec in 2009, and in
     21 early 2010, libjpeg-turbo spun off into an independent project, with the goal
     22 of making high-speed JPEG compression/decompression technology available to a
     23 broader range of users and developers.
     24 
     25 
     26 License
     27 =======
     28 
     29 libjpeg-turbo is covered by three compatible BSD-style open source licenses.
     30 Refer to [LICENSE.md](LICENSE.md) for a roll-up of license terms.
     31 
     32 
     33 Building libjpeg-turbo
     34 ======================
     35 
     36 Refer to [BUILDING.md](BUILDING.md) for complete instructions.
     37 
     38 
     39 Using libjpeg-turbo
     40 ===================
     41 
     42 libjpeg-turbo includes two APIs that can be used to compress and decompress
     43 JPEG images:
     44 
     45 - **TurboJPEG API**<br>
     46   This API provides an easy-to-use interface for compressing and decompressing
     47   JPEG images in memory.  It also provides some functionality that would not be
     48   straightforward to achieve using the underlying libjpeg API, such as
     49   generating planar YUV images and performing multiple simultaneous lossless
     50   transforms on an image.  The Java interface for libjpeg-turbo is written on
     51   top of the TurboJPEG API.
     52 
     53 - **libjpeg API**<br>
     54   This is the de facto industry-standard API for compressing and decompressing
     55   JPEG images.  It is more difficult to use than the TurboJPEG API but also
     56   more powerful.  The libjpeg API implementation in libjpeg-turbo is both
     57   API/ABI-compatible and mathematically compatible with libjpeg v6b.  It can
     58   also optionally be configured to be API/ABI-compatible with libjpeg v7 and v8
     59   (see below.)
     60 
     61 There is no significant performance advantage to either API when both are used
     62 to perform similar operations.
     63 
     64 Colorspace Extensions
     65 ---------------------
     66 
     67 libjpeg-turbo includes extensions that allow JPEG images to be compressed
     68 directly from (and decompressed directly to) buffers that use BGR, BGRX,
     69 RGBX, XBGR, and XRGB pixel ordering.  This is implemented with ten new
     70 colorspace constants:
     71 
     72     JCS_EXT_RGB   /* red/green/blue */
     73     JCS_EXT_RGBX  /* red/green/blue/x */
     74     JCS_EXT_BGR   /* blue/green/red */
     75     JCS_EXT_BGRX  /* blue/green/red/x */
     76     JCS_EXT_XBGR  /* x/blue/green/red */
     77     JCS_EXT_XRGB  /* x/red/green/blue */
     78     JCS_EXT_RGBA  /* red/green/blue/alpha */
     79     JCS_EXT_BGRA  /* blue/green/red/alpha */
     80     JCS_EXT_ABGR  /* alpha/blue/green/red */
     81     JCS_EXT_ARGB  /* alpha/red/green/blue */
     82 
     83 Setting `cinfo.in_color_space` (compression) or `cinfo.out_color_space`
     84 (decompression) to one of these values will cause libjpeg-turbo to read the
     85 red, green, and blue values from (or write them to) the appropriate position in
     86 the pixel when compressing from/decompressing to an RGB buffer.
     87 
     88 Your application can check for the existence of these extensions at compile
     89 time with:
     90 
     91     #ifdef JCS_EXTENSIONS
     92 
     93 At run time, attempting to use these extensions with a libjpeg implementation
     94 that does not support them will result in a "Bogus input colorspace" error.
     95 Applications can trap this error in order to test whether run-time support is
     96 available for the colorspace extensions.
     97 
     98 When using the RGBX, BGRX, XBGR, and XRGB colorspaces during decompression, the
     99 X byte is undefined, and in order to ensure the best performance, libjpeg-turbo
    100 can set that byte to whatever value it wishes.  If an application expects the X
    101 byte to be used as an alpha channel, then it should specify `JCS_EXT_RGBA`,
    102 `JCS_EXT_BGRA`, `JCS_EXT_ABGR`, or `JCS_EXT_ARGB`.  When these colorspace
    103 constants are used, the X byte is guaranteed to be 0xFF, which is interpreted
    104 as opaque.
    105 
    106 Your application can check for the existence of the alpha channel colorspace
    107 extensions at compile time with:
    108 
    109     #ifdef JCS_ALPHA_EXTENSIONS
    110 
    111 [jcstest.c](jcstest.c), located in the libjpeg-turbo source tree, demonstrates
    112 how to check for the existence of the colorspace extensions at compile time and
    113 run time.
    114 
    115 libjpeg v7 and v8 API/ABI Emulation
    116 -----------------------------------
    117 
    118 With libjpeg v7 and v8, new features were added that necessitated extending the
    119 compression and decompression structures.  Unfortunately, due to the exposed
    120 nature of those structures, extending them also necessitated breaking backward
    121 ABI compatibility with previous libjpeg releases.  Thus, programs that were
    122 built to use libjpeg v7 or v8 did not work with libjpeg-turbo, since it is
    123 based on the libjpeg v6b code base.  Although libjpeg v7 and v8 are not
    124 as widely used as v6b, enough programs (including a few Linux distros) made
    125 the switch that there was a demand to emulate the libjpeg v7 and v8 ABIs
    126 in libjpeg-turbo.  It should be noted, however, that this feature was added
    127 primarily so that applications that had already been compiled to use libjpeg
    128 v7+ could take advantage of accelerated baseline JPEG encoding/decoding
    129 without recompiling.  libjpeg-turbo does not claim to support all of the
    130 libjpeg v7+ features, nor to produce identical output to libjpeg v7+ in all
    131 cases (see below.)
    132 
    133 By passing an argument of `--with-jpeg7` or `--with-jpeg8` to `configure`, or
    134 an argument of `-DWITH_JPEG7=1` or `-DWITH_JPEG8=1` to `cmake`, you can build a
    135 version of libjpeg-turbo that emulates the libjpeg v7 or v8 ABI, so that
    136 programs that are built against libjpeg v7 or v8 can be run with libjpeg-turbo.
    137 The following section describes which libjpeg v7+ features are supported and
    138 which aren't.
    139 
    140 ### Support for libjpeg v7 and v8 Features
    141 
    142 #### Fully supported
    143 
    144 - **libjpeg: IDCT scaling extensions in decompressor**<br>
    145   libjpeg-turbo supports IDCT scaling with scaling factors of 1/8, 1/4, 3/8,
    146   1/2, 5/8, 3/4, 7/8, 9/8, 5/4, 11/8, 3/2, 13/8, 7/4, 15/8, and 2/1 (only 1/4
    147   and 1/2 are SIMD-accelerated.)
    148 
    149 - **libjpeg: Arithmetic coding**
    150 
    151 - **libjpeg: In-memory source and destination managers**<br>
    152   See notes below.
    153 
    154 - **cjpeg: Separate quality settings for luminance and chrominance**<br>
    155   Note that the libpjeg v7+ API was extended to accommodate this feature only
    156   for convenience purposes.  It has always been possible to implement this
    157   feature with libjpeg v6b (see rdswitch.c for an example.)
    158 
    159 - **cjpeg: 32-bit BMP support**
    160 
    161 - **cjpeg: `-rgb` option**
    162 
    163 - **jpegtran: Lossless cropping**
    164 
    165 - **jpegtran: `-perfect` option**
    166 
    167 - **jpegtran: Forcing width/height when performing lossless crop**
    168 
    169 - **rdjpgcom: `-raw` option**
    170 
    171 - **rdjpgcom: Locale awareness**
    172 
    173 
    174 #### Not supported
    175 
    176 NOTE:  As of this writing, extensive research has been conducted into the
    177 usefulness of DCT scaling as a means of data reduction and SmartScale as a
    178 means of quality improvement.  The reader is invited to peruse the research at
    179 <http://www.libjpeg-turbo.org/About/SmartScale> and draw his/her own conclusions,
    180 but it is the general belief of our project that these features have not
    181 demonstrated sufficient usefulness to justify inclusion in libjpeg-turbo.
    182 
    183 - **libjpeg: DCT scaling in compressor**<br>
    184   `cinfo.scale_num` and `cinfo.scale_denom` are silently ignored.
    185   There is no technical reason why DCT scaling could not be supported when
    186   emulating the libjpeg v7+ API/ABI, but without the SmartScale extension (see
    187   below), only scaling factors of 1/2, 8/15, 4/7, 8/13, 2/3, 8/11, 4/5, and
    188   8/9 would be available, which is of limited usefulness.
    189 
    190 - **libjpeg: SmartScale**<br>
    191   `cinfo.block_size` is silently ignored.
    192   SmartScale is an extension to the JPEG format that allows for DCT block
    193   sizes other than 8x8.  Providing support for this new format would be
    194   feasible (particularly without full acceleration.)  However, until/unless
    195   the format becomes either an official industry standard or, at minimum, an
    196   accepted solution in the community, we are hesitant to implement it, as
    197   there is no sense of whether or how it might change in the future.  It is
    198   our belief that SmartScale has not demonstrated sufficient usefulness as a
    199   lossless format nor as a means of quality enhancement, and thus our primary
    200   interest in providing this feature would be as a means of supporting
    201   additional DCT scaling factors.
    202 
    203 - **libjpeg: Fancy downsampling in compressor**<br>
    204   `cinfo.do_fancy_downsampling` is silently ignored.
    205   This requires the DCT scaling feature, which is not supported.
    206 
    207 - **jpegtran: Scaling**<br>
    208   This requires both the DCT scaling and SmartScale features, which are not
    209   supported.
    210 
    211 - **Lossless RGB JPEG files**<br>
    212   This requires the SmartScale feature, which is not supported.
    213 
    214 ### What About libjpeg v9?
    215 
    216 libjpeg v9 introduced yet another field to the JPEG compression structure
    217 (`color_transform`), thus making the ABI backward incompatible with that of
    218 libjpeg v8.  This new field was introduced solely for the purpose of supporting
    219 lossless SmartScale encoding.  Furthermore, there was actually no reason to
    220 extend the API in this manner, as the color transform could have just as easily
    221 been activated by way of a new JPEG colorspace constant, thus preserving
    222 backward ABI compatibility.
    223 
    224 Our research (see link above) has shown that lossless SmartScale does not
    225 generally accomplish anything that can't already be accomplished better with
    226 existing, standard lossless formats.  Therefore, at this time it is our belief
    227 that there is not sufficient technical justification for software projects to
    228 upgrade from libjpeg v8 to libjpeg v9, and thus there is not sufficient
    229 technical justification for us to emulate the libjpeg v9 ABI.
    230 
    231 In-Memory Source/Destination Managers
    232 -------------------------------------
    233 
    234 By default, libjpeg-turbo 1.3 and later includes the `jpeg_mem_src()` and
    235 `jpeg_mem_dest()` functions, even when not emulating the libjpeg v8 API/ABI.
    236 Previously, it was necessary to build libjpeg-turbo from source with libjpeg v8
    237 API/ABI emulation in order to use the in-memory source/destination managers,
    238 but several projects requested that those functions be included when emulating
    239 the libjpeg v6b API/ABI as well.  This allows the use of those functions by
    240 programs that need them, without breaking ABI compatibility for programs that
    241 don't, and it allows those functions to be provided in the "official"
    242 libjpeg-turbo binaries.
    243 
    244 Those who are concerned about maintaining strict conformance with the libjpeg
    245 v6b or v7 API can pass an argument of `--without-mem-srcdst` to `configure` or
    246 an argument of `-DWITH_MEM_SRCDST=0` to `cmake` prior to building
    247 libjpeg-turbo.  This will restore the pre-1.3 behavior, in which
    248 `jpeg_mem_src()` and `jpeg_mem_dest()` are only included when emulating the
    249 libjpeg v8 API/ABI.
    250 
    251 On Un*x systems, including the in-memory source/destination managers changes
    252 the dynamic library version from 62.1.0 to 62.2.0 if using libjpeg v6b API/ABI
    253 emulation and from 7.1.0 to 7.2.0 if using libjpeg v7 API/ABI emulation.
    254 
    255 Note that, on most Un*x systems, the dynamic linker will not look for a
    256 function in a library until that function is actually used.  Thus, if a program
    257 is built against libjpeg-turbo 1.3+ and uses `jpeg_mem_src()` or
    258 `jpeg_mem_dest()`, that program will not fail if run against an older version
    259 of libjpeg-turbo or against libjpeg v7- until the program actually tries to
    260 call `jpeg_mem_src()` or `jpeg_mem_dest()`.  Such is not the case on Windows.
    261 If a program is built against the libjpeg-turbo 1.3+ DLL and uses
    262 `jpeg_mem_src()` or `jpeg_mem_dest()`, then it must use the libjpeg-turbo 1.3+
    263 DLL at run time.
    264 
    265 Both cjpeg and djpeg have been extended to allow testing the in-memory
    266 source/destination manager functions.  See their respective man pages for more
    267 details.
    268 
    269 
    270 Mathematical Compatibility
    271 ==========================
    272 
    273 For the most part, libjpeg-turbo should produce identical output to libjpeg
    274 v6b.  The one exception to this is when using the floating point DCT/IDCT, in
    275 which case the outputs of libjpeg v6b and libjpeg-turbo can differ for the
    276 following reasons:
    277 
    278 - The SSE/SSE2 floating point DCT implementation in libjpeg-turbo is ever so
    279   slightly more accurate than the implementation in libjpeg v6b, but not by
    280   any amount perceptible to human vision (generally in the range of 0.01 to
    281   0.08 dB gain in PNSR.)
    282 
    283 - When not using the SIMD extensions, libjpeg-turbo uses the more accurate
    284   (and slightly faster) floating point IDCT algorithm introduced in libjpeg
    285   v8a as opposed to the algorithm used in libjpeg v6b.  It should be noted,
    286   however, that this algorithm basically brings the accuracy of the floating
    287   point IDCT in line with the accuracy of the slow integer IDCT.  The floating
    288   point DCT/IDCT algorithms are mainly a legacy feature, and they do not
    289   produce significantly more accuracy than the slow integer algorithms (to put
    290   numbers on this, the typical difference in PNSR between the two algorithms
    291   is less than 0.10 dB, whereas changing the quality level by 1 in the upper
    292   range of the quality scale is typically more like a 1.0 dB difference.)
    293 
    294 - If the floating point algorithms in libjpeg-turbo are not implemented using
    295   SIMD instructions on a particular platform, then the accuracy of the
    296   floating point DCT/IDCT can depend on the compiler settings.
    297 
    298 While libjpeg-turbo does emulate the libjpeg v8 API/ABI, under the hood it is
    299 still using the same algorithms as libjpeg v6b, so there are several specific
    300 cases in which libjpeg-turbo cannot be expected to produce the same output as
    301 libjpeg v8:
    302 
    303 - When decompressing using scaling factors of 1/2 and 1/4, because libjpeg v8
    304   implements those scaling algorithms differently than libjpeg v6b does, and
    305   libjpeg-turbo's SIMD extensions are based on the libjpeg v6b behavior.
    306 
    307 - When using chrominance subsampling, because libjpeg v8 implements this
    308   with its DCT/IDCT scaling algorithms rather than with a separate
    309   downsampling/upsampling algorithm.  In our testing, the subsampled/upsampled
    310   output of libjpeg v8 is less accurate than that of libjpeg v6b for this
    311   reason.
    312 
    313 - When decompressing using a scaling factor > 1 and merged (AKA "non-fancy" or
    314   "non-smooth") chrominance upsampling, because libjpeg v8 does not support
    315   merged upsampling with scaling factors > 1.
    316 
    317 
    318 Performance Pitfalls
    319 ====================
    320 
    321 Restart Markers
    322 ---------------
    323 
    324 The optimized Huffman decoder in libjpeg-turbo does not handle restart markers
    325 in a way that makes the rest of the libjpeg infrastructure happy, so it is
    326 necessary to use the slow Huffman decoder when decompressing a JPEG image that
    327 has restart markers.  This can cause the decompression performance to drop by
    328 as much as 20%, but the performance will still be much greater than that of
    329 libjpeg.  Many consumer packages, such as PhotoShop, use restart markers when
    330 generating JPEG images, so images generated by those programs will experience
    331 this issue.
    332 
    333 Fast Integer Forward DCT at High Quality Levels
    334 -----------------------------------------------
    335 
    336 The algorithm used by the SIMD-accelerated quantization function cannot produce
    337 correct results whenever the fast integer forward DCT is used along with a JPEG
    338 quality of 98-100.  Thus, libjpeg-turbo must use the non-SIMD quantization
    339 function in those cases.  This causes performance to drop by as much as 40%.
    340 It is therefore strongly advised that you use the slow integer forward DCT
    341 whenever encoding images with a JPEG quality of 98 or higher.
    342