Home | History | Annotate | Download | only in doc
      1 # FAQ
      2 
      3 [TOC]
      4 
      5 ## General
      6 
      7 1. What is RapidJSON?
      8 
      9    RapidJSON is a C++ library for parsing and generating JSON. You may check all [features](doc/features.md) of it.
     10 
     11 2. Why is RapidJSON named so?
     12 
     13    It is inspired by [RapidXML](http://rapidxml.sourceforge.net/), which is a fast XML DOM parser.
     14 
     15 3. Is RapidJSON similar to RapidXML?
     16 
     17    RapidJSON borrowed some designs of RapidXML, including *in situ* parsing, header-only library. But the two APIs are completely different. Also RapidJSON provide many features that are not in RapidXML.
     18 
     19 4. Is RapidJSON free?
     20 
     21    Yes, it is free under MIT license. It can be used in commercial applications. Please check the details in [license.txt](https://github.com/miloyip/rapidjson/blob/master/license.txt).
     22 
     23 5. Is RapidJSON small? What are its dependencies? 
     24 
     25    Yes. A simple executable which parses a JSON and prints its statistics is less than 30KB on Windows.
     26 
     27    RapidJSON depends on C++ standard library only.
     28 
     29 6. How to install RapidJSON?
     30 
     31    Check [Installation section](https://miloyip.github.io/rapidjson/).
     32 
     33 7. Can RapidJSON run on my platform?
     34 
     35    RapidJSON has been tested in many combinations of operating systems, compilers and CPU architecture by the community. But we cannot ensure that it can be run on your particular platform. Building and running the unit test suite will give you the answer.
     36 
     37 8. Does RapidJSON support C++03? C++11?
     38 
     39    RapidJSON was firstly implemented for C++03. Later it added optional support of some C++11 features (e.g., move constructor, `noexcept`). RapidJSON shall be compatible with C++03 or C++11 compliant compilers.
     40 
     41 9. Does RapidJSON really work in real applications?
     42 
     43    Yes. It is deployed in both client and server real applications. A community member reported that RapidJSON in their system parses 50 million JSONs daily.
     44 
     45 10. How RapidJSON is tested?
     46 
     47    RapidJSON contains a unit test suite for automatic testing. [Travis](https://travis-ci.org/miloyip/rapidjson/)(for Linux) and [AppVeyor](https://ci.appveyor.com/project/miloyip/rapidjson/)(for Windows) will compile and run the unit test suite for all modifications. The test process also uses Valgrind (in Linux) to detect memory leaks.
     48 
     49 11. Is RapidJSON well documented?
     50 
     51    RapidJSON provides user guide and API documentationn.
     52 
     53 12. Are there alternatives?
     54 
     55    Yes, there are a lot alternatives. For example, [nativejson-benchmark](https://github.com/miloyip/nativejson-benchmark) has a listing of open-source C/C++ JSON libraries. [json.org](http://www.json.org/) also has a list.
     56 
     57 ## JSON
     58 
     59 1. What is JSON?
     60 
     61    JSON (JavaScript Object Notation) is a lightweight data-interchange format. It uses human readable text format. More details of JSON can be referred to [RFC7159](http://www.ietf.org/rfc/rfc7159.txt) and [ECMA-404](http://www.ecma-international.org/publications/standards/Ecma-404.htm).
     62 
     63 2. What are applications of JSON?
     64 
     65    JSON are commonly used in web applications for transferring structured data. It is also used as a file format for data persistence.
     66 
     67 2. Does RapidJSON conform to the JSON standard?
     68 
     69    Yes. RapidJSON is fully compliance with [RFC7159](http://www.ietf.org/rfc/rfc7159.txt) and [ECMA-404](http://www.ecma-international.org/publications/standards/Ecma-404.htm). It can handle corner cases, such as supporting null character and surrogate pairs in JSON strings.
     70 
     71 3. Does RapidJSON support relaxed syntax?
     72 
     73    Currently no. RapidJSON only support the strict standardized format. Support on related syntax is under discussion in this [issue](https://github.com/miloyip/rapidjson/issues/36).
     74 
     75 ## DOM and SAX
     76 
     77 1. What is DOM style API?
     78 
     79    Document Object Model (DOM) is an in-memory representation of JSON for query and manipulation.
     80 
     81 2. What is SAX style API?
     82 
     83    SAX is an event-driven API for parsing and generation.
     84 
     85 3. Should I choose DOM or SAX?
     86 
     87    DOM is easy for query and manipulation. SAX is very fast and memory-saving but often more difficult to be applied.
     88 
     89 4. What is *in situ* parsing?
     90 
     91    *in situ* parsing decodes the JSON strings directly into the input JSON. This is an optimization which can reduce memory consumption and improve performance, but the input JSON will be modified. Check [in-situ parsing](doc/dom.md) for details.
     92 
     93 5. When does parsing generate an error?
     94 
     95    The parser generates an error when the input JSON contains invalid syntax, or a value can not be represented (a number is too big), or the handler of parsers terminate the parsing. Check [parse error](doc/dom.md) for details.
     96 
     97 6. What error information is provided? 
     98 
     99    The error is stored in `ParseResult`, which includes the error code and offset (number of characters from the beginning of JSON). The error code can be translated into human-readable error message.
    100 
    101 7. Why not just using `double` to represent JSON number?
    102 
    103    Some applications use 64-bit unsigned/signed integers. And these integers cannot be converted into `double` without loss of precision. So the parsers detects whether a JSON number is convertible to different types of integers and/or `double`.
    104 
    105 8. How to clear-and-minimize a document or value?
    106 
    107    Call one of the `SetXXX()` methods - they call destructor which deallocates DOM data:
    108 
    109    ```
    110    Document d;
    111    ...
    112    d.SetObject();  // clear and minimize
    113    ```
    114 
    115    Alternatively, use equivalent of the [C++ swap with temporary idiom](https://en.wikibooks.org/wiki/More_C%2B%2B_Idioms/Clear-and-minimize):
    116    ```
    117    Value(kObjectType).Swap(d);
    118    ```
    119    or equivalent, but sightly longer to type:
    120    ```
    121    d.Swap(Value(kObjectType).Move()); 
    122    ```
    123 
    124 9. How to insert a document node into another document?
    125 
    126    Let's take the following two DOM trees represented as JSON documents:
    127    ```
    128    Document person;
    129    person.Parse("{\"person\":{\"name\":{\"first\":\"Adam\",\"last\":\"Thomas\"}}}");
    130    
    131    Document address;
    132    address.Parse("{\"address\":{\"city\":\"Moscow\",\"street\":\"Quiet\"}}");
    133    ```
    134    Let's assume we want to merge them in such way that the whole `address` document becomes a node of the `person`:
    135    ```
    136    { "person": {
    137       "name": { "first": "Adam", "last": "Thomas" },
    138       "address": { "city": "Moscow", "street": "Quiet" }
    139       }
    140    }
    141    ```
    142 
    143    The most important requirement to take care of document and value life-cycle as well as consistent memory managent using the right allocator during the value transfer.
    144    
    145    Simple yet most efficient way to achieve that is to modify the `address` definition above to initialize it with allocator of the `person` document, then we just add the root nenber of the value:
    146    ```
    147    Documnet address(person.GetAllocator());
    148    ...
    149    person["person"].AddMember("address", address["address"], person.GetAllocator());
    150    ```
    151 Alternatively, if we don't want to explicitly refer to the root value of `address` by name, we can refer to it via iterator:
    152    ```
    153    auto addressRoot = address.MemberBegin();
    154    person["person"].AddMember(addressRoot->name, addressRoot->value, person.GetAllocator());
    155    ```  
    156    
    157    Second way is to deep-clone the value from the address document:
    158    ```
    159    Value addressValue = Value(address["address"], person.GetAllocator());
    160    person["person"].AddMember("address", addressValue, person.GetAllocator());
    161    ```
    162 
    163 ## Document/Value (DOM)
    164 
    165 1. What is move semantics? Why?
    166 
    167    Instead of copy semantics, move semantics is used in `Value`. That means, when assigning a source value to a target value, the ownership of source value is moved to the target value.
    168 
    169    Since moving is faster than copying, this design decision forces user to aware of the copying overhead.
    170 
    171 2. How to copy a value?
    172 
    173    There are two APIs: constructor with allocator, and `CopyFrom()`. See [Deep Copy Value](doc/tutorial.md) for an example.
    174 
    175 3. Why do I need to provide the length of string?
    176 
    177    Since C string is null-terminated, the length of string needs to be computed via `strlen()`, with linear runtime complexity. This incurs an unncessary overhead of many operations, if the user already knows the length of string.
    178 
    179    Also, RapidJSON can handle `\u0000` (null character) within a string. If a string contains null characters, `strlen()` cannot return the true length of it. In such case user must provide the length of string explicitly.
    180 
    181 4. Why do I need to provide allocator parameter in many DOM manipulation API?
    182 
    183    Since the APIs are member functions of `Value`, we do not want to save an allocator pointer in every `Value`.
    184 
    185 5. Does it convert between numerical types?
    186 
    187    When using `GetInt()`, `GetUint()`, ... conversion may occur. For integer-to-integer conversion, it only convert when it is safe (otherwise it will assert). However, when converting a 64-bit signed/unsigned integer to double, it will convert but be aware that it may lose precision. A number with fraction, or an integer larger than 64-bit, can only be obtained by `GetDouble()`.
    188 
    189 ## Reader/Writer (SAX)
    190 
    191 1. Why don't we just `printf` a JSON? Why do we need a `Writer`? 
    192 
    193    Most importantly, `Writer` will ensure the output JSON is well-formed. Calling SAX events incorrectly (e.g. `StartObject()` pairing with `EndArray()`) will assert. Besides, `Writer` will escapes strings (e.g., `\n`). Finally, the numeric output of `printf()` may not be a valid JSON number, especially in some locale with digit delimiters. And the number-to-string conversion in `Writer` is implemented with very fast algorithms, which outperforms than `printf()` or `iostream`.
    194 
    195 2. Can I pause the parsing process and resume it later?
    196 
    197    This is not directly supported in the current version due to performance consideration. However, if the execution environment supports multi-threading, user can parse a JSON in a separate thread, and pause it by blocking in the input stream.
    198 
    199 ## Unicode
    200 
    201 1. Does it support UTF-8, UTF-16 and other format?
    202 
    203    Yes. It fully support UTF-8, UTF-16 (LE/BE), UTF-32 (LE/BE) and ASCII. 
    204 
    205 2. Can it validate the encoding?
    206 
    207    Yes, just pass `kParseValidateEncodingFlag` to `Parse()`. If there is invalid encoding in the stream, it wil generate `kParseErrorStringInvalidEncoding` error.
    208 
    209 3. What is surrogate pair? Does RapidJSON support it?
    210 
    211    JSON uses UTF-16 encoding when escaping unicode character, e.g. `\u5927` representing Chinese character "big". To handle characters other than those in basic multilingual plane (BMP), UTF-16 encodes those characters with two 16-bit values, which is called UTF-16 surrogate pair. For example, the Emoji character U+1F602 can be encoded as `\uD83D\uDE02` in JSON.
    212 
    213    RapidJSON fully support parsing/generating UTF-16 surrogates. 
    214 
    215 4. Can it handle `\u0000` (null character) in JSON string?
    216 
    217    Yes. RapidJSON fully support null character in JSON string. However, user need to be aware of it and using `GetStringLength()` and related APIs to obtain the true length of string.
    218 
    219 5. Can I output `\uxxxx` for all non-ASCII character?
    220 
    221    Yes, use `ASCII<>` as output encoding template parameter in `Writer` can enforce escaping those characters.
    222 
    223 ## Stream
    224 
    225 1. I have a big JSON file. Should I load the whole file to memory?
    226 
    227    User can use `FileReadStream` to read the file chunk-by-chunk. But for *in situ* parsing, the whole file must be loaded.
    228 
    229 2. Can I parse JSON while it is streamed from network?
    230 
    231    Yes. User can implement a custom stream for this. Please refer to the implementation of `FileReadStream`.
    232 
    233 3. I don't know what encoding will the JSON be. How to handle them?
    234 
    235    You may use `AutoUTFInputStream` which detects the encoding of input stream automatically. However, it will incur some performance overhead.
    236 
    237 4. What is BOM? How RapidJSON handle it?
    238 
    239    [Byte order mark (BOM)](http://en.wikipedia.org/wiki/Byte_order_mark) sometimes reside at the beginning of file/stream to indiciate the UTF encoding type of it.
    240 
    241    RapidJSON's `EncodedInputStream` can detect/consume BOM. `EncodedOutputStream` can optionally write a BOM. See [Encoded Streams](doc/stream.md) for example.
    242 
    243 5. Why little/big endian is related?
    244 
    245    little/big endian of stream is an issue for UTF-16 and UTF-32 streams, but not UTF-8 stream.
    246 
    247 ## Performance
    248 
    249 1. Is RapidJSON really fast?
    250 
    251    Yes. It may be the fastest open source JSON library. There is a [benchmark](https://github.com/miloyip/nativejson-benchmark) for evaluating performance of C/C++ JSON libaries.
    252 
    253 2. Why is it fast?
    254 
    255    Many design decisions of RapidJSON is aimed at time/space performance. These may reduce user-friendliness of APIs. Besides, it also employs low-level optimizations (intrinsics, SIMD) and special algorithms (custom double-to-string, string-to-double conversions).
    256 
    257 3. What is SIMD? How it is applied in RapidJSON?
    258 
    259    [SIMD](http://en.wikipedia.org/wiki/SIMD) instructions can perform parallel computation in modern CPUs. RapidJSON support Intel's SSE2/SSE4.2 to accelerate whitespace skipping. This improves performance of parsing indent formatted JSON. Define `RAPIDJSON_SSE2` or `RAPIDJSON_SSE42` macro to enable this feature. However, running the executable on a machine without such instruction set support will make it crash.
    260 
    261 4. Does it consume a lot of memory?
    262 
    263    The design of RapidJSON aims at reducing memory footprint.
    264 
    265    In the SAX API, `Reader` consumes memory portional to maximum depth of JSON tree, plus maximum length of JSON string.
    266 
    267    In the DOM API, each `Value` consumes exactly 16/24 bytes for 32/64-bit architecture respectively. RapidJSON also uses a special memory allocator to minimize overhead of allocations.
    268 
    269 5. What is the purpose of being high performance?
    270 
    271    Some applications need to process very large JSON files. Some server-side applications need to process huge amount of JSONs. Being high performance can improve both latency and throuput. In a broad sense, it will also save energy.
    272 
    273 ## Gossip
    274 
    275 1. Who are the developers of RapidJSON?
    276 
    277    Milo Yip ([miloyip](https://github.com/miloyip)) is the original author of RapidJSON. Many contributors from the world have improved RapidJSON.  Philipp A. Hartmann ([pah](https://github.com/pah)) has implemented a lot of improvements, setting up automatic testing and also involves in a lot of discussions for the community. Don Ding ([thebusytypist](https://github.com/thebusytypist)) implemented the iterative parser. Andrii Senkovych ([jollyroger](https://github.com/jollyroger)) completed the CMake migration. Kosta ([Kosta-Github](https://github.com/Kosta-Github)) provided a very neat short-string optimization. Thank you for all other contributors and community members as well.
    278 
    279 2. Why do you develop RapidJSON?
    280 
    281    It was just a hobby project initially in 2011. Milo Yip is a game programmer and he just knew about JSON at that time and would like to apply JSON in future projects. As JSON seems very simple he would like to write a header-only and fast library.
    282 
    283 3. Why there is a long empty period of development?
    284 
    285    It is basically due to personal issues, such as getting new family members. Also, Milo Yip has spent a lot of spare time on translating "Game Engine Architecture" by Jason Gregory into Chinese.
    286 
    287 4. Why did the repository move from Google Code to GitHub?
    288 
    289    This is the trend. And GitHub is much more powerful and convenient.
    290