1 # Stream 2 3 In RapidJSON, `rapidjson::Stream` is a concept for reading/writing JSON. Here we first show how to use streams provided. And then see how to create a custom stream. 4 5 [TOC] 6 7 # Memory Streams {#MemoryStreams} 8 9 Memory streams store JSON in memory. 10 11 ## StringStream (Input) {#StringStream} 12 13 `StringStream` is the most basic input stream. It represents a complete, read-only JSON stored in memory. It is defined in `rapidjson/rapidjson.h`. 14 15 ~~~~~~~~~~cpp 16 #include "rapidjson/document.h" // will include "rapidjson/rapidjson.h" 17 18 using namespace rapidjson; 19 20 // ... 21 const char json[] = "[1, 2, 3, 4]"; 22 StringStream s(json); 23 24 Document d; 25 d.ParseStream(s); 26 ~~~~~~~~~~ 27 28 Since this is very common usage, `Document::Parse(const char*)` is provided to do exactly the same as above: 29 30 ~~~~~~~~~~cpp 31 // ... 32 const char json[] = "[1, 2, 3, 4]"; 33 Document d; 34 d.Parse(json); 35 ~~~~~~~~~~ 36 37 Note that, `StringStream` is a typedef of `GenericStringStream<UTF8<> >`, user may use another encodings to represent the character set of the stream. 38 39 ## StringBuffer (Output) {#StringBuffer} 40 41 `StringBuffer` is a simple output stream. It allocates a memory buffer for writing the whole JSON. Use `GetString()` to obtain the buffer. 42 43 ~~~~~~~~~~cpp 44 #include "rapidjson/stringbuffer.h" 45 46 StringBuffer buffer; 47 Writer<StringBuffer> writer(buffer); 48 d.Accept(writer); 49 50 const char* output = buffer.GetString(); 51 ~~~~~~~~~~ 52 53 When the buffer is full, it will increases the capacity automatically. The default capacity is 256 characters (256 bytes for UTF8, 512 bytes for UTF16, etc.). User can provide an allocator and a initial capacity. 54 55 ~~~~~~~~~~cpp 56 StringBuffer buffer1(0, 1024); // Use its allocator, initial size = 1024 57 StringBuffer buffer2(allocator, 1024); 58 ~~~~~~~~~~ 59 60 By default, `StringBuffer` will instantiate an internal allocator. 61 62 Similarly, `StringBuffer` is a typedef of `GenericStringBuffer<UTF8<> >`. 63 64 # File Streams {#FileStreams} 65 66 When parsing a JSON from file, you may read the whole JSON into memory and use ``StringStream`` above. 67 68 However, if the JSON is big, or memory is limited, you can use `FileReadStream`. It only read a part of JSON from file into buffer, and then let the part be parsed. If it runs out of characters in the buffer, it will read the next part from file. 69 70 ## FileReadStream (Input) {#FileReadStream} 71 72 `FileReadStream` reads the file via a `FILE` pointer. And user need to provide a buffer. 73 74 ~~~~~~~~~~cpp 75 #include "rapidjson/filereadstream.h" 76 #include <cstdio> 77 78 using namespace rapidjson; 79 80 FILE* fp = fopen("big.json", "rb"); // non-Windows use "r" 81 82 char readBuffer[65536]; 83 FileReadStream is(fp, readBuffer, sizeof(readBuffer)); 84 85 Document d; 86 d.ParseStream(is); 87 88 fclose(fp); 89 ~~~~~~~~~~ 90 91 Different from string streams, `FileReadStream` is byte stream. It does not handle encodings. If the file is not UTF-8, the byte stream can be wrapped in a `EncodedInputStream`. It will be discussed very soon. 92 93 Apart from reading file, user can also use `FileReadStream` to read `stdin`. 94 95 ## FileWriteStream (Output) {#FileWriteStream} 96 97 `FileWriteStream` is buffered output stream. Its usage is very similar to `FileReadStream`. 98 99 ~~~~~~~~~~cpp 100 #include "rapidjson/filewritestream.h" 101 #include <cstdio> 102 103 using namespace rapidjson; 104 105 Document d; 106 d.Parse(json); 107 // ... 108 109 FILE* fp = fopen("output.json", "wb"); // non-Windows use "w" 110 111 char writeBuffer[65536]; 112 FileWriteStream os(fp, writeBuffer, sizeof(writeBuffer)); 113 114 Writer<FileWriteStream> writer(os); 115 d.Accept(writer); 116 117 fclose(fp); 118 ~~~~~~~~~~ 119 120 It can also directs the output to `stdout`. 121 122 # Encoded Streams {#EncodedStreams} 123 124 Encoded streams do not contain JSON itself, but they wrap byte streams to provide basic encoding/decoding function. 125 126 As mentioned above, UTF-8 byte streams can be read directly. However, UTF-16 and UTF-32 have endian issue. To handle endian correctly, it needs to convert bytes into characters (e.g. `wchar_t` for UTF-16) while reading, and characters into bytes while writing. 127 128 Besides, it also need to handle [byte order mark (BOM)](http://en.wikipedia.org/wiki/Byte_order_mark). When reading from a byte stream, it is needed to detect or just consume the BOM if exists. When writing to a byte stream, it can optionally write BOM. 129 130 If the encoding of stream is known in compile-time, you may use `EncodedInputStream` and `EncodedOutputStream`. If the stream can be UTF-8, UTF-16LE, UTF-16BE, UTF-32LE, UTF-32BE JSON, and it is only known in runtime, you may use `AutoUTFInputStream` and `AutoUTFOutputStream`. These streams are defined in `rapidjson/encodedstream.h`. 131 132 Note that, these encoded streams can be applied to streams other than file. For example, you may have a file in memory, or a custom byte stream, be wrapped in encoded streams. 133 134 ## EncodedInputStream {#EncodedInputStream} 135 136 `EncodedInputStream` has two template parameters. The first one is a `Encoding` class, such as `UTF8`, `UTF16LE`, defined in `rapidjson/encodings.h`. The second one is the class of stream to be wrapped. 137 138 ~~~~~~~~~~cpp 139 #include "rapidjson/document.h" 140 #include "rapidjson/filereadstream.h" // FileReadStream 141 #include "rapidjson/encodedstream.h" // EncodedInputStream 142 #include <cstdio> 143 144 using namespace rapidjson; 145 146 FILE* fp = fopen("utf16le.json", "rb"); // non-Windows use "r" 147 148 char readBuffer[256]; 149 FileReadStream bis(fp, readBuffer, sizeof(readBuffer)); 150 151 EncodedInputStream<UTF16LE<>, FileReadStream> eis(bis); // wraps bis into eis 152 153 Document d; // Document is GenericDocument<UTF8<> > 154 d.ParseStream<0, UTF16LE<> >(eis); // Parses UTF-16LE file into UTF-8 in memory 155 156 fclose(fp); 157 ~~~~~~~~~~ 158 159 ## EncodedOutputStream {#EncodedOutputStream} 160 161 `EncodedOutputStream` is similar but it has a `bool putBOM` parameter in the constructor, controlling whether to write BOM into output byte stream. 162 163 ~~~~~~~~~~cpp 164 #include "rapidjson/filewritestream.h" // FileWriteStream 165 #include "rapidjson/encodedstream.h" // EncodedOutputStream 166 #include <cstdio> 167 168 Document d; // Document is GenericDocument<UTF8<> > 169 // ... 170 171 FILE* fp = fopen("output_utf32le.json", "wb"); // non-Windows use "w" 172 173 char writeBuffer[256]; 174 FileWriteStream bos(fp, writeBuffer, sizeof(writeBuffer)); 175 176 typedef EncodedOutputStream<UTF32LE<>, FileWriteStream> OutputStream; 177 OutputStream eos(bos, true); // Write BOM 178 179 Writer<OutputStream, UTF32LE<>, UTF8<>> writer(eos); 180 d.Accept(writer); // This generates UTF32-LE file from UTF-8 in memory 181 182 fclose(fp); 183 ~~~~~~~~~~ 184 185 ## AutoUTFInputStream {#AutoUTFInputStream} 186 187 Sometimes an application may want to handle all supported JSON encoding. `AutoUTFInputStream` will detection encoding by BOM first. If BOM is unavailable, it will use characteristics of valid JSON to make detection. If neither method success, it falls back to the UTF type provided in constructor. 188 189 Since the characters (code units) may be 8-bit, 16-bit or 32-bit. `AutoUTFInputStream` requires a character type which can hold at least 32-bit. We may use `unsigned`, as in the template parameter: 190 191 ~~~~~~~~~~cpp 192 #include "rapidjson/document.h" 193 #include "rapidjson/filereadstream.h" // FileReadStream 194 #include "rapidjson/encodedstream.h" // AutoUTFInputStream 195 #include <cstdio> 196 197 using namespace rapidjson; 198 199 FILE* fp = fopen("any.json", "rb"); // non-Windows use "r" 200 201 char readBuffer[256]; 202 FileReadStream bis(fp, readBuffer, sizeof(readBuffer)); 203 204 AutoUTFInputStream<unsigned, FileReadStream> eis(bis); // wraps bis into eis 205 206 Document d; // Document is GenericDocument<UTF8<> > 207 d.ParseStream<0, AutoUTF<unsigned> >(eis); // This parses any UTF file into UTF-8 in memory 208 209 fclose(fp); 210 ~~~~~~~~~~ 211 212 When specifying the encoding of stream, uses `AutoUTF<CharType>` as in `ParseStream()` above. 213 214 You can obtain the type of UTF via `UTFType GetType()`. And check whether a BOM is found by `HasBOM()` 215 216 ## AutoUTFOutputStream {#AutoUTFOutputStream} 217 218 Similarly, to choose encoding for output during runtime, we can use `AutoUTFOutputStream`. This class is not automatic *per se*. You need to specify the UTF type and whether to write BOM in runtime. 219 220 ~~~~~~~~~~cpp 221 using namespace rapidjson; 222 223 void WriteJSONFile(FILE* fp, UTFType type, bool putBOM, const Document& d) { 224 char writeBuffer[256]; 225 FileWriteStream bos(fp, writeBuffer, sizeof(writeBuffer)); 226 227 typedef AutoUTFOutputStream<unsigned, FileWriteStream> OutputStream; 228 OutputStream eos(bos, type, putBOM); 229 230 Writer<OutputStream, UTF8<>, AutoUTF<> > writer; 231 d.Accept(writer); 232 } 233 ~~~~~~~~~~ 234 235 `AutoUTFInputStream` and `AutoUTFOutputStream` is more convenient than `EncodedInputStream` and `EncodedOutputStream`. They just incur a little bit runtime overheads. 236 237 # Custom Stream {#CustomStream} 238 239 In addition to memory/file streams, user can create their own stream classes which fits RapidJSON's API. For example, you may create network stream, stream from compressed file, etc. 240 241 RapidJSON combines different types using templates. A class containing all required interface can be a stream. The Stream interface is defined in comments of `rapidjson/rapidjson.h`: 242 243 ~~~~~~~~~~cpp 244 concept Stream { 245 typename Ch; //!< Character type of the stream. 246 247 //! Read the current character from stream without moving the read cursor. 248 Ch Peek() const; 249 250 //! Read the current character from stream and moving the read cursor to next character. 251 Ch Take(); 252 253 //! Get the current read cursor. 254 //! \return Number of characters read from start. 255 size_t Tell(); 256 257 //! Begin writing operation at the current read pointer. 258 //! \return The begin writer pointer. 259 Ch* PutBegin(); 260 261 //! Write a character. 262 void Put(Ch c); 263 264 //! Flush the buffer. 265 void Flush(); 266 267 //! End the writing operation. 268 //! \param begin The begin write pointer returned by PutBegin(). 269 //! \return Number of characters written. 270 size_t PutEnd(Ch* begin); 271 } 272 ~~~~~~~~~~ 273 274 For input stream, they must implement `Peek()`, `Take()` and `Tell()`. 275 For output stream, they must implement `Put()` and `Flush()`. 276 There are two special interface, `PutBegin()` and `PutEnd()`, which are only for *in situ* parsing. Normal streams do not implement them. However, if the interface is not needed for a particular stream, it is still need to a dummy implementation, otherwise will generate compilation error. 277 278 ## Example: istream wrapper {#ExampleIStreamWrapper} 279 280 The following example is a wrapper of `std::istream`, which only implements 3 functions. 281 282 ~~~~~~~~~~cpp 283 class IStreamWrapper { 284 public: 285 typedef char Ch; 286 287 IStreamWrapper(std::istream& is) : is_(is) { 288 } 289 290 Ch Peek() const { // 1 291 int c = is_.peek(); 292 return c == std::char_traits<char>::eof() ? '\0' : (Ch)c; 293 } 294 295 Ch Take() { // 2 296 int c = is_.get(); 297 return c == std::char_traits<char>::eof() ? '\0' : (Ch)c; 298 } 299 300 size_t Tell() const { return (size_t)is_.tellg(); } // 3 301 302 Ch* PutBegin() { assert(false); return 0; } 303 void Put(Ch) { assert(false); } 304 void Flush() { assert(false); } 305 size_t PutEnd(Ch*) { assert(false); return 0; } 306 307 private: 308 IStreamWrapper(const IStreamWrapper&); 309 IStreamWrapper& operator=(const IStreamWrapper&); 310 311 std::istream& is_; 312 }; 313 ~~~~~~~~~~ 314 315 User can use it to wrap instances of `std::stringstream`, `std::ifstream`. 316 317 ~~~~~~~~~~cpp 318 const char* json = "[1,2,3,4]"; 319 std::stringstream ss(json); 320 IStreamWrapper is(ss); 321 322 Document d; 323 d.ParseStream(is); 324 ~~~~~~~~~~ 325 326 Note that, this implementation may not be as efficient as RapidJSON's memory or file streams, due to internal overheads of the standard library. 327 328 ## Example: ostream wrapper {#ExampleOStreamWrapper} 329 330 The following example is a wrapper of `std::istream`, which only implements 2 functions. 331 332 ~~~~~~~~~~cpp 333 class OStreamWrapper { 334 public: 335 typedef char Ch; 336 337 OStreamWrapper(std::ostream& os) : os_(os) { 338 } 339 340 Ch Peek() const { assert(false); return '\0'; } 341 Ch Take() { assert(false); return '\0'; } 342 size_t Tell() const { } 343 344 Ch* PutBegin() { assert(false); return 0; } 345 void Put(Ch c) { os_.put(c); } // 1 346 void Flush() { os_.flush(); } // 2 347 size_t PutEnd(Ch*) { assert(false); return 0; } 348 349 private: 350 OStreamWrapper(const OStreamWrapper&); 351 OStreamWrapper& operator=(const OStreamWrapper&); 352 353 std::ostream& os_; 354 }; 355 ~~~~~~~~~~ 356 357 User can use it to wrap instances of `std::stringstream`, `std::ofstream`. 358 359 ~~~~~~~~~~cpp 360 Document d; 361 // ... 362 363 std::stringstream ss; 364 OSStreamWrapper os(ss); 365 366 Writer<OStreamWrapper> writer(os); 367 d.Accept(writer); 368 ~~~~~~~~~~ 369 370 Note that, this implementation may not be as efficient as RapidJSON's memory or file streams, due to internal overheads of the standard library. 371 372 # Summary {#Summary} 373 374 This section describes stream classes available in RapidJSON. Memory streams are simple. File stream can reduce the memory required during JSON parsing and generation, if the JSON is stored in file system. Encoded streams converts between byte streams and character streams. Finally, user may create custom streams using a simple interface. 375