1 ====================== 2 Nanopb: Basic concepts 3 ====================== 4 5 .. include :: menu.rst 6 7 The things outlined here are the underlying concepts of the nanopb design. 8 9 .. contents:: 10 11 Proto files 12 =========== 13 All Protocol Buffers implementations use .proto files to describe the message 14 format. The point of these files is to be a portable interface description 15 language. 16 17 Compiling .proto files for nanopb 18 --------------------------------- 19 Nanopb uses the Google's protoc compiler to parse the .proto file, and then a 20 python script to generate the C header and source code from it:: 21 22 user@host:~$ protoc -omessage.pb message.proto 23 user@host:~$ python ../generator/nanopb_generator.py message.pb 24 Writing to message.h and message.c 25 user@host:~$ 26 27 Modifying generator behaviour 28 ----------------------------- 29 Using generator options, you can set maximum sizes for fields in order to 30 allocate them statically. The preferred way to do this is to create an .options 31 file with the same name as your .proto file:: 32 33 # Foo.proto 34 message Foo { 35 required string name = 1; 36 } 37 38 :: 39 40 # Foo.options 41 Foo.name max_size:16 42 43 For more information on this, see the `Proto file options`_ section in the 44 reference manual. 45 46 .. _`Proto file options`: reference.html#proto-file-options 47 48 Streams 49 ======= 50 51 Nanopb uses streams for accessing the data in encoded format. 52 The stream abstraction is very lightweight, and consists of a structure (*pb_ostream_t* or *pb_istream_t*) which contains a pointer to a callback function. 53 54 There are a few generic rules for callback functions: 55 56 #) Return false on IO errors. The encoding or decoding process will abort immediately. 57 #) Use state to store your own data, such as a file descriptor. 58 #) *bytes_written* and *bytes_left* are updated by pb_write and pb_read. 59 #) Your callback may be used with substreams. In this case *bytes_left*, *bytes_written* and *max_size* have smaller values than the original stream. Don't use these values to calculate pointers. 60 #) Always read or write the full requested length of data. For example, POSIX *recv()* needs the *MSG_WAITALL* parameter to accomplish this. 61 62 Output streams 63 -------------- 64 65 :: 66 67 struct _pb_ostream_t 68 { 69 bool (*callback)(pb_ostream_t *stream, const uint8_t *buf, size_t count); 70 void *state; 71 size_t max_size; 72 size_t bytes_written; 73 }; 74 75 The *callback* for output stream may be NULL, in which case the stream simply counts the number of bytes written. In this case, *max_size* is ignored. 76 77 Otherwise, if *bytes_written* + bytes_to_be_written is larger than *max_size*, pb_write returns false before doing anything else. If you don't want to limit the size of the stream, pass SIZE_MAX. 78 79 **Example 1:** 80 81 This is the way to get the size of the message without storing it anywhere:: 82 83 Person myperson = ...; 84 pb_ostream_t sizestream = {0}; 85 pb_encode(&sizestream, Person_fields, &myperson); 86 printf("Encoded size is %d\n", sizestream.bytes_written); 87 88 **Example 2:** 89 90 Writing to stdout:: 91 92 bool callback(pb_ostream_t *stream, const uint8_t *buf, size_t count) 93 { 94 FILE *file = (FILE*) stream->state; 95 return fwrite(buf, 1, count, file) == count; 96 } 97 98 pb_ostream_t stdoutstream = {&callback, stdout, SIZE_MAX, 0}; 99 100 Input streams 101 ------------- 102 For input streams, there is one extra rule: 103 104 #) You don't need to know the length of the message in advance. After getting EOF error when reading, set bytes_left to 0 and return false. Pb_decode will detect this and if the EOF was in a proper position, it will return true. 105 106 Here is the structure:: 107 108 struct _pb_istream_t 109 { 110 bool (*callback)(pb_istream_t *stream, uint8_t *buf, size_t count); 111 void *state; 112 size_t bytes_left; 113 }; 114 115 The *callback* must always be a function pointer. *Bytes_left* is an upper limit on the number of bytes that will be read. You can use SIZE_MAX if your callback handles EOF as described above. 116 117 **Example:** 118 119 This function binds an input stream to stdin: 120 121 :: 122 123 bool callback(pb_istream_t *stream, uint8_t *buf, size_t count) 124 { 125 FILE *file = (FILE*)stream->state; 126 bool status; 127 128 if (buf == NULL) 129 { 130 while (count-- && fgetc(file) != EOF); 131 return count == 0; 132 } 133 134 status = (fread(buf, 1, count, file) == count); 135 136 if (feof(file)) 137 stream->bytes_left = 0; 138 139 return status; 140 } 141 142 pb_istream_t stdinstream = {&callback, stdin, SIZE_MAX}; 143 144 Data types 145 ========== 146 147 Most Protocol Buffers datatypes have directly corresponding C datatypes, such as int32 is int32_t, float is float and bool is bool. However, the variable-length datatypes are more complex: 148 149 1) Strings, bytes and repeated fields of any type map to callback functions by default. 150 2) If there is a special option *(nanopb).max_size* specified in the .proto file, string maps to null-terminated char array and bytes map to a structure containing a char array and a size field. 151 3) If there is a special option *(nanopb).max_count* specified on a repeated field, it maps to an array of whatever type is being repeated. Another field will be created for the actual number of entries stored. 152 153 =============================================================================== ======================= 154 field in .proto autogenerated in .h 155 =============================================================================== ======================= 156 required string name = 1; pb_callback_t name; 157 required string name = 1 [(nanopb).max_size = 40]; char name[40]; 158 repeated string name = 1 [(nanopb).max_size = 40]; pb_callback_t name; 159 repeated string name = 1 [(nanopb).max_size = 40, (nanopb).max_count = 5]; | size_t name_count; 160 | char name[5][40]; 161 required bytes data = 1 [(nanopb).max_size = 40]; | typedef struct { 162 | size_t size; 163 | uint8_t bytes[40]; 164 | } Person_data_t; 165 | Person_data_t data; 166 =============================================================================== ======================= 167 168 The maximum lengths are checked in runtime. If string/bytes/array exceeds the allocated length, *pb_decode* will return false. 169 170 Note: for the *bytes* datatype, the field length checking may not be exact. 171 The compiler may add some padding to the *pb_bytes_t* structure, and the nanopb runtime doesn't know how much of the structure size is padding. Therefore it uses the whole length of the structure for storing data, which is not very smart but shouldn't cause problems. In practise, this means that if you specify *(nanopb).max_size=5* on a *bytes* field, you may be able to store 6 bytes there. For the *string* field type, the length limit is exact. 172 173 Field callbacks 174 =============== 175 When a field has dynamic length, nanopb cannot statically allocate storage for it. Instead, it allows you to handle the field in whatever way you want, using a callback function. 176 177 The `pb_callback_t`_ structure contains a function pointer and a *void* pointer called *arg* you can use for passing data to the callback. If the function pointer is NULL, the field will be skipped. A pointer to the *arg* is passed to the function, so that it can modify it and retrieve the value. 178 179 The actual behavior of the callback function is different in encoding and decoding modes. In encoding mode, the callback is called once and should write out everything, including field tags. In decoding mode, the callback is called repeatedly for every data item. 180 181 .. _`pb_callback_t`: reference.html#pb-callback-t 182 183 Encoding callbacks 184 ------------------ 185 :: 186 187 bool (*encode)(pb_ostream_t *stream, const pb_field_t *field, void * const *arg); 188 189 When encoding, the callback should write out complete fields, including the wire type and field number tag. It can write as many or as few fields as it likes. For example, if you want to write out an array as *repeated* field, you should do it all in a single call. 190 191 Usually you can use `pb_encode_tag_for_field`_ to encode the wire type and tag number of the field. However, if you want to encode a repeated field as a packed array, you must call `pb_encode_tag`_ instead to specify a wire type of *PB_WT_STRING*. 192 193 If the callback is used in a submessage, it will be called multiple times during a single call to `pb_encode`_. In this case, it must produce the same amount of data every time. If the callback is directly in the main message, it is called only once. 194 195 .. _`pb_encode`: reference.html#pb-encode 196 .. _`pb_encode_tag_for_field`: reference.html#pb-encode-tag-for-field 197 .. _`pb_encode_tag`: reference.html#pb-encode-tag 198 199 This callback writes out a dynamically sized string:: 200 201 bool write_string(pb_ostream_t *stream, const pb_field_t *field, void * const *arg) 202 { 203 char *str = get_string_from_somewhere(); 204 if (!pb_encode_tag_for_field(stream, field)) 205 return false; 206 207 return pb_encode_string(stream, (uint8_t*)str, strlen(str)); 208 } 209 210 Decoding callbacks 211 ------------------ 212 :: 213 214 bool (*decode)(pb_istream_t *stream, const pb_field_t *field, void **arg); 215 216 When decoding, the callback receives a length-limited substring that reads the contents of a single field. The field tag has already been read. For *string* and *bytes*, the length value has already been parsed, and is available at *stream->bytes_left*. 217 218 The callback will be called multiple times for repeated fields. For packed fields, you can either read multiple values until the stream ends, or leave it to `pb_decode`_ to call your function over and over until all values have been read. 219 220 .. _`pb_decode`: reference.html#pb-decode 221 222 This callback reads multiple integers and prints them:: 223 224 bool read_ints(pb_istream_t *stream, const pb_field_t *field, void **arg) 225 { 226 while (stream->bytes_left) 227 { 228 uint64_t value; 229 if (!pb_decode_varint(stream, &value)) 230 return false; 231 printf("%lld\n", value); 232 } 233 return true; 234 } 235 236 Field description array 237 ======================= 238 239 For using the *pb_encode* and *pb_decode* functions, you need an array of pb_field_t constants describing the structure you wish to encode. This description is usually autogenerated from .proto file. 240 241 For example this submessage in the Person.proto file:: 242 243 message Person { 244 message PhoneNumber { 245 required string number = 1 [(nanopb).max_size = 40]; 246 optional PhoneType type = 2 [default = HOME]; 247 } 248 } 249 250 generates this field description array for the structure *Person_PhoneNumber*:: 251 252 const pb_field_t Person_PhoneNumber_fields[3] = { 253 PB_FIELD( 1, STRING , REQUIRED, STATIC, Person_PhoneNumber, number, number, 0), 254 PB_FIELD( 2, ENUM , OPTIONAL, STATIC, Person_PhoneNumber, type, number, &Person_PhoneNumber_type_default), 255 PB_LAST_FIELD 256 }; 257 258 259 Extension fields 260 ================ 261 Protocol Buffers supports a concept of `extension fields`_, which are 262 additional fields to a message, but defined outside the actual message. 263 The definition can even be in a completely separate .proto file. 264 265 The base message is declared as extensible by keyword *extensions* in 266 the .proto file:: 267 268 message MyMessage { 269 .. fields .. 270 extensions 100 to 199; 271 } 272 273 For each extensible message, *nanopb_generator.py* declares an additional 274 callback field called *extensions*. The field and associated datatype 275 *pb_extension_t* forms a linked list of handlers. When an unknown field is 276 encountered, the decoder calls each handler in turn until either one of them 277 handles the field, or the list is exhausted. 278 279 The actual extensions are declared using the *extend* keyword in the .proto, 280 and are in the global namespace:: 281 282 extend MyMessage { 283 optional int32 myextension = 100; 284 } 285 286 For each extension, *nanopb_generator.py* creates a constant of type 287 *pb_extension_type_t*. To link together the base message and the extension, 288 you have to: 289 290 1. Allocate storage for your field, matching the datatype in the .proto. 291 For example, for a *int32* field, you need a *int32_t* variable to store 292 the value. 293 2. Create a *pb_extension_t* constant, with pointers to your variable and 294 to the generated *pb_extension_type_t*. 295 3. Set the *message.extensions* pointer to point to the *pb_extension_t*. 296 297 An example of this is available in *tests/test_encode_extensions.c* and 298 *tests/test_decode_extensions.c*. 299 300 .. _`extension fields`: https://developers.google.com/protocol-buffers/docs/proto#extensions 301 302 303 Return values and error handling 304 ================================ 305 306 Most functions in nanopb return bool: *true* means success, *false* means failure. There is also some support for error messages for debugging purposes: the error messages go in *stream->errmsg*. 307 308 The error messages help in guessing what is the underlying cause of the error. The most common error conditions are: 309 310 1) Running out of memory, i.e. stack overflow. 311 2) Invalid field descriptors (would usually mean a bug in the generator). 312 3) IO errors in your own stream callbacks. 313 4) Errors that happen in your callback functions. 314 5) Exceeding the max_size or bytes_left of a stream. 315 6) Exceeding the max_size of a string or array field 316 7) Invalid protocol buffers binary message. 317