Home | History | Annotate | Download | only in sources
      1 <?xml version='1.0' encoding='utf-8' ?>
      2 <!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [
      3 <!ENTITY % BOOK_ENTITIES SYSTEM "Wayland.ent">
      4 %BOOK_ENTITIES;
      5 ]>
      6 <chapter id="chap-Protocol">
      7   <title>Wayland Protocol and Model of Operation</title>
      8   <section id="sect-Protocol-Basic-Principles">
      9     <title>Basic Principles</title>
     10     <para>
     11       The Wayland protocol is an asynchronous object oriented protocol.  All
     12       requests are method invocations on some object.  The requests include
     13       an object ID that uniquely identifies an object on the server.  Each
     14       object implements an interface and the requests include an opcode that
     15       identifies which method in the interface to invoke.
     16     </para>
     17     <para>
     18       The protocol is message-based.  A message sent by a client to the server
     19       is called request.  A message from the server to a client is called event.
     20       A message has a number of arguments, each of which has a certain type (see
     21       <xref linkend="sect-Protocol-Wire-Format"/> for a list of argument types).
     22     </para>
     23     <para>
     24       Additionally, the protocol can specify <type>enum</type>s which associate
     25       names to specific numeric enumeration values.  These are primarily just
     26       descriptive in nature: at the wire format level enums are just integers.
     27       But they also serve a secondary purpose to enhance type safety or
     28       otherwise add context for use in language bindings or other such code.
     29       This latter usage is only supported so long as code written before these
     30       attributes were introduced still works after; in other words, adding an
     31       enum should not break API, otherwise it puts backwards compatibility at
     32       risk.
     33     </para>
     34     <para>
     35       <type>enum</type>s can be defined as just a set of integers, or as
     36       bitfields.  This is specified via the <type>bitfield</type> boolean
     37       attribute in the <type>enum</type> definition.  If this attribute is true,
     38       the enum is intended to be accessed primarily using bitwise operations,
     39       for example when arbitrarily many choices of the enum can be ORed
     40       together; if it is false, or the attribute is omitted, then the enum
     41       arguments are a just a sequence of numerical values.
     42     </para>
     43     <para>
     44       The <type>enum</type> attribute can be used on either <type>uint</type>
     45       or <type>int</type> arguments, however if the <type>enum</type> is
     46       defined as a <type>bitfield</type>, it can only be used on
     47       <type>uint</type> args.
     48     </para>
     49     <para>
     50       The server sends back events to the client, each event is emitted from
     51       an object.  Events can be error conditions.  The event includes the
     52       object ID and the event opcode, from which the client can determine
     53       the type of event.  Events are generated both in response to requests
     54       (in which case the request and the event constitutes a round trip) or
     55       spontaneously when the server state changes.
     56     </para>
     57     <para>
     58       <itemizedlist>
     59 	<listitem>
     60 	  <para>
     61 	    State is broadcast on connect, events are sent
     62 	    out when state changes. Clients must listen for
     63 	    these changes and cache the state.
     64 	    There is no need (or mechanism) to query server state.
     65 	  </para>
     66 	</listitem>
     67 	<listitem>
     68 	  <para>
     69 	    The server will broadcast the presence of a number of global objects,
     70 	    which in turn will broadcast their current state.
     71 	  </para>
     72 	</listitem>
     73       </itemizedlist>
     74     </para>
     75   </section>
     76   <section id="sect-Protocol-Code-Generation">
     77     <title>Code Generation</title>
     78     <para>
     79       The interfaces, requests and events are defined in
     80       <filename>protocol/wayland.xml</filename>.
     81       This xml is used to generate the function prototypes that can be used by
     82       clients and compositors.
     83     </para>
     84     <para>
     85       The protocol entry points are generated as inline functions which just
     86       wrap the <function>wl_proxy_*</function> functions.  The inline functions aren't
     87       part of the library ABI and language bindings should generate their
     88       own stubs for the protocol entry points from the xml.
     89     </para>
     90   </section>
     91   <section id="sect-Protocol-Wire-Format">
     92     <title>Wire Format</title>
     93     <para>
     94       The protocol is sent over a UNIX domain stream socket, where the endpoint
     95       usually is named <systemitem class="service">wayland-0</systemitem>
     96       (although it can be changed via <emphasis>WAYLAND_DISPLAY</emphasis>
     97       in the environment).
     98     </para>
     99     <para>
    100       Every message is structured as 32-bit words; values are represented in the
    101       host's byte-order.  The message header has 2 words in it:
    102       <itemizedlist>
    103 	<listitem>
    104 	  <para>
    105 	    The first word is the sender's object ID (32-bit).
    106 	  </para>
    107 	</listitem>
    108 	<listitem>
    109 	  <para>
    110 	    The second has 2 parts of 16-bit.  The upper 16-bits are the message
    111 	    size in bytes, starting at the header (i.e. it has a minimum value of 8).The lower is the request/event opcode.
    112 	  </para>
    113 	</listitem>
    114       </itemizedlist>
    115       The payload describes the request/event arguments.  Every argument is always
    116       aligned to 32-bits. Where padding is required, the value of padding bytes is
    117       undefined. There is no prefix that describes the type, but it is
    118       inferred implicitly from the xml specification.
    119     </para>
    120     <para>
    121 
    122       The representation of argument types are as follows:
    123       <variablelist>
    124 	<varlistentry>
    125 	  <term>int</term>
    126 	  <term>uint</term>
    127 	  <listitem>
    128 	    <para>
    129 	      The value is the 32-bit value of the signed/unsigned
    130 	      int.
    131 	    </para>
    132 	  </listitem>
    133 	</varlistentry>
    134 	<varlistentry>
    135 	  <term>fixed</term>
    136 	  <listitem>
    137 	    <para>
    138 	      Signed 24.8 decimal numbers. It is a signed decimal type which
    139 	      offers a sign bit, 23 bits of integer precision and 8 bits of
    140 	      decimal precision. This is exposed as an opaque struct with
    141 	      conversion helpers to and from double and int on the C API side.
    142 	    </para>
    143 	  </listitem>
    144 	</varlistentry>
    145 	<varlistentry>
    146 	  <term>string</term>
    147 	  <listitem>
    148 	    <para>
    149 	      Starts with an unsigned 32-bit length, followed by the
    150 	      string contents, including terminating null byte, then padding
    151 	      to a 32-bit boundary.
    152 	    </para>
    153 	  </listitem>
    154 	</varlistentry>
    155 	<varlistentry>
    156 	  <term>object</term>
    157 	  <listitem>
    158 	    <para>
    159 	      32-bit object ID.
    160 	    </para>
    161 	  </listitem>
    162 	</varlistentry>
    163 	<varlistentry>
    164 	  <term>new_id</term>
    165 	  <listitem>
    166 	    <para>
    167 	      The 32-bit object ID.  On requests, the client
    168 	      decides the ID.  The only events with <type>new_id</type> are
    169 	      advertisements of globals, and the server will use IDs below
    170 	      0x10000.
    171 	    </para>
    172 	  </listitem>
    173 	</varlistentry>
    174 	<varlistentry>
    175 	  <term>array</term>
    176 	  <listitem>
    177 	    <para>
    178 	      Starts with 32-bit array size in bytes, followed by the array
    179 	      contents verbatim, and finally padding to a 32-bit boundary.
    180 	    </para>
    181 	  </listitem>
    182 	</varlistentry>
    183 	<varlistentry>
    184 	  <term>fd</term>
    185 	  <listitem>
    186 	    <para>
    187 	      The file descriptor is not stored in the message buffer, but in
    188 	      the ancillary data of the UNIX domain socket message (msg_control).
    189 	    </para>
    190 	  </listitem>
    191 	</varlistentry>
    192       </variablelist>
    193     </para>
    194   </section>
    195   <xi:include href="ProtocolInterfaces.xml" xmlns:xi="http://www.w3.org/2001/XInclude"/>
    196   <section id="sect-Protocol-Versioning">
    197     <title>Versioning</title>
    198     <para>
    199       Every interface is versioned and every protocol object implements a
    200       particular version of its interface.  For global objects, the maximum
    201       version supported by the server is advertised with the global and the
    202       actual version of the created protocol object is determined by the
    203       version argument passed to wl_registry.bind().  For objects that are
    204       not globals, their version is inferred from the object that created
    205       them.
    206     </para>
    207     <para>
    208       In order to keep things sane, this has a few implications for
    209       interface versions:
    210       <itemizedlist>
    211 	<listitem>
    212 	  <para>
    213 	    The object creation hierarchy must be a tree.  Otherwise,
    214 	    infering object versions from the parent object becomes a much
    215 	    more difficult to properly track.
    216 	  </para>
    217 	</listitem>
    218 	<listitem>
    219 	  <para>
    220 	    When the version of an interface increases, so does the version
    221 	    of its parent (recursively until you get to a global interface)
    222 	  </para>
    223 	</listitem>
    224 	<listitem>
    225 	  <para>
    226 	    A global interface's version number acts like a counter for all
    227 	    of its child interfaces.  Whenever a child interface gets
    228 	    modified, the global parent's interface version number also
    229 	    increases (see above).  The child interface then takes on the
    230 	    same version number as the new version of its parent global
    231 	    interface.
    232 	  </para>
    233 	</listitem>
    234       </itemizedlist>
    235     </para>
    236     <para>
    237       To illustrate the above, consider the wl_compositor interface.  It
    238       has two children, wl_surface and wl_region.  As of wayland version
    239       1.2, wl_surface and wl_compositor are both at version 3.  If
    240       something is added to the wl_region interface, both wl_region and
    241       wl_compositor will get bumpped to version 4.  If, afterwards,
    242       wl_surface is changed, both wl_compositor and wl_surface will be at
    243       version 5.  In this way the global interface version is used as a
    244       sort of "counter" for all of its child interfaces.  This makes it
    245       very simple to know the version of the child given the version of its
    246       parent.  The child is at the highest possible interface version that
    247       is less than or equal to its parent's version.
    248     </para>
    249     <para>
    250       It is worth noting a particular exception to the above versioning
    251       scheme.  The wl_display (and, by extension, wl_registry) interface
    252       cannot change because it is the core protocol object and its version
    253       is never advertised nor is there a mechanism to request a different
    254       version.
    255     </para>
    256   </section>
    257   <section id="sect-Protocol-Connect-Time">
    258     <title>Connect Time</title>
    259     <para>
    260       There is no fixed connection setup information, the server emits
    261       multiple events at connect time, to indicate the presence and
    262       properties of global objects: outputs, compositor, input devices.
    263     </para>
    264   </section>
    265   <section id="sect-Protocol-Security-and-Authentication">
    266     <title>Security and Authentication</title>
    267     <para>
    268       <itemizedlist>
    269 	<listitem>
    270 	  <para>
    271 	    mostly about access to underlying buffers, need new drm auth
    272 	    mechanism (the grant-to ioctl idea), need to check the cmd stream?
    273 	  </para>
    274 	</listitem>
    275 	<listitem>
    276 	  <para>
    277 	    getting the server socket depends on the compositor type, could
    278 	    be a system wide name, through fd passing on the session dbus.
    279 	    or the client is forked by the compositor and the fd is
    280 	    already opened.
    281 	  </para>
    282 	</listitem>
    283       </itemizedlist>
    284     </para>
    285   </section>
    286   <section id="sect-Protocol-Creating-Objects">
    287     <title>Creating Objects</title>
    288     <para>
    289       Each object has a unique ID.  The IDs are allocated by the entity
    290       creating the object (either client or server).  IDs allocated by the
    291       client are in the range [1, 0xfeffffff] while IDs allocated by the
    292       server are in the range [0xff000000, 0xffffffff].  The 0 ID is
    293       reserved to represent a null or non-existant object.
    294 
    295       For efficiency purposes, the IDs are densely packed in the sense that
    296       the ID N will not be used until N-1 has been used.  Any ID allocation
    297       algorithm that does not maintain this property is incompatible with
    298       the implementation in libwayland.
    299     </para>
    300   </section>
    301   <section id="sect-Protocol-Compositor">
    302     <title>Compositor</title>
    303     <para>
    304       The compositor is a global object, advertised at connect time.
    305     </para>
    306     <para>
    307       See <xref linkend="protocol-spec-wl_compositor"/> for the
    308       protocol description.
    309     </para>
    310   </section>
    311   <section id="sect-Protocol-Surface">
    312     <title>Surfaces</title>
    313     <para>
    314       A surface manages a rectangular grid of pixels that clients create
    315       for displaying their content to the screen.  Clients don't know
    316       the global position of their surfaces, and cannot access other
    317       clients' surfaces.
    318     </para>
    319     <para>
    320       Once the client has finished writing pixels, it 'commits' the
    321       buffer; this permits the compositor to access the buffer and read
    322       the pixels.  When the compositor is finished, it releases the
    323       buffer back to the client.
    324     </para>
    325     <para>
    326       See <xref linkend="protocol-spec-wl_surface"/> for the protocol
    327       description.
    328     </para>
    329   </section>
    330   <section id="sect-Protocol-Input">
    331     <title>Input</title>
    332     <para>
    333       A seat represents a group of input devices including mice,
    334       keyboards and touchscreens. It has a keyboard and pointer
    335       focus. Seats are global objects. Pointer events are delivered
    336       in surface-local coordinates.
    337     </para>
    338     <para>
    339       The compositor maintains an implicit grab when a button is
    340       pressed, to ensure that the corresponding button release
    341       event gets delivered to the same surface. But there is no way
    342       for clients to take an explicit grab. Instead, surfaces can
    343       be mapped as 'popup', which combines transient window semantics
    344       with a pointer grab.
    345     </para>
    346     <para>
    347       To avoid race conditions, input events that are likely to
    348       trigger further requests (such as button presses, key events,
    349       pointer motions) carry serial numbers, and requests such as
    350       wl_surface.set_popup require that the serial number of the
    351       triggering event is specified. The server maintains a
    352       monotonically increasing counter for these serial numbers.
    353     </para>
    354     <para>
    355       Input events also carry timestamps with millisecond granularity.
    356       Their base is undefined, so they can't be compared against
    357       system time (as obtained with clock_gettime or gettimeofday).
    358       They can be compared with each other though, and for instance
    359       be used to identify sequences of button presses as double
    360       or triple clicks.
    361     </para>
    362     <para>
    363       See <xref linkend="protocol-spec-wl_seat"/> for the
    364       protocol description.
    365     </para>
    366     <para>
    367       Talk about:
    368 
    369       <itemizedlist>
    370 	<listitem>
    371 	  <para>
    372 	    keyboard map, change events
    373 	  </para>
    374 	</listitem>
    375 	<listitem>
    376 	  <para>
    377 	    xkb on Wayland
    378 	  </para>
    379 	</listitem>
    380 	<listitem>
    381 	  <para>
    382 	    multi pointer Wayland
    383 	  </para>
    384 	</listitem>
    385       </itemizedlist>
    386     </para>
    387     <para>
    388       A surface can change the pointer image when the surface is the pointer
    389       focus of the input device.  Wayland doesn't automatically change the
    390       pointer image when a pointer enters a surface, but expects the
    391       application to set the cursor it wants in response to the pointer
    392       focus and motion events.  The rationale is that a client has to manage
    393       changing pointer images for UI elements within the surface in response
    394       to motion events anyway, so we'll make that the only mechanism for
    395       setting or changing the pointer image.  If the server receives a request
    396       to set the pointer image after the surface loses pointer focus, the
    397       request is ignored.  To the client this will look like it successfully
    398       set the pointer image.
    399     </para>
    400     <para>
    401       The compositor will revert the pointer image back to a default image
    402       when no surface has the pointer focus for that device.  Clients can
    403       revert the pointer image back to the default image by setting a NULL
    404       image.
    405     </para>
    406     <para>
    407       What if the pointer moves from one window which has set a special
    408       pointer image to a surface that doesn't set an image in response to
    409       the motion event?  The new surface will be stuck with the special
    410       pointer image.  We can't just revert the pointer image on leaving a
    411       surface, since if we immediately enter a surface that sets a different
    412       image, the image will flicker.  Broken app, I suppose.
    413     </para>
    414   </section>
    415   <section id="sect-Protocol-Output">
    416     <title>Output</title>
    417     <para>
    418       An output is a global object, advertised at connect time or as it
    419       comes and goes.
    420     </para>
    421     <para>
    422       See <xref linkend="protocol-spec-wl_output"/> for the protocol
    423       description.
    424     </para>
    425     <para>
    426     </para>
    427     <itemizedlist>
    428       <listitem>
    429 	<para>
    430 	  laid out in a big (compositor) coordinate system
    431 	</para>
    432       </listitem>
    433       <listitem>
    434 	<para>
    435 	  basically xrandr over Wayland
    436 	</para>
    437       </listitem>
    438       <listitem>
    439 	<para>
    440 	  geometry needs position in compositor coordinate system
    441 	</para>
    442       </listitem>
    443       <listitem>
    444 	<para>
    445 	  events to advertise available modes, requests to move and change
    446 	  modes
    447 	</para>
    448       </listitem>
    449     </itemizedlist>
    450   </section>
    451   <section id="sect-Protocol-data-sharing">
    452     <title>Data sharing between clients</title>
    453     <para>
    454       The Wayland protocol provides clients a mechanism for sharing
    455       data that allows the implementation of copy-paste and
    456       drag-and-drop. The client providing the data creates a
    457       <function>wl_data_source</function> object and the clients
    458       obtaining the data will see it as <function>wl_data_offer</function>
    459       object. This interface allows the clients to agree on a mutually
    460       supported mime type and transfer the data via a file descriptor
    461       that is passed through the protocol.
    462     </para>
    463     <para>
    464       The next section explains the negotiation between data source and
    465       data offer objects. <xref linkend="sect-Protocol-data-sharing-devices"/>
    466       explains how these objects are created and passed to different
    467       clients using the <function>wl_data_device</function> interface
    468       that implements copy-paste and drag-and-drop support.
    469     </para>
    470     <para>
    471       See <xref linkend="protocol-spec-wl_data_offer"/>,
    472       <xref linkend="protocol-spec-wl_data_source"/>,
    473       <xref linkend="protocol-spec-wl_data_device"/> and
    474       <xref linkend="protocol-spec-wl_data_device_manager"/> for
    475       protocol descriptions.
    476     </para>
    477     <para>
    478       MIME is defined in RFC's 2045-2049. A
    479       <ulink url="ftp://ftp.isi.edu/in-notes/iana/assignments/media-types/">
    480       registry of MIME types</ulink> is maintained by the Internet Assigned
    481       Numbers Authority (IANA).
    482     </para>
    483     <section>
    484       <title>Data negotiation</title>
    485       <para>
    486 	A client providing data to other clients will create a <function>wl_data_source</function>
    487 	object and advertise the mime types for the formats it supports for
    488 	that data through the <function>wl_data_source.offer</function>
    489 	request. On the receiving end, the data offer object will generate one
    490 	<function>wl_data_offer.offer</function> event for each supported mime
    491 	type.
    492       </para>
    493       <para>
    494 	The actual data transfer happens when the receiving client sends a
    495 	<function>wl_data_offer.receive</function> request. This request takes
    496 	a mime type and a file descriptor as arguments. This request will generate a
    497 	<function>wl_data_source.send</function> event on the sending client
    498 	with the same arguments, and the latter client is expected to write its
    499 	data to the given file descriptor using the chosen mime type.
    500       </para>
    501     </section>
    502     <section id="sect-Protocol-data-sharing-devices">
    503       <title>Data devices</title>
    504       <para>
    505 	Data devices glue data sources and offers together. A data device is
    506 	associated with a <function>wl_seat</function> and is obtained by the clients using the
    507 	<function>wl_data_device_manager</function> factory object, which is also responsible for
    508 	creating data sources.
    509       </para>
    510       <para>
    511 	Clients are informed of new data offers through the
    512 	<function>wl_data_device.data_offer</function> event. After this
    513 	event is generated the data offer will advertise the available mime
    514 	types. New data offers are introduced prior to their use for
    515 	copy-paste or drag-and-drop.
    516       </para>
    517       <section>
    518 	<title>Selection</title>
    519 	<para>
    520 	  Each data device has a selection data source. Clients create a data
    521 	  source object using the device manager and may set it as the
    522 	  current selection for a given data device. Whenever the current
    523 	  selection changes, the client with keyboard focus receives a
    524 	  <function>wl_data_device.selection</function> event. This event is
    525 	  also generated on a client immediately before it receives keyboard
    526 	  focus.
    527 	</para>
    528 	<para>
    529 	  The data offer is introduced with
    530 	  <function>wl_data_device.data_offer</function> event before the
    531 	  selection event.
    532 	</para>
    533       </section>
    534       <section>
    535 	<title>Drag and Drop</title>
    536 	<para>
    537 	  A drag-and-drop operation is started using the
    538 	  <function>wl_data_device.start_drag</function> request. This
    539 	  requests causes a pointer grab that will generate enter, motion and
    540 	  leave events on the data device. A data source is supplied as
    541 	  argument to start_drag, and data offers associated with it are
    542 	  supplied to clients surfaces under the pointer in the
    543 	  <function>wl_data_device.enter</function> event. The data offer
    544 	  is introduced to the client prior to the enter event with the
    545 	  <function>wl_data_device.data_offer</function> event.
    546 	</para>
    547 	<para>
    548 	  Clients are expected to provide feedback to the data sending client
    549 	  by calling the <function>wl_data_offer.accept</function> request with
    550 	  a mime type it accepts. If none of the advertised mime types is
    551 	  supported by the receiving client, it should supply NULL to the
    552 	  accept request. The accept request causes the sending client to
    553 	  receive a <function>wl_data_source.target</function> event with the
    554 	  chosen mime type.
    555 	</para>
    556 	<para>
    557 	  When the drag ends, the receiving client receives a
    558 	  <function>wl_data_device.drop</function> event at which it is expected
    559 	  to transfer the data using the
    560 	  <function>wl_data_offer.receive</function> request.
    561 	</para>
    562       </section>
    563     </section>
    564   </section>
    565 </chapter>
    566