Maximilian Luz | 8d77928 | 2020-12-21 19:39:57 +0100 | [diff] [blame] | 1 | .. SPDX-License-Identifier: GPL-2.0+ |
| 2 | |
| 3 | .. |ssh_ptl| replace:: :c:type:`struct ssh_ptl <ssh_ptl>` |
| 4 | .. |ssh_ptl_submit| replace:: :c:func:`ssh_ptl_submit` |
| 5 | .. |ssh_ptl_cancel| replace:: :c:func:`ssh_ptl_cancel` |
| 6 | .. |ssh_ptl_shutdown| replace:: :c:func:`ssh_ptl_shutdown` |
| 7 | .. |ssh_ptl_rx_rcvbuf| replace:: :c:func:`ssh_ptl_rx_rcvbuf` |
| 8 | .. |ssh_rtl| replace:: :c:type:`struct ssh_rtl <ssh_rtl>` |
| 9 | .. |ssh_rtl_submit| replace:: :c:func:`ssh_rtl_submit` |
| 10 | .. |ssh_rtl_cancel| replace:: :c:func:`ssh_rtl_cancel` |
| 11 | .. |ssh_rtl_shutdown| replace:: :c:func:`ssh_rtl_shutdown` |
| 12 | .. |ssh_packet| replace:: :c:type:`struct ssh_packet <ssh_packet>` |
| 13 | .. |ssh_packet_get| replace:: :c:func:`ssh_packet_get` |
| 14 | .. |ssh_packet_put| replace:: :c:func:`ssh_packet_put` |
| 15 | .. |ssh_packet_ops| replace:: :c:type:`struct ssh_packet_ops <ssh_packet_ops>` |
| 16 | .. |ssh_packet_base_priority| replace:: :c:type:`enum ssh_packet_base_priority <ssh_packet_base_priority>` |
| 17 | .. |ssh_packet_flags| replace:: :c:type:`enum ssh_packet_flags <ssh_packet_flags>` |
| 18 | .. |SSH_PACKET_PRIORITY| replace:: :c:func:`SSH_PACKET_PRIORITY` |
| 19 | .. |ssh_frame| replace:: :c:type:`struct ssh_frame <ssh_frame>` |
| 20 | .. |ssh_command| replace:: :c:type:`struct ssh_command <ssh_command>` |
| 21 | .. |ssh_request| replace:: :c:type:`struct ssh_request <ssh_request>` |
| 22 | .. |ssh_request_get| replace:: :c:func:`ssh_request_get` |
| 23 | .. |ssh_request_put| replace:: :c:func:`ssh_request_put` |
| 24 | .. |ssh_request_ops| replace:: :c:type:`struct ssh_request_ops <ssh_request_ops>` |
| 25 | .. |ssh_request_init| replace:: :c:func:`ssh_request_init` |
| 26 | .. |ssh_request_flags| replace:: :c:type:`enum ssh_request_flags <ssh_request_flags>` |
| 27 | .. |ssam_controller| replace:: :c:type:`struct ssam_controller <ssam_controller>` |
| 28 | .. |ssam_device| replace:: :c:type:`struct ssam_device <ssam_device>` |
| 29 | .. |ssam_device_driver| replace:: :c:type:`struct ssam_device_driver <ssam_device_driver>` |
| 30 | .. |ssam_client_bind| replace:: :c:func:`ssam_client_bind` |
| 31 | .. |ssam_client_link| replace:: :c:func:`ssam_client_link` |
| 32 | .. |ssam_request_sync| replace:: :c:type:`struct ssam_request_sync <ssam_request_sync>` |
| 33 | .. |ssam_event_registry| replace:: :c:type:`struct ssam_event_registry <ssam_event_registry>` |
| 34 | .. |ssam_event_id| replace:: :c:type:`struct ssam_event_id <ssam_event_id>` |
| 35 | .. |ssam_nf| replace:: :c:type:`struct ssam_nf <ssam_nf>` |
| 36 | .. |ssam_nf_refcount_inc| replace:: :c:func:`ssam_nf_refcount_inc` |
| 37 | .. |ssam_nf_refcount_dec| replace:: :c:func:`ssam_nf_refcount_dec` |
| 38 | .. |ssam_notifier_register| replace:: :c:func:`ssam_notifier_register` |
| 39 | .. |ssam_notifier_unregister| replace:: :c:func:`ssam_notifier_unregister` |
| 40 | .. |ssam_cplt| replace:: :c:type:`struct ssam_cplt <ssam_cplt>` |
| 41 | .. |ssam_event_queue| replace:: :c:type:`struct ssam_event_queue <ssam_event_queue>` |
| 42 | .. |ssam_request_sync_submit| replace:: :c:func:`ssam_request_sync_submit` |
| 43 | |
| 44 | ===================== |
| 45 | Core Driver Internals |
| 46 | ===================== |
| 47 | |
| 48 | Architectural overview of the Surface System Aggregator Module (SSAM) core |
| 49 | and Surface Serial Hub (SSH) driver. For the API documentation, refer to: |
| 50 | |
| 51 | .. toctree:: |
| 52 | :maxdepth: 2 |
| 53 | |
| 54 | internal-api |
| 55 | |
| 56 | |
| 57 | Overview |
| 58 | ======== |
| 59 | |
| 60 | The SSAM core implementation is structured in layers, somewhat following the |
| 61 | SSH protocol structure: |
| 62 | |
| 63 | Lower-level packet transport is implemented in the *packet transport layer |
| 64 | (PTL)*, directly building on top of the serial device (serdev) |
| 65 | infrastructure of the kernel. As the name indicates, this layer deals with |
| 66 | the packet transport logic and handles things like packet validation, packet |
| 67 | acknowledgment (ACKing), packet (retransmission) timeouts, and relaying |
| 68 | packet payloads to higher-level layers. |
| 69 | |
| 70 | Above this sits the *request transport layer (RTL)*. This layer is centered |
| 71 | around command-type packet payloads, i.e. requests (sent from host to EC), |
| 72 | responses of the EC to those requests, and events (sent from EC to host). |
| 73 | It, specifically, distinguishes events from request responses, matches |
| 74 | responses to their corresponding requests, and implements request timeouts. |
| 75 | |
| 76 | The *controller* layer is building on top of this and essentially decides |
| 77 | how request responses and, especially, events are dealt with. It provides an |
| 78 | event notifier system, handles event activation/deactivation, provides a |
| 79 | workqueue for event and asynchronous request completion, and also manages |
| 80 | the message counters required for building command messages (``SEQ``, |
| 81 | ``RQID``). This layer basically provides a fundamental interface to the SAM |
| 82 | EC for use in other kernel drivers. |
| 83 | |
| 84 | While the controller layer already provides an interface for other kernel |
| 85 | drivers, the client *bus* extends this interface to provide support for |
| 86 | native SSAM devices, i.e. devices that are not defined in ACPI and not |
| 87 | implemented as platform devices, via |ssam_device| and |ssam_device_driver| |
| 88 | simplify management of client devices and client drivers. |
| 89 | |
Mauro Carvalho Chehab | bbbaf22 | 2021-06-16 08:27:30 +0200 | [diff] [blame] | 90 | Refer to Documentation/driver-api/surface_aggregator/client.rst for |
| 91 | documentation regarding the client device/driver API and interface options |
| 92 | for other kernel drivers. It is recommended to familiarize oneself with |
| 93 | that chapter and the Documentation/driver-api/surface_aggregator/ssh.rst |
| 94 | before continuing with the architectural overview below. |
Maximilian Luz | 8d77928 | 2020-12-21 19:39:57 +0100 | [diff] [blame] | 95 | |
| 96 | |
| 97 | Packet Transport Layer |
| 98 | ====================== |
| 99 | |
| 100 | The packet transport layer is represented via |ssh_ptl| and is structured |
| 101 | around the following key concepts: |
| 102 | |
| 103 | Packets |
| 104 | ------- |
| 105 | |
| 106 | Packets are the fundamental transmission unit of the SSH protocol. They are |
| 107 | managed by the packet transport layer, which is essentially the lowest layer |
| 108 | of the driver and is built upon by other components of the SSAM core. |
| 109 | Packets to be transmitted by the SSAM core are represented via |ssh_packet| |
| 110 | (in contrast, packets received by the core do not have any specific |
| 111 | structure and are managed entirely via the raw |ssh_frame|). |
| 112 | |
| 113 | This structure contains the required fields to manage the packet inside the |
| 114 | transport layer, as well as a reference to the buffer containing the data to |
| 115 | be transmitted (i.e. the message wrapped in |ssh_frame|). Most notably, it |
| 116 | contains an internal reference count, which is used for managing its |
| 117 | lifetime (accessible via |ssh_packet_get| and |ssh_packet_put|). When this |
| 118 | counter reaches zero, the ``release()`` callback provided to the packet via |
| 119 | its |ssh_packet_ops| reference is executed, which may then deallocate the |
| 120 | packet or its enclosing structure (e.g. |ssh_request|). |
| 121 | |
| 122 | In addition to the ``release`` callback, the |ssh_packet_ops| reference also |
| 123 | provides a ``complete()`` callback, which is run once the packet has been |
| 124 | completed and provides the status of this completion, i.e. zero on success |
| 125 | or a negative errno value in case of an error. Once the packet has been |
| 126 | submitted to the packet transport layer, the ``complete()`` callback is |
| 127 | always guaranteed to be executed before the ``release()`` callback, i.e. the |
| 128 | packet will always be completed, either successfully, with an error, or due |
| 129 | to cancellation, before it will be released. |
| 130 | |
| 131 | The state of a packet is managed via its ``state`` flags |
| 132 | (|ssh_packet_flags|), which also contains the packet type. In particular, |
| 133 | the following bits are noteworthy: |
| 134 | |
| 135 | * ``SSH_PACKET_SF_LOCKED_BIT``: This bit is set when completion, either |
| 136 | through error or success, is imminent. It indicates that no further |
| 137 | references of the packet should be taken and any existing references |
| 138 | should be dropped as soon as possible. The process setting this bit is |
| 139 | responsible for removing any references to this packet from the packet |
| 140 | queue and pending set. |
| 141 | |
| 142 | * ``SSH_PACKET_SF_COMPLETED_BIT``: This bit is set by the process running the |
| 143 | ``complete()`` callback and is used to ensure that this callback only runs |
| 144 | once. |
| 145 | |
| 146 | * ``SSH_PACKET_SF_QUEUED_BIT``: This bit is set when the packet is queued on |
| 147 | the packet queue and cleared when it is dequeued. |
| 148 | |
| 149 | * ``SSH_PACKET_SF_PENDING_BIT``: This bit is set when the packet is added to |
| 150 | the pending set and cleared when it is removed from it. |
| 151 | |
| 152 | Packet Queue |
| 153 | ------------ |
| 154 | |
| 155 | The packet queue is the first of the two fundamental collections in the |
| 156 | packet transport layer. It is a priority queue, with priority of the |
| 157 | respective packets based on the packet type (major) and number of tries |
| 158 | (minor). See |SSH_PACKET_PRIORITY| for more details on the priority value. |
| 159 | |
| 160 | All packets to be transmitted by the transport layer must be submitted to |
| 161 | this queue via |ssh_ptl_submit|. Note that this includes control packets |
| 162 | sent by the transport layer itself. Internally, data packets can be |
| 163 | re-submitted to this queue due to timeouts or NAK packets sent by the EC. |
| 164 | |
| 165 | Pending Set |
| 166 | ----------- |
| 167 | |
| 168 | The pending set is the second of the two fundamental collections in the |
| 169 | packet transport layer. It stores references to packets that have already |
| 170 | been transmitted, but wait for acknowledgment (e.g. the corresponding ACK |
| 171 | packet) by the EC. |
| 172 | |
| 173 | Note that a packet may both be pending and queued if it has been |
| 174 | re-submitted due to a packet acknowledgment timeout or NAK. On such a |
| 175 | re-submission, packets are not removed from the pending set. |
| 176 | |
| 177 | Transmitter Thread |
| 178 | ------------------ |
| 179 | |
| 180 | The transmitter thread is responsible for most of the actual work regarding |
| 181 | packet transmission. In each iteration, it (waits for and) checks if the |
| 182 | next packet on the queue (if any) can be transmitted and, if so, removes it |
| 183 | from the queue and increments its counter for the number of transmission |
| 184 | attempts, i.e. tries. If the packet is sequenced, i.e. requires an ACK by |
| 185 | the EC, the packet is added to the pending set. Next, the packet's data is |
| 186 | submitted to the serdev subsystem. In case of an error or timeout during |
| 187 | this submission, the packet is completed by the transmitter thread with the |
| 188 | status value of the callback set accordingly. In case the packet is |
| 189 | unsequenced, i.e. does not require an ACK by the EC, the packet is completed |
| 190 | with success on the transmitter thread. |
| 191 | |
| 192 | Transmission of sequenced packets is limited by the number of concurrently |
| 193 | pending packets, i.e. a limit on how many packets may be waiting for an ACK |
Mauro Carvalho Chehab | bbbaf22 | 2021-06-16 08:27:30 +0200 | [diff] [blame] | 194 | from the EC in parallel. This limit is currently set to one (see |
| 195 | Documentation/driver-api/surface_aggregator/ssh.rst for the reasoning behind |
| 196 | this). Control packets (i.e. ACK and NAK) can always be transmitted. |
Maximilian Luz | 8d77928 | 2020-12-21 19:39:57 +0100 | [diff] [blame] | 197 | |
| 198 | Receiver Thread |
| 199 | --------------- |
| 200 | |
| 201 | Any data received from the EC is put into a FIFO buffer for further |
| 202 | processing. This processing happens on the receiver thread. The receiver |
| 203 | thread parses and validates the received message into its |ssh_frame| and |
| 204 | corresponding payload. It prepares and submits the necessary ACK (and on |
| 205 | validation error or invalid data NAK) packets for the received messages. |
| 206 | |
| 207 | This thread also handles further processing, such as matching ACK messages |
| 208 | to the corresponding pending packet (via sequence ID) and completing it, as |
| 209 | well as initiating re-submission of all currently pending packets on |
| 210 | receival of a NAK message (re-submission in case of a NAK is similar to |
| 211 | re-submission due to timeout, see below for more details on that). Note that |
| 212 | the successful completion of a sequenced packet will always run on the |
| 213 | receiver thread (whereas any failure-indicating completion will run on the |
| 214 | process where the failure occurred). |
| 215 | |
| 216 | Any payload data is forwarded via a callback to the next upper layer, i.e. |
| 217 | the request transport layer. |
| 218 | |
| 219 | Timeout Reaper |
| 220 | -------------- |
| 221 | |
| 222 | The packet acknowledgment timeout is a per-packet timeout for sequenced |
| 223 | packets, started when the respective packet begins (re-)transmission (i.e. |
| 224 | this timeout is armed once per transmission attempt on the transmitter |
| 225 | thread). It is used to trigger re-submission or, when the number of tries |
| 226 | has been exceeded, cancellation of the packet in question. |
| 227 | |
| 228 | This timeout is handled via a dedicated reaper task, which is essentially a |
| 229 | work item (re-)scheduled to run when the next packet is set to time out. The |
| 230 | work item then checks the set of pending packets for any packets that have |
| 231 | exceeded the timeout and, if there are any remaining packets, re-schedules |
| 232 | itself to the next appropriate point in time. |
| 233 | |
| 234 | If a timeout has been detected by the reaper, the packet will either be |
| 235 | re-submitted if it still has some remaining tries left, or completed with |
| 236 | ``-ETIMEDOUT`` as status if not. Note that re-submission, in this case and |
| 237 | triggered by receival of a NAK, means that the packet is added to the queue |
| 238 | with a now incremented number of tries, yielding a higher priority. The |
| 239 | timeout for the packet will be disabled until the next transmission attempt |
| 240 | and the packet remains on the pending set. |
| 241 | |
| 242 | Note that due to transmission and packet acknowledgment timeouts, the packet |
| 243 | transport layer is always guaranteed to make progress, if only through |
| 244 | timing out packets, and will never fully block. |
| 245 | |
| 246 | Concurrency and Locking |
| 247 | ----------------------- |
| 248 | |
| 249 | There are two main locks in the packet transport layer: One guarding access |
| 250 | to the packet queue and one guarding access to the pending set. These |
| 251 | collections may only be accessed and modified under the respective lock. If |
| 252 | access to both collections is needed, the pending lock must be acquired |
| 253 | before the queue lock to avoid deadlocks. |
| 254 | |
| 255 | In addition to guarding the collections, after initial packet submission |
| 256 | certain packet fields may only be accessed under one of the locks. |
| 257 | Specifically, the packet priority must only be accessed while holding the |
| 258 | queue lock and the packet timestamp must only be accessed while holding the |
| 259 | pending lock. |
| 260 | |
| 261 | Other parts of the packet transport layer are guarded independently. State |
| 262 | flags are managed by atomic bit operations and, if necessary, memory |
| 263 | barriers. Modifications to the timeout reaper work item and expiration date |
| 264 | are guarded by their own lock. |
| 265 | |
| 266 | The reference of the packet to the packet transport layer (``ptl``) is |
| 267 | somewhat special. It is either set when the upper layer request is submitted |
| 268 | or, if there is none, when the packet is first submitted. After it is set, |
| 269 | it will not change its value. Functions that may run concurrently with |
| 270 | submission, i.e. cancellation, can not rely on the ``ptl`` reference to be |
| 271 | set. Access to it in these functions is guarded by ``READ_ONCE()``, whereas |
| 272 | setting ``ptl`` is equally guarded with ``WRITE_ONCE()`` for symmetry. |
| 273 | |
| 274 | Some packet fields may be read outside of the respective locks guarding |
| 275 | them, specifically priority and state for tracing. In those cases, proper |
| 276 | access is ensured by employing ``WRITE_ONCE()`` and ``READ_ONCE()``. Such |
| 277 | read-only access is only allowed when stale values are not critical. |
| 278 | |
| 279 | With respect to the interface for higher layers, packet submission |
| 280 | (|ssh_ptl_submit|), packet cancellation (|ssh_ptl_cancel|), data receival |
| 281 | (|ssh_ptl_rx_rcvbuf|), and layer shutdown (|ssh_ptl_shutdown|) may always be |
| 282 | executed concurrently with respect to each other. Note that packet |
| 283 | submission may not run concurrently with itself for the same packet. |
| 284 | Equally, shutdown and data receival may also not run concurrently with |
| 285 | themselves (but may run concurrently with each other). |
| 286 | |
| 287 | |
| 288 | Request Transport Layer |
| 289 | ======================= |
| 290 | |
| 291 | The request transport layer is represented via |ssh_rtl| and builds on top |
| 292 | of the packet transport layer. It deals with requests, i.e. SSH packets sent |
| 293 | by the host containing a |ssh_command| as frame payload. This layer |
| 294 | separates responses to requests from events, which are also sent by the EC |
| 295 | via a |ssh_command| payload. While responses are handled in this layer, |
| 296 | events are relayed to the next upper layer, i.e. the controller layer, via |
| 297 | the corresponding callback. The request transport layer is structured around |
| 298 | the following key concepts: |
| 299 | |
| 300 | Request |
| 301 | ------- |
| 302 | |
| 303 | Requests are packets with a command-type payload, sent from host to EC to |
| 304 | query data from or trigger an action on it (or both simultaneously). They |
| 305 | are represented by |ssh_request|, wrapping the underlying |ssh_packet| |
| 306 | storing its message data (i.e. SSH frame with command payload). Note that |
| 307 | all top-level representations, e.g. |ssam_request_sync| are built upon this |
| 308 | struct. |
| 309 | |
| 310 | As |ssh_request| extends |ssh_packet|, its lifetime is also managed by the |
| 311 | reference counter inside the packet struct (which can be accessed via |
| 312 | |ssh_request_get| and |ssh_request_put|). Once the counter reaches zero, the |
| 313 | ``release()`` callback of the |ssh_request_ops| reference of the request is |
| 314 | called. |
| 315 | |
| 316 | Requests can have an optional response that is equally sent via a SSH |
| 317 | message with command-type payload (from EC to host). The party constructing |
| 318 | the request must know if a response is expected and mark this in the request |
| 319 | flags provided to |ssh_request_init|, so that the request transport layer |
| 320 | can wait for this response. |
| 321 | |
| 322 | Similar to |ssh_packet|, |ssh_request| also has a ``complete()`` callback |
| 323 | provided via its request ops reference and is guaranteed to be completed |
| 324 | before it is released once it has been submitted to the request transport |
| 325 | layer via |ssh_rtl_submit|. For a request without a response, successful |
| 326 | completion will occur once the underlying packet has been successfully |
| 327 | transmitted by the packet transport layer (i.e. from within the packet |
| 328 | completion callback). For a request with response, successful completion |
| 329 | will occur once the response has been received and matched to the request |
| 330 | via its request ID (which happens on the packet layer's data-received |
| 331 | callback running on the receiver thread). If the request is completed with |
| 332 | an error, the status value will be set to the corresponding (negative) errno |
| 333 | value. |
| 334 | |
| 335 | The state of a request is again managed via its ``state`` flags |
| 336 | (|ssh_request_flags|), which also encode the request type. In particular, |
| 337 | the following bits are noteworthy: |
| 338 | |
| 339 | * ``SSH_REQUEST_SF_LOCKED_BIT``: This bit is set when completion, either |
| 340 | through error or success, is imminent. It indicates that no further |
| 341 | references of the request should be taken and any existing references |
| 342 | should be dropped as soon as possible. The process setting this bit is |
| 343 | responsible for removing any references to this request from the request |
| 344 | queue and pending set. |
| 345 | |
| 346 | * ``SSH_REQUEST_SF_COMPLETED_BIT``: This bit is set by the process running the |
| 347 | ``complete()`` callback and is used to ensure that this callback only runs |
| 348 | once. |
| 349 | |
| 350 | * ``SSH_REQUEST_SF_QUEUED_BIT``: This bit is set when the request is queued on |
| 351 | the request queue and cleared when it is dequeued. |
| 352 | |
| 353 | * ``SSH_REQUEST_SF_PENDING_BIT``: This bit is set when the request is added to |
| 354 | the pending set and cleared when it is removed from it. |
| 355 | |
| 356 | Request Queue |
| 357 | ------------- |
| 358 | |
| 359 | The request queue is the first of the two fundamental collections in the |
| 360 | request transport layer. In contrast to the packet queue of the packet |
| 361 | transport layer, it is not a priority queue and the simple first come first |
| 362 | serve principle applies. |
| 363 | |
| 364 | All requests to be transmitted by the request transport layer must be |
| 365 | submitted to this queue via |ssh_rtl_submit|. Once submitted, requests may |
| 366 | not be re-submitted, and will not be re-submitted automatically on timeout. |
| 367 | Instead, the request is completed with a timeout error. If desired, the |
| 368 | caller can create and submit a new request for another try, but it must not |
| 369 | submit the same request again. |
| 370 | |
| 371 | Pending Set |
| 372 | ----------- |
| 373 | |
| 374 | The pending set is the second of the two fundamental collections in the |
| 375 | request transport layer. This collection stores references to all pending |
| 376 | requests, i.e. requests awaiting a response from the EC (similar to what the |
| 377 | pending set of the packet transport layer does for packets). |
| 378 | |
| 379 | Transmitter Task |
| 380 | ---------------- |
| 381 | |
| 382 | The transmitter task is scheduled when a new request is available for |
| 383 | transmission. It checks if the next request on the request queue can be |
| 384 | transmitted and, if so, submits its underlying packet to the packet |
| 385 | transport layer. This check ensures that only a limited number of |
| 386 | requests can be pending, i.e. waiting for a response, at the same time. If |
| 387 | the request requires a response, the request is added to the pending set |
| 388 | before its packet is submitted. |
| 389 | |
| 390 | Packet Completion Callback |
| 391 | -------------------------- |
| 392 | |
| 393 | The packet completion callback is executed once the underlying packet of a |
| 394 | request has been completed. In case of an error completion, the |
| 395 | corresponding request is completed with the error value provided in this |
| 396 | callback. |
| 397 | |
| 398 | On successful packet completion, further processing depends on the request. |
| 399 | If the request expects a response, it is marked as transmitted and the |
| 400 | request timeout is started. If the request does not expect a response, it is |
| 401 | completed with success. |
| 402 | |
| 403 | Data-Received Callback |
| 404 | ---------------------- |
| 405 | |
| 406 | The data received callback notifies the request transport layer of data |
| 407 | being received by the underlying packet transport layer via a data-type |
| 408 | frame. In general, this is expected to be a command-type payload. |
| 409 | |
| 410 | If the request ID of the command is one of the request IDs reserved for |
| 411 | events (one to ``SSH_NUM_EVENTS``, inclusively), it is forwarded to the |
| 412 | event callback registered in the request transport layer. If the request ID |
| 413 | indicates a response to a request, the respective request is looked up in |
| 414 | the pending set and, if found and marked as transmitted, completed with |
| 415 | success. |
| 416 | |
| 417 | Timeout Reaper |
| 418 | -------------- |
| 419 | |
| 420 | The request-response-timeout is a per-request timeout for requests expecting |
| 421 | a response. It is used to ensure that a request does not wait indefinitely |
| 422 | on a response from the EC and is started after the underlying packet has |
| 423 | been successfully completed. |
| 424 | |
| 425 | This timeout is, similar to the packet acknowledgment timeout on the packet |
| 426 | transport layer, handled via a dedicated reaper task. This task is |
| 427 | essentially a work-item (re-)scheduled to run when the next request is set |
| 428 | to time out. The work item then scans the set of pending requests for any |
| 429 | requests that have timed out and completes them with ``-ETIMEDOUT`` as |
| 430 | status. Requests will not be re-submitted automatically. Instead, the issuer |
| 431 | of the request must construct and submit a new request, if so desired. |
| 432 | |
| 433 | Note that this timeout, in combination with packet transmission and |
| 434 | acknowledgment timeouts, guarantees that the request layer will always make |
| 435 | progress, even if only through timing out packets, and never fully block. |
| 436 | |
| 437 | Concurrency and Locking |
| 438 | ----------------------- |
| 439 | |
| 440 | Similar to the packet transport layer, there are two main locks in the |
| 441 | request transport layer: One guarding access to the request queue and one |
| 442 | guarding access to the pending set. These collections may only be accessed |
| 443 | and modified under the respective lock. |
| 444 | |
| 445 | Other parts of the request transport layer are guarded independently. State |
| 446 | flags are (again) managed by atomic bit operations and, if necessary, memory |
| 447 | barriers. Modifications to the timeout reaper work item and expiration date |
| 448 | are guarded by their own lock. |
| 449 | |
| 450 | Some request fields may be read outside of the respective locks guarding |
| 451 | them, specifically the state for tracing. In those cases, proper access is |
| 452 | ensured by employing ``WRITE_ONCE()`` and ``READ_ONCE()``. Such read-only |
| 453 | access is only allowed when stale values are not critical. |
| 454 | |
| 455 | With respect to the interface for higher layers, request submission |
| 456 | (|ssh_rtl_submit|), request cancellation (|ssh_rtl_cancel|), and layer |
| 457 | shutdown (|ssh_rtl_shutdown|) may always be executed concurrently with |
| 458 | respect to each other. Note that request submission may not run concurrently |
| 459 | with itself for the same request (and also may only be called once per |
| 460 | request). Equally, shutdown may also not run concurrently with itself. |
| 461 | |
| 462 | |
| 463 | Controller Layer |
| 464 | ================ |
| 465 | |
| 466 | The controller layer extends on the request transport layer to provide an |
| 467 | easy-to-use interface for client drivers. It is represented by |
| 468 | |ssam_controller| and the SSH driver. While the lower level transport layers |
| 469 | take care of transmitting and handling packets and requests, the controller |
| 470 | layer takes on more of a management role. Specifically, it handles device |
| 471 | initialization, power management, and event handling, including event |
| 472 | delivery and registration via the (event) completion system (|ssam_cplt|). |
| 473 | |
| 474 | Event Registration |
| 475 | ------------------ |
| 476 | |
| 477 | In general, an event (or rather a class of events) has to be explicitly |
| 478 | requested by the host before the EC will send it (HID input events seem to |
| 479 | be the exception). This is done via an event-enable request (similarly, |
| 480 | events should be disabled via an event-disable request once no longer |
| 481 | desired). |
| 482 | |
| 483 | The specific request used to enable (or disable) an event is given via an |
| 484 | event registry, i.e. the governing authority of this event (so to speak), |
| 485 | represented by |ssam_event_registry|. As parameters to this request, the |
| 486 | target category and, depending on the event registry, instance ID of the |
| 487 | event to be enabled must be provided. This (optional) instance ID must be |
| 488 | zero if the registry does not use it. Together, target category and instance |
| 489 | ID form the event ID, represented by |ssam_event_id|. In short, both, event |
| 490 | registry and event ID, are required to uniquely identify a respective class |
| 491 | of events. |
| 492 | |
| 493 | Note that a further *request ID* parameter must be provided for the |
| 494 | enable-event request. This parameter does not influence the class of events |
| 495 | being enabled, but instead is set as the request ID (RQID) on each event of |
| 496 | this class sent by the EC. It is used to identify events (as a limited |
| 497 | number of request IDs is reserved for use in events only, specifically one |
| 498 | to ``SSH_NUM_EVENTS`` inclusively) and also map events to their specific |
| 499 | class. Currently, the controller always sets this parameter to the target |
| 500 | category specified in |ssam_event_id|. |
| 501 | |
| 502 | As multiple client drivers may rely on the same (or overlapping) classes of |
| 503 | events and enable/disable calls are strictly binary (i.e. on/off), the |
| 504 | controller has to manage access to these events. It does so via reference |
| 505 | counting, storing the counter inside an RB-tree based mapping with event |
| 506 | registry and ID as key (there is no known list of valid event registry and |
| 507 | event ID combinations). See |ssam_nf|, |ssam_nf_refcount_inc|, and |
| 508 | |ssam_nf_refcount_dec| for details. |
| 509 | |
| 510 | This management is done together with notifier registration (described in |
| 511 | the next section) via the top-level |ssam_notifier_register| and |
| 512 | |ssam_notifier_unregister| functions. |
| 513 | |
| 514 | Event Delivery |
| 515 | -------------- |
| 516 | |
| 517 | To receive events, a client driver has to register an event notifier via |
| 518 | |ssam_notifier_register|. This increments the reference counter for that |
| 519 | specific class of events (as detailed in the previous section), enables the |
| 520 | class on the EC (if it has not been enabled already), and installs the |
| 521 | provided notifier callback. |
| 522 | |
| 523 | Notifier callbacks are stored in lists, with one (RCU) list per target |
| 524 | category (provided via the event ID; NB: there is a fixed known number of |
| 525 | target categories). There is no known association from the combination of |
| 526 | event registry and event ID to the command data (target ID, target category, |
| 527 | command ID, and instance ID) that can be provided by an event class, apart |
| 528 | from target category and instance ID given via the event ID. |
| 529 | |
| 530 | Note that due to the way notifiers are (or rather have to be) stored, client |
| 531 | drivers may receive events that they have not requested and need to account |
| 532 | for them. Specifically, they will, by default, receive all events from the |
| 533 | same target category. To simplify dealing with this, filtering of events by |
| 534 | target ID (provided via the event registry) and instance ID (provided via |
| 535 | the event ID) can be requested when registering a notifier. This filtering |
| 536 | is applied when iterating over the notifiers at the time they are executed. |
| 537 | |
| 538 | All notifier callbacks are executed on a dedicated workqueue, the so-called |
| 539 | completion workqueue. After an event has been received via the callback |
| 540 | installed in the request layer (running on the receiver thread of the packet |
| 541 | transport layer), it will be put on its respective event queue |
| 542 | (|ssam_event_queue|). From this event queue the completion work item of that |
| 543 | queue (running on the completion workqueue) will pick up the event and |
| 544 | execute the notifier callback. This is done to avoid blocking on the |
| 545 | receiver thread. |
| 546 | |
| 547 | There is one event queue per combination of target ID and target category. |
| 548 | This is done to ensure that notifier callbacks are executed in sequence for |
| 549 | events of the same target ID and target category. Callbacks can be executed |
| 550 | in parallel for events with a different combination of target ID and target |
| 551 | category. |
| 552 | |
| 553 | Concurrency and Locking |
| 554 | ----------------------- |
| 555 | |
| 556 | Most of the concurrency related safety guarantees of the controller are |
| 557 | provided by the lower-level request transport layer. In addition to this, |
| 558 | event (un-)registration is guarded by its own lock. |
| 559 | |
| 560 | Access to the controller state is guarded by the state lock. This lock is a |
| 561 | read/write semaphore. The reader part can be used to ensure that the state |
| 562 | does not change while functions depending on the state to stay the same |
| 563 | (e.g. |ssam_notifier_register|, |ssam_notifier_unregister|, |
| 564 | |ssam_request_sync_submit|, and derivatives) are executed and this guarantee |
| 565 | is not already provided otherwise (e.g. through |ssam_client_bind| or |
| 566 | |ssam_client_link|). The writer part guards any transitions that will change |
| 567 | the state, i.e. initialization, destruction, suspension, and resumption. |
| 568 | |
| 569 | The controller state may be accessed (read-only) outside the state lock for |
| 570 | smoke-testing against invalid API usage (e.g. in |ssam_request_sync_submit|). |
| 571 | Note that such checks are not supposed to (and will not) protect against all |
| 572 | invalid usages, but rather aim to help catch them. In those cases, proper |
| 573 | variable access is ensured by employing ``WRITE_ONCE()`` and ``READ_ONCE()``. |
| 574 | |
| 575 | Assuming any preconditions on the state not changing have been satisfied, |
| 576 | all non-initialization and non-shutdown functions may run concurrently with |
| 577 | each other. This includes |ssam_notifier_register|, |ssam_notifier_unregister|, |
| 578 | |ssam_request_sync_submit|, as well as all functions building on top of those. |