blob: 8ddc4b1cfef2734224a3eb7e3dadcbbec7c0f45b [file] [log] [blame]
Mauro Carvalho Chehab97162a12019-06-08 23:27:03 -03001======================
2Userspace verbs access
3======================
Roland Dreier6f501422005-07-07 17:57:21 -07004
5 The ib_uverbs module, built by enabling CONFIG_INFINIBAND_USER_VERBS,
6 enables direct userspace access to IB hardware via "verbs," as
7 described in chapter 11 of the InfiniBand Architecture Specification.
8
9 To use the verbs, the libibverbs library, available from
Jason Gunthorpe46adb172018-02-02 14:35:29 -070010 https://github.com/linux-rdma/rdma-core, is required. libibverbs contains a
Roland Dreier6f501422005-07-07 17:57:21 -070011 device-independent API for using the ib_uverbs interface.
12 libibverbs also requires appropriate device-dependent kernel and
13 userspace driver for your InfiniBand hardware. For example, to use
14 a Mellanox HCA, you will need the ib_mthca kernel module and the
15 libmthca userspace driver be installed.
16
17User-kernel communication
Mauro Carvalho Chehab97162a12019-06-08 23:27:03 -030018=========================
Roland Dreier6f501422005-07-07 17:57:21 -070019
20 Userspace communicates with the kernel for slow path, resource
21 management operations via the /dev/infiniband/uverbsN character
22 devices. Fast path operations are typically performed by writing
23 directly to hardware registers mmap()ed into userspace, with no
24 system call or context switch into the kernel.
25
26 Commands are sent to the kernel via write()s on these device files.
27 The ABI is defined in drivers/infiniband/include/ib_user_verbs.h.
28 The structs for commands that require a response from the kernel
29 contain a 64-bit field used to pass a pointer to an output buffer.
30 Status is returned to userspace as the return value of the write()
31 system call.
32
33Resource management
Mauro Carvalho Chehab97162a12019-06-08 23:27:03 -030034===================
Roland Dreier6f501422005-07-07 17:57:21 -070035
36 Since creation and destruction of all IB resources is done by
37 commands passed through a file descriptor, the kernel can keep track
38 of which resources are attached to a given userspace context. The
39 ib_uverbs module maintains idr tables that are used to translate
40 between kernel pointers and opaque userspace handles, so that kernel
41 pointers are never exposed to userspace and userspace cannot trick
42 the kernel into following a bogus pointer.
43
44 This also allows the kernel to clean up when a process exits and
45 prevent one process from touching another process's resources.
46
47Memory pinning
Mauro Carvalho Chehab97162a12019-06-08 23:27:03 -030048==============
Roland Dreier6f501422005-07-07 17:57:21 -070049
50 Direct userspace I/O requires that memory regions that are potential
51 I/O targets be kept resident at the same physical address. The
52 ib_uverbs module manages pinning and unpinning memory regions via
53 get_user_pages() and put_page() calls. It also accounts for the
Davidlohr Bueso1a7a05e2019-02-06 17:31:55 -080054 amount of memory pinned in the process's pinned_vm, and checks that
Roland Dreier6f501422005-07-07 17:57:21 -070055 unprivileged processes do not exceed their RLIMIT_MEMLOCK limit.
56
57 Pages that are pinned multiple times are counted each time they are
Davidlohr Bueso1a7a05e2019-02-06 17:31:55 -080058 pinned, so the value of pinned_vm may be an overestimate of the
Roland Dreier6f501422005-07-07 17:57:21 -070059 number of pages pinned by a process.
60
61/dev files
Mauro Carvalho Chehab97162a12019-06-08 23:27:03 -030062==========
Roland Dreier6f501422005-07-07 17:57:21 -070063
64 To create the appropriate character device files automatically with
Mauro Carvalho Chehab97162a12019-06-08 23:27:03 -030065 udev, a rule like::
Roland Dreier6f501422005-07-07 17:57:21 -070066
Bart Van Asscheaa07a992009-10-07 15:35:55 -070067 KERNEL=="uverbs*", NAME="infiniband/%k"
Roland Dreier6f501422005-07-07 17:57:21 -070068
Mauro Carvalho Chehab97162a12019-06-08 23:27:03 -030069 can be used. This will create device nodes named::
Roland Dreier6f501422005-07-07 17:57:21 -070070
71 /dev/infiniband/uverbs0
72
73 and so on. Since the InfiniBand userspace verbs should be safe for
74 use by non-privileged processes, it may be useful to add an
75 appropriate MODE or GROUP to the udev rule.