| .. SPDX-License-Identifier: GPL-2.0 |
| |
| ==================== |
| Kernel Testing Guide |
| ==================== |
| |
| |
| There are a number of different tools for testing the Linux kernel, so knowing |
| when to use each of them can be a challenge. This document provides a rough |
| overview of their differences, and how they fit together. |
| |
| |
| Writing and Running Tests |
| ========================= |
| |
| The bulk of kernel tests are written using either the kselftest or KUnit |
| frameworks. These both provide infrastructure to help make running tests and |
| groups of tests easier, as well as providing helpers to aid in writing new |
| tests. |
| |
| If you're looking to verify the behaviour of the Kernel — particularly specific |
| parts of the kernel — then you'll want to use KUnit or kselftest. |
| |
| |
| The Difference Between KUnit and kselftest |
| ------------------------------------------ |
| |
| KUnit (Documentation/dev-tools/kunit/index.rst) is an entirely in-kernel system |
| for "white box" testing: because test code is part of the kernel, it can access |
| internal structures and functions which aren't exposed to userspace. |
| |
| KUnit tests therefore are best written against small, self-contained parts |
| of the kernel, which can be tested in isolation. This aligns well with the |
| concept of 'unit' testing. |
| |
| For example, a KUnit test might test an individual kernel function (or even a |
| single codepath through a function, such as an error handling case), rather |
| than a feature as a whole. |
| |
| This also makes KUnit tests very fast to build and run, allowing them to be |
| run frequently as part of the development process. |
| |
| There is a KUnit test style guide which may give further pointers in |
| Documentation/dev-tools/kunit/style.rst |
| |
| |
| kselftest (Documentation/dev-tools/kselftest.rst), on the other hand, is |
| largely implemented in userspace, and tests are normal userspace scripts or |
| programs. |
| |
| This makes it easier to write more complicated tests, or tests which need to |
| manipulate the overall system state more (e.g., spawning processes, etc.). |
| However, it's not possible to call kernel functions directly from kselftest. |
| This means that only kernel functionality which is exposed to userspace somehow |
| (e.g. by a syscall, device, filesystem, etc.) can be tested with kselftest. To |
| work around this, some tests include a companion kernel module which exposes |
| more information or functionality. If a test runs mostly or entirely within the |
| kernel, however, KUnit may be the more appropriate tool. |
| |
| kselftest is therefore suited well to tests of whole features, as these will |
| expose an interface to userspace, which can be tested, but not implementation |
| details. This aligns well with 'system' or 'end-to-end' testing. |
| |
| For example, all new system calls should be accompanied by kselftest tests. |
| |
| Code Coverage Tools |
| =================== |
| |
| The Linux Kernel supports two different code coverage measurement tools. These |
| can be used to verify that a test is executing particular functions or lines |
| of code. This is useful for determining how much of the kernel is being tested, |
| and for finding corner-cases which are not covered by the appropriate test. |
| |
| Documentation/dev-tools/gcov.rst is GCC's coverage testing tool, which can be |
| used with the kernel to get global or per-module coverage. Unlike KCOV, it |
| does not record per-task coverage. Coverage data can be read from debugfs, |
| and interpreted using the usual gcov tooling. |
| |
| Documentation/dev-tools/kcov.rst is a feature which can be built in to the |
| kernel to allow capturing coverage on a per-task level. It's therefore useful |
| for fuzzing and other situations where information about code executed during, |
| for example, a single syscall is useful. |
| |
| |
| Dynamic Analysis Tools |
| ====================== |
| |
| The kernel also supports a number of dynamic analysis tools, which attempt to |
| detect classes of issues when they occur in a running kernel. These typically |
| each look for a different class of bugs, such as invalid memory accesses, |
| concurrency issues such as data races, or other undefined behaviour like |
| integer overflows. |
| |
| Some of these tools are listed below: |
| |
| * kmemleak detects possible memory leaks. See |
| Documentation/dev-tools/kmemleak.rst |
| * KASAN detects invalid memory accesses such as out-of-bounds and |
| use-after-free errors. See Documentation/dev-tools/kasan.rst |
| * UBSAN detects behaviour that is undefined by the C standard, like integer |
| overflows. See Documentation/dev-tools/ubsan.rst |
| * KCSAN detects data races. See Documentation/dev-tools/kcsan.rst |
| * KFENCE is a low-overhead detector of memory issues, which is much faster than |
| KASAN and can be used in production. See Documentation/dev-tools/kfence.rst |
| * lockdep is a locking correctness validator. See |
| Documentation/locking/lockdep-design.rst |
| * Runtime Verification (RV) supports checking specific behaviours for a given |
| subsystem. See Documentation/trace/rv/runtime-verification.rst |
| * There are several other pieces of debug instrumentation in the kernel, many |
| of which can be found in lib/Kconfig.debug |
| |
| These tools tend to test the kernel as a whole, and do not "pass" like |
| kselftest or KUnit tests. They can be combined with KUnit or kselftest by |
| running tests on a kernel with these tools enabled: you can then be sure |
| that none of these errors are occurring during the test. |
| |
| Some of these tools integrate with KUnit or kselftest and will |
| automatically fail tests if an issue is detected. |
| |
| Static Analysis Tools |
| ===================== |
| |
| In addition to testing a running kernel, one can also analyze kernel source code |
| directly (**at compile time**) using **static analysis** tools. The tools |
| commonly used in the kernel allow one to inspect the whole source tree or just |
| specific files within it. They make it easier to detect and fix problems during |
| the development process. |
| |
| Sparse can help test the kernel by performing type-checking, lock checking, |
| value range checking, in addition to reporting various errors and warnings while |
| examining the code. See the Documentation/dev-tools/sparse.rst documentation |
| page for details on how to use it. |
| |
| Smatch extends Sparse and provides additional checks for programming logic |
| mistakes such as missing breaks in switch statements, unused return values on |
| error checking, forgetting to set an error code in the return of an error path, |
| etc. Smatch also has tests against more serious issues such as integer |
| overflows, null pointer dereferences, and memory leaks. See the project page at |
| http://smatch.sourceforge.net/. |
| |
| Coccinelle is another static analyzer at our disposal. Coccinelle is often used |
| to aid refactoring and collateral evolution of source code, but it can also help |
| to avoid certain bugs that occur in common code patterns. The types of tests |
| available include API tests, tests for correct usage of kernel iterators, checks |
| for the soundness of free operations, analysis of locking behavior, and further |
| tests known to help keep consistent kernel usage. See the |
| Documentation/dev-tools/coccinelle.rst documentation page for details. |
| |
| Beware, though, that static analysis tools suffer from **false positives**. |
| Errors and warns need to be evaluated carefully before attempting to fix them. |
| |
| When to use Sparse and Smatch |
| ----------------------------- |
| |
| Sparse does type checking, such as verifying that annotated variables do not |
| cause endianness bugs, detecting places that use ``__user`` pointers improperly, |
| and analyzing the compatibility of symbol initializers. |
| |
| Smatch does flow analysis and, if allowed to build the function database, it |
| also does cross function analysis. Smatch tries to answer questions like where |
| is this buffer allocated? How big is it? Can this index be controlled by the |
| user? Is this variable larger than that variable? |
| |
| It's generally easier to write checks in Smatch than it is to write checks in |
| Sparse. Nevertheless, there are some overlaps between Sparse and Smatch checks. |
| |
| Strong points of Smatch and Coccinelle |
| -------------------------------------- |
| |
| Coccinelle is probably the easiest for writing checks. It works before the |
| pre-processor so it's easier to check for bugs in macros using Coccinelle. |
| Coccinelle also creates patches for you, which no other tool does. |
| |
| For example, with Coccinelle you can do a mass conversion from |
| ``kmalloc(x * size, GFP_KERNEL)`` to ``kmalloc_array(x, size, GFP_KERNEL)``, and |
| that's really useful. If you just created a Smatch warning and try to push the |
| work of converting on to the maintainers they would be annoyed. You'd have to |
| argue about each warning if can really overflow or not. |
| |
| Coccinelle does no analysis of variable values, which is the strong point of |
| Smatch. On the other hand, Coccinelle allows you to do simple things in a simple |
| way. |