blob: e4f6972eb6c04b69c85bc288f78552593f31056d [file] [log] [blame]
Mike Rapoportd18edf52018-03-21 21:22:39 +02001=====================
Kirill A. Shutemov49076ec2013-11-14 14:31:51 -08002Split page table lock
3=====================
4
5Originally, mm->page_table_lock spinlock protected all page tables of the
6mm_struct. But this approach leads to poor page fault scalability of
7multi-threaded applications due high contention on the lock. To improve
8scalability, split page table lock was introduced.
9
10With split page table lock we have separate per-table lock to serialize
11access to the table. At the moment we use split lock for PTE and PMD
12tables. Access to higher level tables protected by mm->page_table_lock.
13
14There are helpers to lock/unlock a table and other accessor functions:
Mike Rapoportd18edf52018-03-21 21:22:39 +020015
Kirill A. Shutemov49076ec2013-11-14 14:31:51 -080016 - pte_offset_map_lock()
Hugh Dickins0d940a92023-06-08 18:10:32 -070017 maps PTE and takes PTE table lock, returns pointer to PTE with
18 pointer to its PTE table lock, or returns NULL if no PTE table;
19 - pte_offset_map_nolock()
20 maps PTE, returns pointer to PTE with pointer to its PTE table
21 lock (not taken), or returns NULL if no PTE table;
22 - pte_offset_map()
23 maps PTE, returns pointer to PTE, or returns NULL if no PTE table;
24 - pte_unmap()
25 unmaps PTE table;
Kirill A. Shutemov49076ec2013-11-14 14:31:51 -080026 - pte_unmap_unlock()
27 unlocks and unmaps PTE table;
28 - pte_alloc_map_lock()
Hugh Dickins0d940a92023-06-08 18:10:32 -070029 allocates PTE table if needed and takes its lock, returns pointer to
30 PTE with pointer to its lock, or returns NULL if allocation failed;
Kirill A. Shutemov49076ec2013-11-14 14:31:51 -080031 - pmd_lock()
32 takes PMD table lock, returns pointer to taken lock;
33 - pmd_lockptr()
34 returns pointer to PMD table lock;
35
36Split page table lock for PTE tables is enabled compile-time if
37CONFIG_SPLIT_PTLOCK_CPUS (usually 4) is less or equal to NR_CPUS.
Rolf Eike Beer96c0f7c2021-01-12 14:19:36 +010038If split lock is disabled, all tables are guarded by mm->page_table_lock.
Kirill A. Shutemov49076ec2013-11-14 14:31:51 -080039
40Split page table lock for PMD tables is enabled, if it's enabled for PTE
41tables and the architecture supports it (see below).
42
43Hugetlb and split page table lock
Mike Rapoportd18edf52018-03-21 21:22:39 +020044=================================
Kirill A. Shutemov49076ec2013-11-14 14:31:51 -080045
46Hugetlb can support several page sizes. We use split lock only for PMD
47level, but not for PUD.
48
49Hugetlb-specific helpers:
Mike Rapoportd18edf52018-03-21 21:22:39 +020050
Kirill A. Shutemov49076ec2013-11-14 14:31:51 -080051 - huge_pte_lock()
52 takes pmd split lock for PMD_SIZE page, mm->page_table_lock
53 otherwise;
54 - huge_pte_lockptr()
55 returns pointer to table lock;
56
57Support of split page table lock by an architecture
Mike Rapoportd18edf52018-03-21 21:22:39 +020058===================================================
Kirill A. Shutemov49076ec2013-11-14 14:31:51 -080059
Mark Rutlandb4ed71f2019-09-25 16:49:46 -070060There's no need in special enabling of PTE split page table lock: everything
Vishal Moola (Oracle)9a4bbd82023-08-07 16:05:13 -070061required is done by pagetable_pte_ctor() and pagetable_pte_dtor(), which
Mark Rutlandb4ed71f2019-09-25 16:49:46 -070062must be called on PTE table allocation / freeing.
Kirill A. Shutemov49076ec2013-11-14 14:31:51 -080063
64Make sure the architecture doesn't use slab allocator for page table
Kirill A. Shutemov1d798ca2015-11-06 16:29:54 -080065allocation: slab uses page->slab_cache for its pages.
66This field shares storage with page->ptl.
Kirill A. Shutemov49076ec2013-11-14 14:31:51 -080067
68PMD split lock only makes sense if you have more than two page table
69levels.
70
Vishal Moola (Oracle)9a4bbd82023-08-07 16:05:13 -070071PMD split lock enabling requires pagetable_pmd_ctor() call on PMD table
72allocation and pagetable_pmd_dtor() on freeing.
Kirill A. Shutemov49076ec2013-11-14 14:31:51 -080073
Kirill A. Shutemovc2836102013-11-21 14:32:09 -080074Allocation usually happens in pmd_alloc_one(), freeing in pmd_free() and
75pmd_free_tlb(), but make sure you cover all PMD table allocation / freeing
76paths: i.e X86_PAE preallocate few PMDs on pgd_alloc().
Kirill A. Shutemov49076ec2013-11-14 14:31:51 -080077
78With everything in place you can set CONFIG_ARCH_ENABLE_SPLIT_PMD_PTLOCK.
79
Vishal Moola (Oracle)9a4bbd82023-08-07 16:05:13 -070080NOTE: pagetable_pte_ctor() and pagetable_pmd_ctor() can fail -- it must
Kirill A. Shutemov49076ec2013-11-14 14:31:51 -080081be handled properly.
82
83page->ptl
Mike Rapoportd18edf52018-03-21 21:22:39 +020084=========
Kirill A. Shutemov49076ec2013-11-14 14:31:51 -080085
86page->ptl is used to access split page table lock, where 'page' is struct
87page of page containing the table. It shares storage with page->private
88(and few other fields in union).
89
90To avoid increasing size of struct page and have best performance, we use a
91trick:
Mike Rapoportd18edf52018-03-21 21:22:39 +020092
Kirill A. Shutemov49076ec2013-11-14 14:31:51 -080093 - if spinlock_t fits into long, we use page->ptr as spinlock, so we
94 can avoid indirect access and save a cache line.
95 - if size of spinlock_t is bigger then size of long, we use page->ptl as
96 pointer to spinlock_t and allocate it dynamically. This allows to use
97 split lock with enabled DEBUG_SPINLOCK or DEBUG_LOCK_ALLOC, but costs
98 one more cache line for indirect access;
99
Vishal Moola (Oracle)9a4bbd82023-08-07 16:05:13 -0700100The spinlock_t allocated in pagetable_pte_ctor() for PTE table and in
101pagetable_pmd_ctor() for PMD table.
Kirill A. Shutemov49076ec2013-11-14 14:31:51 -0800102
103Please, never access page->ptl directly -- use appropriate helper.