| The Linux kernel supports the following overcommit handling modes |
| |
| 0 - Heuristic overcommit handling. Obvious overcommits of |
| address space are refused. Used for a typical system. It |
| ensures a seriously wild allocation fails while allowing |
| overcommit to reduce swap usage. root is allowed to |
| allocate slightly more memory in this mode. This is the |
| default. |
| |
| 1 - Always overcommit. Appropriate for some scientific |
| applications. Classic example is code using sparse arrays |
| and just relying on the virtual memory consisting almost |
| entirely of zero pages. |
| |
| 2 - Don't overcommit. The total address space commit |
| for the system is not permitted to exceed swap + a |
| configurable percentage (default is 50) of physical RAM. |
| Depending on the percentage you use, in most situations |
| this means a process will not be killed while accessing |
| pages but will receive errors on memory allocation as |
| appropriate. |
| |
| Useful for applications that want to guarantee their |
| memory allocations will be available in the future |
| without having to initialize every page. |
| |
| The overcommit policy is set via the sysctl `vm.overcommit_memory'. |
| |
| The overcommit percentage is set via `vm.overcommit_ratio'. |
| |
| The current overcommit limit and amount committed are viewable in |
| /proc/meminfo as CommitLimit and Committed_AS respectively. |
| |
| Gotchas |
| ------- |
| |
| The C language stack growth does an implicit mremap. If you want absolute |
| guarantees and run close to the edge you MUST mmap your stack for the |
| largest size you think you will need. For typical stack usage this does |
| not matter much but it's a corner case if you really really care |
| |
| In mode 2 the MAP_NORESERVE flag is ignored. |
| |
| |
| How It Works |
| ------------ |
| |
| The overcommit is based on the following rules |
| |
| For a file backed map |
| SHARED or READ-only - 0 cost (the file is the map not swap) |
| PRIVATE WRITABLE - size of mapping per instance |
| |
| For an anonymous or /dev/zero map |
| SHARED - size of mapping |
| PRIVATE READ-only - 0 cost (but of little use) |
| PRIVATE WRITABLE - size of mapping per instance |
| |
| Additional accounting |
| Pages made writable copies by mmap |
| shmfs memory drawn from the same pool |
| |
| Status |
| ------ |
| |
| o We account mmap memory mappings |
| o We account mprotect changes in commit |
| o We account mremap changes in size |
| o We account brk |
| o We account munmap |
| o We report the commit status in /proc |
| o Account and check on fork |
| o Review stack handling/building on exec |
| o SHMfs accounting |
| o Implement actual limit enforcement |
| |
| To Do |
| ----- |
| o Account ptrace pages (this is hard) |