sys_mlockall  [mm/mlock.c]


Locks all of the calling process's virtual address space into RAM, preventing that memory from being paged to the swap area.

All pages that contain a part of the specified address range are guaranteed to be resident in RAM when the call returns successfully; the pages are guaranteed to stay in RAM until later unlocked.

Memory locking has two main applications: real-time algorithms and high-security data processing. Real-time applications require deterministic timing, and, like scheduling, paging is one major cause of unexpected program execution delays. Real-time applications will usually also switch to a real-time scheduler with sys_sched_setscheduler. Cryptographic security software often handles critical bytes like passwords or secret keys as data structures. As a result of paging, these secrets could be transferred onto a persistent swap store medium, where they might be accessible to the enemy long after the security software has erased the secrets in RAM and terminated. (But be aware that the suspend mode on laptops and some desktop computers will save a copy of the system's RAM to disk, regardless of memory locks.)

Real-time processes that are using sys_mlockall to prevent delays on page faults should reserve enough locked stack pages before entering the time-critical section, so that no page fault can be caused by function calls. This can be achieved by calling a function that allocates a sufficiently large automatic variable (an array) and writes to the memory occupied by this array in order to touch these stack pages. This way, enough pages will be mapped for the stack and can be locked into RAM. The dummy writes ensure that not even copy-on-write page faults can occur in the critical section.

Memory locks are not inherited by a child created via sys_fork and are automatically removed (unlocked) during an sys_execve or when the process terminates.

The memory lock on an address range is automatically removed if the address range is unmapped via sys_munmap.

Memory locks do not stack, i.e., pages which have been locked several times will be unlocked by a single call to sys_munlockall. Pages which are mapped to several locations or by several processes stay locked into RAM as long as they are locked at least at one location or by at least one process.

Arguments

eax 152
ebx Flags:
MCL_CURRENT Lock all pages which are currently mapped into the address space of the process.
MCL_FUTURE Lock all pages which will become mapped into the address space of the process in the future. These could be for instance new pages required by a growing heap and stack as well as new memory mapped files or shared memory regions.
If MCL_FUTURE has been specified, then a later system call (e.g., sys_mmap), may fail if it would cause the number of locked bytes to exceed the permitted maximum (see below). In the same circumstances, stack growth may likewise fail: the kernel will deny stack expansion and deliver a SIGSEGV signal to the process.

Return values

If the system call succeeds the return value is 0.
If the system call fails the return value is one of the following errno values:

-ENOMEM (Linux 2.6.9 and later) the caller had a non-zero RLIMIT_MEMLOCK soft resource limit, but tried to lock more memory than the limit permitted. This limit is not enforced if the process is privileged (CAP_IPC_LOCK).
-EPERM (Linux 2.6.9 and later) the caller was not privileged (CAP_IPC_LOCK) and its RLIMIT_MEMLOCK soft resource limit was 0.
-EINVAL Unknown flags were specified.

Remarks

In Linux 2.6.8 and earlier, a process must be privileged (CAP_IPC_LOCK) in order to lock memory and the RLIMIT_MEMLOCK soft resource limit defines a limit on how much memory the process may lock.

Since Linux 2.6.9, no limits are placed on the amount of memory that a privileged process can lock and the RLIMIT_MEMLOCK soft resource limit instead defines a limit on how much memory an unprivileged process may lock.

Since kernel 2.6.9, if a privileged process calls sys_mlockall with MCL_FUTURE flag set and later drops privileges (loses the CAP_IPC_LOCK capability by, for example, setting its effective UID to a non-zero value), then subsequent memory allocations (e.g., sys_mmap, sys_brk) will fail if the RLIMIT_MEMLOCK resource limit is encountered.

Compatibility

n/a