Most Linux distributions allow processes to request more memory that what is available in the system. The logic behind the approval is that generally the allocated memory is not used up immediately. Also it has been observed that processes over their lifetime do not utilize all of the memory they had initially requested. Thus over-committing allows the system to fully utilize it’s memory at the risk of out-of-memory (OOM) situations.
The purpose of OOM Killer is to find the best process to kill in case of severe memory shortage. The process is selected on the basis of badness score. The value of badness score is determined by the following properties:
- original memory size of the process – more memory a process uses higher is its score
- it’s CPU time
- the run time – the longer a process is alive lower is its score
- oom_adj value – The /proc/<pid>/oom_adj can be set to a value between -17 and +15. Higher the value, more likely is it to be selected as the sacrificial lamb. Setting this value to -17 instructs the OOM Killer to never kill the process.
- Half of each child’s memory size is added to parent’s score.
- If the task has a nice value above zero, the score doubles
- Superuser or direct hardware access tasks have their values divided by 4
- Depending on oom_adj the value is adjusted as:
- if oom_adj > 0, score <<= oom_adj
- if oom_adj < 0, score >>= -(oom_adj)
The principle on which the OOM Killer operates is :
System should lose the minimum amount of work done, recovers a large amount of memory, doesn’t kill innocent processes eating tons of memory and kills the minimum number of processes (limit to 1 if possible).
The task with the highest badness score is selected and all it’s children are killed. If the process does not have any child then the process itself will be killed.
Adding more info from OOM_KILLER.
The function which does the above mentioned badness score computation is called badness(). It gets called by the following chain:
_alloc_pages -> out_of_memory() -> select_bad_process() -> badness()
The badness() accumulates points for each process and returns them to select_bad_process(). The scoring of a process starts with the size of it’s resident memory:
/* * The memory size of the process is the basis for the badness. */ points = p->mm->total_vm;The memory size of any child is added to the process:
/* * Processes which fork a lot of child processes are likely * a good choice. We add the vmsize of the childs if they * have an own mm. This prevents forking servers to flood the * machine with an endless amount of childs */ ... if (chld->mm != p->mm && chld->mm) points += chld->mm->total_vm;Process with nice value above zero have their score increased and long running processes have theirs decreased:
s = int_sqrt(cpu_time); if (s) points /= s; s = int_sqrt(int_sqrt(run_time)); if (s) points /= s;
/* * Niced processes are most likely less important, so double * their badness points. */ if (task_nice(p) > 0) points *= 2;Superuser processes and direct hardware access tasks have their scores reduced:
/* * Superuser processes are usually more important, so we make it * less likely that we kill those. */ if (cap_t(p->cap_effective) & CAP_TO_MASK(CAP_SYS_ADMIN) || p->uid == 0 || p->euid == 0) points /= 4;
/* * We don't want to kill a process with direct hardware access. * Not only could that mess up the hardware, but usually users * tend to only have this flag set on applications they think * of as important. */ if (cap_t(p->cap_effective) & CAP_TO_MASK(CAP_SYS_RAWIO)) points /= 4;
Finally, honour the oom_adj setting:
/* * Adjust the score by oomkilladj. */ if (p->oomkilladj) { if (p->oomkilladj > 0) points <<= p->oomkilladj; else points >>= -(p->oomkilladj); }Thus the ideal candidate will be:
One that was recently started, is a non-privileged process which together with its children uses a lot of memory, has been nice’d and does no I/O. Something like a nohup’d parallel kernel build (which is not a bad choice since all results are saved to disk and very little work is lost when a make is terminated).
From the SDE Tip – Amazon
Popularity: 2% [?]