- 04 Feb, 2015 24 commits
-
-
Morten Rasmussen authored
With energy-aware scheduling enabled nohz_kick_needed() generates many nohz idle-balance kicks which lead to nothing when multiple tasks get packed on a single cpu to save energy. This causes unnecessary wake-ups and hence wastes energy. Make these conditions depend on !energy_aware() for now until the energy-aware nohz story gets sorted out. cc: Ingo Molnar <mingo@redhat.com> cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by:
Morten Rasmussen <morten.rasmussen@arm.com>
-
Morten Rasmussen authored
Add an extra criteria to need_active_balance() to kick off active load balance if the source cpu is overutilized and has lower capacity than the destination cpus. cc: Ingo Molnar <mingo@redhat.com> cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by:
Morten Rasmussen <morten.rasmussen@arm.com>
-
We do not want to miss out on the ability to do energy-aware idle load balancing if the system is only partially loaded since the operational range of energy-aware scheduling corresponds to a partially loaded system. We might want to pull a single remaining task from a potential src cpu towards an idle destination cpu if the energy model tells us this is worth doing to save energy. cc: Ingo Molnar <mingo@redhat.com> cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by:
Dietmar Eggemann <dietmar.eggemann@arm.com>
-
Skip cpu as a potential src (costliest) in case it has only one task running and its original capacity is greater than or equal to the original capacity of the dst cpu. cc: Ingo Molnar <mingo@redhat.com> cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by:
Dietmar Eggemann <dietmar.eggemann@arm.com>
-
Energy-aware load balancing bases on cpu usage so the upper bound of its operational range is a fully utilized cpu. Above this tipping point it makes more sense to use weighted_cpuload to preserve smp_nice. This patch implements the tipping point detection in update_sg_lb_stats as if one cpu is over-utilized the current energy-aware load balance operation will fall back into the conventional weighted load based one. cc: Ingo Molnar <mingo@redhat.com> cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by:
Dietmar Eggemann <dietmar.eggemann@arm.com>
-
Energy-aware load balancing does not rely on env->imbalance but instead it evaluates the system-wide energy difference for each task on the src rq by potentially moving it to the dst rq. If this energy difference is lesser than zero the task is actually moved from src to dst rq. cc: Ingo Molnar <mingo@redhat.com> cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by:
Dietmar Eggemann <dietmar.eggemann@arm.com>
-
In case that after the gathering of sched domain statistics the current load balancing operation is still in energy-aware mode and a least efficient sched group has been found, detect the least efficient cpu by comparing the cpu efficiency (ratio between cpu usage and cpu energy consumption) among all cpus of the least efficient sched group. cc: Ingo Molnar <mingo@redhat.com> cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by:
Dietmar Eggemann <dietmar.eggemann@arm.com>
-
In case that after the gathering of sched domain statistics the current load balancing operation is still in energy-aware mode, just return the least efficient (costliest) reference. That implies the system is considered to be balanced in case no least efficient sched group was found. cc: Ingo Molnar <mingo@redhat.com> cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by:
Dietmar Eggemann <dietmar.eggemann@arm.com>
-
Energy-aware load balancing has to work alongside the conventional load based functionality. This includes the tipping point feature, i.e. being able to fall back from energy aware to the conventional load based functionality during an ongoing load balancing action. That is why this patch introduces an additional reference to hold the least efficient sched group (costliest) as well its statistics in form of an extra sg_lb_stats structure (costliest_stat). The function update_sd_pick_costliest is used to assign the least efficient sched group parallel to the existing update_sd_pick_busiest. cc: Ingo Molnar <mingo@redhat.com> cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by:
Dietmar Eggemann <dietmar.eggemann@arm.com>
-
To be able to identify the least efficient (costliest) sched group introduce group_eff as the efficiency of the sched group into sg_lb_stats. The group efficiency is defined as the ratio between the group usage and the group energy consumption. cc: Ingo Molnar <mingo@redhat.com> cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by:
Dietmar Eggemann <dietmar.eggemann@arm.com>
-
Energy-aware load balancing should only happen if the ENERGY_AWARE feature is turned on and the sched domain on which the load balancing is performed on contains energy data. There is also a need during a load balance action to be able to query if we should continue to load balance energy-aware or if we reached the tipping point which forces us to fall back to the conventional load balancing functionality. cc: Ingo Molnar <mingo@redhat.com> cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by:
Dietmar Eggemann <dietmar.eggemann@arm.com>
-
Morten Rasmussen authored
To estimate the energy consumption of a sched_group in sched_group_energy() it is necessary to know which idle-state the group is in when it is idle. For now, it is assumed that this is the current idle-state (though it might be wrong). Based on the individual cpu idle-states group_idle_state() finds the group idle-state. cc: Ingo Molnar <mingo@redhat.com> cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by:
Morten Rasmussen <morten.rasmussen@arm.com>
-
Morten Rasmussen authored
Make wake-ups of new tasks (find_idlest_group) aware of any differences in cpu compute capacity so new tasks don't get handed off to a cpus with lower capacity. cc: Ingo Molnar <mingo@redhat.com> cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by:
Morten Rasmussen <morten.rasmussen@arm.com>
-
Morten Rasmussen authored
Let available compute capacity and estimated energy impact select wake-up target cpu when energy-aware scheduling is enabled. energy_aware_wake_cpu() attempts to find group of cpus with sufficient compute capacity to accommodate the task and find a cpu with enough spare capacity to handle the task within that group. Preference is given to cpus with enough spare capacity at the current OPP. Finally, the energy impact of the new target and the previous task cpu is compared to select the wake-up target cpu. cc: Ingo Molnar <mingo@redhat.com> cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by:
Morten Rasmussen <morten.rasmussen@arm.com>
-
Morten Rasmussen authored
Adds a generic energy-aware helper function, energy_diff(), that calculates energy impact of adding, removing, and migrating utilization in the system. cc: Ingo Molnar <mingo@redhat.com> cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by:
Morten Rasmussen <morten.rasmussen@arm.com>
-
Morten Rasmussen authored
Extended sched_group_energy() to support energy prediction with usage (tasks) added/removed from a specific cpu or migrated between a pair of cpus. Useful for load-balancing decision making. cc: Ingo Molnar <mingo@redhat.com> cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by:
Morten Rasmussen <morten.rasmussen@arm.com>
-
Morten Rasmussen authored
For energy-aware load-balancing decisions it is necessary to know the energy consumption estimates of groups of cpus. This patch introduces a basic function, sched_group_energy(), which estimates the energy consumption of the cpus in the group and any resources shared by the members of the group. NOTE: The function has five levels of identation and breaks the 80 character limit. Refactoring is necessary. cc: Ingo Molnar <mingo@redhat.com> cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by:
Morten Rasmussen <morten.rasmussen@arm.com>
-
Morten Rasmussen authored
With scale-invariant usage tracking get_cpu_usage() should never return a usage above the current compute capacity of the cpu (capacity_curr). The scaling of the utilization tracking contributions should generally cause the cpu utilization to saturate at capacity_curr, but it may temporarily exceed this value in certain situations. This patch changes the cap from capacity_orig to capacity_curr. cc: Ingo Molnar <mingo@redhat.com> cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by:
Morten Rasmussen <morten.rasmussen@arm.com>
-
Morten Rasmussen authored
Move get_cpu_usage() to an earlier position in fair.c. cc: Ingo Molnar <mingo@redhat.com> cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by:
Morten Rasmussen <morten.rasmussen@arm.com>
-
Morten Rasmussen authored
capacity_orig_of() returns the max available compute capacity of a cpu. For scale-invariant utilization tracking and energy-aware scheduling decisions it is useful to know the compute capacity available at the current OPP of a cpu. cc: Ingo Molnar <mingo@redhat.com> cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by:
Morten Rasmussen <morten.rasmussen@arm.com>
-
Morten Rasmussen authored
This patch introduces the ENERGY_AWARE sched feature, which is implemented using jump labels when SCHED_DEBUG is defined. It is statically set false when SCHED_DEBUG is not defined. Hence this doesn't allow energy awareness to be enabled without SCHED_DEBUG. This sched_feature knob will be replaced later with a more appropriate control knob when things have matured a bit. ENERGY_AWARE is based on per-entity load-tracking hence FAIR_GROUP_SCHED must be enable. This dependency isn't checked at compile time yet. cc: Ingo Molnar <mingo@redhat.com> cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by:
Morten Rasmussen <morten.rasmussen@arm.com>
-
Morten Rasmussen authored
Add the blocked utilization contribution to group sched_entity utilization (se->avg.utilization_avg_contrib) and to get_cpu_usage(). With this change cpu usage now includes recent usage by currently non-runnable tasks, hence it provides a more stable view of the cpu usage. It does, however, also mean that the meaning of usage is changed: A cpu may be momentarily idle while usage >0. It can no longer be assumed that cpu usage >0 implies runnable tasks on the rq. cfs_rq->utilization_load_avg or nr_running should be used instead to get the current rq status. cc: Ingo Molnar <mingo@redhat.com> cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by:
Morten Rasmussen <morten.rasmussen@arm.com>
-
Morten Rasmussen authored
Introduces the blocked utilization, the utilization counter-part to cfs_rq->utilization_load_avg. It is the sum of sched_entity utilization contributions of entities that were recently on the cfs_rq that are currently blocked. Combined with sum of contributions of entities currently on the cfs_rq or currently running (cfs_rq->utilization_load_avg) this can provide a more stable average view of the cpu usage. cc: Ingo Molnar <mingo@redhat.com> cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by:
Morten Rasmussen <morten.rasmussen@arm.com>
-
Since now we have besides frequency invariant also cpu (uarch plus max system frequency) invariant cfs_rq::utilization_load_avg both, frequency and cpu scaling happens as part of the load tracking. So cfs_rq::utilization_load_avg does not have to be scaled by the original capacity of the cpu again. Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by:
Dietmar Eggemann <dietmar.eggemann@arm.com>
-
- 02 Feb, 2015 11 commits
-
-
Besides the existing frequency scale-invariance correction factor, apply cpu scale-invariance correction factor to usage tracking. Cpu scale-invariance takes cpu performance deviations due to micro-architectural differences (i.e. instructions per seconds) between cpus in HMP systems (e.g. big.LITTLE) and differences in the frequency value of the highest OPP between cpus in SMP systems into consideration. Each segment of the sched_avg::running_avg_sum geometric series is now scaled by the cpu performance factor too so the sched_avg::utilization_avg_contrib of each entity will be invariant from the particular cpu of the HMP/SMP system it is gathered on. So the usage level that is returned by get_cpu_usage stays relative to the max cpu performance of the system. Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by:
Dietmar Eggemann <dietmar.eggemann@arm.com>
-
Apply frequency scale-invariance correction factor to load tracking. Each segment of the sched_avg::runnable_avg_sum geometric series is now scaled by the current frequency so the sched_avg::load_avg_contrib of each entity will be invariant with frequency scaling. As a result, cfs_rq::runnable_load_avg which is the sum of sched_avg::load_avg_contrib, becomes invariant too. So the load level that is returned by weighted_cpuload, stays relative to the max frequency of the cpu. Then, we want the keep the load tracking values in a 32bits type, which implies that the max value of sched_avg::{runnable|running}_avg_sum must be lower than 2^32/88761=48388 (88761 is the max weight of a task). As LOAD_AVG_MAX = 47742, arch_scale_freq_capacity must return a value less than (48388/47742) << SCHED_CAPACITY_SHIFT = 1037 (SCHED_SCALE_CAPACITY = 1024). So we define the range to [0..SCHED_SCALE_CAPACITY] in order to avoid overflow. Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by:
Dietmar Eggemann <dietmar.eggemann@arm.com> Acked-by:
Vincent Guittot <vincent.guittot@linaro.org>
-
When a CPU is used to handle a lot of IRQs or some RT tasks, the remaining capacity for CFS tasks can be significantly reduced. Once we detect such situation by comparing cpu_capacity_orig and cpu_capacity, we trig an idle load balance to check if it's worth moving its tasks on an idle CPU. Once the idle load_balance has selected the busiest CPU, it will look for an active load balance for only two cases : - there is only 1 task on the busiest CPU. - we haven't been able to move a task of the busiest rq. A CPU with a reduced capacity is included in the 1st case, and it's worth to actively migrate its task if the idle CPU has got full capacity. This test has been added in need_active_balance. As a sidenote, this will note generate more spurious ilb because we already trig an ilb if there is more than 1 busy cpu. If this cpu is the only one that has a task, we will trig the ilb once for migrating the task. The nohz_kick_needed function has been cleaned up a bit while adding the new test env.src_cpu and env.src_rq must be set unconditionnally because they are used in need_active_balance which is called even if busiest->nr_running equals 1 Signed-off-by:
Vincent Guittot <vincent.guittot@linaro.org>
-
The scheduler tries to compute how many tasks a group of CPUs can handle by assuming that a task's load is SCHED_LOAD_SCALE and a CPU's capacity is SCHED_CAPACITY_SCALE. group_capacity_factor divides the capacity of the group by SCHED_LOAD_SCALE to estimate how many task can run in the group. Then, it compares this value with the sum of nr_running to decide if the group is overloaded or not. But the group_capacity_factor is hardly working for SMT system, it sometimes works for big cores but fails to do the right thing for little cores. Below are two examples to illustrate the problem that this patch solves: 1- If the original capacity of a CPU is less than SCHED_CAPACITY_SCALE (640 as an example), a group of 3 CPUS will have a max capacity_factor of 2 (div_round_closest(3x640/1024) = 2) which means that it will be seen as overloaded even if we have only one task per CPU. 2 - If the original capacity of a CPU is greater than SCHED_CAPACITY_SCALE (1512 as an example), a group of 4 CPUs will have a capacity_factor of 4 (at max and thanks to the fix [0] for SMT system that prevent the apparition of ghost CPUs) but if one CPU is fully used by rt tasks (and its capacity is reduced to nearly nothing), the capacity factor of the group will still be 4 (div_round_closest(3*1512/1024) = 5 which is cap to 4 with [0]). So, this patch tries to solve this issue by removing capacity_factor and replacing it with the 2 following metrics : -The available CPU's capacity for CFS tasks which is already used by load_balance. -The usage of the CPU by the CFS tasks. For the latter, utilization_avg_contrib has been re-introduced to compute the usage of a CPU by CFS tasks. group_capacity_factor and group_has_free_capacity has been removed and replaced by group_no_capacity. We compare the number of task with the number of CPUs and we evaluate the level of utilization of the CPUs to define if a group is overloaded or if a group has capacity to handle more tasks. For SD_PREFER_SIBLING, a group is tagged overloaded if it has more than 1 task so it will be selected in priority (among the overloaded groups). Since [1], SD_PREFER_SIBLING is no more concerned by the computation of load_above_capacity because local is not overloaded. Finally, the sched_group->sched_group_capacity->capacity_orig has been removed because it's no more used during load balance. [1] https://lkml.org/lkml/2014/8/12/295 Signed-off-by:
Vincent Guittot <vincent.guittot@linaro.org> [Fixed merge conflict on v3.19-rc6: Morten Rasmussen <morten.rasmussen@arm.com>]
-
Monitor the usage level of each group of each sched_domain level. The usage is the portion of cpu_capacity_orig that is currently used on a CPU or group of CPUs. We use the utilization_load_avg to evaluate the usage level of each group. The utilization_load_avg only takes into account the running time of the CFS tasks on a CPU with a maximum value of SCHED_LOAD_SCALE when the CPU is fully utilized. Nevertheless, we must cap utilization_load_avg which can be temporaly greater than SCHED_LOAD_SCALE after the migration of a task on this CPU and until the metrics are stabilized. The utilization_load_avg is in the range [0..SCHED_LOAD_SCALE] to reflect the running load on the CPU whereas the available capacity for the CFS task is in the range [0..cpu_capacity_orig]. In order to test if a CPU is fully utilized by CFS tasks, we have to scale the utilization in the cpu_capacity_orig range of the CPU to get the usage of the latter. The usage can then be compared with the available capacity (ie cpu_capacity) to deduct the usage level of a CPU. The frequency scaling invariance of the usage is not taken into account in this patch, it will be solved in another patch which will deal with frequency scaling invariance on the running_load_avg. Signed-off-by:
Vincent Guittot <vincent.guittot@linaro.org> Acked-by:
Morten Rasmussen <morten.rasmussen@arm.com>
-
This new field cpu_capacity_orig reflects the original capacity of a CPU before being altered by rt tasks and/or IRQ The cpu_capacity_orig will be used: - to detect when the capacity of a CPU has been noticeably reduced so we can trig load balance to look for a CPU with better capacity. As an example, we can detect when a CPU handles a significant amount of irq (with CONFIG_IRQ_TIME_ACCOUNTING) but this CPU is seen as an idle CPU by scheduler whereas CPUs, which are really idle, are available. - evaluate the available capacity for CFS tasks Signed-off-by:
Vincent Guittot <vincent.guittot@linaro.org> Reviewed-by:
Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Acked-by:
Morten Rasmussen <morten.rasmussen@arm.com>
-
The average running time of RT tasks is used to estimate the remaining compute capacity for CFS tasks. This remaining capacity is the original capacity scaled down by a factor (aka scale_rt_capacity). This estimation of available capacity must also be invariant with frequency scaling. A frequency scaling factor is applied on the running time of the RT tasks for computing scale_rt_capacity. In sched_rt_avg_update, we scale the RT execution time like below: rq->rt_avg += rt_delta * arch_scale_freq_capacity() >> SCHED_CAPACITY_SHIFT Then, scale_rt_capacity can be summarized by: scale_rt_capacity = SCHED_CAPACITY_SCALE - ((rq->rt_avg << SCHED_CAPACITY_SHIFT) / period) We can optimize by removing right and left shift in the computation of rq->rt_avg and scale_rt_capacity The call to arch_scale_frequency_capacity in the rt scheduling path might be a concern for RT folks because I'm not sure whether we can rely on arch_scale_freq_capacity to be short and efficient ? Signed-off-by:
Vincent Guittot <vincent.guittot@linaro.org> Acked-by:
Morten Rasmussen <morten.rasmussen@arm.com>
-
Morten Rasmussen authored
Apply frequency scale-invariance correction factor to usage tracking. Each segment of the running_load_avg geometric series is now scaled by the current frequency so the utilization_avg_contrib of each entity will be invariant with frequency scaling. As a result, utilization_load_avg which is the sum of utilization_avg_contrib, becomes invariant too. So the usage level that is returned by get_cpu_usage, stays relative to the max frequency as the cpu_capacity which is is compared against. Then, we want the keep the load tracking values in a 32bits type, which implies that the max value of {runnable|running}_avg_sum must be lower than 2^32/88761=48388 (88761 is the max weigth of a task). As LOAD_AVG_MAX = 47742, arch_scale_freq_capacity must return a value less than (48388/47742) << SCHED_CAPACITY_SHIFT = 1037 (SCHED_SCALE_CAPACITY = 1024). So we define the range to [0..SCHED_SCALE_CAPACITY] in order to avoid overflow. cc: Paul Turner <pjt@google.com> cc: Ben Segall <bsegall@google.com> Signed-off-by:
Morten Rasmussen <morten.rasmussen@arm.com> Signed-off-by:
Vincent Guittot <vincent.guittot@linaro.org>
-
Now that arch_scale_cpu_capacity has been introduced to scale the original capacity, the arch_scale_freq_capacity is no longer used (it was previously used by ARM arch). Remove arch_scale_freq_capacity from the computation of cpu_capacity. The frequency invariance will be handled in the load tracking and not in the CPU capacity. arch_scale_freq_capacity will be revisited for scaling load with the current frequency of the CPUs in a later patch. Signed-off-by:
Vincent Guittot <vincent.guittot@linaro.org> Acked-by:
Morten Rasmussen <morten.rasmussen@arm.com>
-
Morten Rasmussen authored
Adds usage contribution tracking for group entities. Unlike se->avg.load_avg_contrib, se->avg.utilization_avg_contrib for group entities is the sum of se->avg.utilization_avg_contrib for all entities on the group runqueue. It is _not_ influenced in any way by the task group h_load. Hence it is representing the actual cpu usage of the group, not its intended load contribution which may differ significantly from the utilization on lightly utilized systems. cc: Paul Turner <pjt@google.com> cc: Ben Segall <bsegall@google.com> Signed-off-by:
Morten Rasmussen <morten.rasmussen@arm.com> Signed-off-by:
Vincent Guittot <vincent.guittot@linaro.org>
-
Add new statistics which reflect the average time a task is running on the CPU and the sum of these running time of the tasks on a runqueue. The latter is named utilization_load_avg. This patch is based on the usage metric that was proposed in the 1st versions of the per-entity load tracking patchset by Paul Turner <pjt@google.com> but that has be removed afterwards. This version differs from the original one in the sense that it's not linked to task_group. The rq's utilization_load_avg will be used to check if a rq is overloaded or not instead of trying to compute how many tasks a group of CPUs can handle. Rename runnable_avg_period into avg_period as it is now used with both runnable_avg_sum and running_avg_sum Add some descriptions of the variables to explain their differences cc: Paul Turner <pjt@google.com> cc: Ben Segall <bsegall@google.com> Signed-off-by:
Vincent Guittot <vincent.guittot@linaro.org> Acked-by:
Morten Rasmussen <morten.rasmussen@arm.com>
-
- 09 Jan, 2015 2 commits
-
-
Tetsuo Handa authored
When alloc_fair_sched_group() in sched_create_group() fails, free_sched_group() is called, and free_fair_sched_group() is called by free_sched_group(). Since destroy_cfs_bandwidth() is called by free_fair_sched_group() without calling init_cfs_bandwidth(), RCU stall occurs at hrtimer_cancel(): INFO: rcu_sched self-detected stall on CPU { 1} (t=60000 jiffies g=13074 c=13073 q=0) Task dump for CPU 1: (fprintd) R running task 0 6249 1 0x00000088 ... Call Trace: <IRQ> [<ffffffff81094988>] sched_show_task+0xa8/0x110 [<ffffffff81097acd>] dump_cpu_task+0x3d/0x50 [<ffffffff810c3a80>] rcu_dump_cpu_stacks+0x90/0xd0 [<ffffffff810c7751>] rcu_check_callbacks+0x491/0x700 [<ffffffff810cbf2b>] update_process_times+0x4b/0x80 [<ffffffff810db046>] tick_sched_handle.isra.20+0x36/0x50 [<ffffffff810db0a2>] tick_sched_timer+0x42/0x70 [<ffffffff810ccb19>] __run_hrtimer+0x69/0x1a0 [<ffffffff810db060>] ? tick_sched_handle.isra.20+0x50/0x50 [<ffffffff810ccedf>] hrtimer_interrupt+0xef/0x230 [<ffffffff810452cb>] local_apic_timer_interrupt+0x3b/0x70 [<ffffffff8164a465>] smp_apic_timer_interrupt+0x45/0x60 [<ffffffff816485bd>] apic_timer_interrupt+0x6d/0x80 <EOI> [<ffffffff810cc588>] ? lock_hrtimer_base.isra.23+0x18/0x50 [<ffffffff81193cf1>] ? __kmalloc+0x211/0x230 [<ffffffff810cc9d2>] hrtimer_try_to_cancel+0x22/0xd0 [<ffffffff81193cf1>] ? __kmalloc+0x211/0x230 [<ffffffff810ccaa2>] hrtimer_cancel+0x22/0x30 [<ffffffff810a3cb5>] free_fair_sched_group+0x25/0xd0 [<ffffffff8108df46>] free_sched_group+0x16/0x40 [<ffffffff810971bb>] sched_create_group+0x4b/0x80 [<ffffffff810aa383>] sched_autogroup_create_attach+0x43/0x1c0 [<ffffffff8107dc9c>] sys_setsid+0x7c/0x110 [<ffffffff81647729>] system_call_fastpath+0x12/0x17 Check whether init_cfs_bandwidth() was called before calling destroy_cfs_bandwidth(). Signed-off-by:
Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> [ Move the check into destroy_cfs_bandwidth() to aid compilability. ] Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Paul Turner <pjt@google.com> Cc: Ben Segall <bsegall@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/201412252210.GCC30204.SOMVFFOtQJFLOH@I-love.SAKURA.ne.jp Signed-off-by:
Ingo Molnar <mingo@kernel.org>
-
Yuyang Du authored
In effective_load, we have (long w * unsigned long tg->shares) / long W, when w is negative, it is cast to unsigned long and hence the product is insanely large. Fix this by casting tg->shares to long. Reported-by:
Sasha Levin <sasha.levin@oracle.com> Signed-off-by:
Yuyang Du <yuyang.du@intel.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Dave Jones <davej@redhat.com> Cc: Andrey Ryabinin <a.ryabinin@samsung.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20141219002956.GA25405@intel.com Signed-off-by:
Ingo Molnar <mingo@kernel.org>
-
- 16 Nov, 2014 3 commits
-
-
Wanpeng Li authored
Commit caeb178c ("sched/fair: Make update_sd_pick_busiest() return 'true' on a busier sd") changes groups to be ranked in the order of overloaded > imbalance > other, and busiest group is picked according to this order. sgs->group_capacity_factor is used to check if the group is overloaded. When the child domain prefers tasks to go to siblings first, the sgs->group_capacity_factor will be set lower than one in order to move all the excess tasks away. However, group overloaded status is not updated when sgs->group_capacity_factor is set to lower than one, which leads to us missing to find the busiest group. This patch fixes it by updating group overloaded status when sg capacity factor is set to one, in order to find the busiest group accurately. Signed-off-by:
Wanpeng Li <wanpeng.li@linux.intel.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Kirill Tkhai <ktkhai@parallels.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1415144690-25196-1-git-send-email-wanpeng.li@linux.intel.com [ Fixed the changelog. ] Signed-off-by:
Ingo Molnar <mingo@kernel.org>
-
Wanpeng Li authored
Move the p->nr_cpus_allowed check into kernel/sched/core.c: select_task_rq(). This change will make fair.c, rt.c, and deadline.c all start with the same logic. Suggested-and-Acked-by:
Steven Rostedt <rostedt@goodmis.org> Signed-off-by:
Wanpeng Li <wanpeng.li@linux.intel.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Cc: "pang.xunlei" <pang.xunlei@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1415150077-59053-1-git-send-email-wanpeng.li@linux.intel.com Signed-off-by:
Ingo Molnar <mingo@kernel.org>
-
Kirill Tkhai authored
Nobody iterates over numa_group::task_list, this just confuses the readers. Signed-off-by:
Kirill Tkhai <ktkhai@parallels.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1415358456.28592.17.camel@tkhai Signed-off-by:
Ingo Molnar <mingo@kernel.org>
-