Linux kernel ============ This file was moved to Documentation/admin-guide/README.rst Please notice that there are several guides for kernel developers and users. These guides can be rendered in a number of formats, like HTML and PDF. In order to build the documentation, use ``make htmldocs`` or ``make pdfdocs``. There are various text files in the Documentation/ subdirectory, several of them using the Restructured Text markup notation. See Documentation/00-INDEX for a list of what is contained in each file. Please read the Documentation/process/changes.rst file, as it contains the requirements for building and running the kernel, and information about the problems which may result by upgrading your kernel.
Dietmar Eggemann
authored
To speed up the cpu- and frequency-invariant accounting of the task
scheduler make sure that the CIE (topology_get_cpu_scale()) and FIE
(topology_get_freq_scale() get completely inlined into the task
scheduler consumer functions (e.g. __update_load_avg_se()).
This patch-set changes the interface for CIE and FIE from:
drivers/base/arch_topology.c:
static DEFINE_PER_CPU(unsigned long, item);
unsigned long topology_get_item_scale(...)
{
return per_cpu(item, cpu)
}
include/linux/arch_topology.h:
unsigned long topology_get_item_scale(...);
to:
drivers/base/arch_topology.c:
DEFINE_PER_CPU(unsigned long, item);
include/linux/arch_topology.h:
DECLARE_PER_CPU(unsigned long, item);
static inline
unsigned long topology_get_item_scale(...)
{
return per_cpu(item, cpu)
}
An uplift in performance could be detected running the kernel with the
following test patch on top (on JUNO R0 (arm64)):
@@ -2812,10 +2812,18 @@ accumulate_sum(u64 delta, int cpu, struct sched_avg *sa,
unsigned long scale_freq, scale_cpu;
u32 contrib = (u32)delta; /* p == 0 -> delta < 1024 */
u64 periods;
+ u64 t1, t2;
+
+ t1 = sched_clock_cpu(cpu);
scale_freq = arch_scale_freq_capacity(NULL, cpu);
scale_cpu = arch_scale_cpu_capacity(NULL, cpu);
+ t2 = sched_clock_cpu(cpu);
+
+ trace_printk("cpu=%d t1=%llu t2=%llu diff=%llu\n",
+ cpu, t1, t2, t2 - t1);
+
delta += sa->period_contrib;
periods = delta / 1024; /* A period is * 1024us * (~1ms) */
The following test results (3 test runs each) have been obtained by
tracing this trace printk (diff=x) for Cortex A-53 (LITTLE) and Cortex
A-57 (big) cpus w/ (inline) and w/o (non-inline) this patch.
mean max min
A-57 inline:
119.6 300 60
96.8 280 60
110.2 660 60
A-57 non-inline:
142.8 460 80
157.6 680 80
153.4 720 80
A-53 inline:
141.6 360 100
118.8 500 100
148.6 380 100
A-53 non-inline:
293 840 120
253.2 840 120
299.6 1060 140
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Juri Lelli <juri.lelli@arm.com>
Signed-off-by:
Dietmar Eggemann <dietmar.eggemann@arm.com>
Name | Last commit | Last update |
---|---|---|
Documentation | ||
Next | ||
arch | ||
block | ||
certs | ||
crypto | ||
drivers | ||
firmware | ||
fs | ||
include | ||
init | ||
ipc | ||
kernel | ||
lib | ||
mm | ||
net | ||
samples | ||
scripts | ||
security | ||
sound | ||
tools | ||
usr | ||
virt | ||
.cocciconfig | ||
.get_maintainer.ignore | ||
.gitattributes | ||
.gitignore | ||
.mailmap | ||
COPYING | ||
CREDITS | ||
Kbuild | ||
Kconfig | ||
MAINTAINERS | ||
Makefile | ||
README | ||
localversion-next |