1. 25 Mar, 2016 11 commits
    • Alexander Potapenko's avatar
      mm, kasan: add GFP flags to KASAN API · 505f5dcb
      Alexander Potapenko authored
      
      
      Add GFP flags to KASAN hooks for future patches to use.
      
      This patch is based on the "mm: kasan: unified support for SLUB and SLAB
      allocators" patch originally prepared by Dmitry Chernenkov.
      
      Signed-off-by: default avatarAlexander Potapenko <glider@google.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Andrey Konovalov <adech.fo@gmail.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Konstantin Serebryany <kcc@google.com>
      Cc: Dmitry Chernenkov <dmitryc@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      505f5dcb
    • Alexander Potapenko's avatar
      mm, kasan: SLAB support · 7ed2f9e6
      Alexander Potapenko authored
      
      
      Add KASAN hooks to SLAB allocator.
      
      This patch is based on the "mm: kasan: unified support for SLUB and SLAB
      allocators" patch originally prepared by Dmitry Chernenkov.
      
      Signed-off-by: default avatarAlexander Potapenko <glider@google.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Andrey Konovalov <adech.fo@gmail.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Konstantin Serebryany <kcc@google.com>
      Cc: Dmitry Chernenkov <dmitryc@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7ed2f9e6
    • Vlastimil Babka's avatar
      mm/page_alloc: prevent merging between isolated and other pageblocks · d9dddbf5
      Vlastimil Babka authored
      Hanjun Guo has reported that a CMA stress test causes broken accounting of
      CMA and free pages:
      
      > Before the test, I got:
      > -bash-4.3# cat /proc/meminfo | grep Cma
      > CmaTotal:         204800 kB
      > CmaFree:          195044 kB
      >
      >
      > After running the test:
      > -bash-4.3# cat /proc/meminfo | grep Cma
      > CmaTotal:         204800 kB
      > CmaFree:         6602584 kB
      >
      > So the freed CMA memory is more than total..
      >
      > Also the the MemFree is more than mem total:
      >
      > -bash-4.3# cat /proc/meminfo
      > MemTotal:       16342016 kB
      > MemFree:        22367268 kB
      > MemAvailable:   22370528 kB
      
      Laura Abbott has confirmed the issue and suspected the freepage accounting
      rewrite around 3.18/4.0 by Joonsoo Kim.  Joonsoo had a theory that this is
      caused by unexpected merging between MIGRATE_ISOLATE and MIGRATE_CMA
      pageblocks:
      
      > CMA isolates MAX_ORDER aligned blocks, but, during the process,
      > partialy isolated block exists. If MAX_ORDER is 11 and
      > pageblock_order is 9, two pageblocks make up MAX_ORDER
      > aligned block and I can think following scenario because pageblock
      > (un)isolation would be done one by one.
      >
      > (each character means one pageblock. 'C', 'I' means MIGRATE_CMA,
      > MIGRATE_ISOLATE, respectively.
      >
      > CC -> IC -> II (Isolation)
      > II -> CI -> CC (Un-isolation)
      >
      > If some pages are freed at this intermediate state such as IC or CI,
      > that page could be merged to the other page that is resident on
      > different type of pageblock and it will cause wrong freepage count.
      
      This was supposed to be prevented by CMA operating on MAX_ORDER blocks,
      but since it doesn't hold the zone->lock between pageblocks, a race
      window does exist.
      
      It's also likely that unexpected merging can occur between
      MIGRATE_ISOLATE and non-CMA pageblocks.  This should be prevented in
      __free_one_page() since commit 3c605096 ("mm/page_alloc: restrict
      max order of merging on isolated pageblock").  However, we only check
      the migratetype of the pageblock where buddy merging has been initiated,
      not the migratetype of the buddy pageblock (or group of pageblocks)
      which can be MIGRATE_ISOLATE.
      
      Joonsoo has suggested checking for buddy migratetype as part of
      page_is_buddy(), but that would add extra checks in allocator hotpath
      and bloat-o-meter has shown significant code bloat (the function is
      inline).
      
      This patch reduces the bloat at some expense of more complicated code.
      The buddy-merging while-loop in __free_one_page() is initially bounded
      to pageblock_border and without any migratetype checks.  The checks are
      placed outside, bumping the max_order if merging is allowed, and
      returning to the while-loop with a statement which can't be possibly
      considered harmful.
      
      This fixes the accounting bug and also removes the arguably weird state
      in the original commit 3c605096 where buddies could be left
      unmerged.
      
      Fixes: 3c605096 ("mm/page_alloc: restrict max order of merging on isolated pageblock")
      Link: https://lkml.org/lkml/2016/3/2/280
      
      
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Reported-by: default avatarHanjun Guo <guohanjun@huawei.com>
      Tested-by: default avatarHanjun Guo <guohanjun@huawei.com>
      Acked-by: default avatarJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Debugged-by: default avatarLaura Abbott <labbott@redhat.com>
      Debugged-by: default avatarJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
      Cc: Michal Nazarewicz <mina86@mina86.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
      Cc: <stable@vger.kernel.org>	[3.18+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d9dddbf5
    • Tetsuo Handa's avatar
      oom, oom_reaper: protect oom_reaper_list using simpler way · bb29902a
      Tetsuo Handa authored
      
      
      "oom, oom_reaper: disable oom_reaper for oom_kill_allocating_task" tried
      to protect oom_reaper_list using MMF_OOM_KILLED flag.  But we can do it
      by simply checking tsk->oom_reaper_list != NULL.
      
      Signed-off-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bb29902a
    • Michal Hocko's avatar
      oom: make oom_reaper freezable · e2679606
      Michal Hocko authored
      
      
      After "oom: clear TIF_MEMDIE after oom_reaper managed to unmap the
      address space" oom_reaper will call exit_oom_victim on the target task
      after it is done.  This might however race with the PM freezer:
      
      CPU0				CPU1				CPU2
      freeze_processes
        try_to_freeze_tasks
        				# Allocation request
      				out_of_memory
        oom_killer_disable
      				  wake_oom_reaper(P1)
      				  				__oom_reap_task
      								  exit_oom_victim(P1)
          wait_event(oom_victims==0)
      [...]
          				do_exit(P1)
      				  perform IO/interfere with the freezer
      
      which breaks the oom_killer_disable semantic.  We no longer have a
      guarantee that the oom victim won't interfere with the freezer because
      it might be anywhere on the way to do_exit while the freezer thinks the
      task has already terminated.  It might trigger IO or touch devices which
      are frozen already.
      
      In order to close this race, make the oom_reaper thread freezable.  This
      will work because
      	a) already running oom_reaper will block freezer to enter the
      	   quiescent state
      	b) wake_oom_reaper will not wake up the reaper after it has been
      	   frozen
      	c) the only way to call exit_oom_victim after try_to_freeze_tasks
      	   is from the oom victim's context when we know the further
      	   interference shouldn't be possible
      
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Rik van Riel <riel@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e2679606
    • Vladimir Davydov's avatar
      oom: make oom_reaper_list single linked · 29c696e1
      Vladimir Davydov authored
      
      
      Entries are only added/removed from oom_reaper_list at head so we can
      use a single linked list and hence save a word in task_struct.
      
      Signed-off-by: default avatarVladimir Davydov <vdavydov@virtuozzo.com>
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      29c696e1
    • Michal Hocko's avatar
      oom, oom_reaper: disable oom_reaper for oom_kill_allocating_task · 855b0183
      Michal Hocko authored
      
      
      Tetsuo has reported that oom_kill_allocating_task=1 will cause
      oom_reaper_list corruption because oom_kill_process doesn't follow
      standard OOM exclusion (aka ignores TIF_MEMDIE) and allows to enqueue
      the same task multiple times - e.g.  by sacrificing the same child
      multiple times.
      
      This patch fixes the issue by introducing a new MMF_OOM_KILLED mm flag
      which is set in oom_kill_process atomically and oom reaper is disabled
      if the flag was already set.
      
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Reported-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Rik van Riel <riel@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      855b0183
    • Michal Hocko's avatar
      mm, oom_reaper: implement OOM victims queuing · 03049269
      Michal Hocko authored
      
      
      wake_oom_reaper has allowed only 1 oom victim to be queued.  The main
      reason for that was the simplicity as other solutions would require some
      way of queuing.  The current approach is racy and that was deemed
      sufficient as the oom_reaper is considered a best effort approach to
      help with oom handling when the OOM victim cannot terminate in a
      reasonable time.  The race could lead to missing an oom victim which can
      get stuck
      
      out_of_memory
        wake_oom_reaper
          cmpxchg // OK
          			oom_reaper
      			  oom_reap_task
      			    __oom_reap_task
      oom_victim terminates
      			      atomic_inc_not_zero // fail
      out_of_memory
        wake_oom_reaper
          cmpxchg // fails
      			  task_to_reap = NULL
      
      This race requires 2 OOM invocations in a short time period which is not
      very likely but certainly not impossible.  E.g.  the original victim
      might have not released a lot of memory for some reason.
      
      The situation would improve considerably if wake_oom_reaper used a more
      robust queuing.  This is what this patch implements.  This means adding
      oom_reaper_list list_head into task_struct (eat a hole before embeded
      thread_struct for that purpose) and a oom_reaper_lock spinlock for
      queuing synchronization.  wake_oom_reaper will then add the task on the
      queue and oom_reaper will dequeue it.
      
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
      Cc: Andrea Argangeli <andrea@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      03049269
    • Michal Hocko's avatar
      mm, oom_reaper: report success/failure · bc448e89
      Michal Hocko authored
      
      
      Inform about the successful/failed oom_reaper attempts and dump all the
      held locks to tell us more who is blocking the progress.
      
      [akpm@linux-foundation.org: fix CONFIG_MMU=n build]
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Andrea Argangeli <andrea@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bc448e89
    • Michal Hocko's avatar
      oom: clear TIF_MEMDIE after oom_reaper managed to unmap the address space · 36324a99
      Michal Hocko authored
      
      
      When oom_reaper manages to unmap all the eligible vmas there shouldn't
      be much of the freable memory held by the oom victim left anymore so it
      makes sense to clear the TIF_MEMDIE flag for the victim and allow the
      OOM killer to select another task.
      
      The lack of TIF_MEMDIE also means that the victim cannot access memory
      reserves anymore but that shouldn't be a problem because it would get
      the access again if it needs to allocate and hits the OOM killer again
      due to the fatal_signal_pending resp.  PF_EXITING check.  We can safely
      hide the task from the OOM killer because it is clearly not a good
      candidate anymore as everyhing reclaimable has been torn down already.
      
      This patch will allow to cap the time an OOM victim can keep TIF_MEMDIE
      and thus hold off further global OOM killer actions granted the oom
      reaper is able to take mmap_sem for the associated mm struct.  This is
      not guaranteed now but further steps should make sure that mmap_sem for
      write should be blocked killable which will help to reduce such a lock
      contention.  This is not done by this patch.
      
      Note that exit_oom_victim might be called on a remote task from
      __oom_reap_task now so we have to check and clear the flag atomically
      otherwise we might race and underflow oom_victims or wake up waiters too
      early.
      
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Suggested-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Suggested-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Andrea Argangeli <andrea@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Rik van Riel <riel@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      36324a99
    • Michal Hocko's avatar
      mm, oom: introduce oom reaper · aac45363
      Michal Hocko authored
      
      
      This patch (of 5):
      
      This is based on the idea from Mel Gorman discussed during LSFMM 2015
      and independently brought up by Oleg Nesterov.
      
      The OOM killer currently allows to kill only a single task in a good
      hope that the task will terminate in a reasonable time and frees up its
      memory.  Such a task (oom victim) will get an access to memory reserves
      via mark_oom_victim to allow a forward progress should there be a need
      for additional memory during exit path.
      
      It has been shown (e.g.  by Tetsuo Handa) that it is not that hard to
      construct workloads which break the core assumption mentioned above and
      the OOM victim might take unbounded amount of time to exit because it
      might be blocked in the uninterruptible state waiting for an event (e.g.
      lock) which is blocked by another task looping in the page allocator.
      
      This patch reduces the probability of such a lockup by introducing a
      specialized kernel thread (oom_reaper) which tries to reclaim additional
      memory by preemptively reaping the anonymous or swapped out memory owned
      by the oom victim under an assumption that such a memory won't be needed
      when its owner is killed and kicked from the userspace anyway.  There is
      one notable exception to this, though, if the OOM victim was in the
      process of coredumping the result would be incomplete.  This is
      considered a reasonable constrain because the overall system health is
      more important than debugability of a particular application.
      
      A kernel thread has been chosen because we need a reliable way of
      invocation so workqueue context is not appropriate because all the
      workers might be busy (e.g.  allocating memory).  Kswapd which sounds
      like another good fit is not appropriate as well because it might get
      blocked on locks during reclaim as well.
      
      oom_reaper has to take mmap_sem on the target task for reading so the
      solution is not 100% because the semaphore might be held or blocked for
      write but the probability is reduced considerably wrt.  basically any
      lock blocking forward progress as described above.  In order to prevent
      from blocking on the lock without any forward progress we are using only
      a trylock and retry 10 times with a short sleep in between.  Users of
      mmap_sem which need it for write should be carefully reviewed to use
      _killable waiting as much as possible and reduce allocations requests
      done with the lock held to absolute minimum to reduce the risk even
      further.
      
      The API between oom killer and oom reaper is quite trivial.
      wake_oom_reaper updates mm_to_reap with cmpxchg to guarantee only
      NULL->mm transition and oom_reaper clear this atomically once it is done
      with the work.  This means that only a single mm_struct can be reaped at
      the time.  As the operation is potentially disruptive we are trying to
      limit it to the ncessary minimum and the reaper blocks any updates while
      it operates on an mm.  mm_struct is pinned by mm_count to allow parallel
      exit_mmap and a race is detected by atomic_inc_not_zero(mm_users).
      
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Suggested-by: default avatarOleg Nesterov <oleg@redhat.com>
      Suggested-by: default avatarMel Gorman <mgorman@suse.de>
      Acked-by: default avatarMel Gorman <mgorman@suse.de>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Andrea Argangeli <andrea@kernel.org>
      Cc: Rik van Riel <riel@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      aac45363
  2. 22 Mar, 2016 3 commits
    • Piotr Kwapulinski's avatar
      mm/mprotect.c: don't imply PROT_EXEC on non-exec fs · f138556d
      Piotr Kwapulinski authored
      
      
      The mprotect(PROT_READ) fails when called by the READ_IMPLIES_EXEC
      binary on a memory mapped file located on non-exec fs.  The mprotect
      does not check whether fs is _executable_ or not.  The PROT_EXEC flag is
      set automatically even if a memory mapped file is located on non-exec
      fs.  Fix it by checking whether a memory mapped file is located on a
      non-exec fs.  If so the PROT_EXEC is not implied by the PROT_READ.  The
      implementation uses the VM_MAYEXEC flag set properly in mmap.  Now it is
      consistent with mmap.
      
      I did the isolated tests (PT_GNU_STACK X/NX, multiple VMAs, X/NX fs).  I
      also patched the official 3.19.0-47-generic Ubuntu 14.04 kernel and it
      seems to work.
      
      Signed-off-by: default avatarPiotr Kwapulinski <kwapulinski.piotr@gmail.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Konstantin Khlebnikov <koct9i@gmail.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f138556d
    • Dmitry Vyukov's avatar
      kernel: add kcov code coverage · 5c9a8750
      Dmitry Vyukov authored
      kcov provides code coverage collection for coverage-guided fuzzing
      (randomized testing).  Coverage-guided fuzzing is a testing technique
      that uses coverage feedback to determine new interesting inputs to a
      system.  A notable user-space example is AFL
      (http://lcamtuf.coredump.cx/afl/).  However, this technique is not
      widely used for kernel testing due to missing compiler and kernel
      support.
      
      kcov does not aim to collect as much coverage as possible.  It aims to
      collect more or less stable coverage that is function of syscall inputs.
      To achieve this goal it does not collect coverage in soft/hard
      interrupts and instrumentation of some inherently non-deterministic or
      non-interesting parts of kernel is disbled (e.g.  scheduler, locking).
      
      Currently there is a single coverage collection mode (tracing), but the
      API anticipates additional collection modes.  Initially I also
      implemented a second mode which exposes coverage in a fixed-size hash
      table of counters (what Quentin used in his original patch).  I've
      dropped the second mode for simplicity.
      
      This patch adds the necessary support on kernel side.  The complimentary
      compiler support was added in gcc revision 231296.
      
      We've used this support to build syzkaller system call fuzzer, which has
      found 90 kernel bugs in just 2 months:
      
        https://github.com/google/syzkaller/wiki/Found-Bugs
      
      
      
      We've also found 30+ bugs in our internal systems with syzkaller.
      Another (yet unexplored) direction where kcov coverage would greatly
      help is more traditional "blob mutation".  For example, mounting a
      random blob as a filesystem, or receiving a random blob over wire.
      
      Why not gcov.  Typical fuzzing loop looks as follows: (1) reset
      coverage, (2) execute a bit of code, (3) collect coverage, repeat.  A
      typical coverage can be just a dozen of basic blocks (e.g.  an invalid
      input).  In such context gcov becomes prohibitively expensive as
      reset/collect coverage steps depend on total number of basic
      blocks/edges in program (in case of kernel it is about 2M).  Cost of
      kcov depends only on number of executed basic blocks/edges.  On top of
      that, kernel requires per-thread coverage because there are always
      background threads and unrelated processes that also produce coverage.
      With inlined gcov instrumentation per-thread coverage is not possible.
      
      kcov exposes kernel PCs and control flow to user-space which is
      insecure.  But debugfs should not be mapped as user accessible.
      
      Based on a patch by Quentin Casasnovas.
      
      [akpm@linux-foundation.org: make task_struct.kcov_mode have type `enum kcov_mode']
      [akpm@linux-foundation.org: unbreak allmodconfig]
      [akpm@linux-foundation.org: follow x86 Makefile layout standards]
      Signed-off-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Cc: syzkaller <syzkaller@googlegroups.com>
      Cc: Vegard Nossum <vegard.nossum@oracle.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Tavis Ormandy <taviso@google.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
      Cc: Kostya Serebryany <kcc@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Kees Cook <keescook@google.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: David Drysdale <drysdale@google.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Kirill A. Shutemov <kirill@shutemov.name>
      Cc: Jiri Slaby <jslaby@suse.cz>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5c9a8750
    • Minchan Kim's avatar
      zram: revive swap_slot_free_notify · 3f2b1a04
      Minchan Kim authored
      Commit b430e9d1 ("remove compressed copy from zram in-memory")
      applied swap_slot_free_notify call in *end_swap_bio_read* to remove
      duplicated memory between zram and memory.
      
      However, with the introduction of rw_page in zram: 8c7f0102
      
       ("zram:
      implement rw_page operation of zram"), it became void because rw_page
      doesn't need bio.
      
      Memory footprint is really important in embedded platforms which have
      small memory, for example, 512M) recently because it could start to kill
      processes if memory footprint exceeds some threshold by LMK or some
      similar memory management modules.
      
      This patch restores the function for rw_page, thereby eliminating this
      duplication.
      
      Signed-off-by: default avatarMinchan Kim <minchan@kernel.org>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: karam.lee <karam.lee@lge.com>
      Cc: <sangseok.lee@lge.com>
      Cc: Chan Jeong <chan.jeong@lge.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3f2b1a04
  3. 17 Mar, 2016 26 commits