1. 13 Jun, 2016 3 commits
    • Heiko Carstens's avatar
      s390: add proper __ro_after_init support · d07a980c
      Heiko Carstens authored
      
      
      On s390 __ro_after_init is currently mapped to __read_mostly which
      means that data marked as __ro_after_init will not be protected.
      
      Reason for this is that the common code __ro_after_init implementation
      is x86 centric: the ro_after_init data section was added to rodata,
      since x86 enables write protection to kernel text and rodata very
      late. On s390 we have write protection for these sections enabled with
      the initial page tables. So adding the ro_after_init data section to
      rodata does not work on s390.
      
      In order to make __ro_after_init work properly on s390 move the
      ro_after_init data, right behind rodata. Unlike the rodata section it
      will be marked read-only later after all init calls happened.
      
      This s390 specific implementation adds new __start_ro_after_init and
      __end_ro_after_init labels. Everything in between will be marked
      read-only after the init calls happened. In addition to the
      __ro_after_init data move also the exception table there, since from a
      practical point of view it fits the __ro_after_init requirements.
      
      Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      d07a980c
    • Martin Schwidefsky's avatar
      s390/mm: simplify the TLB flushing code · 64f31d58
      Martin Schwidefsky authored
      
      
      ptep_flush_lazy and pmdp_flush_lazy use mm->context.attach_count to
      decide between a lazy TLB flush vs an immediate TLB flush. The field
      contains two 16-bit counters, the number of CPUs that have the mm
      attached and can create TLB entries for it and the number of CPUs in
      the middle of a page table update.
      
      The __tlb_flush_asce, ptep_flush_direct and pmdp_flush_direct functions
      use the attach counter and a mask check with mm_cpumask(mm) to decide
      between a local flush local of the current CPU and a global flush.
      
      For all these functions the decision between lazy vs immediate and
      local vs global TLB flush can be based on CPU masks. There are two
      masks:  the mm->context.cpu_attach_mask with the CPUs that are actively
      using the mm, and the mm_cpumask(mm) with the CPUs that have used the
      mm since the last full flush. The decision between lazy vs immediate
      flush is based on the mm->context.cpu_attach_mask, to decide between
      local vs global flush the mm_cpumask(mm) is used.
      
      With this patch all checks will use the CPU masks, the old counter
      mm->context.attach_count with its two 16-bit values is turned into a
      single counter mm->context.flush_count that keeps track of the number
      of CPUs with incomplete page table updates. The sole user of this
      counter is finish_arch_post_lock_switch() which waits for the end of
      all page table updates.
      
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      64f31d58
    • Heiko Carstens's avatar
      s390/mm: align swapper_pg_dir to 16k · 0ccb32c9
      Heiko Carstens authored
      
      
      The segment/region table that is part of the kernel image must be
      properly aligned to 16k in order to make the crdte inline assembly
      work.
      Otherwise it will calculate a wrong segment/region table start address
      and access incorrect memory locations if the swapper_pg_dir is not
      aligned to 16k.
      
      Therefore define BSS_FIRST_SECTIONS in order to put the swapper_pg_dir
      at the beginning of the bss section and also align the bss section to
      16k just like other architectures did.
      
      Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      0ccb32c9
  2. 21 Apr, 2016 1 commit
    • Gerald Schaefer's avatar
      s390/mm: fix asce_bits handling with dynamic pagetable levels · 723cacbd
      Gerald Schaefer authored
      
      
      There is a race with multi-threaded applications between context switch and
      pagetable upgrade. In switch_mm() a new user_asce is built from mm->pgd and
      mm->context.asce_bits, w/o holding any locks. A concurrent mmap with a
      pagetable upgrade on another thread in crst_table_upgrade() could already
      have set new asce_bits, but not yet the new mm->pgd. This would result in a
      corrupt user_asce in switch_mm(), and eventually in a kernel panic from a
      translation exception.
      
      Fix this by storing the complete asce instead of just the asce_bits, which
      can then be read atomically from switch_mm(), so that it either sees the
      old value or the new value, but no mixture. Both cases are OK. Having the
      old value would result in a page fault on access to the higher level memory,
      but the fault handler would see the new mm->pgd, if it was a valid access
      after the mmap on the other thread has completed. So as worst-case scenario
      we would have a page fault loop for the racing thread until the next time
      slice.
      
      Also remove dead code and simplify the upgrade/downgrade path, there are no
      upgrades from 2 levels, and only downgrades from 3 levels for compat tasks.
      There are also no concurrent upgrades, because the mmap_sem is held with
      down_write() in do_mmap, so the flush and table checks during upgrade can
      be removed.
      
      Reported-by: default avatarMichael Munday <munday@ca.ibm.com>
      Reviewed-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: default avatarGerald Schaefer <gerald.schaefer@de.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      723cacbd
  3. 17 Mar, 2016 1 commit
    • Heiko Carstens's avatar
      s390: add DEBUG_RODATA support · 91d37211
      Heiko Carstens authored
      git commit d2aa1aca
      
       ("mm/init: Add 'rodata=off' boot cmdline
      parameter to disable read-only kernel mappings") adds a bogus warning
      to the console which states that s390 does not support kernel memory
      protection.
      
      This however is not true. We do support that since a couple of years
      however in a different way than the author of the above named patch
      expected.
      
      To get rid of the misleading message implement the mark_rodata_ro
      function and emit a message which states the amount of memory which
      was write protected already earlier.
      
      This is the same what parisc currently does.
      
      We currently do not support the kernel parameter "rodata=off" which
      would allow to write to the rodata section again. However since we
      have this feature since years without any problems there is no reason
      to add support for this.
      
      Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      91d37211
  4. 19 Jan, 2016 1 commit
  5. 16 Nov, 2015 1 commit
  6. 27 Aug, 2015 1 commit
    • Dan Williams's avatar
      mm: ZONE_DEVICE for "device memory" · 033fbae9
      Dan Williams authored
      
      
      While pmem is usable as a block device or via DAX mappings to userspace
      there are several usage scenarios that can not target pmem due to its
      lack of struct page coverage. In preparation for "hot plugging" pmem
      into the vmemmap add ZONE_DEVICE as a new zone to tag these pages
      separately from the ones that are subject to standard page allocations.
      Importantly "device memory" can be removed at will by userspace
      unbinding the driver of the device.
      
      Having a separate zone prevents allocation and otherwise marks these
      pages that are distinct from typical uniform memory.  Device memory has
      different lifetime and performance characteristics than RAM.  However,
      since we have run out of ZONES_SHIFT bits this functionality currently
      depends on sacrificing ZONE_DMA.
      
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Jerome Glisse <j.glisse@gmail.com>
      [hch: various simplifications in the arch interface]
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      033fbae9
  7. 03 Aug, 2015 2 commits
  8. 13 May, 2015 1 commit
  9. 25 Mar, 2015 1 commit
    • Heiko Carstens's avatar
      s390: remove 31 bit support · 5a79859a
      Heiko Carstens authored
      Remove the 31 bit support in order to reduce maintenance cost and
      effectively remove dead code. Since a couple of years there is no
      distribution left that comes with a 31 bit kernel.
      
      The 31 bit kernel also has been broken since more than a year before
      anybody noticed. In addition I added a removal warning to the kernel
      shown at ipl for 5 minutes: a960062e
      
       ("s390: add 31 bit warning
      message") which let everybody know about the plan to remove 31 bit
      code. We didn't get any response.
      
      Given that the last 31 bit only machine was introduced in 1999 let's
      remove the code.
      Anybody with 31 bit user space code can still use the compat mode.
      
      Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      5a79859a
  10. 22 Jan, 2015 1 commit
  11. 14 Sep, 2014 1 commit
  12. 03 Apr, 2014 1 commit
    • Martin Schwidefsky's avatar
      s390/mm,tlb: optimize TLB flushing for zEC12 · 1b948d6c
      Martin Schwidefsky authored
      
      
      The zEC12 machines introduced the local-clearing control for the IDTE
      and IPTE instruction. If the control is set only the TLB of the local
      CPU is cleared of entries, either all entries of a single address space
      for IDTE, or the entry for a single page-table entry for IPTE.
      Without the local-clearing control the TLB flush is broadcasted to all
      CPUs in the configuration, which is expensive.
      
      The reset of the bit mask of the CPUs that need flushing after a
      non-local IDTE is tricky. As TLB entries for an address space remain
      in the TLB even if the address space is detached a new bit field is
      required to keep track of attached CPUs vs. CPUs in the need of a
      flush. After a non-local flush with IDTE the bit-field of attached CPUs
      is copied to the bit-field of CPUs in need of a flush. The ordering
      of operations on cpu_attach_mask, attach_count and mm_cpumask(mm) is
      such that an underindication in mm_cpumask(mm) is prevented but an
      overindication in mm_cpumask(mm) is possible.
      
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      1b948d6c
  13. 26 Jul, 2013 1 commit
  14. 03 Jul, 2013 4 commits
    • Jiang Liu's avatar
      mm/s390: prepare for removing num_physpages and simplify mem_init() · a18d0e2d
      Jiang Liu authored
      
      
      Prepare for removing num_physpages and simplify mem_init().
      
      Signed-off-by: default avatarJiang Liu <jiang.liu@huawei.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a18d0e2d
    • Jiang Liu's avatar
      mm: concentrate modification of totalram_pages into the mm core · 0c988534
      Jiang Liu authored
      
      
      Concentrate code to modify totalram_pages into the mm core, so the arch
      memory initialized code doesn't need to take care of it.  With these
      changes applied, only following functions from mm core modify global
      variable totalram_pages: free_bootmem_late(), free_all_bootmem(),
      free_all_bootmem_node(), adjust_managed_page_count().
      
      With this patch applied, it will be much more easier for us to keep
      totalram_pages and zone->managed_pages in consistence.
      
      Signed-off-by: default avatarJiang Liu <jiang.liu@huawei.com>
      Acked-by: default avatarDavid Howells <dhowells@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: <sworddragon2@aol.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Cc: Jianguo Wu <wujianguo@huawei.com>
      Cc: Joonsoo Kim <js1304@gmail.com>
      Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Michel Lespinasse <walken@google.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Tang Chen <tangchen@cn.fujitsu.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Wen Congyang <wency@cn.fujitsu.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0c988534
    • Jiang Liu's avatar
      mm: enhance free_reserved_area() to support poisoning memory with zero · dbe67df4
      Jiang Liu authored
      
      
      Address more review comments from last round of code review.
      1) Enhance free_reserved_area() to support poisoning freed memory with
         pattern '0'. This could be used to get rid of poison_init_mem()
         on ARM64.
      2) A previous patch has disabled memory poison for initmem on s390
         by mistake, so restore to the original behavior.
      3) Remove redundant PAGE_ALIGN() when calling free_reserved_area().
      
      Signed-off-by: default avatarJiang Liu <jiang.liu@huawei.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: <sworddragon2@aol.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Cc: Jianguo Wu <wujianguo@huawei.com>
      Cc: Joonsoo Kim <js1304@gmail.com>
      Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Michel Lespinasse <walken@google.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Tang Chen <tangchen@cn.fujitsu.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Wen Congyang <wency@cn.fujitsu.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      dbe67df4
    • Jiang Liu's avatar
      mm: change signature of free_reserved_area() to fix building warnings · 11199692
      Jiang Liu authored
      
      
      Change signature of free_reserved_area() according to Russell King's
      suggestion to fix following build warnings:
      
        arch/arm/mm/init.c: In function 'mem_init':
        arch/arm/mm/init.c:603:2: warning: passing argument 1 of 'free_reserved_area' makes integer from pointer without a cast [enabled by default]
          free_reserved_area(__va(PHYS_PFN_OFFSET), swapper_pg_dir, 0, NULL);
          ^
        In file included from include/linux/mman.h:4:0,
                         from arch/arm/mm/init.c:15:
        include/linux/mm.h:1301:22: note: expected 'long unsigned int' but argument is of type 'void *'
         extern unsigned long free_reserved_area(unsigned long start, unsigned long end,
      
         mm/page_alloc.c: In function 'free_reserved_area':
      >> mm/page_alloc.c:5134:3: warning: passing argument 1 of 'virt_to_phys' makes pointer from integer without a cast [enabled by default]
         In file included from arch/mips/include/asm/page.h:49:0,
                          from include/linux/mmzone.h:20,
                          from include/linux/gfp.h:4,
                          from include/linux/mm.h:8,
                          from mm/page_alloc.c:18:
         arch/mips/include/asm/io.h:119:29: note: expected 'const volatile void *' but argument is of type 'long unsigned int'
         mm/page_alloc.c: In function 'free_area_init_nodes':
         mm/page_alloc.c:5030:34: warning: array subscript is below array bounds [-Warray-bounds]
      
      Also address some minor code review comments.
      
      Signed-off-by: default avatarJiang Liu <jiang.liu@huawei.com>
      Reported-by: default avatarArnd Bergmann <arnd@arndb.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: <sworddragon2@aol.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Cc: Jianguo Wu <wujianguo@huawei.com>
      Cc: Joonsoo Kim <js1304@gmail.com>
      Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Michel Lespinasse <walken@google.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Tang Chen <tangchen@cn.fujitsu.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Wen Congyang <wency@cn.fujitsu.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      11199692
  15. 02 May, 2013 1 commit
  16. 29 Apr, 2013 1 commit
  17. 17 Apr, 2013 1 commit
  18. 24 Feb, 2013 1 commit
  19. 23 Nov, 2012 2 commits
  20. 26 Sep, 2012 1 commit
  21. 20 Jul, 2012 1 commit
    • Heiko Carstens's avatar
      s390/comments: unify copyright messages and remove file names · a53c8fab
      Heiko Carstens authored
      
      
      Remove the file name from the comment at top of many files. In most
      cases the file name was wrong anyway, so it's rather pointless.
      
      Also unify the IBM copyright statement. We did have a lot of sightly
      different statements and wanted to change them one after another
      whenever a file gets touched. However that never happened. Instead
      people start to take the old/"wrong" statements to use as a template
      for new files.
      So unify all of them in one go.
      
      Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      a53c8fab
  22. 28 Mar, 2012 1 commit
  23. 24 Feb, 2012 1 commit
  24. 27 Dec, 2011 1 commit
    • Martin Schwidefsky's avatar
      [S390] add support for physical memory > 4TB · 14045ebf
      Martin Schwidefsky authored
      
      
      The kernel address space of a 64 bit kernel currently uses a three level
      page table and the vmemmap array has a fixed address and a fixed maximum
      size. A three level page table is good enough for systems with less than
      3.8TB of memory, for bigger systems four page table levels need to be
      used. Each page table level costs a bit of performance, use 3 levels for
      normal systems and 4 levels only for the really big systems.
      To avoid bloating sparse.o too much set MAX_PHYSMEM_BITS to 46 for a
      maximum of 64TB of memory.
      
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      14045ebf
  25. 31 Oct, 2011 1 commit
  26. 26 May, 2011 1 commit
  27. 23 May, 2011 1 commit
    • Martin Schwidefsky's avatar
      [S390] refactor page table functions for better pgste support · b2fa47e6
      Martin Schwidefsky authored
      
      
      Rework the architecture page table functions to access the bits in the
      page table extension array (pgste). There are a number of changes:
      1) Fix missing pgste update if the attach_count for the mm is <= 1.
      2) For every operation that affects the invalid bit in the pte or the
         rcp byte in the pgste the pcl lock needs to be acquired. The function
         pgste_get_lock gets the pcl lock and returns the current pgste value
         for a pte pointer. The function pgste_set_unlock stores the pgste
         and releases the lock. Between these two calls the bits in the pgste
         can be shuffled.
      3) Define two software bits in the pte _PAGE_SWR and _PAGE_SWC to avoid
         calling SetPageDirty and SetPageReferenced from pgtable.h. If the
         host reference backup bit or the host change backup bit has been
         set the dirty/referenced state is transfered to the pte. The common
         code will pick up the state from the pte.
      4) Add ptep_modify_prot_start and ptep_modify_prot_commit for mprotect.
      5) Remove pgd_populate_kernel, pud_populate_kernel, pmd_populate_kernel
         pgd_clear_kernel, pud_clear_kernel, pmd_clear_kernel and ptep_invalidate.
      6) Rename kvm_s390_test_and_clear_page_dirty to
         ptep_test_and_clear_user_dirty and add ptep_test_and_clear_user_young.
      7) Define mm_exclusive() and mm_has_pgste() helper to improve readability.
      
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      b2fa47e6
  28. 25 Oct, 2010 3 commits
  29. 07 Oct, 2010 1 commit
    • David Howells's avatar
      Fix IRQ flag handling naming · df9ee292
      David Howells authored
      
      
      Fix the IRQ flag handling naming.  In linux/irqflags.h under one configuration,
      it maps:
      
      	local_irq_enable() -> raw_local_irq_enable()
      	local_irq_disable() -> raw_local_irq_disable()
      	local_irq_save() -> raw_local_irq_save()
      	...
      
      and under the other configuration, it maps:
      
      	raw_local_irq_enable() -> local_irq_enable()
      	raw_local_irq_disable() -> local_irq_disable()
      	raw_local_irq_save() -> local_irq_save()
      	...
      
      This is quite confusing.  There should be one set of names expected of the
      arch, and this should be wrapped to give another set of names that are expected
      by users of this facility.
      
      Change this to have the arch provide:
      
      	flags = arch_local_save_flags()
      	flags = arch_local_irq_save()
      	arch_local_irq_restore(flags)
      	arch_local_irq_disable()
      	arch_local_irq_enable()
      	arch_irqs_disabled_flags(flags)
      	arch_irqs_disabled()
      	arch_safe_halt()
      
      Then linux/irqflags.h wraps these to provide:
      
      	raw_local_save_flags(flags)
      	raw_local_irq_save(flags)
      	raw_local_irq_restore(flags)
      	raw_local_irq_disable()
      	raw_local_irq_enable()
      	raw_irqs_disabled_flags(flags)
      	raw_irqs_disabled()
      	raw_safe_halt()
      
      with type checking on the flags 'arguments', and then wraps those to provide:
      
      	local_save_flags(flags)
      	local_irq_save(flags)
      	local_irq_restore(flags)
      	local_irq_disable()
      	local_irq_enable()
      	irqs_disabled_flags(flags)
      	irqs_disabled()
      	safe_halt()
      
      with tracing included if enabled.
      
      The arch functions can now all be inline functions rather than some of them
      having to be macros.
      
      Signed-off-by: David Howells <dhowells@redhat.com> [X86, FRV, MN10300]
      Signed-off-by: Chris Metcalf <cmetcalf@tilera.com> [Tile]
      Signed-off-by: Michal Simek <monstr@monstr.eu> [Microblaze]
      Tested-by: Catalin Marinas <catalin.marinas@arm.com> [ARM]
      Acked-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com> [AVR]
      Acked-by: Tony Luck <tony.luck@intel.com> [IA-64]
      Acked-by: Hirokazu Takata <takata@linux-m32r.org> [M32R]
      Acked-by: Greg Ungerer <gerg@uclinux.org> [M68K/M68KNOMMU]
      Acked-by: Ralf Baechle <ralf@linux-mips.org> [MIPS]
      Acked-by: Kyle McMartin <kyle@mcmartin.ca> [PA-RISC]
      Acked-by: Paul Mackerras <paulus@samba.org> [PowerPC]
      Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> [S390]
      Acked-by: Chen Liqin <liqin.chen@sunplusct.com> [Score]
      Acked-by: Matt Fleming <matt@console-pimps.org> [SH]
      Acked-by: David S. Miller <davem@davemloft.net> [Sparc]
      Acked-by: Chris Zankel <chris@zankel.net> [Xtensa]
      Reviewed-by: Richard Henderson <rth@twiddle.net> [Alpha]
      Reviewed-by: Yoshinori Sato <ysato@users.sourceforge.jp> [H8300]
      Cc: starvik@axis.com [CRIS]
      Cc: jesper.nilsson@axis.com [CRIS]
      Cc: linux-cris-kernel@axis.com
      df9ee292
  30. 24 Aug, 2010 1 commit
    • Martin Schwidefsky's avatar
      [S390] fix tlb flushing vs. concurrent /proc accesses · 050eef36
      Martin Schwidefsky authored
      
      
      The tlb flushing code uses the mm_users field of the mm_struct to
      decide if each page table entry needs to be flushed individually with
      IPTE or if a global flush for the mm_struct is sufficient after all page
      table updates have been done. The comment for mm_users says "How many
      users with user space?" but the /proc code increases mm_users after it
      found the process structure by pid without creating a new user process.
      Which makes mm_users useless for the decision between the two tlb
      flusing methods. The current code can be confused to not flush tlb
      entries by a concurrent access to /proc files if e.g. a fork is in
      progres. The solution for this problem is to make the tlb flushing
      logic independent from the mm_users field.
      
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      050eef36
  31. 30 Mar, 2010 1 commit
    • Tejun Heo's avatar
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo authored
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Guess-its-ok-by: default avatarChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6