Skip to content
  • Ingo Molnar's avatar
    sched/core: Fix uclamp ABI bug, clean up and robustify sched_read_attr() ABI logic and code · 1251201c
    Ingo Molnar authored
    Thadeu Lima de Souza Cascardo reported that 'chrt' broke on recent kernels:
      $ chrt -p $$
      chrt: failed to get pid 26306's policy: Argument list too long
    and he has root-caused the bug to the following commit increasing sched_attr
    size and breaking sched_read_attr() into returning -EFBIG:
     ("sched/uclamp: Extend sched_setattr() to support utilization clamping")
    The other, bigger bug is that the whole sched_getattr() and sched_read_attr()
    logic of checking non-zero bits in new ABI components is arguably broken,
    and pretty much any extension of the ABI will spuriously break the ABI.
    That's way too fragile.
    Instead implement the perf syscall's extensible ABI instead, which we
    already implement on the sched_setattr() side:
     - if user-attributes have the same size as kernel attributes then the
       logic is unchanged.
     - if user-attributes are larger than the kernel knows about then simply
       skip the extra bits, but set attr->size to the (smaller) kernel size
       so that tooling can (in principle) handle older kernel as well.
     - if user-attributes are smaller than the kernel knows about then just
       copy whatever user-space can accept.
    Also clean up the whole logic:
     - Simplify the code flow - there's no need for 'ret' for example.
     - Standardize on 'kattr/uattr' and 'ksize/usize' naming to make sure we
       always know which side we are dealing with.
     - Why is it called 'read' when what it does is to copy to user? This
       code is so far away from VFS read() semantics that the naming is
       actively confusing. Name it sched_attr_copy_to_user() instead, which
       mirrors other copy_to_user() functionality.
     - Move the attr->size assignment from the head of sched_getattr() to the
       sched_attr_copy_to_user() function. Nothing else within the kernel
       should care about the size of the structure.
    With these fixes the sched_getattr() syscall now nicely supports an
    extensible ABI in both a forward and backward compatible fashion, and
    will also fix the chrt bug.
    As an added bonus the bogus -EFBIG return is removed as well, which as
    Thadeu noted should have been -E2BIG to begin with.
    Reported-by: default avatarThadeu Lima de Souza Cascardo <>
    Tested-by: default avatarDietmar Eggemann <>
    Tested-by: default avatarThadeu Lima de Souza Cascardo <>
    Acked-by: default avatarThadeu Lima de Souza Cascardo <>
    Cc: Arnaldo Carvalho de Melo <>
    Cc: Jiri Olsa <>
    Cc: Linus Torvalds <>
    Cc: Patrick Bellasi <>
    Cc: Peter Zijlstra <>
    Cc: Thomas Gleixner <>
    Fixes: a509a7cd ("sched/uclamp: Extend sched_setattr() to support utilization clamping")
    Signed-off-by: default avatarIngo Molnar <>