Skip to content
  • Frank Rowand's avatar
    of: cache phandle nodes to reduce cost of of_find_node_by_phandle() · 0b3ce78e
    Frank Rowand authored
    
    
    Create a cache of the nodes that contain a phandle property.  Use this
    cache to find the node for a given phandle value instead of scanning
    the devicetree to find the node.  If the phandle value is not found
    in the cache, of_find_node_by_phandle() will fall back to the tree
    scan algorithm.
    
    The cache is initialized in of_core_init().
    
    The cache is freed via a late_initcall_sync() if modules are not
    enabled.
    
    If the devicetree is created by the dtc compiler, with all phandle
    property values auto generated, then the size required by the cache
    could be 4 * (1 + number of phandles) bytes.  This results in an O(1)
    node lookup cost for a given phandle value.  Due to a concern that the
    phandle property values might not be consistent with what is generated
    by the dtc compiler, a mask has been added to the cache lookup algorithm.
    To maintain the O(1) node lookup cost, the size of the cache has been
    increased by rounding the number of entries up to the next power of
    two.
    
    The overhead of finding the devicetree node containing a given phandle
    value has been noted by several people in the recent past, in some cases
    with a patch to add a hashed index of devicetree nodes, based on the
    phandle value of the node.  One concern with this approach is the extra
    space added to each node.  This patch takes advantage of the phandle
    property values auto generated by the dtc compiler, which begin with
    one and monotonically increase by one, resulting in a range of 1..n
    for n phandle values.  This implementation should also provide a good
    reduction of overhead for any range of phandle values that are mostly
    in a monotonic range.
    
    Performance measurements by Chintan Pandya <cpandya@codeaurora.org>
    of several implementations of patches that are similar to this one
    suggest an expected reduction of boot time by ~400ms for his test
    system.  If the cache size was decreased to 64 entries, the boot
    time was reduced by ~340 ms.  The measurements were on a 4.9.73 kernel
    for arch/arm64/boot/dts/qcom/sda670-mtp.dts, contains 2371 nodes and
    814 phandle values.
    
    Reported-by: default avatarChintan Pandya <cpandya@codeaurora.org>
    Signed-off-by: default avatarFrank Rowand <frank.rowand@sony.com>
    Signed-off-by: default avatarRob Herring <robh@kernel.org>
    0b3ce78e