Skip to content
  • Kim Phillips's avatar
    perf stat: Don't report a null stalled cycles per insn metric · 80cc7bb6
    Kim Phillips authored
    For data collected on machines with front end stalled cycles supported,
    such as found on modern AMD CPU families, commit 146540fb ("perf
    stat: Always separate stalled cycles per insn") introduces a new line in
    CSV output with a leading comma that upsets some automated scripts.
    Scripts have to use "-e ex_ret_instr" to work around this issue, after
    upgrading to a version of perf with that commit.
    
    We could add "if (have_frontend_stalled && !config->csv_sep)" to the not
    (total && avg) else clause, to emphasize that CSV users are usually
    scripts, and are written to do only what is needed, i.e., they wouldn't
    typically invoke "perf stat" without specifying an explicit event list.
    
    But - let alone CSV output - why should users now tolerate a constant
    0-reporting extra line in regular terminal output?:
    
    BEFORE:
    
    $ sudo perf stat --all-cpus -einstructions,cycles -- sleep 1
    
     Performance counter stats for 'system wide':
    
           181,110,981      instructions              #    0.58  insn per cycle
                                                      #    0.00  stalled cycles per insn
           309,876,469      cycles
    
           1.002202582 seconds time elapsed
    
    The user would not like to see the now permanent:
    
      "0.00  stalled cycles per insn"
    
    line fixture, as it gives no useful information.
    
    So this patch removes the printing of the zeroed stalled cycles line
    altogether, almost reverting the very original commit fb4605ba
    ("perf stat: Check for frontend stalled for metrics"), which seems like
    it was written to normalize --metric-only column output of common Intel
    machines at the time: modern Intel machines have ceased to support the
    genericised frontend stalled metrics AFAICT.
    
    AFTER:
    
    $ sudo perf stat --all-cpus -einstructions,cycles -- sleep 1
    
     Performance counter stats for 'system wide':
    
           244,071,432      instructions              #    0.69  insn per cycle
           355,353,490      cycles
    
           1.001862516 seconds time elapsed
    
    Output behaviour when stalled cycles is indeed measured is not affected
    (BEFORE == AFTER):
    
    $ sudo perf stat --all-cpus -einstructions,cycles,stalled-cycles-frontend -- sleep 1
    
     Performance counter stats for 'system wide':
    
           247,227,799      instructions              #    0.63  insn per cycle
                                                      #    0.26  stalled cycles per insn
           394,745,636      cycles
            63,194,485      stalled-cycles-frontend   #   16.01% frontend cycles idle
    
           1.002079770 seconds time elapsed
    
    Fixes: 146540fb
    
     ("perf stat: Always separate stalled cycles per insn")
    Signed-off-by: default avatarKim Phillips <kim.phillips@amd.com>
    Acked-by: default avatarAndi Kleen <ak@linux.intel.com>
    Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
    Acked-by: default avatarSong Liu <songliubraving@fb.com>
    Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
    Cc: Cong Wang <xiyou.wangcong@gmail.com>
    Cc: Davidlohr Bueso <dave@stgolabs.net>
    Cc: Jin Yao <yao.jin@linux.intel.com>
    Cc: Kan Liang <kan.liang@linux.intel.com>
    Cc: Mark Rutland <mark.rutland@arm.com>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Link: http://lore.kernel.org/lkml/20200207230613.26709-1-kim.phillips@amd.com
    
    
    Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
    80cc7bb6