This program summarize runqueue statistics, that is, task on-CPU time
before being descheduled, in flavor of per-CPU, per-PID, or histogram.

There are major two categories of parameters:
1. one category of parameters control the sampling source, such as
-c[--cpu] and -p[--pid]
2. another category of parameters control the displaying format,
such as -C[--cpus], -P[--pids] and -D[--dist]

These two categories of parameters are Orthogonal.


Example 1.

By default, the program will sample all processes on all runqueues
of all CPUs. Statistics of all processes on the same CPU is
aggregated and displayed in a per-CPU flavor.

In this case, "alirunqinfo" is the same as "alirunqinfo -C".

# alirunqinfo
Tracing runqueue statistics... Hit Ctrl-C to end.
^C
       CPU      Count TOTAL_usecs
         0         77        207
         1          9         27
         2          3         14
         3          2         15
         4         45        122
         6         41        380
         7          6         25
         9         18         61
        10         19         50
        12          2         19
        14          2        189
        16          1         13
        17         18         49
        20         36         73
        21         18         35
        22         22        109
        24         25         92
        28         21         68
        30          4         95


Example 2.

Make the output displayed in a per-process flavor with "-P" parameter.
In this case, all processes on all CPUs are sampled as usual, while
all statistics of the same process is aggregated, whichever CPU it
runs on.

# alirunqinfo -P
Tracing runqueue statistics... Hit Ctrl-C to end.
^C
       PID                                   Thread      Count TOTAL_usecs
        10                                rcu_sched        122        264
       382                kworker/14:1-mm_percpu_wq          1         20
      2926                                     rngd          1          2
      3327                kworker/16:2-mm_percpu_wq          1         15
      3384                             in:imjournal          2          8
      3428                          AliYunDunUpdate         20         60
      3429                          AliYunDunUpdate         20         40
      3442                           aliyun-service         20         98
      3449                                AliYunDun         20        137
      3450                                AliYunDun         19         58
      3451                                AliYunDun         20         56
      3462                                AliYunDun         19         46
      3463                                AliYunDun         19        100
      3464                                AliYunDun          2        220
      3467                                AliYunDun          2          6
      3468                                AliYunDun         26         61
      3469                                AliYunDun         39         78
      3470                                AliYunDun         39        125
      3471                                AliYunDun          2         10
      3473                                AliYunDun          2          4
      3474                                AliYunDun          1          4
      3475                                AliYunDun          2         11
      3476                                AliYunDun         10         30
      3533                                     sshd          5        193
      4573                kworker/27:2-mm_percpu_wq          2         13
      5016                kworker/24:0-mm_percpu_wq          1          5
      5281       kworker/4:0-events_power_efficient          9         19
      5999       kworker/0:2-events_power_efficient          2         34
      6429             kworker/u64:0-events_unbound          3         14
      6468             kworker/u64:1-events_unbound          2         37


Example 3.

You can set "-P" and "-C" parameter at the same time, in which case, the
output is displayed in a per-CPU and per-process flavor.

# alirunqinfo -PC
Tracing runqueue statistics... Hit Ctrl-C to end.
^C

CPU0:
       PID                                   Thread      Count TOTAL_usecs
        10                                rcu_sched         72        147
      3467                                AliYunDun          1          6

CPU1:
       PID                                   Thread      Count TOTAL_usecs
      3476                                AliYunDun          5         26

CPU2:
       PID                                   Thread      Count TOTAL_usecs
      3384                             in:imjournal          2         12

CPU3:
       PID                                   Thread      Count TOTAL_usecs
      2926                                     rngd          2          7


Example 4.

Sample specified process only with "-p", and "-P" is added implicitly
when "-p" is specified.

# alirunqinfo -p 3429
Tracing runqueue statistics... Hit Ctrl-C to end.
^C
       PID                                   Thread      Count TOTAL_usecs
      3429                          AliYunDunUpdate         14         40


Example 5.

When both "-p" and "-C" parameters are specified, the output is displayed
in per-CPU and per-process flavor.

# alirunqinfo -p 3429 -C
Tracing runqueue statistics... Hit Ctrl-C to end.
^C

CPU22:
       PID                                   Thread      Count TOTAL_usecs
      3429                          AliYunDunUpdate         29         67


Example 6.

Sample processes on specified CPU only with "-c", and "-C" is added
implicitly when "-c" is specified.

# alirunqinfo -c 22
Tracing runqueue statistics... Hit Ctrl-C to end.
^C
       CPU      Count TOTAL_usecs
        22         24         90


Example 7.

Make the output displayed in a diagram when "-D" is specified.

# alirunqinfo -p 3429 -D
Tracing runqueue statistics... Hit Ctrl-C to end.
^C

Thread = 3429 (AliYunDunUpdate)
     time_usecs          : count     distribution
         0 -> 1          : 22       |****************************************|
         2 -> 3          : 12       |*********************                   |
         4 -> 7          : 0        |                                        |
         8 -> 15         : 0        |                                        |
        16 -> 31         : 0        |                                        |
        32 -> 63         : 1        |*                                       |



USAGE message:

# alirunqinfo -h
usage: alirunqinfo.py [-h] [-T] [-N] [-C] [-P] [-D] [-c CPU] [-p PID]
                   [interval] [outputs]

Summarize runqueue statistics, including total count, time, etc.

positional arguments:
  interval           output interval, in seconds
  outputs            number of outputs

optional arguments:
  -h, --help         show this help message and exit
  -T, --timestamp    include timestamp on output
  -N, --nanoseconds  output in nanoseconds
  -C, --cpus         show per-CPU runqueue statistics
  -P, --pids         show per-thread runqueue statistics
  -D, --dist         show distributions as histograms
  -c CPU, --cpu CPU  show runqueue statistics on specific CPU only
  -p PID, --pid PID  show runqueue statistics of specific PID only

examples:
    ./alirunqinfo            # sum runqueue statistics
    ./alirunqinfo -C         # show per-CPU runqueue statistics
    ./alirunqinfo -P         # show per-thread runqueue statistics
    ./alirunqinfo -D         # show runqueue statistics as histograms
    ./alirunqinfo -c 0       # show runqueue statistics of CPU 0 only
    ./alirunqinfo -p 25      # show runqueue statistics of PID 25 only
    ./alirunqinfo 1 10       # print 1 second summaries, 10 times
    ./alirunqinfo -NT 1      # 1s summaries, nanoseconds, and timestamps
