Sysjitter worse in kernel 6.12 than 6.6

Environment:

  • Intel(R) Core™ i9-10940X CPU @ 3.30GHz (14 cores)
  • Hyper-thread off
  • 32GB Memory
  • Clearlinux 42770

kernel command /etc/kernel/cmdline.d/hpe.conf, make cpu core (1-13) isolated.

intel_idle.max_cstate=0
processor.max_cstate=1
idle=poll
pcie_aspm=performance
mce=ignore_ce
ipmi_si.force_kipmi=0
nmi_watchdog=0
hpet=disabled
noht
nohz=on
nohalt
nosoftlockup
pti=off
spectre_v2=off
l1tf=off
nospec_store_bypass_disable
no_stf_barrier
mds=off
irqaffinity=0
selinux=0
audit=0
tsc=reliable
mitigations=off
intel_pstate=disable
skew_tick=1
transparent_hugepage=never
iommu=off
intel_iommu=off
rcu_nocb_poll
pcie_port_pm=off
clocksource=tsc
acpi_irq_nobalance
module.sig_unenforce
isolcpus=1-13
nohz_full=1-13
rcu_nocbs=1-13

runing sysjitter: ./sysjitter --runtime 10 200 | column -t

in kernel org.clearlinux.ltscurrent.6.6.64-1447, got expected result: isolated cpu cores has no interrupts(int_n_per_sec) during benchmark.

core_i:          0            1            2            3            4            5            6            7            8            9            10           11           12           13
threshold(ns):   200          200          200          200          200          200          200          200          200          200          200          200          200          200
cpu_mhz:         3311         3311         3311         3311         3311         3311         3311         3311         3311         3311         3311         3311         3311         3311
runtime(ns):     10002823577  10002822834  10002822971  10002824374  10002823325  10002822947  10002822961  10002823062  10002822743  10002823198  10002823117  10002823660  10002823478  10002825002
runtime(s):      10.003       10.003       10.003       10.003       10.003       10.003       10.003       10.003       10.003       10.003       10.003       10.003       10.003       10.003
int_n:           10126        0            0            0            0            0            0            0            0            0            0            0            0            4
int_n_per_sec:   1012.314     0.000        0.000        0.000        0.000        0.000        0.000        0.000        0.000        0.000        0.000        0.000        0.000        0.400
int_min(ns):     555          0            0            0            0            0            0            0            0            0            0            0            0            1237
int_median(ns):  3504         0            0            0            0            0            0            0            0            0            0            0            0            1362
int_mean(ns):    2763         0            0            0            0            0            0            0            0            0            0            0            0            1510
int_90(ns):      3759         0            0            0            0            0            0            0            0            0            0            0            0            2201
int_99(ns):      5678         0            0            0            0            0            0            0            0            0            0            0            0            2201
int_999(ns):     73101        0            0            0            0            0            0            0            0            0            0            0            0            2201
int_9999(ns):    158149       0            0            0            0            0            0            0            0            0            0            0            0            2201
int_99999(ns):   172710       0            0            0            0            0            0            0            0            0            0            0            0            2201
int_max(ns):     172710       0            0            0            0            0            0            0            0            0            0            0            0            2201
int_total(ns):   27989402     0            0            0            0            0            0            0            0            0            0            0            0            6043
int_total(%):    0.280        0.000        0.000        0.000        0.000        0.000        0.000        0.000        0.000        0.000        0.000        0.000        0.000        0.000

in kernel org.clearlinux.native.6.12.4-1518, got unexpected result: isolated cpu cores has about 2 interrupts per second (int_n_per_sec) during benchmark.

core_i:          0            1           2            3            4            5            6            7            8            9            10           11           12           13
threshold(ns):   200          200         200          200          200          200          200          200          200          200          200          200          200          200
cpu_mhz:         3311         3322        3311         3311         3311         3311         3311         3311         3311         3311         3311         3311         3311         3311
runtime(ns):     10003023405  9969901169  10003023673  10003023434  10003023686  10003024300  10003023266  10003023555  10003024161  10003023876  10003023949  10003023414  10003023461  10003025463
runtime(s):      10.003       9.970       10.003       10.003       10.003       10.003       10.003       10.003       10.003       10.003       10.003       10.003       10.003       10.003
int_n:           10847        20          26           22           24           26           24           21           26           26           24           20           18           18
int_n_per_sec:   1084.372     2.006       2.599        2.199        2.399        2.599        2.399        2.099        2.599        2.599        2.399        1.999        1.799        1.799
int_min(ns):     553          984         800          829          826          736          804          811          869          826          851          846          1713         1642
int_median(ns):  3458         2042        1713         1870         1813         1813         1803         1852         1832         1835         1884         2019         2955         2942
int_mean(ns):    4773         2387        1990         2252         2047         2012         2079         2274         2120         2115         2134         2425         2558         2533
int_90(ns):      3814         3585        3241         3361         3132         3238         3279         3443         3368         3410         3630         3566         3516         3247
int_99(ns):      33648        4474        4140         4353         4138         3891         4195         4061         5646         4916         3780         4007         3883         4291
int_999(ns):     122059       4474        4140         4353         4138         3891         4195         4061         5646         4916         3780         4007         3883         4291
int_9999(ns):    157194       4474        4140         4353         4138         3891         4195         4061         5646         4916         3780         4007         3883         4291
int_99999(ns):   2758484      4474        4140         4353         4138         3891         4195         4061         5646         4916         3780         4007         3883         4291
int_max(ns):     2758484      4474        4140         4353         4138         3891         4195         4061         5646         4916         3780         4007         3883         4291
int_total(ns):   51784231     47754       51752        49557        49147        52318        49917        47775        55130        55010        51220        48517        46061        45604
int_total(%):    0.518        0.000       0.001        0.000        0.000        0.001        0.000        0.000        0.001        0.001        0.001        0.000        0.000        0.000

Anyone has idea why?

Seems something happened with the ability to set IRQ affinity in kernel v6.12:

Typical output looks something like:

irqbalance[848]: Cannot change IRQ 63 affinity: Success
irqbalance[848]: IRQ 63 affinity is now unmanaged
irqbalance[848]: Cannot change IRQ 61 affinity: Success
irqbalance[848]: IRQ 61 affinity is now unmanaged
irqbalance[848]: Cannot change IRQ 68 affinity: Permission denied
irqbalance[848]: IRQ 68 affinity is now unmanaged
irqbalance[848]: Cannot change IRQ 58 affinity: Permission denied
irqbalance[848]: IRQ 58 affinity is now unmanaged
irqbalance[848]: Cannot change IRQ 66 affinity: Permission denied
irqbalance[848]: IRQ 66 affinity is now unmanaged 
2 Likes

Thanks for connecting the dots @damentz
@william.douglas - head’s up!

Chris

with boot parameters in /etc/kernel/cmdline.d/hpe.conf. The irq affinity after boot is expected: all assigned to cpu core 0.

# cat /proc/irq/default_smp_affinity 
0001

# cat /proc/irq/*/smp_affinity_list
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0

Running dynticks-testing on isolated core 5 and got

# tracer: nop
#
# entries-in-buffer/entries-written: 24421/36078   #P:14
#
#                                _-----=> irqs-off/BH-disabled
#                               / _----=> need-resched
#                              | / _---=> hardirq/softirq
#                              || / _--=> preempt-depth
#                              ||| / _-=> migrate-disable
#                              |||| /     delay
#           TASK-PID     CPU#  |||||  TIMESTAMP  FUNCTION
#              | |         |   |||||     |         |
          <idle>-0       [005] d..2.    32.504692: sched_switch: prev_comm=swapper/5 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=taskset next_pid=832 next_prio=120
       user_loop-832     [005] d..1.    32.504801: workqueue_queue_work: work struct=0000000039ab7807 function=blk_mq_run_work_fn workqueue=kblockd req_cpu=512 cpu=5
       user_loop-832     [005] d....    32.504805: tick_stop: success=0 dependency=SCHED
       user_loop-832     [005] d..2.    32.504809: sched_switch: prev_comm=taskset prev_pid=832 prev_prio=120 prev_state=D ==> next_comm=kworker/5:0H next_pid=59 next_prio=100
    kworker/5:0H-59      [005] .....    32.504810: workqueue_execute_start: work struct 0000000039ab7807: function blk_mq_run_work_fn
    kworker/5:0H-59      [005] d..2.    32.504824: sched_switch: prev_comm=kworker/5:0H prev_pid=59 prev_prio=100 prev_state=I ==> next_comm=swapper/5 next_pid=0 next_prio=120
          <idle>-0       [005] d..1.    32.504954: tick_stop: success=1 dependency=NONE
          <idle>-0       [005] d..2.    32.505801: sched_switch: prev_comm=swapper/5 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=taskset next_pid=832 next_prio=120
       user_loop-832     [005] dn...    33.911275: tick_stop: success=0 dependency=SCHED
       user_loop-832     [005] d.h..    33.911441: hrtimer_expire_entry: hrtimer=00000000b7a3bd2e function=tick_nohz_handler now=33853178916
       user_loop-832     [005] d....    33.911444: tick_stop: success=1 dependency=NONE
       user_loop-832     [005] dn...    34.935273: tick_stop: success=0 dependency=SCHED
       user_loop-832     [005] d.h..    34.935441: hrtimer_expire_entry: hrtimer=00000000b7a3bd2e function=tick_nohz_handler now=34877178787
       user_loop-832     [005] d....    34.935442: tick_stop: success=1 dependency=NONE
       user_loop-832     [005] dn...    35.959271: tick_stop: success=0 dependency=SCHED
       user_loop-832     [005] d.h..    35.959441: hrtimer_expire_entry: hrtimer=00000000b7a3bd2e function=tick_nohz_handler now=35901178815
       user_loop-832     [005] d....    35.959442: tick_stop: success=1 dependency=NONE
       user_loop-832     [005] dn...    36.983275: tick_stop: success=0 dependency=SCHED
       user_loop-832     [005] d.h..    36.983441: hrtimer_expire_entry: hrtimer=00000000b7a3bd2e function=tick_nohz_handler now=36925178810
       user_loop-832     [005] d....    36.983442: tick_stop: success=1 dependency=NONE
       user_loop-832     [005] dn...    38.007271: tick_stop: success=0 dependency=SCHED
       user_loop-832     [005] d.h..    38.007441: hrtimer_expire_entry: hrtimer=00000000b7a3bd2e function=tick_nohz_handler now=37949178807
       user_loop-832     [005] d....    38.007442: tick_stop: success=1 dependency=NONE
       user_loop-832     [005] dn...    39.031271: tick_stop: success=0 dependency=SCHED
       user_loop-832     [005] d.h..    39.031441: hrtimer_expire_entry: hrtimer=00000000b7a3bd2e function=tick_nohz_handler now=38973178791
       user_loop-832     [005] d....    39.031442: tick_stop: success=1 dependency=NONE
       user_loop-832     [005] dn...    40.055269: tick_stop: success=0 dependency=SCHED
       user_loop-832     [005] d.h..    40.055440: hrtimer_expire_entry: hrtimer=00000000b7a3bd2e function=tick_nohz_handler now=39997178822
       user_loop-832     [005] d....    40.055442: tick_stop: success=1 dependency=NONE
       user_loop-832     [005] dn...    41.079270: tick_stop: success=0 dependency=SCHED
       user_loop-832     [005] d.h..    41.079440: hrtimer_expire_entry: hrtimer=00000000b7a3bd2e function=tick_nohz_handler now=41021178813
       user_loop-832     [005] d....    41.079441: tick_stop: success=1 dependency=NONE
       user_loop-832     [005] dn...    42.103269: tick_stop: success=0 dependency=SCHED
       user_loop-832     [005] d.h..    42.103440: hrtimer_expire_entry: hrtimer=00000000b7a3bd2e function=tick_nohz_handler now=42045178802
       user_loop-832     [005] d....    42.103441: tick_stop: success=1 dependency=NONE
       user_loop-832     [005] d..2.    42.518341: workqueue_queue_work: work struct=00000000cfc01b61 function=ioc_release_fn workqueue=events_power_efficient req_cpu=512 cpu=-1
       user_loop-832     [005] d..2.    42.518346: sched_switch: prev_comm=user_loop prev_pid=832 prev_prio=120 prev_state=X ==> next_comm=swapper/5 next_pid=0 next_prio=120
          <idle>-0       [005] d.h1.    42.518347: hrtimer_expire_entry: hrtimer=000000003797a589 function=inactive_task_timer now=42460085785

It seems there are two tick_stop per second.

hrtimer_expire_entry per second is expected.