localed stuck in recent 3.18 git in copy_net_ns?

Discussion:

localed stuck in recent 3.18 git in copy_net_ns?

Kevin Fenzi

2014-10-20 20:15:15 UTC

Greetings.

I'm seeing suspend/resume failures with recent 3.18 git kernels.

Full dmesg at: http://paste.fedoraproject.org/143615/83287914/

The possibly interesting parts:

[ 78.373144] PM: Syncing filesystems ... done.
[ 78.411180] PM: Preparing system for mem sleep
[ 78.411995] Freezing user space processes ...
[ 98.429955] Freezing of tasks failed after 20.001 seconds (1 tasks refusing to freeze, wq_busy=0):
[ 98.429971] (-localed) D ffff88025f214c80 0 1866 1 0x00000084
[ 98.429975] ffff88024e777df8 0000000000000086 ffff88009b4444b0 0000000000014c80
[ 98.429978] ffff88024e777fd8 0000000000014c80 ffff880250ffb110 ffff88009b4444b0
[ 98.429981] 0000000000000000 ffffffff81cec1a0 ffffffff81cec1a4 ffff88009b4444b0
[ 98.429983] Call Trace:
[ 98.429991] [<ffffffff8175d619>] schedule_preempt_disabled+0x29/0x70
[ 98.429994] [<ffffffff8175f433>] __mutex_lock_slowpath+0xb3/0x120
[ 98.429997] [<ffffffff8175f4c3>] mutex_lock+0x23/0x40
[ 98.430001] [<ffffffff8163e325>] copy_net_ns+0x75/0x140
[ 98.430005] [<ffffffff810b8c2d>] create_new_namespaces+0xfd/0x1a0
[ 98.430008] [<ffffffff810b8e5a>] unshare_nsproxy_namespaces+0x5a/0xc0
[ 98.430012] [<ffffffff81098813>] SyS_unshare+0x193/0x340
[ 98.430015] [<ffffffff817617a9>] system_call_fastpath+0x12/0x17

[ 98.430032] Restarting tasks ... done.
[ 98.480361] PM: Syncing filesystems ... done.
[ 98.571645] PM: Preparing system for freeze sleep
[ 98.571779] Freezing user space processes ...
[ 118.592086] Freezing of tasks failed after 20.003 seconds (1 tasks refusing to freeze, wq_busy=0):
[ 118.592102] (-localed) D ffff88025f214c80 0 1866 1 0x00000084
[ 118.592106] ffff88024e777df8 0000000000000086 ffff88009b4444b0 0000000000014c80
[ 118.592109] ffff88024e777fd8 0000000000014c80 ffff880250ffb110 ffff88009b4444b0
[ 118.592111] 0000000000000000 ffffffff81cec1a0 ffffffff81cec1a4 ffff88009b4444b0
[ 118.592114] Call Trace:
[ 118.592121] [<ffffffff8175d619>] schedule_preempt_disabled+0x29/0x70
[ 118.592125] [<ffffffff8175f433>] __mutex_lock_slowpath+0xb3/0x120
[ 118.592127] [<ffffffff8175f4c3>] mutex_lock+0x23/0x40
[ 118.592132] [<ffffffff8163e325>] copy_net_ns+0x75/0x140
[ 118.592136] [<ffffffff810b8c2d>] create_new_namespaces+0xfd/0x1a0
[ 118.592139] [<ffffffff810b8e5a>] unshare_nsproxy_namespaces+0x5a/0xc0
[ 118.592143] [<ffffffff81098813>] SyS_unshare+0x193/0x340
[ 118.592146] [<ffffffff817617a9>] system_call_fastpath+0x12/0x17

[ 118.592163] Restarting tasks ... done.

root 6 0.0 0.0 0 0 ? D 13:49 0:00 [kworker/u16:0]
root 1876 0.0 0.0 41460 5784 ? Ds 13:49 0:00 (-localed)

I'll try and bisect this, but perhaps it rings bells already for folks.

kevin

Dave Jones

2014-10-20 20:43:26 UTC

Post by Kevin Fenzi
I'm seeing suspend/resume failures with recent 3.18 git kernels.
Full dmesg at: http://paste.fedoraproject.org/143615/83287914/
[ 78.373144] PM: Syncing filesystems ... done.
[ 78.411180] PM: Preparing system for mem sleep
[ 78.411995] Freezing user space processes ...
[ 98.429971] (-localed) D ffff88025f214c80 0 1866 1 0x00000084
[ 98.429975] ffff88024e777df8 0000000000000086 ffff88009b4444b0 0000000000014c80
[ 98.429978] ffff88024e777fd8 0000000000014c80 ffff880250ffb110 ffff88009b4444b0
[ 98.429981] 0000000000000000 ffffffff81cec1a0 ffffffff81cec1a4 ffff88009b4444b0
[ 98.429991] [<ffffffff8175d619>] schedule_preempt_disabled+0x29/0x70
[ 98.429994] [<ffffffff8175f433>] __mutex_lock_slowpath+0xb3/0x120
[ 98.429997] [<ffffffff8175f4c3>] mutex_lock+0x23/0x40
[ 98.430001] [<ffffffff8163e325>] copy_net_ns+0x75/0x140
[ 98.430005] [<ffffffff810b8c2d>] create_new_namespaces+0xfd/0x1a0
[ 98.430008] [<ffffffff810b8e5a>] unshare_nsproxy_namespaces+0x5a/0xc0
[ 98.430012] [<ffffffff81098813>] SyS_unshare+0x193/0x340
[ 98.430015] [<ffffffff817617a9>] system_call_fastpath+0x12/0x17

I've seen similar soft lockup traces from the sys_unshare path when running my
fuzz tester. It seems that if you create enough network namespaces,
it can take a huge amount of time for them to be iterated.
(Running trinity with '-c unshare' you can see the slow down happen. In
some cases, it takes so long that the watchdog process kills it --
though the SIGKILL won't get delivered until the unshare() completes)

Any idea what this machine had been doing prior to this that may have
involved creating lots of namespaces ?

Dave

Kevin Fenzi

2014-10-20 20:53:59 UTC

On Mon, 20 Oct 2014 16:43:26 -0400

Post by Dave Jones
I've seen similar soft lockup traces from the sys_unshare path when
running my fuzz tester. It seems that if you create enough network
namespaces, it can take a huge amount of time for them to be iterated.
(Running trinity with '-c unshare' you can see the slow down happen.
In some cases, it takes so long that the watchdog process kills it --
though the SIGKILL won't get delivered until the unshare() completes)
Any idea what this machine had been doing prior to this that may have
involved creating lots of namespaces ?

That was right after boot. ;)

This is my main rawhide running laptop.

A 'ip netns list' shows nothing.

kevin

Kevin Fenzi

2014-10-21 21:12:25 UTC

On Mon, 20 Oct 2014 14:53:59 -0600

Post by Kevin Fenzi
On Mon, 20 Oct 2014 16:43:26 -0400

Post by Dave Jones
I've seen similar soft lockup traces from the sys_unshare path when
running my fuzz tester. It seems that if you create enough network
namespaces, it can take a huge amount of time for them to be
iterated. (Running trinity with '-c unshare' you can see the slow
down happen. In some cases, it takes so long that the watchdog
process kills it -- though the SIGKILL won't get delivered until
the unshare() completes)
Any idea what this machine had been doing prior to this that may
have involved creating lots of namespaces ?

That was right after boot. ;)
This is my main rawhide running laptop.
A 'ip netns list' shows nothing.

Some more information:

The problem started between:

v3.17-7872-g5ff0b9e1a1da and v3.17-8307-gf1d0d14120a8

(I can try and do a bisect, but have to head out on a trip tomorrow)

In all the kernels with the problem, there is a kworker process in D.

sysrq-t says:
Showing all locks held in the system:
Oct 21 15:06:31 voldemort.scrye.com kernel: 4 locks held by kworker/u16:0/6:
Oct 21 15:06:31 voldemort.scrye.com kernel: #0: ("%s""netns"){.+.+.+}, at: [<ffffffff810ccbff>] process_one_work+0x17f/0x850
Oct 21 15:06:31 voldemort.scrye.com kernel: #1: (net_cleanup_work){+.+.+.}, at: [<ffffffff810ccbff>] process_one_work+0x17f/0x850
Oct 21 15:06:31 voldemort.scrye.com kernel: #2: (net_mutex){+.+.+.}, at: [<ffffffff817069fc>] cleanup_net+0x8c/0x1f0
Oct 21 15:06:31 voldemort.scrye.com kernel: #3:
(rcu_sched_state.barrier_mutex){+.+...}, at: [<ffffffff8112a395>]
_rcu_barrier+0x35/0x200

On first running any of the systemd units that use PrivateNetwork, then
run ok, but they are also set to timeout after a minute. On sucessive
runs they hang in D also.

kevin

Josh Boyer

2014-10-22 17:12:25 UTC

Post by Kevin Fenzi
On Mon, 20 Oct 2014 14:53:59 -0600

Post by Kevin Fenzi
On Mon, 20 Oct 2014 16:43:26 -0400

Post by Dave Jones
I've seen similar soft lockup traces from the sys_unshare path when
running my fuzz tester. It seems that if you create enough network
namespaces, it can take a huge amount of time for them to be
iterated. (Running trinity with '-c unshare' you can see the slow
down happen. In some cases, it takes so long that the watchdog
process kills it -- though the SIGKILL won't get delivered until
the unshare() completes)
Any idea what this machine had been doing prior to this that may
have involved creating lots of namespaces ?

That was right after boot. ;)
This is my main rawhide running laptop.
A 'ip netns list' shows nothing.

v3.17-7872-g5ff0b9e1a1da and v3.17-8307-gf1d0d14120a8
(I can try and do a bisect, but have to head out on a trip tomorrow)
In all the kernels with the problem, there is a kworker process in D.
Oct 21 15:06:31 voldemort.scrye.com kernel: #0: ("%s""netns"){.+.+.+}, at: [<ffffffff810ccbff>] process_one_work+0x17f/0x850
Oct 21 15:06:31 voldemort.scrye.com kernel: #1: (net_cleanup_work){+.+.+.}, at: [<ffffffff810ccbff>] process_one_work+0x17f/0x850
Oct 21 15:06:31 voldemort.scrye.com kernel: #2: (net_mutex){+.+.+.}, at: [<ffffffff817069fc>] cleanup_net+0x8c/0x1f0
(rcu_sched_state.barrier_mutex){+.+...}, at: [<ffffffff8112a395>]
_rcu_barrier+0x35/0x200
On first running any of the systemd units that use PrivateNetwork, then
run ok, but they are also set to timeout after a minute. On sucessive
runs they hang in D also.

Someone else is seeing this when they try and modprobe ppp_generic:

[ 240.599195] INFO: task kworker/u16:5:100 blocked for more than 120 seconds.
[ 240.599338] Not tainted 3.18.0-0.rc1.git2.1.fc22.x86_64 #1
[ 240.599446] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 240.599583] kworker/u16:5 D ffff8802202db480 12400 100 2 0x00000000
[ 240.599744] Workqueue: netns cleanup_net
[ 240.599823] ffff8802202eb9e8 0000000000000096 ffff8802202db480
00000000001d5f00
[ 240.600066] ffff8802202ebfd8 00000000001d5f00 ffff8800368c3480
ffff8802202db480
[ 240.600228] ffffffff81ee2690 7fffffffffffffff ffffffff81ee2698
ffffffff81ee2690
[ 240.600386] Call Trace:
[ 240.600445] [<ffffffff8185e239>] schedule+0x29/0x70
[ 240.600541] [<ffffffff8186345c>] schedule_timeout+0x26c/0x410
[ 240.600651] [<ffffffff81865ef7>] ? retint_restore_args+0x13/0x13
[ 240.600765] [<ffffffff818644e4>] ? _raw_spin_unlock_irq+0x34/0x50
[ 240.600879] [<ffffffff8185fc6c>] wait_for_completion+0x10c/0x150
[ 240.601025] [<ffffffff810e53e0>] ? wake_up_state+0x20/0x20
[ 240.601133] [<ffffffff8112a749>] _rcu_barrier+0x159/0x200
[ 240.601237] [<ffffffff8112a845>] rcu_barrier+0x15/0x20
[ 240.601335] [<ffffffff81718ebf>] netdev_run_todo+0x6f/0x310
[ 240.601442] [<ffffffff8170da85>] ? rollback_registered_many+0x265/0x2e0
[ 240.601564] [<ffffffff81725f2e>] rtnl_unlock+0xe/0x10
[ 240.601660] [<ffffffff8170f8e6>] default_device_exit_batch+0x156/0x180
[ 240.601781] [<ffffffff810fd8a0>] ? abort_exclusive_wait+0xb0/0xb0
[ 240.601895] [<ffffffff81707993>] ops_exit_list.isra.1+0x53/0x60
[ 240.602028] [<ffffffff81708540>] cleanup_net+0x100/0x1f0
[ 240.602131] [<ffffffff810ccfa8>] process_one_work+0x218/0x850
[ 240.602241] [<ffffffff810ccf0f>] ? process_one_work+0x17f/0x850
[ 240.602350] [<ffffffff810cd6c7>] ? worker_thread+0xe7/0x4a0
[ 240.602454] [<ffffffff810cd64b>] worker_thread+0x6b/0x4a0
[ 240.602555] [<ffffffff810cd5e0>] ? process_one_work+0x850/0x850
[ 240.602665] [<ffffffff810d399b>] kthread+0x10b/0x130
[ 240.602762] [<ffffffff81028cc9>] ? sched_clock+0x9/0x10
[ 240.602862] [<ffffffff810d3890>] ? kthread_create_on_node+0x250/0x250
[ 240.603004] [<ffffffff818651fc>] ret_from_fork+0x7c/0xb0
[ 240.603106] [<ffffffff810d3890>] ? kthread_create_on_node+0x250/0x250
[ 240.603224] 4 locks held by kworker/u16:5/100:
[ 240.603304] #0: ("%s""netns"){.+.+.+}, at: [<ffffffff810ccf0f>]
process_one_work+0x17f/0x850
[ 240.603495] #1: (net_cleanup_work){+.+.+.}, at:
[<ffffffff810ccf0f>] process_one_work+0x17f/0x850
[ 240.603691] #2: (net_mutex){+.+.+.}, at: [<ffffffff817084cc>]
cleanup_net+0x8c/0x1f0
[ 240.603869] #3: (rcu_sched_state.barrier_mutex){+.+...}, at:
[<ffffffff8112a625>] _rcu_barrier+0x35/0x200
[ 240.604211] INFO: task modprobe:1387 blocked for more than 120 seconds.
[ 240.604329] Not tainted 3.18.0-0.rc1.git2.1.fc22.x86_64 #1
[ 240.604434] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 240.604570] modprobe D ffff8800cb4f1a40 13112 1387 1386 0x00000080
[ 240.604719] ffff8800cafbbbe8 0000000000000096 ffff8800cb4f1a40
00000000001d5f00
[ 240.604878] ffff8800cafbbfd8 00000000001d5f00 ffff880223280000
ffff8800cb4f1a40
[ 240.605068] ffff8800cb4f1a40 ffffffff81f8fb48 0000000000000246
ffff8800cb4f1a40
[ 240.605228] Call Trace:
[ 240.605283] [<ffffffff8185e7e1>] schedule_preempt_disabled+0x31/0x80
[ 240.605400] [<ffffffff81860033>] mutex_lock_nested+0x183/0x440
[ 240.605510] [<ffffffff8170835f>] ? register_pernet_subsys+0x1f/0x50
[ 240.605626] [<ffffffff8170835f>] ? register_pernet_subsys+0x1f/0x50
[ 240.605757] [<ffffffffa0701000>] ? 0xffffffffa0701000
[ 240.605854] [<ffffffff8170835f>] register_pernet_subsys+0x1f/0x50
[ 240.606005] [<ffffffffa0701048>] br_init+0x48/0xd3 [bridge]
[ 240.606112] [<ffffffff81002148>] do_one_initcall+0xd8/0x210
[ 240.606224] [<ffffffff81153c02>] load_module+0x20c2/0x2870
[ 240.606327] [<ffffffff8114ebe0>] ? store_uevent+0x70/0x70
[ 240.606433] [<ffffffff8110ac26>] ? lock_release_non_nested+0x3c6/0x3d0
[ 240.606557] [<ffffffff81154497>] SyS_init_module+0xe7/0x140
[ 240.606664] [<ffffffff818652a9>] system_call_fastpath+0x12/0x17
[ 240.606773] 1 lock held by modprobe/1387:
[ 240.606845] #0: (net_mutex){+.+.+.}, at: [<ffffffff8170835f>]
register_pernet_subsys+0x1f/0x50
[ 240.607114] INFO: task modprobe:1466 blocked for more than 120 seconds.
[ 240.607231] Not tainted 3.18.0-0.rc1.git2.1.fc22.x86_64 #1
[ 240.607337] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 240.607473] modprobe D ffff88020fbab480 13096 1466 1399 0x00000084
[ 240.607622] ffff88020d1bbbe8 0000000000000096 ffff88020fbab480
00000000001d5f00
[ 240.607791] ffff88020d1bbfd8 00000000001d5f00 ffffffff81e1b580
ffff88020fbab480
[ 240.607949] ffff88020fbab480 ffffffff81f8fb48 0000000000000246
ffff88020fbab480
[ 240.608138] Call Trace:
[ 240.608193] [<ffffffff8185e7e1>] schedule_preempt_disabled+0x31/0x80
[ 240.608316] [<ffffffff81860033>] mutex_lock_nested+0x183/0x440
[ 240.608425] [<ffffffff817083ad>] ? register_pernet_device+0x1d/0x70
[ 240.608542] [<ffffffff817083ad>] ? register_pernet_device+0x1d/0x70
[ 240.608662] [<ffffffffa071d000>] ? 0xffffffffa071d000
[ 240.608759] [<ffffffff817083ad>] register_pernet_device+0x1d/0x70
[ 240.608881] [<ffffffffa071d020>] ppp_init+0x20/0x1000 [ppp_generic]
[ 240.609021] [<ffffffff81002148>] do_one_initcall+0xd8/0x210
[ 240.609131] [<ffffffff81153c02>] load_module+0x20c2/0x2870
[ 240.609235] [<ffffffff8114ebe0>] ? store_uevent+0x70/0x70
[ 240.609339] [<ffffffff8110ac26>] ? lock_release_non_nested+0x3c6/0x3d0
[ 240.609462] [<ffffffff81154497>] SyS_init_module+0xe7/0x140
[ 240.609568] [<ffffffff818652a9>] system_call_fastpath+0x12/0x17
[ 240.609677] 1 lock held by modprobe/1466:
[ 240.609749] #0: (net_mutex){+.+.+.}, at: [<ffffffff817083ad>]
register_pernet_device+0x1d/0x70

Looks like contention on net_mutex or something, but I honestly have
no idea yet. I can't recreate it myself at the moment or I would
bisect.

Has nobody else run into this with the pre-3.18 kernels? Fedora isn't
carrying any patches in this area.

josh

Cong Wang

2014-10-22 17:37:53 UTC

(Adding Paul and Eric in Cc)

I am not aware of any change in net/core/dev.c related here,
so I guess it's a bug in rcu_barrier().

Thanks.

Post by Josh Boyer
[ 240.599195] INFO: task kworker/u16:5:100 blocked for more than 120 seconds.
[ 240.599338] Not tainted 3.18.0-0.rc1.git2.1.fc22.x86_64 #1
[ 240.599446] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 240.599583] kworker/u16:5 D ffff8802202db480 12400 100 2 0x00000000
[ 240.599744] Workqueue: netns cleanup_net
[ 240.599823] ffff8802202eb9e8 0000000000000096 ffff8802202db480
00000000001d5f00
[ 240.600066] ffff8802202ebfd8 00000000001d5f00 ffff8800368c3480
ffff8802202db480
[ 240.600228] ffffffff81ee2690 7fffffffffffffff ffffffff81ee2698
ffffffff81ee2690
[ 240.600445] [<ffffffff8185e239>] schedule+0x29/0x70
[ 240.600541] [<ffffffff8186345c>] schedule_timeout+0x26c/0x410
[ 240.600651] [<ffffffff81865ef7>] ? retint_restore_args+0x13/0x13
[ 240.600765] [<ffffffff818644e4>] ? _raw_spin_unlock_irq+0x34/0x50
[ 240.600879] [<ffffffff8185fc6c>] wait_for_completion+0x10c/0x150
[ 240.601025] [<ffffffff810e53e0>] ? wake_up_state+0x20/0x20
[ 240.601133] [<ffffffff8112a749>] _rcu_barrier+0x159/0x200
[ 240.601237] [<ffffffff8112a845>] rcu_barrier+0x15/0x20
[ 240.601335] [<ffffffff81718ebf>] netdev_run_todo+0x6f/0x310
[ 240.601442] [<ffffffff8170da85>] ? rollback_registered_many+0x265/0x2e0
[ 240.601564] [<ffffffff81725f2e>] rtnl_unlock+0xe/0x10
[ 240.601660] [<ffffffff8170f8e6>] default_device_exit_batch+0x156/0x180
[ 240.601781] [<ffffffff810fd8a0>] ? abort_exclusive_wait+0xb0/0xb0
[ 240.601895] [<ffffffff81707993>] ops_exit_list.isra.1+0x53/0x60
[ 240.602028] [<ffffffff81708540>] cleanup_net+0x100/0x1f0
[ 240.602131] [<ffffffff810ccfa8>] process_one_work+0x218/0x850
[ 240.602241] [<ffffffff810ccf0f>] ? process_one_work+0x17f/0x850
[ 240.602350] [<ffffffff810cd6c7>] ? worker_thread+0xe7/0x4a0
[ 240.602454] [<ffffffff810cd64b>] worker_thread+0x6b/0x4a0
[ 240.602555] [<ffffffff810cd5e0>] ? process_one_work+0x850/0x850
[ 240.602665] [<ffffffff810d399b>] kthread+0x10b/0x130
[ 240.602762] [<ffffffff81028cc9>] ? sched_clock+0x9/0x10
[ 240.602862] [<ffffffff810d3890>] ? kthread_create_on_node+0x250/0x250
[ 240.603004] [<ffffffff818651fc>] ret_from_fork+0x7c/0xb0
[ 240.603106] [<ffffffff810d3890>] ? kthread_create_on_node+0x250/0x250
[ 240.603304] #0: ("%s""netns"){.+.+.+}, at: [<ffffffff810ccf0f>]
process_one_work+0x17f/0x850
[<ffffffff810ccf0f>] process_one_work+0x17f/0x850
[ 240.603691] #2: (net_mutex){+.+.+.}, at: [<ffffffff817084cc>]
cleanup_net+0x8c/0x1f0
[<ffffffff8112a625>] _rcu_barrier+0x35/0x200
[ 240.604211] INFO: task modprobe:1387 blocked for more than 120 seconds.
[ 240.604329] Not tainted 3.18.0-0.rc1.git2.1.fc22.x86_64 #1
[ 240.604434] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 240.604570] modprobe D ffff8800cb4f1a40 13112 1387 1386 0x00000080
[ 240.604719] ffff8800cafbbbe8 0000000000000096 ffff8800cb4f1a40
00000000001d5f00
[ 240.604878] ffff8800cafbbfd8 00000000001d5f00 ffff880223280000
ffff8800cb4f1a40
[ 240.605068] ffff8800cb4f1a40 ffffffff81f8fb48 0000000000000246
ffff8800cb4f1a40
[ 240.605283] [<ffffffff8185e7e1>] schedule_preempt_disabled+0x31/0x80
[ 240.605400] [<ffffffff81860033>] mutex_lock_nested+0x183/0x440
[ 240.605510] [<ffffffff8170835f>] ? register_pernet_subsys+0x1f/0x50
[ 240.605626] [<ffffffff8170835f>] ? register_pernet_subsys+0x1f/0x50
[ 240.605757] [<ffffffffa0701000>] ? 0xffffffffa0701000
[ 240.605854] [<ffffffff8170835f>] register_pernet_subsys+0x1f/0x50
[ 240.606005] [<ffffffffa0701048>] br_init+0x48/0xd3 [bridge]
[ 240.606112] [<ffffffff81002148>] do_one_initcall+0xd8/0x210
[ 240.606224] [<ffffffff81153c02>] load_module+0x20c2/0x2870
[ 240.606327] [<ffffffff8114ebe0>] ? store_uevent+0x70/0x70
[ 240.606433] [<ffffffff8110ac26>] ? lock_release_non_nested+0x3c6/0x3d0
[ 240.606557] [<ffffffff81154497>] SyS_init_module+0xe7/0x140
[ 240.606664] [<ffffffff818652a9>] system_call_fastpath+0x12/0x17
[ 240.606845] #0: (net_mutex){+.+.+.}, at: [<ffffffff8170835f>]
register_pernet_subsys+0x1f/0x50
[ 240.607114] INFO: task modprobe:1466 blocked for more than 120 seconds.
[ 240.607231] Not tainted 3.18.0-0.rc1.git2.1.fc22.x86_64 #1
[ 240.607337] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 240.607473] modprobe D ffff88020fbab480 13096 1466 1399 0x00000084
[ 240.607622] ffff88020d1bbbe8 0000000000000096 ffff88020fbab480
00000000001d5f00
[ 240.607791] ffff88020d1bbfd8 00000000001d5f00 ffffffff81e1b580
ffff88020fbab480
[ 240.607949] ffff88020fbab480 ffffffff81f8fb48 0000000000000246
ffff88020fbab480
[ 240.608193] [<ffffffff8185e7e1>] schedule_preempt_disabled+0x31/0x80
[ 240.608316] [<ffffffff81860033>] mutex_lock_nested+0x183/0x440
[ 240.608425] [<ffffffff817083ad>] ? register_pernet_device+0x1d/0x70
[ 240.608542] [<ffffffff817083ad>] ? register_pernet_device+0x1d/0x70
[ 240.608662] [<ffffffffa071d000>] ? 0xffffffffa071d000
[ 240.608759] [<ffffffff817083ad>] register_pernet_device+0x1d/0x70
[ 240.608881] [<ffffffffa071d020>] ppp_init+0x20/0x1000 [ppp_generic]
[ 240.609021] [<ffffffff81002148>] do_one_initcall+0xd8/0x210
[ 240.609131] [<ffffffff81153c02>] load_module+0x20c2/0x2870
[ 240.609235] [<ffffffff8114ebe0>] ? store_uevent+0x70/0x70
[ 240.609339] [<ffffffff8110ac26>] ? lock_release_non_nested+0x3c6/0x3d0
[ 240.609462] [<ffffffff81154497>] SyS_init_module+0xe7/0x140
[ 240.609568] [<ffffffff818652a9>] system_call_fastpath+0x12/0x17
[ 240.609749] #0: (net_mutex){+.+.+.}, at: [<ffffffff817083ad>]
register_pernet_device+0x1d/0x70
Looks like contention on net_mutex or something, but I honestly have
no idea yet. I can't recreate it myself at the moment or I would
bisect.
Has nobody else run into this with the pre-3.18 kernels? Fedora isn't
carrying any patches in this area.
josh
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
More majordomo info at http://vger.kernel.org/majordomo-info.html

Josh Boyer

2014-10-22 17:49:38 UTC

Post by Cong Wang
(Adding Paul and Eric in Cc)
I am not aware of any change in net/core/dev.c related here,
so I guess it's a bug in rcu_barrier().

Possibly. The person that reported the issue below said it showed up
between Linux v3.17-7872-g5ff0b9e1a1da and Linux
v3.17-8307-gf1d0d14120a8 for them. Which is a slightly older window
than the on that Kevin reported. I haven't had a chance to dig
through the commits yet.

josh

Post by Cong Wang

Post by Josh Boyer
[ 240.599195] INFO: task kworker/u16:5:100 blocked for more than 120 seconds.
[ 240.599338] Not tainted 3.18.0-0.rc1.git2.1.fc22.x86_64 #1
[ 240.599446] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 240.599583] kworker/u16:5 D ffff8802202db480 12400 100 2 0x00000000
[ 240.599744] Workqueue: netns cleanup_net
[ 240.599823] ffff8802202eb9e8 0000000000000096 ffff8802202db480
00000000001d5f00
[ 240.600066] ffff8802202ebfd8 00000000001d5f00 ffff8800368c3480
ffff8802202db480
[ 240.600228] ffffffff81ee2690 7fffffffffffffff ffffffff81ee2698
ffffffff81ee2690
[ 240.600445] [<ffffffff8185e239>] schedule+0x29/0x70
[ 240.600541] [<ffffffff8186345c>] schedule_timeout+0x26c/0x410
[ 240.600651] [<ffffffff81865ef7>] ? retint_restore_args+0x13/0x13
[ 240.600765] [<ffffffff818644e4>] ? _raw_spin_unlock_irq+0x34/0x50
[ 240.600879] [<ffffffff8185fc6c>] wait_for_completion+0x10c/0x150
[ 240.601025] [<ffffffff810e53e0>] ? wake_up_state+0x20/0x20
[ 240.601133] [<ffffffff8112a749>] _rcu_barrier+0x159/0x200
[ 240.601237] [<ffffffff8112a845>] rcu_barrier+0x15/0x20
[ 240.601335] [<ffffffff81718ebf>] netdev_run_todo+0x6f/0x310
[ 240.601442] [<ffffffff8170da85>] ? rollback_registered_many+0x265/0x2e0
[ 240.601564] [<ffffffff81725f2e>] rtnl_unlock+0xe/0x10
[ 240.601660] [<ffffffff8170f8e6>] default_device_exit_batch+0x156/0x180
[ 240.601781] [<ffffffff810fd8a0>] ? abort_exclusive_wait+0xb0/0xb0
[ 240.601895] [<ffffffff81707993>] ops_exit_list.isra.1+0x53/0x60
[ 240.602028] [<ffffffff81708540>] cleanup_net+0x100/0x1f0
[ 240.602131] [<ffffffff810ccfa8>] process_one_work+0x218/0x850
[ 240.602241] [<ffffffff810ccf0f>] ? process_one_work+0x17f/0x850
[ 240.602350] [<ffffffff810cd6c7>] ? worker_thread+0xe7/0x4a0
[ 240.602454] [<ffffffff810cd64b>] worker_thread+0x6b/0x4a0
[ 240.602555] [<ffffffff810cd5e0>] ? process_one_work+0x850/0x850
[ 240.602665] [<ffffffff810d399b>] kthread+0x10b/0x130
[ 240.602762] [<ffffffff81028cc9>] ? sched_clock+0x9/0x10
[ 240.602862] [<ffffffff810d3890>] ? kthread_create_on_node+0x250/0x250
[ 240.603004] [<ffffffff818651fc>] ret_from_fork+0x7c/0xb0
[ 240.603106] [<ffffffff810d3890>] ? kthread_create_on_node+0x250/0x250
[ 240.603304] #0: ("%s""netns"){.+.+.+}, at: [<ffffffff810ccf0f>]
process_one_work+0x17f/0x850
[<ffffffff810ccf0f>] process_one_work+0x17f/0x850
[ 240.603691] #2: (net_mutex){+.+.+.}, at: [<ffffffff817084cc>]
cleanup_net+0x8c/0x1f0
[<ffffffff8112a625>] _rcu_barrier+0x35/0x200
[ 240.604211] INFO: task modprobe:1387 blocked for more than 120 seconds.
[ 240.604329] Not tainted 3.18.0-0.rc1.git2.1.fc22.x86_64 #1
[ 240.604434] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 240.604570] modprobe D ffff8800cb4f1a40 13112 1387 1386 0x00000080
[ 240.604719] ffff8800cafbbbe8 0000000000000096 ffff8800cb4f1a40
00000000001d5f00
[ 240.604878] ffff8800cafbbfd8 00000000001d5f00 ffff880223280000
ffff8800cb4f1a40
[ 240.605068] ffff8800cb4f1a40 ffffffff81f8fb48 0000000000000246
ffff8800cb4f1a40
[ 240.605283] [<ffffffff8185e7e1>] schedule_preempt_disabled+0x31/0x80
[ 240.605400] [<ffffffff81860033>] mutex_lock_nested+0x183/0x440
[ 240.605510] [<ffffffff8170835f>] ? register_pernet_subsys+0x1f/0x50
[ 240.605626] [<ffffffff8170835f>] ? register_pernet_subsys+0x1f/0x50
[ 240.605757] [<ffffffffa0701000>] ? 0xffffffffa0701000
[ 240.605854] [<ffffffff8170835f>] register_pernet_subsys+0x1f/0x50
[ 240.606005] [<ffffffffa0701048>] br_init+0x48/0xd3 [bridge]
[ 240.606112] [<ffffffff81002148>] do_one_initcall+0xd8/0x210
[ 240.606224] [<ffffffff81153c02>] load_module+0x20c2/0x2870
[ 240.606327] [<ffffffff8114ebe0>] ? store_uevent+0x70/0x70
[ 240.606433] [<ffffffff8110ac26>] ? lock_release_non_nested+0x3c6/0x3d0
[ 240.606557] [<ffffffff81154497>] SyS_init_module+0xe7/0x140
[ 240.606664] [<ffffffff818652a9>] system_call_fastpath+0x12/0x17
[ 240.606845] #0: (net_mutex){+.+.+.}, at: [<ffffffff8170835f>]
register_pernet_subsys+0x1f/0x50
[ 240.607114] INFO: task modprobe:1466 blocked for more than 120 seconds.
[ 240.607231] Not tainted 3.18.0-0.rc1.git2.1.fc22.x86_64 #1
[ 240.607337] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 240.607473] modprobe D ffff88020fbab480 13096 1466 1399 0x00000084
[ 240.607622] ffff88020d1bbbe8 0000000000000096 ffff88020fbab480
00000000001d5f00
[ 240.607791] ffff88020d1bbfd8 00000000001d5f00 ffffffff81e1b580
ffff88020fbab480
[ 240.607949] ffff88020fbab480 ffffffff81f8fb48 0000000000000246
ffff88020fbab480
[ 240.608193] [<ffffffff8185e7e1>] schedule_preempt_disabled+0x31/0x80
[ 240.608316] [<ffffffff81860033>] mutex_lock_nested+0x183/0x440
[ 240.608425] [<ffffffff817083ad>] ? register_pernet_device+0x1d/0x70
[ 240.608542] [<ffffffff817083ad>] ? register_pernet_device+0x1d/0x70
[ 240.608662] [<ffffffffa071d000>] ? 0xffffffffa071d000
[ 240.608759] [<ffffffff817083ad>] register_pernet_device+0x1d/0x70
[ 240.608881] [<ffffffffa071d020>] ppp_init+0x20/0x1000 [ppp_generic]
[ 240.609021] [<ffffffff81002148>] do_one_initcall+0xd8/0x210
[ 240.609131] [<ffffffff81153c02>] load_module+0x20c2/0x2870
[ 240.609235] [<ffffffff8114ebe0>] ? store_uevent+0x70/0x70
[ 240.609339] [<ffffffff8110ac26>] ? lock_release_non_nested+0x3c6/0x3d0
[ 240.609462] [<ffffffff81154497>] SyS_init_module+0xe7/0x140
[ 240.609568] [<ffffffff818652a9>] system_call_fastpath+0x12/0x17
[ 240.609749] #0: (net_mutex){+.+.+.}, at: [<ffffffff817083ad>]
register_pernet_device+0x1d/0x70
Looks like contention on net_mutex or something, but I honestly have
no idea yet. I can't recreate it myself at the moment or I would
bisect.
Has nobody else run into this with the pre-3.18 kernels? Fedora isn't
carrying any patches in this area.
josh
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
More majordomo info at http://vger.kernel.org/majordomo-info.html

Eric W. Biederman

2014-10-22 17:53:24 UTC

Post by Cong Wang
(Adding Paul and Eric in Cc)

Post by Josh Boyer
[ 240.599195] INFO: task kworker/u16:5:100 blocked for more than 120 seconds.
[ 240.599338] Not tainted 3.18.0-0.rc1.git2.1.fc22.x86_64 #1
[ 240.599446] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 240.599583] kworker/u16:5 D ffff8802202db480 12400 100 2 0x00000000
[ 240.599744] Workqueue: netns cleanup_net
[ 240.599823] ffff8802202eb9e8 0000000000000096 ffff8802202db480
00000000001d5f00
[ 240.600066] ffff8802202ebfd8 00000000001d5f00 ffff8800368c3480
ffff8802202db480
[ 240.600228] ffffffff81ee2690 7fffffffffffffff ffffffff81ee2698
ffffffff81ee2690
[ 240.600445] [<ffffffff8185e239>] schedule+0x29/0x70
[ 240.600541] [<ffffffff8186345c>] schedule_timeout+0x26c/0x410
[ 240.600651] [<ffffffff81865ef7>] ? retint_restore_args+0x13/0x13
[ 240.600765] [<ffffffff818644e4>] ? _raw_spin_unlock_irq+0x34/0x50
[ 240.600879] [<ffffffff8185fc6c>] wait_for_completion+0x10c/0x150
[ 240.601025] [<ffffffff810e53e0>] ? wake_up_state+0x20/0x20
[ 240.601133] [<ffffffff8112a749>] _rcu_barrier+0x159/0x200
[ 240.601237] [<ffffffff8112a845>] rcu_barrier+0x15/0x20
[ 240.601335] [<ffffffff81718ebf>] netdev_run_todo+0x6f/0x310
[ 240.601442] [<ffffffff8170da85>] ? rollback_registered_many+0x265/0x2e0
[ 240.601564] [<ffffffff81725f2e>] rtnl_unlock+0xe/0x10
[ 240.601660] [<ffffffff8170f8e6>] default_device_exit_batch+0x156/0x180
[ 240.601781] [<ffffffff810fd8a0>] ? abort_exclusive_wait+0xb0/0xb0
[ 240.601895] [<ffffffff81707993>] ops_exit_list.isra.1+0x53/0x60
[ 240.602028] [<ffffffff81708540>] cleanup_net+0x100/0x1f0
[ 240.602131] [<ffffffff810ccfa8>] process_one_work+0x218/0x850
[ 240.602241] [<ffffffff810ccf0f>] ? process_one_work+0x17f/0x850
[ 240.602350] [<ffffffff810cd6c7>] ? worker_thread+0xe7/0x4a0
[ 240.602454] [<ffffffff810cd64b>] worker_thread+0x6b/0x4a0
[ 240.602555] [<ffffffff810cd5e0>] ? process_one_work+0x850/0x850
[ 240.602665] [<ffffffff810d399b>] kthread+0x10b/0x130
[ 240.602762] [<ffffffff81028cc9>] ? sched_clock+0x9/0x10
[ 240.602862] [<ffffffff810d3890>] ? kthread_create_on_node+0x250/0x250
[ 240.603004] [<ffffffff818651fc>] ret_from_fork+0x7c/0xb0
[ 240.603106] [<ffffffff810d3890>] ? kthread_create_on_node+0x250/0x250
[ 240.603304] #0: ("%s""netns"){.+.+.+}, at: [<ffffffff810ccf0f>]
process_one_work+0x17f/0x850
[<ffffffff810ccf0f>] process_one_work+0x17f/0x850
[ 240.603691] #2: (net_mutex){+.+.+.}, at: [<ffffffff817084cc>]
cleanup_net+0x8c/0x1f0
[<ffffffff8112a625>] _rcu_barrier+0x35/0x200
[ 240.604211] INFO: task modprobe:1387 blocked for more than 120 seconds.
[ 240.604329] Not tainted 3.18.0-0.rc1.git2.1.fc22.x86_64 #1
[ 240.604434] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 240.604570] modprobe D ffff8800cb4f1a40 13112 1387 1386 0x00000080
[ 240.604719] ffff8800cafbbbe8 0000000000000096 ffff8800cb4f1a40
00000000001d5f00
[ 240.604878] ffff8800cafbbfd8 00000000001d5f00 ffff880223280000
ffff8800cb4f1a40
[ 240.605068] ffff8800cb4f1a40 ffffffff81f8fb48 0000000000000246
ffff8800cb4f1a40
[ 240.605283] [<ffffffff8185e7e1>] schedule_preempt_disabled+0x31/0x80
[ 240.605400] [<ffffffff81860033>] mutex_lock_nested+0x183/0x440
[ 240.605510] [<ffffffff8170835f>] ? register_pernet_subsys+0x1f/0x50
[ 240.605626] [<ffffffff8170835f>] ? register_pernet_subsys+0x1f/0x50
[ 240.605757] [<ffffffffa0701000>] ? 0xffffffffa0701000
[ 240.605854] [<ffffffff8170835f>] register_pernet_subsys+0x1f/0x50
[ 240.606005] [<ffffffffa0701048>] br_init+0x48/0xd3 [bridge]
[ 240.606112] [<ffffffff81002148>] do_one_initcall+0xd8/0x210
[ 240.606224] [<ffffffff81153c02>] load_module+0x20c2/0x2870
[ 240.606327] [<ffffffff8114ebe0>] ? store_uevent+0x70/0x70
[ 240.606433] [<ffffffff8110ac26>] ? lock_release_non_nested+0x3c6/0x3d0
[ 240.606557] [<ffffffff81154497>] SyS_init_module+0xe7/0x140
[ 240.606664] [<ffffffff818652a9>] system_call_fastpath+0x12/0x17
[ 240.606845] #0: (net_mutex){+.+.+.}, at: [<ffffffff8170835f>]
register_pernet_subsys+0x1f/0x50
[ 240.607114] INFO: task modprobe:1466 blocked for more than 120 seconds.
[ 240.607231] Not tainted 3.18.0-0.rc1.git2.1.fc22.x86_64 #1
[ 240.607337] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 240.607473] modprobe D ffff88020fbab480 13096 1466 1399 0x00000084
[ 240.607622] ffff88020d1bbbe8 0000000000000096 ffff88020fbab480
00000000001d5f00
[ 240.607791] ffff88020d1bbfd8 00000000001d5f00 ffffffff81e1b580
ffff88020fbab480
[ 240.607949] ffff88020fbab480 ffffffff81f8fb48 0000000000000246
ffff88020fbab480
[ 240.608193] [<ffffffff8185e7e1>] schedule_preempt_disabled+0x31/0x80
[ 240.608316] [<ffffffff81860033>] mutex_lock_nested+0x183/0x440
[ 240.608425] [<ffffffff817083ad>] ? register_pernet_device+0x1d/0x70
[ 240.608542] [<ffffffff817083ad>] ? register_pernet_device+0x1d/0x70
[ 240.608662] [<ffffffffa071d000>] ? 0xffffffffa071d000
[ 240.608759] [<ffffffff817083ad>] register_pernet_device+0x1d/0x70
[ 240.608881] [<ffffffffa071d020>] ppp_init+0x20/0x1000 [ppp_generic]
[ 240.609021] [<ffffffff81002148>] do_one_initcall+0xd8/0x210
[ 240.609131] [<ffffffff81153c02>] load_module+0x20c2/0x2870
[ 240.609235] [<ffffffff8114ebe0>] ? store_uevent+0x70/0x70
[ 240.609339] [<ffffffff8110ac26>] ? lock_release_non_nested+0x3c6/0x3d0
[ 240.609462] [<ffffffff81154497>] SyS_init_module+0xe7/0x140
[ 240.609568] [<ffffffff818652a9>] system_call_fastpath+0x12/0x17
[ 240.609749] #0: (net_mutex){+.+.+.}, at: [<ffffffff817083ad>]
register_pernet_device+0x1d/0x70
Looks like contention on net_mutex or something, but I honestly have
no idea yet. I can't recreate it myself at the moment or I would
bisect.
Has nobody else run into this with the pre-3.18 kernels? Fedora isn't
carrying any patches in this area.

I am not aware of any change in net/core/dev.c related here,
so I guess it's a bug in rcu_barrier().
From the limited trace data I see in this email I have to agree.

It looks like for some reason rcu_barrier is taking forever
while the rtnl_lock is held in cleanup_net. Because the
rtnl_lock is held modprobe of the ppp driver is getting stuck.

Is it possible we have an AB BA deadlock between the rtnl_lock
and rcu. With something the module loading code assumes?

Eric

Paul E. McKenney

2014-10-22 18:11:35 UTC

Post by Eric W. Biederman

Post by Cong Wang
(Adding Paul and Eric in Cc)

Post by Josh Boyer
[ 240.599195] INFO: task kworker/u16:5:100 blocked for more than 120 seconds.
[ 240.599338] Not tainted 3.18.0-0.rc1.git2.1.fc22.x86_64 #1
[ 240.599446] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 240.599583] kworker/u16:5 D ffff8802202db480 12400 100 2 0x00000000
[ 240.599744] Workqueue: netns cleanup_net
[ 240.599823] ffff8802202eb9e8 0000000000000096 ffff8802202db480
00000000001d5f00
[ 240.600066] ffff8802202ebfd8 00000000001d5f00 ffff8800368c3480
ffff8802202db480
[ 240.600228] ffffffff81ee2690 7fffffffffffffff ffffffff81ee2698
ffffffff81ee2690
[ 240.600445] [<ffffffff8185e239>] schedule+0x29/0x70
[ 240.600541] [<ffffffff8186345c>] schedule_timeout+0x26c/0x410
[ 240.600651] [<ffffffff81865ef7>] ? retint_restore_args+0x13/0x13
[ 240.600765] [<ffffffff818644e4>] ? _raw_spin_unlock_irq+0x34/0x50
[ 240.600879] [<ffffffff8185fc6c>] wait_for_completion+0x10c/0x150
[ 240.601025] [<ffffffff810e53e0>] ? wake_up_state+0x20/0x20
[ 240.601133] [<ffffffff8112a749>] _rcu_barrier+0x159/0x200
[ 240.601237] [<ffffffff8112a845>] rcu_barrier+0x15/0x20
[ 240.601335] [<ffffffff81718ebf>] netdev_run_todo+0x6f/0x310
[ 240.601442] [<ffffffff8170da85>] ? rollback_registered_many+0x265/0x2e0
[ 240.601564] [<ffffffff81725f2e>] rtnl_unlock+0xe/0x10
[ 240.601660] [<ffffffff8170f8e6>] default_device_exit_batch+0x156/0x180
[ 240.601781] [<ffffffff810fd8a0>] ? abort_exclusive_wait+0xb0/0xb0
[ 240.601895] [<ffffffff81707993>] ops_exit_list.isra.1+0x53/0x60
[ 240.602028] [<ffffffff81708540>] cleanup_net+0x100/0x1f0
[ 240.602131] [<ffffffff810ccfa8>] process_one_work+0x218/0x850
[ 240.602241] [<ffffffff810ccf0f>] ? process_one_work+0x17f/0x850
[ 240.602350] [<ffffffff810cd6c7>] ? worker_thread+0xe7/0x4a0
[ 240.602454] [<ffffffff810cd64b>] worker_thread+0x6b/0x4a0
[ 240.602555] [<ffffffff810cd5e0>] ? process_one_work+0x850/0x850
[ 240.602665] [<ffffffff810d399b>] kthread+0x10b/0x130
[ 240.602762] [<ffffffff81028cc9>] ? sched_clock+0x9/0x10
[ 240.602862] [<ffffffff810d3890>] ? kthread_create_on_node+0x250/0x250
[ 240.603004] [<ffffffff818651fc>] ret_from_fork+0x7c/0xb0
[ 240.603106] [<ffffffff810d3890>] ? kthread_create_on_node+0x250/0x250
[ 240.603304] #0: ("%s""netns"){.+.+.+}, at: [<ffffffff810ccf0f>]
process_one_work+0x17f/0x850
[<ffffffff810ccf0f>] process_one_work+0x17f/0x850
[ 240.603691] #2: (net_mutex){+.+.+.}, at: [<ffffffff817084cc>]
cleanup_net+0x8c/0x1f0
[<ffffffff8112a625>] _rcu_barrier+0x35/0x200
[ 240.604211] INFO: task modprobe:1387 blocked for more than 120 seconds.
[ 240.604329] Not tainted 3.18.0-0.rc1.git2.1.fc22.x86_64 #1
[ 240.604434] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 240.604570] modprobe D ffff8800cb4f1a40 13112 1387 1386 0x00000080
[ 240.604719] ffff8800cafbbbe8 0000000000000096 ffff8800cb4f1a40
00000000001d5f00
[ 240.604878] ffff8800cafbbfd8 00000000001d5f00 ffff880223280000
ffff8800cb4f1a40
[ 240.605068] ffff8800cb4f1a40 ffffffff81f8fb48 0000000000000246
ffff8800cb4f1a40
[ 240.605283] [<ffffffff8185e7e1>] schedule_preempt_disabled+0x31/0x80
[ 240.605400] [<ffffffff81860033>] mutex_lock_nested+0x183/0x440
[ 240.605510] [<ffffffff8170835f>] ? register_pernet_subsys+0x1f/0x50
[ 240.605626] [<ffffffff8170835f>] ? register_pernet_subsys+0x1f/0x50
[ 240.605757] [<ffffffffa0701000>] ? 0xffffffffa0701000
[ 240.605854] [<ffffffff8170835f>] register_pernet_subsys+0x1f/0x50
[ 240.606005] [<ffffffffa0701048>] br_init+0x48/0xd3 [bridge]
[ 240.606112] [<ffffffff81002148>] do_one_initcall+0xd8/0x210
[ 240.606224] [<ffffffff81153c02>] load_module+0x20c2/0x2870
[ 240.606327] [<ffffffff8114ebe0>] ? store_uevent+0x70/0x70
[ 240.606433] [<ffffffff8110ac26>] ? lock_release_non_nested+0x3c6/0x3d0
[ 240.606557] [<ffffffff81154497>] SyS_init_module+0xe7/0x140
[ 240.606664] [<ffffffff818652a9>] system_call_fastpath+0x12/0x17
[ 240.606845] #0: (net_mutex){+.+.+.}, at: [<ffffffff8170835f>]
register_pernet_subsys+0x1f/0x50
[ 240.607114] INFO: task modprobe:1466 blocked for more than 120 seconds.
[ 240.607231] Not tainted 3.18.0-0.rc1.git2.1.fc22.x86_64 #1
[ 240.607337] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 240.607473] modprobe D ffff88020fbab480 13096 1466 1399 0x00000084
[ 240.607622] ffff88020d1bbbe8 0000000000000096 ffff88020fbab480
00000000001d5f00
[ 240.607791] ffff88020d1bbfd8 00000000001d5f00 ffffffff81e1b580
ffff88020fbab480
[ 240.607949] ffff88020fbab480 ffffffff81f8fb48 0000000000000246
ffff88020fbab480
[ 240.608193] [<ffffffff8185e7e1>] schedule_preempt_disabled+0x31/0x80
[ 240.608316] [<ffffffff81860033>] mutex_lock_nested+0x183/0x440
[ 240.608425] [<ffffffff817083ad>] ? register_pernet_device+0x1d/0x70
[ 240.608542] [<ffffffff817083ad>] ? register_pernet_device+0x1d/0x70
[ 240.608662] [<ffffffffa071d000>] ? 0xffffffffa071d000
[ 240.608759] [<ffffffff817083ad>] register_pernet_device+0x1d/0x70
[ 240.608881] [<ffffffffa071d020>] ppp_init+0x20/0x1000 [ppp_generic]
[ 240.609021] [<ffffffff81002148>] do_one_initcall+0xd8/0x210
[ 240.609131] [<ffffffff81153c02>] load_module+0x20c2/0x2870
[ 240.609235] [<ffffffff8114ebe0>] ? store_uevent+0x70/0x70
[ 240.609339] [<ffffffff8110ac26>] ? lock_release_non_nested+0x3c6/0x3d0
[ 240.609462] [<ffffffff81154497>] SyS_init_module+0xe7/0x140
[ 240.609568] [<ffffffff818652a9>] system_call_fastpath+0x12/0x17
[ 240.609749] #0: (net_mutex){+.+.+.}, at: [<ffffffff817083ad>]
register_pernet_device+0x1d/0x70
Looks like contention on net_mutex or something, but I honestly have
no idea yet. I can't recreate it myself at the moment or I would
bisect.
Has nobody else run into this with the pre-3.18 kernels? Fedora isn't
carrying any patches in this area.

I am not aware of any change in net/core/dev.c related here,
so I guess it's a bug in rcu_barrier().
From the limited trace data I see in this email I have to agree.

It looks like for some reason rcu_barrier is taking forever
while the rtnl_lock is held in cleanup_net. Because the
rtnl_lock is held modprobe of the ppp driver is getting stuck.
Is it possible we have an AB BA deadlock between the rtnl_lock
and rcu. With something the module loading code assumes?

I am not aware of RCU ever acquiring rtnl_lock, not directly, anyway.

Thanx, Paul

Eric W. Biederman

2014-10-22 18:25:37 UTC

Post by Paul E. McKenney

Post by Eric W. Biederman

Post by Cong Wang
(Adding Paul and Eric in Cc)

Post by Josh Boyer
[ 240.599195] INFO: task kworker/u16:5:100 blocked for more than 120 seconds.
[ 240.599338] Not tainted 3.18.0-0.rc1.git2.1.fc22.x86_64 #1
[ 240.599446] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 240.599583] kworker/u16:5 D ffff8802202db480 12400 100 2 0x00000000
[ 240.599744] Workqueue: netns cleanup_net
[ 240.599823] ffff8802202eb9e8 0000000000000096 ffff8802202db480
00000000001d5f00
[ 240.600066] ffff8802202ebfd8 00000000001d5f00 ffff8800368c3480
ffff8802202db480
[ 240.600228] ffffffff81ee2690 7fffffffffffffff ffffffff81ee2698
ffffffff81ee2690
[ 240.600445] [<ffffffff8185e239>] schedule+0x29/0x70
[ 240.600541] [<ffffffff8186345c>] schedule_timeout+0x26c/0x410
[ 240.600651] [<ffffffff81865ef7>] ? retint_restore_args+0x13/0x13
[ 240.600765] [<ffffffff818644e4>] ? _raw_spin_unlock_irq+0x34/0x50
[ 240.600879] [<ffffffff8185fc6c>] wait_for_completion+0x10c/0x150
[ 240.601025] [<ffffffff810e53e0>] ? wake_up_state+0x20/0x20
[ 240.601133] [<ffffffff8112a749>] _rcu_barrier+0x159/0x200
[ 240.601237] [<ffffffff8112a845>] rcu_barrier+0x15/0x20
[ 240.601335] [<ffffffff81718ebf>] netdev_run_todo+0x6f/0x310
[ 240.601442] [<ffffffff8170da85>] ? rollback_registered_many+0x265/0x2e0
[ 240.601564] [<ffffffff81725f2e>] rtnl_unlock+0xe/0x10
[ 240.601660] [<ffffffff8170f8e6>] default_device_exit_batch+0x156/0x180
[ 240.601781] [<ffffffff810fd8a0>] ? abort_exclusive_wait+0xb0/0xb0
[ 240.601895] [<ffffffff81707993>] ops_exit_list.isra.1+0x53/0x60
[ 240.602028] [<ffffffff81708540>] cleanup_net+0x100/0x1f0
[ 240.602131] [<ffffffff810ccfa8>] process_one_work+0x218/0x850
[ 240.602241] [<ffffffff810ccf0f>] ? process_one_work+0x17f/0x850
[ 240.602350] [<ffffffff810cd6c7>] ? worker_thread+0xe7/0x4a0
[ 240.602454] [<ffffffff810cd64b>] worker_thread+0x6b/0x4a0
[ 240.602555] [<ffffffff810cd5e0>] ? process_one_work+0x850/0x850
[ 240.602665] [<ffffffff810d399b>] kthread+0x10b/0x130
[ 240.602762] [<ffffffff81028cc9>] ? sched_clock+0x9/0x10
[ 240.602862] [<ffffffff810d3890>] ? kthread_create_on_node+0x250/0x250
[ 240.603004] [<ffffffff818651fc>] ret_from_fork+0x7c/0xb0
[ 240.603106] [<ffffffff810d3890>] ? kthread_create_on_node+0x250/0x250
[ 240.603304] #0: ("%s""netns"){.+.+.+}, at: [<ffffffff810ccf0f>]
process_one_work+0x17f/0x850
[<ffffffff810ccf0f>] process_one_work+0x17f/0x850
[ 240.603691] #2: (net_mutex){+.+.+.}, at: [<ffffffff817084cc>]
cleanup_net+0x8c/0x1f0
[<ffffffff8112a625>] _rcu_barrier+0x35/0x200
[ 240.604211] INFO: task modprobe:1387 blocked for more than 120 seconds.
[ 240.604329] Not tainted 3.18.0-0.rc1.git2.1.fc22.x86_64 #1
[ 240.604434] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 240.604570] modprobe D ffff8800cb4f1a40 13112 1387 1386 0x00000080
[ 240.604719] ffff8800cafbbbe8 0000000000000096 ffff8800cb4f1a40
00000000001d5f00
[ 240.604878] ffff8800cafbbfd8 00000000001d5f00 ffff880223280000
ffff8800cb4f1a40
[ 240.605068] ffff8800cb4f1a40 ffffffff81f8fb48 0000000000000246
ffff8800cb4f1a40
[ 240.605283] [<ffffffff8185e7e1>] schedule_preempt_disabled+0x31/0x80
[ 240.605400] [<ffffffff81860033>] mutex_lock_nested+0x183/0x440
[ 240.605510] [<ffffffff8170835f>] ? register_pernet_subsys+0x1f/0x50
[ 240.605626] [<ffffffff8170835f>] ? register_pernet_subsys+0x1f/0x50
[ 240.605757] [<ffffffffa0701000>] ? 0xffffffffa0701000
[ 240.605854] [<ffffffff8170835f>] register_pernet_subsys+0x1f/0x50
[ 240.606005] [<ffffffffa0701048>] br_init+0x48/0xd3 [bridge]
[ 240.606112] [<ffffffff81002148>] do_one_initcall+0xd8/0x210
[ 240.606224] [<ffffffff81153c02>] load_module+0x20c2/0x2870
[ 240.606327] [<ffffffff8114ebe0>] ? store_uevent+0x70/0x70
[ 240.606433] [<ffffffff8110ac26>] ? lock_release_non_nested+0x3c6/0x3d0
[ 240.606557] [<ffffffff81154497>] SyS_init_module+0xe7/0x140
[ 240.606664] [<ffffffff818652a9>] system_call_fastpath+0x12/0x17
[ 240.606845] #0: (net_mutex){+.+.+.}, at: [<ffffffff8170835f>]
register_pernet_subsys+0x1f/0x50
[ 240.607114] INFO: task modprobe:1466 blocked for more than 120 seconds.
[ 240.607231] Not tainted 3.18.0-0.rc1.git2.1.fc22.x86_64 #1
[ 240.607337] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 240.607473] modprobe D ffff88020fbab480 13096 1466 1399 0x00000084
[ 240.607622] ffff88020d1bbbe8 0000000000000096 ffff88020fbab480
00000000001d5f00
[ 240.607791] ffff88020d1bbfd8 00000000001d5f00 ffffffff81e1b580
ffff88020fbab480
[ 240.607949] ffff88020fbab480 ffffffff81f8fb48 0000000000000246
ffff88020fbab480
[ 240.608193] [<ffffffff8185e7e1>] schedule_preempt_disabled+0x31/0x80
[ 240.608316] [<ffffffff81860033>] mutex_lock_nested+0x183/0x440
[ 240.608425] [<ffffffff817083ad>] ? register_pernet_device+0x1d/0x70
[ 240.608542] [<ffffffff817083ad>] ? register_pernet_device+0x1d/0x70
[ 240.608662] [<ffffffffa071d000>] ? 0xffffffffa071d000
[ 240.608759] [<ffffffff817083ad>] register_pernet_device+0x1d/0x70
[ 240.608881] [<ffffffffa071d020>] ppp_init+0x20/0x1000 [ppp_generic]
[ 240.609021] [<ffffffff81002148>] do_one_initcall+0xd8/0x210
[ 240.609131] [<ffffffff81153c02>] load_module+0x20c2/0x2870
[ 240.609235] [<ffffffff8114ebe0>] ? store_uevent+0x70/0x70
[ 240.609339] [<ffffffff8110ac26>] ? lock_release_non_nested+0x3c6/0x3d0
[ 240.609462] [<ffffffff81154497>] SyS_init_module+0xe7/0x140
[ 240.609568] [<ffffffff818652a9>] system_call_fastpath+0x12/0x17
[ 240.609749] #0: (net_mutex){+.+.+.}, at: [<ffffffff817083ad>]
register_pernet_device+0x1d/0x70
Looks like contention on net_mutex or something, but I honestly have
no idea yet. I can't recreate it myself at the moment or I would
bisect.
Has nobody else run into this with the pre-3.18 kernels? Fedora isn't
carrying any patches in this area.

I am not aware of any change in net/core/dev.c related here,
so I guess it's a bug in rcu_barrier().
From the limited trace data I see in this email I have to agree.

It looks like for some reason rcu_barrier is taking forever
while the rtnl_lock is held in cleanup_net. Because the
rtnl_lock is held modprobe of the ppp driver is getting stuck.
Is it possible we have an AB BA deadlock between the rtnl_lock
and rcu. With something the module loading code assumes?

I am not aware of RCU ever acquiring rtnl_lock, not directly, anyway.

Does the module loading code do something strange with rcu? Perhaps
blocking an rcu grace period until the module loading completes?

If the module loading somehow blocks an rcu grace period that would
create an AB deadlock because loading the ppp module grabs the
rtnl_lock. And elsewhere we have the rtnl_lock waiting for an rcu grace
period.

I would think trying and failing to get the rtnl_lock would sleep and
thus let any rcu grace period happen but shrug.

It looks like something is holding up the rcu grace period, and causing
this. Although it is possible that something is causing cleanup_net
to run slowly and we are just seeing that slowness show up in
rcu_barrier as that is one of the slower bits. With a single trace I
can't definitely same that the rcu barrier is getting stuck but it
certainly looks that way.

Eric

Paul E. McKenney

2014-10-22 18:55:11 UTC

Post by Eric W. Biederman

Post by Paul E. McKenney

Post by Eric W. Biederman

Post by Cong Wang
(Adding Paul and Eric in Cc)

Post by Josh Boyer
[ 240.599195] INFO: task kworker/u16:5:100 blocked for more than 120 seconds.
[ 240.599338] Not tainted 3.18.0-0.rc1.git2.1.fc22.x86_64 #1
[ 240.599446] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 240.599583] kworker/u16:5 D ffff8802202db480 12400 100 2 0x00000000
[ 240.599744] Workqueue: netns cleanup_net
[ 240.599823] ffff8802202eb9e8 0000000000000096 ffff8802202db480
00000000001d5f00
[ 240.600066] ffff8802202ebfd8 00000000001d5f00 ffff8800368c3480
ffff8802202db480
[ 240.600228] ffffffff81ee2690 7fffffffffffffff ffffffff81ee2698
ffffffff81ee2690
[ 240.600445] [<ffffffff8185e239>] schedule+0x29/0x70
[ 240.600541] [<ffffffff8186345c>] schedule_timeout+0x26c/0x410
[ 240.600651] [<ffffffff81865ef7>] ? retint_restore_args+0x13/0x13
[ 240.600765] [<ffffffff818644e4>] ? _raw_spin_unlock_irq+0x34/0x50
[ 240.600879] [<ffffffff8185fc6c>] wait_for_completion+0x10c/0x150
[ 240.601025] [<ffffffff810e53e0>] ? wake_up_state+0x20/0x20
[ 240.601133] [<ffffffff8112a749>] _rcu_barrier+0x159/0x200
[ 240.601237] [<ffffffff8112a845>] rcu_barrier+0x15/0x20
[ 240.601335] [<ffffffff81718ebf>] netdev_run_todo+0x6f/0x310
[ 240.601442] [<ffffffff8170da85>] ? rollback_registered_many+0x265/0x2e0
[ 240.601564] [<ffffffff81725f2e>] rtnl_unlock+0xe/0x10
[ 240.601660] [<ffffffff8170f8e6>] default_device_exit_batch+0x156/0x180
[ 240.601781] [<ffffffff810fd8a0>] ? abort_exclusive_wait+0xb0/0xb0
[ 240.601895] [<ffffffff81707993>] ops_exit_list.isra.1+0x53/0x60
[ 240.602028] [<ffffffff81708540>] cleanup_net+0x100/0x1f0
[ 240.602131] [<ffffffff810ccfa8>] process_one_work+0x218/0x850
[ 240.602241] [<ffffffff810ccf0f>] ? process_one_work+0x17f/0x850
[ 240.602350] [<ffffffff810cd6c7>] ? worker_thread+0xe7/0x4a0
[ 240.602454] [<ffffffff810cd64b>] worker_thread+0x6b/0x4a0
[ 240.602555] [<ffffffff810cd5e0>] ? process_one_work+0x850/0x850
[ 240.602665] [<ffffffff810d399b>] kthread+0x10b/0x130
[ 240.602762] [<ffffffff81028cc9>] ? sched_clock+0x9/0x10
[ 240.602862] [<ffffffff810d3890>] ? kthread_create_on_node+0x250/0x250
[ 240.603004] [<ffffffff818651fc>] ret_from_fork+0x7c/0xb0
[ 240.603106] [<ffffffff810d3890>] ? kthread_create_on_node+0x250/0x250
[ 240.603304] #0: ("%s""netns"){.+.+.+}, at: [<ffffffff810ccf0f>]
process_one_work+0x17f/0x850
[<ffffffff810ccf0f>] process_one_work+0x17f/0x850
[ 240.603691] #2: (net_mutex){+.+.+.}, at: [<ffffffff817084cc>]
cleanup_net+0x8c/0x1f0
[<ffffffff8112a625>] _rcu_barrier+0x35/0x200
[ 240.604211] INFO: task modprobe:1387 blocked for more than 120 seconds.
[ 240.604329] Not tainted 3.18.0-0.rc1.git2.1.fc22.x86_64 #1
[ 240.604434] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 240.604570] modprobe D ffff8800cb4f1a40 13112 1387 1386 0x00000080
[ 240.604719] ffff8800cafbbbe8 0000000000000096 ffff8800cb4f1a40
00000000001d5f00
[ 240.604878] ffff8800cafbbfd8 00000000001d5f00 ffff880223280000
ffff8800cb4f1a40
[ 240.605068] ffff8800cb4f1a40 ffffffff81f8fb48 0000000000000246
ffff8800cb4f1a40
[ 240.605283] [<ffffffff8185e7e1>] schedule_preempt_disabled+0x31/0x80
[ 240.605400] [<ffffffff81860033>] mutex_lock_nested+0x183/0x440
[ 240.605510] [<ffffffff8170835f>] ? register_pernet_subsys+0x1f/0x50
[ 240.605626] [<ffffffff8170835f>] ? register_pernet_subsys+0x1f/0x50
[ 240.605757] [<ffffffffa0701000>] ? 0xffffffffa0701000
[ 240.605854] [<ffffffff8170835f>] register_pernet_subsys+0x1f/0x50
[ 240.606005] [<ffffffffa0701048>] br_init+0x48/0xd3 [bridge]
[ 240.606112] [<ffffffff81002148>] do_one_initcall+0xd8/0x210
[ 240.606224] [<ffffffff81153c02>] load_module+0x20c2/0x2870
[ 240.606327] [<ffffffff8114ebe0>] ? store_uevent+0x70/0x70
[ 240.606433] [<ffffffff8110ac26>] ? lock_release_non_nested+0x3c6/0x3d0
[ 240.606557] [<ffffffff81154497>] SyS_init_module+0xe7/0x140
[ 240.606664] [<ffffffff818652a9>] system_call_fastpath+0x12/0x17
[ 240.606845] #0: (net_mutex){+.+.+.}, at: [<ffffffff8170835f>]
register_pernet_subsys+0x1f/0x50
[ 240.607114] INFO: task modprobe:1466 blocked for more than 120 seconds.
[ 240.607231] Not tainted 3.18.0-0.rc1.git2.1.fc22.x86_64 #1
[ 240.607337] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 240.607473] modprobe D ffff88020fbab480 13096 1466 1399 0x00000084
[ 240.607622] ffff88020d1bbbe8 0000000000000096 ffff88020fbab480
00000000001d5f00
[ 240.607791] ffff88020d1bbfd8 00000000001d5f00 ffffffff81e1b580
ffff88020fbab480
[ 240.607949] ffff88020fbab480 ffffffff81f8fb48 0000000000000246
ffff88020fbab480
[ 240.608193] [<ffffffff8185e7e1>] schedule_preempt_disabled+0x31/0x80
[ 240.608316] [<ffffffff81860033>] mutex_lock_nested+0x183/0x440
[ 240.608425] [<ffffffff817083ad>] ? register_pernet_device+0x1d/0x70
[ 240.608542] [<ffffffff817083ad>] ? register_pernet_device+0x1d/0x70
[ 240.608662] [<ffffffffa071d000>] ? 0xffffffffa071d000
[ 240.608759] [<ffffffff817083ad>] register_pernet_device+0x1d/0x70
[ 240.608881] [<ffffffffa071d020>] ppp_init+0x20/0x1000 [ppp_generic]
[ 240.609021] [<ffffffff81002148>] do_one_initcall+0xd8/0x210
[ 240.609131] [<ffffffff81153c02>] load_module+0x20c2/0x2870
[ 240.609235] [<ffffffff8114ebe0>] ? store_uevent+0x70/0x70
[ 240.609339] [<ffffffff8110ac26>] ? lock_release_non_nested+0x3c6/0x3d0
[ 240.609462] [<ffffffff81154497>] SyS_init_module+0xe7/0x140
[ 240.609568] [<ffffffff818652a9>] system_call_fastpath+0x12/0x17
[ 240.609749] #0: (net_mutex){+.+.+.}, at: [<ffffffff817083ad>]
register_pernet_device+0x1d/0x70
Looks like contention on net_mutex or something, but I honestly have
no idea yet. I can't recreate it myself at the moment or I would
bisect.
Has nobody else run into this with the pre-3.18 kernels? Fedora isn't
carrying any patches in this area.

I am not aware of any change in net/core/dev.c related here,
so I guess it's a bug in rcu_barrier().
From the limited trace data I see in this email I have to agree.

It looks like for some reason rcu_barrier is taking forever
while the rtnl_lock is held in cleanup_net. Because the
rtnl_lock is held modprobe of the ppp driver is getting stuck.
Is it possible we have an AB BA deadlock between the rtnl_lock
and rcu. With something the module loading code assumes?

I am not aware of RCU ever acquiring rtnl_lock, not directly, anyway.

Does the module loading code do something strange with rcu? Perhaps
blocking an rcu grace period until the module loading completes?
If the module loading somehow blocks an rcu grace period that would
create an AB deadlock because loading the ppp module grabs the
rtnl_lock. And elsewhere we have the rtnl_lock waiting for an rcu grace
period.
I would think trying and failing to get the rtnl_lock would sleep and
thus let any rcu grace period happen but shrug.
It looks like something is holding up the rcu grace period, and causing
this. Although it is possible that something is causing cleanup_net
to run slowly and we are just seeing that slowness show up in
rcu_barrier as that is one of the slower bits. With a single trace I
can't definitely same that the rcu barrier is getting stuck but it
certainly looks that way.

Don't get me wrong -- the fact that this kthread appears to have
blocked within rcu_barrier() for 120 seconds means that something is
most definitely wrong here. I am surprised that there are no RCU CPU
stall warnings, but perhaps the blockage is in the callback execution
rather than grace-period completion. Or something is preventing this
kthread from starting up after the wake-up callback executes. Or...

Is this thing reproducible?

Thanx, Paul

Josh Boyer

2014-10-22 19:33:35 UTC

On Wed, Oct 22, 2014 at 2:55 PM, Paul E. McKenney

Post by Paul E. McKenney

Post by Eric W. Biederman

Post by Paul E. McKenney

Post by Eric W. Biederman

Post by Cong Wang
(Adding Paul and Eric in Cc)

Post by Josh Boyer
[ 240.599195] INFO: task kworker/u16:5:100 blocked for more than 120 seconds.
[ 240.599338] Not tainted 3.18.0-0.rc1.git2.1.fc22.x86_64 #1
[ 240.599446] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 240.599583] kworker/u16:5 D ffff8802202db480 12400 100 2 0x00000000
[ 240.599744] Workqueue: netns cleanup_net
[ 240.599823] ffff8802202eb9e8 0000000000000096 ffff8802202db480
00000000001d5f00
[ 240.600066] ffff8802202ebfd8 00000000001d5f00 ffff8800368c3480
ffff8802202db480
[ 240.600228] ffffffff81ee2690 7fffffffffffffff ffffffff81ee2698
ffffffff81ee2690
[ 240.600445] [<ffffffff8185e239>] schedule+0x29/0x70
[ 240.600541] [<ffffffff8186345c>] schedule_timeout+0x26c/0x410
[ 240.600651] [<ffffffff81865ef7>] ? retint_restore_args+0x13/0x13
[ 240.600765] [<ffffffff818644e4>] ? _raw_spin_unlock_irq+0x34/0x50
[ 240.600879] [<ffffffff8185fc6c>] wait_for_completion+0x10c/0x150
[ 240.601025] [<ffffffff810e53e0>] ? wake_up_state+0x20/0x20
[ 240.601133] [<ffffffff8112a749>] _rcu_barrier+0x159/0x200
[ 240.601237] [<ffffffff8112a845>] rcu_barrier+0x15/0x20
[ 240.601335] [<ffffffff81718ebf>] netdev_run_todo+0x6f/0x310
[ 240.601442] [<ffffffff8170da85>] ? rollback_registered_many+0x265/0x2e0
[ 240.601564] [<ffffffff81725f2e>] rtnl_unlock+0xe/0x10
[ 240.601660] [<ffffffff8170f8e6>] default_device_exit_batch+0x156/0x180
[ 240.601781] [<ffffffff810fd8a0>] ? abort_exclusive_wait+0xb0/0xb0
[ 240.601895] [<ffffffff81707993>] ops_exit_list.isra.1+0x53/0x60
[ 240.602028] [<ffffffff81708540>] cleanup_net+0x100/0x1f0
[ 240.602131] [<ffffffff810ccfa8>] process_one_work+0x218/0x850
[ 240.602241] [<ffffffff810ccf0f>] ? process_one_work+0x17f/0x850
[ 240.602350] [<ffffffff810cd6c7>] ? worker_thread+0xe7/0x4a0
[ 240.602454] [<ffffffff810cd64b>] worker_thread+0x6b/0x4a0
[ 240.602555] [<ffffffff810cd5e0>] ? process_one_work+0x850/0x850
[ 240.602665] [<ffffffff810d399b>] kthread+0x10b/0x130
[ 240.602762] [<ffffffff81028cc9>] ? sched_clock+0x9/0x10
[ 240.602862] [<ffffffff810d3890>] ? kthread_create_on_node+0x250/0x250
[ 240.603004] [<ffffffff818651fc>] ret_from_fork+0x7c/0xb0
[ 240.603106] [<ffffffff810d3890>] ? kthread_create_on_node+0x250/0x250
[ 240.603304] #0: ("%s""netns"){.+.+.+}, at: [<ffffffff810ccf0f>]
process_one_work+0x17f/0x850
[<ffffffff810ccf0f>] process_one_work+0x17f/0x850
[ 240.603691] #2: (net_mutex){+.+.+.}, at: [<ffffffff817084cc>]
cleanup_net+0x8c/0x1f0
[<ffffffff8112a625>] _rcu_barrier+0x35/0x200
[ 240.604211] INFO: task modprobe:1387 blocked for more than 120 seconds.
[ 240.604329] Not tainted 3.18.0-0.rc1.git2.1.fc22.x86_64 #1
[ 240.604434] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 240.604570] modprobe D ffff8800cb4f1a40 13112 1387 1386 0x00000080
[ 240.604719] ffff8800cafbbbe8 0000000000000096 ffff8800cb4f1a40
00000000001d5f00
[ 240.604878] ffff8800cafbbfd8 00000000001d5f00 ffff880223280000
ffff8800cb4f1a40
[ 240.605068] ffff8800cb4f1a40 ffffffff81f8fb48 0000000000000246
ffff8800cb4f1a40
[ 240.605283] [<ffffffff8185e7e1>] schedule_preempt_disabled+0x31/0x80
[ 240.605400] [<ffffffff81860033>] mutex_lock_nested+0x183/0x440
[ 240.605510] [<ffffffff8170835f>] ? register_pernet_subsys+0x1f/0x50
[ 240.605626] [<ffffffff8170835f>] ? register_pernet_subsys+0x1f/0x50
[ 240.605757] [<ffffffffa0701000>] ? 0xffffffffa0701000
[ 240.605854] [<ffffffff8170835f>] register_pernet_subsys+0x1f/0x50
[ 240.606005] [<ffffffffa0701048>] br_init+0x48/0xd3 [bridge]
[ 240.606112] [<ffffffff81002148>] do_one_initcall+0xd8/0x210
[ 240.606224] [<ffffffff81153c02>] load_module+0x20c2/0x2870
[ 240.606327] [<ffffffff8114ebe0>] ? store_uevent+0x70/0x70
[ 240.606433] [<ffffffff8110ac26>] ? lock_release_non_nested+0x3c6/0x3d0
[ 240.606557] [<ffffffff81154497>] SyS_init_module+0xe7/0x140
[ 240.606664] [<ffffffff818652a9>] system_call_fastpath+0x12/0x17
[ 240.606845] #0: (net_mutex){+.+.+.}, at: [<ffffffff8170835f>]
register_pernet_subsys+0x1f/0x50
[ 240.607114] INFO: task modprobe:1466 blocked for more than 120 seconds.
[ 240.607231] Not tainted 3.18.0-0.rc1.git2.1.fc22.x86_64 #1
[ 240.607337] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 240.607473] modprobe D ffff88020fbab480 13096 1466 1399 0x00000084
[ 240.607622] ffff88020d1bbbe8 0000000000000096 ffff88020fbab480
00000000001d5f00
[ 240.607791] ffff88020d1bbfd8 00000000001d5f00 ffffffff81e1b580
ffff88020fbab480
[ 240.607949] ffff88020fbab480 ffffffff81f8fb48 0000000000000246
ffff88020fbab480
[ 240.608193] [<ffffffff8185e7e1>] schedule_preempt_disabled+0x31/0x80
[ 240.608316] [<ffffffff81860033>] mutex_lock_nested+0x183/0x440
[ 240.608425] [<ffffffff817083ad>] ? register_pernet_device+0x1d/0x70
[ 240.608542] [<ffffffff817083ad>] ? register_pernet_device+0x1d/0x70
[ 240.608662] [<ffffffffa071d000>] ? 0xffffffffa071d000
[ 240.608759] [<ffffffff817083ad>] register_pernet_device+0x1d/0x70
[ 240.608881] [<ffffffffa071d020>] ppp_init+0x20/0x1000 [ppp_generic]
[ 240.609021] [<ffffffff81002148>] do_one_initcall+0xd8/0x210
[ 240.609131] [<ffffffff81153c02>] load_module+0x20c2/0x2870
[ 240.609235] [<ffffffff8114ebe0>] ? store_uevent+0x70/0x70
[ 240.609339] [<ffffffff8110ac26>] ? lock_release_non_nested+0x3c6/0x3d0
[ 240.609462] [<ffffffff81154497>] SyS_init_module+0xe7/0x140
[ 240.609568] [<ffffffff818652a9>] system_call_fastpath+0x12/0x17
[ 240.609749] #0: (net_mutex){+.+.+.}, at: [<ffffffff817083ad>]
register_pernet_device+0x1d/0x70
Looks like contention on net_mutex or something, but I honestly have
no idea yet. I can't recreate it myself at the moment or I would
bisect.
Has nobody else run into this with the pre-3.18 kernels? Fedora isn't
carrying any patches in this area.

I am not aware of any change in net/core/dev.c related here,
so I guess it's a bug in rcu_barrier().
From the limited trace data I see in this email I have to agree.

It looks like for some reason rcu_barrier is taking forever
while the rtnl_lock is held in cleanup_net. Because the
rtnl_lock is held modprobe of the ppp driver is getting stuck.
Is it possible we have an AB BA deadlock between the rtnl_lock
and rcu. With something the module loading code assumes?

I am not aware of RCU ever acquiring rtnl_lock, not directly, anyway.

Does the module loading code do something strange with rcu? Perhaps
blocking an rcu grace period until the module loading completes?
If the module loading somehow blocks an rcu grace period that would
create an AB deadlock because loading the ppp module grabs the
rtnl_lock. And elsewhere we have the rtnl_lock waiting for an rcu grace
period.
I would think trying and failing to get the rtnl_lock would sleep and
thus let any rcu grace period happen but shrug.
It looks like something is holding up the rcu grace period, and causing
this. Although it is possible that something is causing cleanup_net
to run slowly and we are just seeing that slowness show up in
rcu_barrier as that is one of the slower bits. With a single trace I
can't definitely same that the rcu barrier is getting stuck but it
certainly looks that way.

Don't get me wrong -- the fact that this kthread appears to have
blocked within rcu_barrier() for 120 seconds means that something is
most definitely wrong here. I am surprised that there are no RCU CPU
stall warnings, but perhaps the blockage is in the callback execution
rather than grace-period completion. Or something is preventing this
kthread from starting up after the wake-up callback executes. Or...
Is this thing reproducible?

I've added Yanko on CC, who reported the backtrace above and can
recreate it reliably. Apparently reverting the RCU merge commit
(d6dd50e) and rebuilding the latest after that does not show the
issue. I'll let Yanko explain more and answer any questions you have.

josh

Yanko Kaneti

2014-10-22 22:40:32 UTC

Post by Josh Boyer
On Wed, Oct 22, 2014 at 2:55 PM, Paul E. McKenney

Post by Paul E. McKenney

Post by Eric W. Biederman

Post by Paul E. McKenney

Post by Eric W. Biederman

Post by Cong Wang
(Adding Paul and Eric in Cc)

Post by Josh Boyer
[ 240.599195] INFO: task kworker/u16:5:100 blocked for more than 120 seconds.
[ 240.599338] Not tainted 3.18.0-0.rc1.git2.1.fc22.x86_64 #1
[ 240.599446] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 240.599583] kworker/u16:5 D ffff8802202db480 12400 100 2 0x00000000
[ 240.599744] Workqueue: netns cleanup_net
[ 240.599823] ffff8802202eb9e8 0000000000000096 ffff8802202db480
00000000001d5f00
[ 240.600066] ffff8802202ebfd8 00000000001d5f00 ffff8800368c3480
ffff8802202db480
[ 240.600228] ffffffff81ee2690 7fffffffffffffff ffffffff81ee2698
ffffffff81ee2690
[ 240.600445] [<ffffffff8185e239>] schedule+0x29/0x70
[ 240.600541] [<ffffffff8186345c>] schedule_timeout+0x26c/0x410
[ 240.600651] [<ffffffff81865ef7>] ? retint_restore_args+0x13/0x13
[ 240.600765] [<ffffffff818644e4>] ? _raw_spin_unlock_irq+0x34/0x50
[ 240.600879] [<ffffffff8185fc6c>] wait_for_completion+0x10c/0x150
[ 240.601025] [<ffffffff810e53e0>] ? wake_up_state+0x20/0x20
[ 240.601133] [<ffffffff8112a749>] _rcu_barrier+0x159/0x200
[ 240.601237] [<ffffffff8112a845>] rcu_barrier+0x15/0x20
[ 240.601335] [<ffffffff81718ebf>] netdev_run_todo+0x6f/0x310
[ 240.601442] [<ffffffff8170da85>] ? rollback_registered_many+0x265/0x2e0
[ 240.601564] [<ffffffff81725f2e>] rtnl_unlock+0xe/0x10
[ 240.601660] [<ffffffff8170f8e6>] default_device_exit_batch+0x156/0x180
[ 240.601781] [<ffffffff810fd8a0>] ? abort_exclusive_wait+0xb0/0xb0
[ 240.601895] [<ffffffff81707993>] ops_exit_list.isra.1+0x53/0x60
[ 240.602028] [<ffffffff81708540>] cleanup_net+0x100/0x1f0
[ 240.602131] [<ffffffff810ccfa8>] process_one_work+0x218/0x850
[ 240.602241] [<ffffffff810ccf0f>] ? process_one_work+0x17f/0x850
[ 240.602350] [<ffffffff810cd6c7>] ? worker_thread+0xe7/0x4a0
[ 240.602454] [<ffffffff810cd64b>] worker_thread+0x6b/0x4a0
[ 240.602555] [<ffffffff810cd5e0>] ? process_one_work+0x850/0x850
[ 240.602665] [<ffffffff810d399b>] kthread+0x10b/0x130
[ 240.602762] [<ffffffff81028cc9>] ? sched_clock+0x9/0x10
[ 240.602862] [<ffffffff810d3890>] ? kthread_create_on_node+0x250/0x250
[ 240.603004] [<ffffffff818651fc>] ret_from_fork+0x7c/0xb0
[ 240.603106] [<ffffffff810d3890>] ? kthread_create_on_node+0x250/0x250
[ 240.603304] #0: ("%s""netns"){.+.+.+}, at: [<ffffffff810ccf0f>]
process_one_work+0x17f/0x850
[<ffffffff810ccf0f>] process_one_work+0x17f/0x850
[ 240.603691] #2: (net_mutex){+.+.+.}, at: [<ffffffff817084cc>]
cleanup_net+0x8c/0x1f0
[<ffffffff8112a625>] _rcu_barrier+0x35/0x200
[ 240.604211] INFO: task modprobe:1387 blocked for more than 120 seconds.
[ 240.604329] Not tainted 3.18.0-0.rc1.git2.1.fc22.x86_64 #1
[ 240.604434] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 240.604570] modprobe D ffff8800cb4f1a40 13112 1387 1386 0x00000080
[ 240.604719] ffff8800cafbbbe8 0000000000000096 ffff8800cb4f1a40
00000000001d5f00
[ 240.604878] ffff8800cafbbfd8 00000000001d5f00 ffff880223280000
ffff8800cb4f1a40
[ 240.605068] ffff8800cb4f1a40 ffffffff81f8fb48 0000000000000246
ffff8800cb4f1a40
[ 240.605283] [<ffffffff8185e7e1>] schedule_preempt_disabled+0x31/0x80
[ 240.605400] [<ffffffff81860033>] mutex_lock_nested+0x183/0x440
[ 240.605510] [<ffffffff8170835f>] ? register_pernet_subsys+0x1f/0x50
[ 240.605626] [<ffffffff8170835f>] ? register_pernet_subsys+0x1f/0x50
[ 240.605757] [<ffffffffa0701000>] ? 0xffffffffa0701000
[ 240.605854] [<ffffffff8170835f>] register_pernet_subsys+0x1f/0x50
[ 240.606005] [<ffffffffa0701048>] br_init+0x48/0xd3 [bridge]
[ 240.606112] [<ffffffff81002148>] do_one_initcall+0xd8/0x210
[ 240.606224] [<ffffffff81153c02>] load_module+0x20c2/0x2870
[ 240.606327] [<ffffffff8114ebe0>] ? store_uevent+0x70/0x70
[ 240.606433] [<ffffffff8110ac26>] ? lock_release_non_nested+0x3c6/0x3d0
[ 240.606557] [<ffffffff81154497>] SyS_init_module+0xe7/0x140
[ 240.606664] [<ffffffff818652a9>] system_call_fastpath+0x12/0x17
[ 240.606845] #0: (net_mutex){+.+.+.}, at: [<ffffffff8170835f>]
register_pernet_subsys+0x1f/0x50
[ 240.607114] INFO: task modprobe:1466 blocked for more than 120 seconds.
[ 240.607231] Not tainted 3.18.0-0.rc1.git2.1.fc22.x86_64 #1
[ 240.607337] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 240.607473] modprobe D ffff88020fbab480 13096 1466 1399 0x00000084
[ 240.607622] ffff88020d1bbbe8 0000000000000096 ffff88020fbab480
00000000001d5f00
[ 240.607791] ffff88020d1bbfd8 00000000001d5f00 ffffffff81e1b580
ffff88020fbab480
[ 240.607949] ffff88020fbab480 ffffffff81f8fb48 0000000000000246
ffff88020fbab480
[ 240.608193] [<ffffffff8185e7e1>] schedule_preempt_disabled+0x31/0x80
[ 240.608316] [<ffffffff81860033>] mutex_lock_nested+0x183/0x440
[ 240.608425] [<ffffffff817083ad>] ? register_pernet_device+0x1d/0x70
[ 240.608542] [<ffffffff817083ad>] ? register_pernet_device+0x1d/0x70
[ 240.608662] [<ffffffffa071d000>] ? 0xffffffffa071d000
[ 240.608759] [<ffffffff817083ad>] register_pernet_device+0x1d/0x70
[ 240.608881] [<ffffffffa071d020>] ppp_init+0x20/0x1000 [ppp_generic]
[ 240.609021] [<ffffffff81002148>] do_one_initcall+0xd8/0x210
[ 240.609131] [<ffffffff81153c02>] load_module+0x20c2/0x2870
[ 240.609235] [<ffffffff8114ebe0>] ? store_uevent+0x70/0x70
[ 240.609339] [<ffffffff8110ac26>] ? lock_release_non_nested+0x3c6/0x3d0
[ 240.609462] [<ffffffff81154497>] SyS_init_module+0xe7/0x140
[ 240.609568] [<ffffffff818652a9>] system_call_fastpath+0x12/0x17
[ 240.609749] #0: (net_mutex){+.+.+.}, at: [<ffffffff817083ad>]
register_pernet_device+0x1d/0x70
Looks like contention on net_mutex or something, but I honestly have
no idea yet. I can't recreate it myself at the moment or I would
bisect.
Has nobody else run into this with the pre-3.18 kernels? Fedora isn't
carrying any patches in this area.

I am not aware of any change in net/core/dev.c related here,
so I guess it's a bug in rcu_barrier().
From the limited trace data I see in this email I have to agree.

It looks like for some reason rcu_barrier is taking forever
while the rtnl_lock is held in cleanup_net. Because the
rtnl_lock is held modprobe of the ppp driver is getting stuck.
Is it possible we have an AB BA deadlock between the rtnl_lock
and rcu. With something the module loading code assumes?

I am not aware of RCU ever acquiring rtnl_lock, not directly, anyway.

Does the module loading code do something strange with rcu? Perhaps
blocking an rcu grace period until the module loading completes?
If the module loading somehow blocks an rcu grace period that would
create an AB deadlock because loading the ppp module grabs the
rtnl_lock. And elsewhere we have the rtnl_lock waiting for an rcu grace
period.
I would think trying and failing to get the rtnl_lock would sleep and
thus let any rcu grace period happen but shrug.
It looks like something is holding up the rcu grace period, and causing
this. Although it is possible that something is causing cleanup_net
to run slowly and we are just seeing that slowness show up in
rcu_barrier as that is one of the slower bits. With a single trace I
can't definitely same that the rcu barrier is getting stuck but it
certainly looks that way.

Don't get me wrong -- the fact that this kthread appears to have
blocked within rcu_barrier() for 120 seconds means that something is
most definitely wrong here. I am surprised that there are no RCU CPU
stall warnings, but perhaps the blockage is in the callback execution
rather than grace-period completion. Or something is preventing this
kthread from starting up after the wake-up callback executes. Or...
Is this thing reproducible?

I've added Yanko on CC, who reported the backtrace above and can
recreate it reliably. Apparently reverting the RCU merge commit
(d6dd50e) and rebuilding the latest after that does not show the
issue. I'll let Yanko explain more and answer any questions you have.

- It is reproducible
- I've done another build here to double check and its definitely the rcu merge
that's causing it.

Don't think I'll be able to dig deeper, but I can do testing if needed.

--Yanko

Paul E. McKenney

2014-10-22 23:24:21 UTC

Post by Yanko Kaneti

Post by Josh Boyer
On Wed, Oct 22, 2014 at 2:55 PM, Paul E. McKenney

[ . . . ]

Post by Yanko Kaneti

Post by Josh Boyer

Post by Paul E. McKenney
Don't get me wrong -- the fact that this kthread appears to have
blocked within rcu_barrier() for 120 seconds means that something is
most definitely wrong here. I am surprised that there are no RCU CPU
stall warnings, but perhaps the blockage is in the callback execution
rather than grace-period completion. Or something is preventing this
kthread from starting up after the wake-up callback executes. Or...
Is this thing reproducible?

I've added Yanko on CC, who reported the backtrace above and can
recreate it reliably. Apparently reverting the RCU merge commit
(d6dd50e) and rebuilding the latest after that does not show the
issue. I'll let Yanko explain more and answer any questions you have.

- It is reproducible
- I've done another build here to double check and its definitely the rcu merge
that's causing it.
Don't think I'll be able to dig deeper, but I can do testing if needed.

Please! Does the following patch help?

Thanx, Paul

------------------------------------------------------------------------

rcu: More on deadlock between CPU hotplug and expedited grace periods

Commit dd56af42bd82 (rcu: Eliminate deadlock between CPU hotplug and
expedited grace periods) was incomplete. Although it did eliminate
deadlocks involving synchronize_sched_expedited()'s acquisition of
cpu_hotplug.lock via get_online_cpus(), it did nothing about the similar
deadlock involving acquisition of this same lock via put_online_cpus().
This deadlock became apparent with testing involving hibernation.

This commit therefore changes put_online_cpus() acquisition of this lock
to be conditional, and increments a new cpu_hotplug.puts_pending field
in case of acquisition failure. Then cpu_hotplug_begin() checks for this
new field being non-zero, and applies any changes to cpu_hotplug.refcount.

Reported-by: Jiri Kosina <***@suse.cz>
Signed-off-by: Paul E. McKenney <***@linux.vnet.ibm.com>
Tested-by: Jiri Kosina <***@suse.cz>

diff --git a/kernel/cpu.c b/kernel/cpu.c
index 356450f09c1f..90a3d017b90c 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -64,6 +64,8 @@ static struct {
* an ongoing cpu hotplug operation.
*/
int refcount;
+ /* And allows lockless put_online_cpus(). */
+ atomic_t puts_pending;

#ifdef CONFIG_DEBUG_LOCK_ALLOC
struct lockdep_map dep_map;
@@ -113,7 +115,11 @@ void put_online_cpus(void)
{
if (cpu_hotplug.active_writer == current)
return;
- mutex_lock(&cpu_hotplug.lock);
+ if (!mutex_trylock(&cpu_hotplug.lock)) {
+ atomic_inc(&cpu_hotplug.puts_pending);
+ cpuhp_lock_release();
+ return;
+ }

if (WARN_ON(!cpu_hotplug.refcount))
cpu_hotplug.refcount++; /* try to fix things up */
@@ -155,6 +161,12 @@ void cpu_hotplug_begin(void)
cpuhp_lock_acquire();
for (;;) {
mutex_lock(&cpu_hotplug.lock);
+ if (atomic_read(&cpu_hotplug.puts_pending)) {
+ int delta;
+
+ delta = atomic_xchg(&cpu_hotplug.puts_pending, 0);
+ cpu_hotplug.refcount -= delta;
+ }
if (likely(!cpu_hotplug.refcount))
break;
__set_current_state(TASK_UNINTERRUPTIBLE);

Yanko Kaneti

2014-10-23 06:09:26 UTC

Post by Paul E. McKenney

Post by Yanko Kaneti

Post by Josh Boyer
On Wed, Oct 22, 2014 at 2:55 PM, Paul E. McKenney

[ . . . ]

Post by Yanko Kaneti

Post by Josh Boyer

Post by Paul E. McKenney
Don't get me wrong -- the fact that this kthread appears to have
blocked within rcu_barrier() for 120 seconds means that
something is
most definitely wrong here. I am surprised that there are no RCU CPU
stall warnings, but perhaps the blockage is in the callback execution
rather than grace-period completion. Or something is
preventing this
kthread from starting up after the wake-up callback executes.
Or...
Is this thing reproducible?

I've added Yanko on CC, who reported the backtrace above and can
recreate it reliably. Apparently reverting the RCU merge commit
(d6dd50e) and rebuilding the latest after that does not show the
issue. I'll let Yanko explain more and answer any questions you have.

- It is reproducible
- I've done another build here to double check and its definitely the rcu merge
that's causing it.
Don't think I'll be able to dig deeper, but I can do testing if needed.

Please! Does the following patch help?

Nope, doesn't seem to make a difference to the modprobe ppp_generic
test

INFO: task kworker/u16:6:101 blocked for more than 120 seconds.
Not tainted 3.18.0-0.rc1.git2.3.fc22.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
kworker/u16:6 D ffff88022067cec0 11680 101 2 0x00000000
Workqueue: netns cleanup_net
ffff8802206939e8 0000000000000096 ffff88022067cec0 00000000001d5f00
ffff880220693fd8 00000000001d5f00 ffff880223263480 ffff88022067cec0
ffffffff82c51d60 7fffffffffffffff ffffffff81ee2698 ffffffff81ee2690
Call Trace:
[<ffffffff8185e289>] schedule+0x29/0x70
[<ffffffff818634ac>] schedule_timeout+0x26c/0x410
[<ffffffff81028c4a>] ? native_sched_clock+0x2a/0xa0
[<ffffffff81107afc>] ? mark_held_locks+0x7c/0xb0
[<ffffffff81864530>] ? _raw_spin_unlock_irq+0x30/0x50
[<ffffffff81107c8d>] ? trace_hardirqs_on_caller+0x15d/0x200
[<ffffffff8185fcbc>] wait_for_completion+0x10c/0x150
[<ffffffff810e5430>] ? wake_up_state+0x20/0x20
[<ffffffff8112a799>] _rcu_barrier+0x159/0x200
[<ffffffff8112a895>] rcu_barrier+0x15/0x20
[<ffffffff81718f0f>] netdev_run_todo+0x6f/0x310
[<ffffffff8170dad5>] ? rollback_registered_many+0x265/0x2e0
[<ffffffff81725f7e>] rtnl_unlock+0xe/0x10
[<ffffffff8170f936>] default_device_exit_batch+0x156/0x180
[<ffffffff810fd8f0>] ? abort_exclusive_wait+0xb0/0xb0
[<ffffffff817079e3>] ops_exit_list.isra.1+0x53/0x60
[<ffffffff81708590>] cleanup_net+0x100/0x1f0
[<ffffffff810ccff8>] process_one_work+0x218/0x850
[<ffffffff810ccf5f>] ? process_one_work+0x17f/0x850
[<ffffffff810cd717>] ? worker_thread+0xe7/0x4a0
[<ffffffff810cd69b>] worker_thread+0x6b/0x4a0
[<ffffffff810cd630>] ? process_one_work+0x850/0x850
[<ffffffff810d39eb>] kthread+0x10b/0x130
[<ffffffff81028cc9>] ? sched_clock+0x9/0x10
[<ffffffff810d38e0>] ? kthread_create_on_node+0x250/0x250
[<ffffffff8186527c>] ret_from_fork+0x7c/0xb0
[<ffffffff810d38e0>] ? kthread_create_on_node+0x250/0x250
4 locks held by kworker/u16:6/101:
#0: ("%s""netns"){.+.+.+}, at: [<ffffffff810ccf5f>] process_one_work+0x17f/0x850
#1: (net_cleanup_work){+.+.+.}, at: [<ffffffff810ccf5f>] process_one_work+0x17f/0x850
#2: (net_mutex){+.+.+.}, at: [<ffffffff8170851c>] cleanup_net+0x8c/0x1f0
#3: (rcu_sched_state.barrier_mutex){+.+...}, at: [<ffffffff8112a675>] _rcu_barrier+0x35/0x200
INFO: task modprobe:1139 blocked for more than 120 seconds.
Not tainted 3.18.0-0.rc1.git2.3.fc22.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
modprobe D ffff880213ac1a40 13112 1139 1138 0x00000080
ffff880036ab3be8 0000000000000096 ffff880213ac1a40 00000000001d5f00
ffff880036ab3fd8 00000000001d5f00 ffff880223264ec0 ffff880213ac1a40
ffff880213ac1a40 ffffffff81f8fb48 0000000000000246 ffff880213ac1a40
Call Trace:
[<ffffffff8185e831>] schedule_preempt_disabled+0x31/0x80
[<ffffffff81860083>] mutex_lock_nested+0x183/0x440
[<ffffffff817083af>] ? register_pernet_subsys+0x1f/0x50
[<ffffffff817083af>] ? register_pernet_subsys+0x1f/0x50
[<ffffffffa06f3000>] ? 0xffffffffa06f3000
[<ffffffff817083af>] register_pernet_subsys+0x1f/0x50
[<ffffffffa06f3048>] br_init+0x48/0xd3 [bridge]
[<ffffffff81002148>] do_one_initcall+0xd8/0x210
[<ffffffff81153c52>] load_module+0x20c2/0x2870
[<ffffffff8114ec30>] ? store_uevent+0x70/0x70
[<ffffffff8110ac76>] ? lock_release_non_nested+0x3c6/0x3d0
[<ffffffff811544e7>] SyS_init_module+0xe7/0x140
[<ffffffff81865329>] system_call_fastpath+0x12/0x17
1 lock held by modprobe/1139:
#0: (net_mutex){+.+.+.}, at: [<ffffffff817083af>]
register_pernet_subsys+0x1f/0x50
INFO: task modprobe:1209 blocked for more than 120 seconds.
Not tainted 3.18.0-0.rc1.git2.3.fc22.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
modprobe D ffff8800c5324ec0 13368 1209 1151 0x00000080
ffff88020d14bbe8 0000000000000096 ffff8800c5324ec0 00000000001d5f00
ffff88020d14bfd8 00000000001d5f00 ffff880223280000 ffff8800c5324ec0
ffff8800c5324ec0 ffffffff81f8fb48 0000000000000246 ffff8800c5324ec0
Call Trace:
[<ffffffff8185e831>] schedule_preempt_disabled+0x31/0x80
[<ffffffff81860083>] mutex_lock_nested+0x183/0x440
[<ffffffff817083fd>] ? register_pernet_device+0x1d/0x70
[<ffffffff817083fd>] ? register_pernet_device+0x1d/0x70
[<ffffffffa070f000>] ? 0xffffffffa070f000
[<ffffffff817083fd>] register_pernet_device+0x1d/0x70
[<ffffffffa070f020>] ppp_init+0x20/0x1000 [ppp_generic]
[<ffffffff81002148>] do_one_initcall+0xd8/0x210
[<ffffffff81153c52>] load_module+0x20c2/0x2870
[<ffffffff8114ec30>] ? store_uevent+0x70/0x70
[<ffffffff8110ac76>] ? lock_release_non_nested+0x3c6/0x3d0
[<ffffffff811544e7>] SyS_init_module+0xe7/0x140
[<ffffffff81865329>] system_call_fastpath+0x12/0x17
1 lock held by modprobe/1209:
#0: (net_mutex){+.+.+.}, at: [<ffffffff817083fd>] register_pernet_device+0x1d/0x70

Post by Paul E. McKenney
Thanx, Paul
---------------------------------------------------------------------
---
rcu: More on deadlock between CPU hotplug and expedited grace periods
Commit dd56af42bd82 (rcu: Eliminate deadlock between CPU hotplug and
expedited grace periods) was incomplete. Although it did eliminate
deadlocks involving synchronize_sched_expedited()'s acquisition of
cpu_hotplug.lock via get_online_cpus(), it did nothing about the similar
deadlock involving acquisition of this same lock via
put_online_cpus().
This deadlock became apparent with testing involving hibernation.
This commit therefore changes put_online_cpus() acquisition of this lock
to be conditional, and increments a new cpu_hotplug.puts_pending field
in case of acquisition failure. Then cpu_hotplug_begin() checks for this
new field being non-zero, and applies any changes to
cpu_hotplug.refcount.
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 356450f09c1f..90a3d017b90c 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -64,6 +64,8 @@ static struct {
* an ongoing cpu hotplug operation.
*/
int refcount;
+ /* And allows lockless put_online_cpus(). */
+ atomic_t puts_pending;
#ifdef CONFIG_DEBUG_LOCK_ALLOC
struct lockdep_map dep_map;
@@ -113,7 +115,11 @@ void put_online_cpus(void)
{
if (cpu_hotplug.active_writer == current)
return;
- mutex_lock(&cpu_hotplug.lock);
+ if (!mutex_trylock(&cpu_hotplug.lock)) {
+ atomic_inc(&cpu_hotplug.puts_pending);
+ cpuhp_lock_release();
+ return;
+ }
if (WARN_ON(!cpu_hotplug.refcount))
cpu_hotplug.refcount++; /* try to fix things up */
@@ -155,6 +161,12 @@ void cpu_hotplug_begin(void)
cpuhp_lock_acquire();
for (;;) {
mutex_lock(&cpu_hotplug.lock);
+ if (atomic_read(&cpu_hotplug.puts_pending)) {
+ int delta;
+
+ delta = atomic_xchg(&cpu_hotplug.puts_pending, 0);
+ cpu_hotplug.refcount -= delta;
+ }
if (likely(!cpu_hotplug.refcount))
break;
__set_current_state(TASK_UNINTERRUPTIBLE);

Paul E. McKenney

2014-10-23 12:27:50 UTC

Post by Yanko Kaneti

Post by Paul E. McKenney

Post by Yanko Kaneti

Post by Josh Boyer
On Wed, Oct 22, 2014 at 2:55 PM, Paul E. McKenney

[ . . . ]

Post by Yanko Kaneti

Post by Josh Boyer

Post by Paul E. McKenney
Don't get me wrong -- the fact that this kthread appears to have
blocked within rcu_barrier() for 120 seconds means that something is
most definitely wrong here. I am surprised that there are no RCU CPU
stall warnings, but perhaps the blockage is in the callback execution
rather than grace-period completion. Or something is
preventing this
kthread from starting up after the wake-up callback executes.
Or...
Is this thing reproducible?

I've added Yanko on CC, who reported the backtrace above and can
recreate it reliably. Apparently reverting the RCU merge commit
(d6dd50e) and rebuilding the latest after that does not show the
issue. I'll let Yanko explain more and answer any questions you have.

- It is reproducible
- I've done another build here to double check and its definitely the rcu merge
that's causing it.
Don't think I'll be able to dig deeper, but I can do testing if needed.

Please! Does the following patch help?

Nope, doesn't seem to make a difference to the modprobe ppp_generic
test

Well, I was hoping. I will take a closer look at the RCU merge commit
and see what suggests itself. I am likely to ask you to revert specific
commits, if that works for you.

Thanx, Paul

Post by Yanko Kaneti
INFO: task kworker/u16:6:101 blocked for more than 120 seconds.
Not tainted 3.18.0-0.rc1.git2.3.fc22.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
kworker/u16:6 D ffff88022067cec0 11680 101 2 0x00000000
Workqueue: netns cleanup_net
ffff8802206939e8 0000000000000096 ffff88022067cec0 00000000001d5f00
ffff880220693fd8 00000000001d5f00 ffff880223263480 ffff88022067cec0
ffffffff82c51d60 7fffffffffffffff ffffffff81ee2698 ffffffff81ee2690
[<ffffffff8185e289>] schedule+0x29/0x70
[<ffffffff818634ac>] schedule_timeout+0x26c/0x410
[<ffffffff81028c4a>] ? native_sched_clock+0x2a/0xa0
[<ffffffff81107afc>] ? mark_held_locks+0x7c/0xb0
[<ffffffff81864530>] ? _raw_spin_unlock_irq+0x30/0x50
[<ffffffff81107c8d>] ? trace_hardirqs_on_caller+0x15d/0x200
[<ffffffff8185fcbc>] wait_for_completion+0x10c/0x150
[<ffffffff810e5430>] ? wake_up_state+0x20/0x20
[<ffffffff8112a799>] _rcu_barrier+0x159/0x200
[<ffffffff8112a895>] rcu_barrier+0x15/0x20
[<ffffffff81718f0f>] netdev_run_todo+0x6f/0x310
[<ffffffff8170dad5>] ? rollback_registered_many+0x265/0x2e0
[<ffffffff81725f7e>] rtnl_unlock+0xe/0x10
[<ffffffff8170f936>] default_device_exit_batch+0x156/0x180
[<ffffffff810fd8f0>] ? abort_exclusive_wait+0xb0/0xb0
[<ffffffff817079e3>] ops_exit_list.isra.1+0x53/0x60
[<ffffffff81708590>] cleanup_net+0x100/0x1f0
[<ffffffff810ccff8>] process_one_work+0x218/0x850
[<ffffffff810ccf5f>] ? process_one_work+0x17f/0x850
[<ffffffff810cd717>] ? worker_thread+0xe7/0x4a0
[<ffffffff810cd69b>] worker_thread+0x6b/0x4a0
[<ffffffff810cd630>] ? process_one_work+0x850/0x850
[<ffffffff810d39eb>] kthread+0x10b/0x130
[<ffffffff81028cc9>] ? sched_clock+0x9/0x10
[<ffffffff810d38e0>] ? kthread_create_on_node+0x250/0x250
[<ffffffff8186527c>] ret_from_fork+0x7c/0xb0
[<ffffffff810d38e0>] ? kthread_create_on_node+0x250/0x250
#0: ("%s""netns"){.+.+.+}, at: [<ffffffff810ccf5f>] process_one_work+0x17f/0x850
#1: (net_cleanup_work){+.+.+.}, at: [<ffffffff810ccf5f>] process_one_work+0x17f/0x850
#2: (net_mutex){+.+.+.}, at: [<ffffffff8170851c>] cleanup_net+0x8c/0x1f0
#3: (rcu_sched_state.barrier_mutex){+.+...}, at: [<ffffffff8112a675>] _rcu_barrier+0x35/0x200
INFO: task modprobe:1139 blocked for more than 120 seconds.
Not tainted 3.18.0-0.rc1.git2.3.fc22.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
modprobe D ffff880213ac1a40 13112 1139 1138 0x00000080
ffff880036ab3be8 0000000000000096 ffff880213ac1a40 00000000001d5f00
ffff880036ab3fd8 00000000001d5f00 ffff880223264ec0 ffff880213ac1a40
ffff880213ac1a40 ffffffff81f8fb48 0000000000000246 ffff880213ac1a40
[<ffffffff8185e831>] schedule_preempt_disabled+0x31/0x80
[<ffffffff81860083>] mutex_lock_nested+0x183/0x440
[<ffffffff817083af>] ? register_pernet_subsys+0x1f/0x50
[<ffffffff817083af>] ? register_pernet_subsys+0x1f/0x50
[<ffffffffa06f3000>] ? 0xffffffffa06f3000
[<ffffffff817083af>] register_pernet_subsys+0x1f/0x50
[<ffffffffa06f3048>] br_init+0x48/0xd3 [bridge]
[<ffffffff81002148>] do_one_initcall+0xd8/0x210
[<ffffffff81153c52>] load_module+0x20c2/0x2870
[<ffffffff8114ec30>] ? store_uevent+0x70/0x70
[<ffffffff8110ac76>] ? lock_release_non_nested+0x3c6/0x3d0
[<ffffffff811544e7>] SyS_init_module+0xe7/0x140
[<ffffffff81865329>] system_call_fastpath+0x12/0x17
#0: (net_mutex){+.+.+.}, at: [<ffffffff817083af>]
register_pernet_subsys+0x1f/0x50
INFO: task modprobe:1209 blocked for more than 120 seconds.
Not tainted 3.18.0-0.rc1.git2.3.fc22.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
modprobe D ffff8800c5324ec0 13368 1209 1151 0x00000080
ffff88020d14bbe8 0000000000000096 ffff8800c5324ec0 00000000001d5f00
ffff88020d14bfd8 00000000001d5f00 ffff880223280000 ffff8800c5324ec0
ffff8800c5324ec0 ffffffff81f8fb48 0000000000000246 ffff8800c5324ec0
[<ffffffff8185e831>] schedule_preempt_disabled+0x31/0x80
[<ffffffff81860083>] mutex_lock_nested+0x183/0x440
[<ffffffff817083fd>] ? register_pernet_device+0x1d/0x70
[<ffffffff817083fd>] ? register_pernet_device+0x1d/0x70
[<ffffffffa070f000>] ? 0xffffffffa070f000
[<ffffffff817083fd>] register_pernet_device+0x1d/0x70
[<ffffffffa070f020>] ppp_init+0x20/0x1000 [ppp_generic]
[<ffffffff81002148>] do_one_initcall+0xd8/0x210
[<ffffffff81153c52>] load_module+0x20c2/0x2870
[<ffffffff8114ec30>] ? store_uevent+0x70/0x70
[<ffffffff8110ac76>] ? lock_release_non_nested+0x3c6/0x3d0
[<ffffffff811544e7>] SyS_init_module+0xe7/0x140
[<ffffffff81865329>] system_call_fastpath+0x12/0x17
#0: (net_mutex){+.+.+.}, at: [<ffffffff817083fd>] register_pernet_device+0x1d/0x70

Post by Paul E. McKenney
Thanx, Paul
---------------------------------------------------------------------
---
rcu: More on deadlock between CPU hotplug and expedited grace periods
Commit dd56af42bd82 (rcu: Eliminate deadlock between CPU hotplug and
expedited grace periods) was incomplete. Although it did eliminate
deadlocks involving synchronize_sched_expedited()'s acquisition of
cpu_hotplug.lock via get_online_cpus(), it did nothing about the similar
deadlock involving acquisition of this same lock via
put_online_cpus().
This deadlock became apparent with testing involving hibernation.
This commit therefore changes put_online_cpus() acquisition of this lock
to be conditional, and increments a new cpu_hotplug.puts_pending field
in case of acquisition failure. Then cpu_hotplug_begin() checks for this
new field being non-zero, and applies any changes to
cpu_hotplug.refcount.
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 356450f09c1f..90a3d017b90c 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -64,6 +64,8 @@ static struct {
* an ongoing cpu hotplug operation.
*/
int refcount;
+ /* And allows lockless put_online_cpus(). */
+ atomic_t puts_pending;
#ifdef CONFIG_DEBUG_LOCK_ALLOC
struct lockdep_map dep_map;
@@ -113,7 +115,11 @@ void put_online_cpus(void)
{
if (cpu_hotplug.active_writer == current)
return;
- mutex_lock(&cpu_hotplug.lock);
+ if (!mutex_trylock(&cpu_hotplug.lock)) {
+ atomic_inc(&cpu_hotplug.puts_pending);
+ cpuhp_lock_release();
+ return;
+ }
if (WARN_ON(!cpu_hotplug.refcount))
cpu_hotplug.refcount++; /* try to fix things up */
@@ -155,6 +161,12 @@ void cpu_hotplug_begin(void)
cpuhp_lock_acquire();
for (;;) {
mutex_lock(&cpu_hotplug.lock);
+ if (atomic_read(&cpu_hotplug.puts_pending)) {
+ int delta;
+
+ delta = atomic_xchg(&cpu_hotplug.puts_pending, 0);
+ cpu_hotplug.refcount -= delta;
+ }
if (likely(!cpu_hotplug.refcount))
break;
__set_current_state(TASK_UNINTERRUPTIBLE);

Paul E. McKenney

2014-10-23 15:33:33 UTC

Post by Paul E. McKenney

Post by Yanko Kaneti

Post by Paul E. McKenney

Post by Yanko Kaneti

Post by Josh Boyer
On Wed, Oct 22, 2014 at 2:55 PM, Paul E. McKenney

[ . . . ]

Post by Yanko Kaneti

Post by Josh Boyer

Post by Paul E. McKenney
Don't get me wrong -- the fact that this kthread appears to have
blocked within rcu_barrier() for 120 seconds means that something is
most definitely wrong here. I am surprised that there are no RCU CPU
stall warnings, but perhaps the blockage is in the callback execution
rather than grace-period completion. Or something is preventing this
kthread from starting up after the wake-up callback executes.
Or...
Is this thing reproducible?

I've added Yanko on CC, who reported the backtrace above and can
recreate it reliably. Apparently reverting the RCU merge commit
(d6dd50e) and rebuilding the latest after that does not show the
issue. I'll let Yanko explain more and answer any questions you have.

- It is reproducible
- I've done another build here to double check and its definitely the rcu merge
that's causing it.
Don't think I'll be able to dig deeper, but I can do testing if needed.

Please! Does the following patch help?

Nope, doesn't seem to make a difference to the modprobe ppp_generic
test

Well, I was hoping. I will take a closer look at the RCU merge commit
and see what suggests itself. I am likely to ask you to revert specific
commits, if that works for you.

Well, rather than reverting commits, could you please try testing the
following commits?

11ed7f934cb8 (rcu: Make nocb leader kthreads process pending callbacks after spawning)

73a860cd58a1 (rcu: Replace flush_signals() with WARN_ON(signal_pending()))

c847f14217d5 (rcu: Avoid misordering in nocb_leader_wait())

For whatever it is worth, I am guessing this one.

a53dd6a65668 (rcutorture: Add RCU-tasks tests to default rcutorture list)

If any of the above fail, this one should also fail.

Also, could you please send along your .config?

Thanx, Paul

Paul E. McKenney

2014-10-23 16:28:48 UTC

Post by Paul E. McKenney

Post by Paul E. McKenney

Post by Paul E. McKenney

Post by Yanko Kaneti

Post by Paul E. McKenney

Post by Yanko Kaneti

Post by Josh Boyer
On Wed, Oct 22, 2014 at 2:55 PM, Paul E. McKenney

[ . . . ]

Post by Yanko Kaneti

Post by Josh Boyer

Post by Paul E. McKenney
Don't get me wrong -- the fact that this kthread appears to
have
blocked within rcu_barrier() for 120 seconds means that
something is
most definitely wrong here. I am surprised that there are no
RCU CPU
stall warnings, but perhaps the blockage is in the callback
execution
rather than grace-period completion. Or something is
preventing this
kthread from starting up after the wake-up callback executes.
Or...
Is this thing reproducible?

I've added Yanko on CC, who reported the backtrace above and can
recreate it reliably. Apparently reverting the RCU merge commit
(d6dd50e) and rebuilding the latest after that does not show the
issue. I'll let Yanko explain more and answer any questions you
have.

- It is reproducible
- I've done another build here to double check and its definitely
the rcu merge
that's causing it.
Don't think I'll be able to dig deeper, but I can do testing if needed.

Please! Does the following patch help?

Nope, doesn't seem to make a difference to the modprobe ppp_generic
test

Well, I was hoping. I will take a closer look at the RCU merge commit
and see what suggests itself. I am likely to ask you to revert specific
commits, if that works for you.

Well, rather than reverting commits, could you please try testing the
following commits?
11ed7f934cb8 (rcu: Make nocb leader kthreads process pending callbacks

after spawning)

Post by Paul E. McKenney
73a860cd58a1 (rcu: Replace flush_signals() with WARN_ON(signal_pending()))
c847f14217d5 (rcu: Avoid misordering in nocb_leader_wait())
For whatever it is worth, I am guessing this one.
a53dd6a65668 (rcutorture: Add RCU-tasks tests to default rcutorture list)
If any of the above fail, this one should also fail.
Also, could you please send along your .config?

Which tree are those in?

They are all in Linus's tree. They are topic branches of the RCU merge
commit (d6dd50e), and the test results will hopefully give me more of a
clue where to look. As would the .config file. ;-)

Thanx, Paul

Yanko Kaneti

2014-10-23 19:51:59 UTC

Post by Paul E. McKenney

Post by Paul E. McKenney

Post by Yanko Kaneti

Post by Paul E. McKenney

Post by Yanko Kaneti

Post by Josh Boyer
On Wed, Oct 22, 2014 at 2:55 PM, Paul E. McKenney

[ . . . ]

Post by Yanko Kaneti

Post by Josh Boyer

Post by Paul E. McKenney
Don't get me wrong -- the fact that this kthread appears to have
blocked within rcu_barrier() for 120 seconds means that
something is
most definitely wrong here. I am surprised that there are no
RCU CPU
stall warnings, but perhaps the blockage is in the callback
execution
rather than grace-period completion. Or something is
preventing this
kthread from starting up after the wake-up callback executes.
Or...
Is this thing reproducible?

I've added Yanko on CC, who reported the backtrace above and can
recreate it reliably. Apparently reverting the RCU merge commit
(d6dd50e) and rebuilding the latest after that does not show the
issue. I'll let Yanko explain more and answer any questions you have.

- It is reproducible
- I've done another build here to double check and its definitely
the rcu merge
that's causing it.
Don't think I'll be able to dig deeper, but I can do testing if needed.

Please! Does the following patch help?

Nope, doesn't seem to make a difference to the modprobe ppp_generic
test

Well, I was hoping. I will take a closer look at the RCU merge commit
and see what suggests itself. I am likely to ask you to revert specific
commits, if that works for you.

Well, rather than reverting commits, could you please try testing the
following commits?
11ed7f934cb8 (rcu: Make nocb leader kthreads process pending callbacks after spawning)
73a860cd58a1 (rcu: Replace flush_signals() with WARN_ON(signal_pending()))
c847f14217d5 (rcu: Avoid misordering in nocb_leader_wait())
For whatever it is worth, I am guessing this one.

Indeed, c847f14217d5 it is.

Much to my embarrasment I just noticed that in addition to the
rcu merge, triggering the bug "requires" my specific Fedora rawhide network
setup. Booting in single mode and modprobe ppp_generic is fine. The bug
appears when starting with my regular fedora network setup, which in my case
includes 3 ethernet adapters and a libvirt birdge+nat setup.

Hope that helps.

I am attaching the config.

Paul E. McKenney

2014-10-23 20:05:07 UTC

Post by Yanko Kaneti

Post by Paul E. McKenney

Post by Paul E. McKenney

Post by Yanko Kaneti

Post by Paul E. McKenney

Post by Yanko Kaneti

Post by Josh Boyer
On Wed, Oct 22, 2014 at 2:55 PM, Paul E. McKenney

[ . . . ]

Post by Yanko Kaneti

Post by Josh Boyer

Post by Paul E. McKenney
Don't get me wrong -- the fact that this kthread appears to
have
blocked within rcu_barrier() for 120 seconds means that
something is
most definitely wrong here. I am surprised that there are no
RCU CPU
stall warnings, but perhaps the blockage is in the callback
execution
rather than grace-period completion. Or something is
preventing this
kthread from starting up after the wake-up callback executes.
Or...
Is this thing reproducible?

I've added Yanko on CC, who reported the backtrace above and can
recreate it reliably. Apparently reverting the RCU merge commit
(d6dd50e) and rebuilding the latest after that does not show the
issue. I'll let Yanko explain more and answer any questions you
have.

- It is reproducible
- I've done another build here to double check and its definitely
the rcu merge
that's causing it.
Don't think I'll be able to dig deeper, but I can do testing if needed.

Please! Does the following patch help?

Nope, doesn't seem to make a difference to the modprobe ppp_generic
test

Well, I was hoping. I will take a closer look at the RCU merge commit
and see what suggests itself. I am likely to ask you to revert specific
commits, if that works for you.

Well, rather than reverting commits, could you please try testing the
following commits?
11ed7f934cb8 (rcu: Make nocb leader kthreads process pending callbacks after spawning)
73a860cd58a1 (rcu: Replace flush_signals() with WARN_ON(signal_pending()))
c847f14217d5 (rcu: Avoid misordering in nocb_leader_wait())
For whatever it is worth, I am guessing this one.

Indeed, c847f14217d5 it is.
Much to my embarrasment I just noticed that in addition to the
rcu merge, triggering the bug "requires" my specific Fedora rawhide network
setup. Booting in single mode and modprobe ppp_generic is fine. The bug
appears when starting with my regular fedora network setup, which in my case
includes 3 ethernet adapters and a libvirt birdge+nat setup.
Hope that helps.
I am attaching the config.

It does help a lot, thank you!!!

The following patch is a bit of a shot in the dark, and assumes that
commit 1772947bd012 (rcu: Handle NOCB callbacks from irq-disabled idle
code) introduced the problem. Does this patch fix things up?

Thanx, Paul

------------------------------------------------------------------------

rcu: Kick rcuo kthreads after their CPU goes offline

If a no-CBs CPU were to post an RCU callback with interrupts disabled
after it entered the idle loop for the last time, there might be no
deferred wakeup for the corresponding rcuo kthreads. This commit
therefore adds a set of calls to do_nocb_deferred_wakeup() after the
CPU has gone completely offline.

Signed-off-by: Paul E. McKenney <***@linux.vnet.ibm.com>

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 84b41b3c6ebd..4f3d25a58786 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3493,8 +3493,10 @@ static int rcu_cpu_notify(struct notifier_block *self,
case CPU_DEAD_FROZEN:
case CPU_UP_CANCELED:
case CPU_UP_CANCELED_FROZEN:
- for_each_rcu_flavor(rsp)
+ for_each_rcu_flavor(rsp) {
rcu_cleanup_dead_cpu(cpu, rsp);
+ do_nocb_deferred_wakeup(this_cpu_ptr(rsp->rda));
+ }
break;
default:
break;

Paul E. McKenney

2014-10-22 17:59:09 UTC

Post by Cong Wang
(Adding Paul and Eric in Cc)
I am not aware of any change in net/core/dev.c related here,
so I guess it's a bug in rcu_barrier().
Thanks.

Does commit 789cbbeca4e (workqueue: Add quiescent state between work items)
and 3e28e3772 (workqueue: Use cond_resched_rcu_qs macro) help this?

Thanx, Paul

Post by Cong Wang

Post by Josh Boyer
[ 240.599195] INFO: task kworker/u16:5:100 blocked for more than 120 seconds.
[ 240.599338] Not tainted 3.18.0-0.rc1.git2.1.fc22.x86_64 #1
[ 240.599446] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 240.599583] kworker/u16:5 D ffff8802202db480 12400 100 2 0x00000000
[ 240.599744] Workqueue: netns cleanup_net
[ 240.599823] ffff8802202eb9e8 0000000000000096 ffff8802202db480
00000000001d5f00
[ 240.600066] ffff8802202ebfd8 00000000001d5f00 ffff8800368c3480
ffff8802202db480
[ 240.600228] ffffffff81ee2690 7fffffffffffffff ffffffff81ee2698
ffffffff81ee2690
[ 240.600445] [<ffffffff8185e239>] schedule+0x29/0x70
[ 240.600541] [<ffffffff8186345c>] schedule_timeout+0x26c/0x410
[ 240.600651] [<ffffffff81865ef7>] ? retint_restore_args+0x13/0x13
[ 240.600765] [<ffffffff818644e4>] ? _raw_spin_unlock_irq+0x34/0x50
[ 240.600879] [<ffffffff8185fc6c>] wait_for_completion+0x10c/0x150
[ 240.601025] [<ffffffff810e53e0>] ? wake_up_state+0x20/0x20
[ 240.601133] [<ffffffff8112a749>] _rcu_barrier+0x159/0x200
[ 240.601237] [<ffffffff8112a845>] rcu_barrier+0x15/0x20
[ 240.601335] [<ffffffff81718ebf>] netdev_run_todo+0x6f/0x310
[ 240.601442] [<ffffffff8170da85>] ? rollback_registered_many+0x265/0x2e0
[ 240.601564] [<ffffffff81725f2e>] rtnl_unlock+0xe/0x10
[ 240.601660] [<ffffffff8170f8e6>] default_device_exit_batch+0x156/0x180
[ 240.601781] [<ffffffff810fd8a0>] ? abort_exclusive_wait+0xb0/0xb0
[ 240.601895] [<ffffffff81707993>] ops_exit_list.isra.1+0x53/0x60
[ 240.602028] [<ffffffff81708540>] cleanup_net+0x100/0x1f0
[ 240.602131] [<ffffffff810ccfa8>] process_one_work+0x218/0x850
[ 240.602241] [<ffffffff810ccf0f>] ? process_one_work+0x17f/0x850
[ 240.602350] [<ffffffff810cd6c7>] ? worker_thread+0xe7/0x4a0
[ 240.602454] [<ffffffff810cd64b>] worker_thread+0x6b/0x4a0
[ 240.602555] [<ffffffff810cd5e0>] ? process_one_work+0x850/0x850
[ 240.602665] [<ffffffff810d399b>] kthread+0x10b/0x130
[ 240.602762] [<ffffffff81028cc9>] ? sched_clock+0x9/0x10
[ 240.602862] [<ffffffff810d3890>] ? kthread_create_on_node+0x250/0x250
[ 240.603004] [<ffffffff818651fc>] ret_from_fork+0x7c/0xb0
[ 240.603106] [<ffffffff810d3890>] ? kthread_create_on_node+0x250/0x250
[ 240.603304] #0: ("%s""netns"){.+.+.+}, at: [<ffffffff810ccf0f>]
process_one_work+0x17f/0x850
[<ffffffff810ccf0f>] process_one_work+0x17f/0x850
[ 240.603691] #2: (net_mutex){+.+.+.}, at: [<ffffffff817084cc>]
cleanup_net+0x8c/0x1f0
[<ffffffff8112a625>] _rcu_barrier+0x35/0x200
[ 240.604211] INFO: task modprobe:1387 blocked for more than 120 seconds.
[ 240.604329] Not tainted 3.18.0-0.rc1.git2.1.fc22.x86_64 #1
[ 240.604434] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 240.604570] modprobe D ffff8800cb4f1a40 13112 1387 1386 0x00000080
[ 240.604719] ffff8800cafbbbe8 0000000000000096 ffff8800cb4f1a40
00000000001d5f00
[ 240.604878] ffff8800cafbbfd8 00000000001d5f00 ffff880223280000
ffff8800cb4f1a40
[ 240.605068] ffff8800cb4f1a40 ffffffff81f8fb48 0000000000000246
ffff8800cb4f1a40
[ 240.605283] [<ffffffff8185e7e1>] schedule_preempt_disabled+0x31/0x80
[ 240.605400] [<ffffffff81860033>] mutex_lock_nested+0x183/0x440
[ 240.605510] [<ffffffff8170835f>] ? register_pernet_subsys+0x1f/0x50
[ 240.605626] [<ffffffff8170835f>] ? register_pernet_subsys+0x1f/0x50
[ 240.605757] [<ffffffffa0701000>] ? 0xffffffffa0701000
[ 240.605854] [<ffffffff8170835f>] register_pernet_subsys+0x1f/0x50
[ 240.606005] [<ffffffffa0701048>] br_init+0x48/0xd3 [bridge]
[ 240.606112] [<ffffffff81002148>] do_one_initcall+0xd8/0x210
[ 240.606224] [<ffffffff81153c02>] load_module+0x20c2/0x2870
[ 240.606327] [<ffffffff8114ebe0>] ? store_uevent+0x70/0x70
[ 240.606433] [<ffffffff8110ac26>] ? lock_release_non_nested+0x3c6/0x3d0
[ 240.606557] [<ffffffff81154497>] SyS_init_module+0xe7/0x140
[ 240.606664] [<ffffffff818652a9>] system_call_fastpath+0x12/0x17
[ 240.606845] #0: (net_mutex){+.+.+.}, at: [<ffffffff8170835f>]
register_pernet_subsys+0x1f/0x50
[ 240.607114] INFO: task modprobe:1466 blocked for more than 120 seconds.
[ 240.607231] Not tainted 3.18.0-0.rc1.git2.1.fc22.x86_64 #1
[ 240.607337] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 240.607473] modprobe D ffff88020fbab480 13096 1466 1399 0x00000084
[ 240.607622] ffff88020d1bbbe8 0000000000000096 ffff88020fbab480
00000000001d5f00
[ 240.607791] ffff88020d1bbfd8 00000000001d5f00 ffffffff81e1b580
ffff88020fbab480
[ 240.607949] ffff88020fbab480 ffffffff81f8fb48 0000000000000246
ffff88020fbab480
[ 240.608193] [<ffffffff8185e7e1>] schedule_preempt_disabled+0x31/0x80
[ 240.608316] [<ffffffff81860033>] mutex_lock_nested+0x183/0x440
[ 240.608425] [<ffffffff817083ad>] ? register_pernet_device+0x1d/0x70
[ 240.608542] [<ffffffff817083ad>] ? register_pernet_device+0x1d/0x70
[ 240.608662] [<ffffffffa071d000>] ? 0xffffffffa071d000
[ 240.608759] [<ffffffff817083ad>] register_pernet_device+0x1d/0x70
[ 240.608881] [<ffffffffa071d020>] ppp_init+0x20/0x1000 [ppp_generic]
[ 240.609021] [<ffffffff81002148>] do_one_initcall+0xd8/0x210
[ 240.609131] [<ffffffff81153c02>] load_module+0x20c2/0x2870
[ 240.609235] [<ffffffff8114ebe0>] ? store_uevent+0x70/0x70
[ 240.609339] [<ffffffff8110ac26>] ? lock_release_non_nested+0x3c6/0x3d0
[ 240.609462] [<ffffffff81154497>] SyS_init_module+0xe7/0x140
[ 240.609568] [<ffffffff818652a9>] system_call_fastpath+0x12/0x17
[ 240.609749] #0: (net_mutex){+.+.+.}, at: [<ffffffff817083ad>]
register_pernet_device+0x1d/0x70
Looks like contention on net_mutex or something, but I honestly have
no idea yet. I can't recreate it myself at the moment or I would
bisect.
Has nobody else run into this with the pre-3.18 kernels? Fedora isn't
carrying any patches in this area.
josh
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
More majordomo info at http://vger.kernel.org/majordomo-info.html

Josh Boyer

2014-10-22 18:03:11 UTC

On Wed, Oct 22, 2014 at 1:59 PM, Paul E. McKenney

Post by Paul E. McKenney

Post by Cong Wang
(Adding Paul and Eric in Cc)
I am not aware of any change in net/core/dev.c related here,
so I guess it's a bug in rcu_barrier().
Thanks.

Does commit 789cbbeca4e (workqueue: Add quiescent state between work items)
and 3e28e3772 (workqueue: Use cond_resched_rcu_qs macro) help this?

I don't believe so. The output below is from a post 3.18-rc1 kernel
(Linux v3.18-rc1-221-gc3351dfabf5c to be exact), and both of those
commits are included in that if I'm reading the git output correctly.

josh

Post by Paul E. McKenney

Post by Cong Wang

Post by Josh Boyer
[ 240.599195] INFO: task kworker/u16:5:100 blocked for more than 120 seconds.
[ 240.599338] Not tainted 3.18.0-0.rc1.git2.1.fc22.x86_64 #1
[ 240.599446] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 240.599583] kworker/u16:5 D ffff8802202db480 12400 100 2 0x00000000
[ 240.599744] Workqueue: netns cleanup_net
[ 240.599823] ffff8802202eb9e8 0000000000000096 ffff8802202db480
00000000001d5f00
[ 240.600066] ffff8802202ebfd8 00000000001d5f00 ffff8800368c3480
ffff8802202db480
[ 240.600228] ffffffff81ee2690 7fffffffffffffff ffffffff81ee2698
ffffffff81ee2690
[ 240.600445] [<ffffffff8185e239>] schedule+0x29/0x70
[ 240.600541] [<ffffffff8186345c>] schedule_timeout+0x26c/0x410
[ 240.600651] [<ffffffff81865ef7>] ? retint_restore_args+0x13/0x13
[ 240.600765] [<ffffffff818644e4>] ? _raw_spin_unlock_irq+0x34/0x50
[ 240.600879] [<ffffffff8185fc6c>] wait_for_completion+0x10c/0x150
[ 240.601025] [<ffffffff810e53e0>] ? wake_up_state+0x20/0x20
[ 240.601133] [<ffffffff8112a749>] _rcu_barrier+0x159/0x200
[ 240.601237] [<ffffffff8112a845>] rcu_barrier+0x15/0x20
[ 240.601335] [<ffffffff81718ebf>] netdev_run_todo+0x6f/0x310
[ 240.601442] [<ffffffff8170da85>] ? rollback_registered_many+0x265/0x2e0
[ 240.601564] [<ffffffff81725f2e>] rtnl_unlock+0xe/0x10
[ 240.601660] [<ffffffff8170f8e6>] default_device_exit_batch+0x156/0x180
[ 240.601781] [<ffffffff810fd8a0>] ? abort_exclusive_wait+0xb0/0xb0
[ 240.601895] [<ffffffff81707993>] ops_exit_list.isra.1+0x53/0x60
[ 240.602028] [<ffffffff81708540>] cleanup_net+0x100/0x1f0
[ 240.602131] [<ffffffff810ccfa8>] process_one_work+0x218/0x850
[ 240.602241] [<ffffffff810ccf0f>] ? process_one_work+0x17f/0x850
[ 240.602350] [<ffffffff810cd6c7>] ? worker_thread+0xe7/0x4a0
[ 240.602454] [<ffffffff810cd64b>] worker_thread+0x6b/0x4a0
[ 240.602555] [<ffffffff810cd5e0>] ? process_one_work+0x850/0x850
[ 240.602665] [<ffffffff810d399b>] kthread+0x10b/0x130
[ 240.602762] [<ffffffff81028cc9>] ? sched_clock+0x9/0x10
[ 240.602862] [<ffffffff810d3890>] ? kthread_create_on_node+0x250/0x250
[ 240.603004] [<ffffffff818651fc>] ret_from_fork+0x7c/0xb0
[ 240.603106] [<ffffffff810d3890>] ? kthread_create_on_node+0x250/0x250
[ 240.603304] #0: ("%s""netns"){.+.+.+}, at: [<ffffffff810ccf0f>]
process_one_work+0x17f/0x850
[<ffffffff810ccf0f>] process_one_work+0x17f/0x850
[ 240.603691] #2: (net_mutex){+.+.+.}, at: [<ffffffff817084cc>]
cleanup_net+0x8c/0x1f0
[<ffffffff8112a625>] _rcu_barrier+0x35/0x200
[ 240.604211] INFO: task modprobe:1387 blocked for more than 120 seconds.
[ 240.604329] Not tainted 3.18.0-0.rc1.git2.1.fc22.x86_64 #1
[ 240.604434] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 240.604570] modprobe D ffff8800cb4f1a40 13112 1387 1386 0x00000080
[ 240.604719] ffff8800cafbbbe8 0000000000000096 ffff8800cb4f1a40
00000000001d5f00
[ 240.604878] ffff8800cafbbfd8 00000000001d5f00 ffff880223280000
ffff8800cb4f1a40
[ 240.605068] ffff8800cb4f1a40 ffffffff81f8fb48 0000000000000246
ffff8800cb4f1a40
[ 240.605283] [<ffffffff8185e7e1>] schedule_preempt_disabled+0x31/0x80
[ 240.605400] [<ffffffff81860033>] mutex_lock_nested+0x183/0x440
[ 240.605510] [<ffffffff8170835f>] ? register_pernet_subsys+0x1f/0x50
[ 240.605626] [<ffffffff8170835f>] ? register_pernet_subsys+0x1f/0x50
[ 240.605757] [<ffffffffa0701000>] ? 0xffffffffa0701000
[ 240.605854] [<ffffffff8170835f>] register_pernet_subsys+0x1f/0x50
[ 240.606005] [<ffffffffa0701048>] br_init+0x48/0xd3 [bridge]
[ 240.606112] [<ffffffff81002148>] do_one_initcall+0xd8/0x210
[ 240.606224] [<ffffffff81153c02>] load_module+0x20c2/0x2870
[ 240.606327] [<ffffffff8114ebe0>] ? store_uevent+0x70/0x70
[ 240.606433] [<ffffffff8110ac26>] ? lock_release_non_nested+0x3c6/0x3d0
[ 240.606557] [<ffffffff81154497>] SyS_init_module+0xe7/0x140
[ 240.606664] [<ffffffff818652a9>] system_call_fastpath+0x12/0x17
[ 240.606845] #0: (net_mutex){+.+.+.}, at: [<ffffffff8170835f>]
register_pernet_subsys+0x1f/0x50
[ 240.607114] INFO: task modprobe:1466 blocked for more than 120 seconds.
[ 240.607231] Not tainted 3.18.0-0.rc1.git2.1.fc22.x86_64 #1
[ 240.607337] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 240.607473] modprobe D ffff88020fbab480 13096 1466 1399 0x00000084
[ 240.607622] ffff88020d1bbbe8 0000000000000096 ffff88020fbab480
00000000001d5f00
[ 240.607791] ffff88020d1bbfd8 00000000001d5f00 ffffffff81e1b580
ffff88020fbab480
[ 240.607949] ffff88020fbab480 ffffffff81f8fb48 0000000000000246
ffff88020fbab480
[ 240.608193] [<ffffffff8185e7e1>] schedule_preempt_disabled+0x31/0x80
[ 240.608316] [<ffffffff81860033>] mutex_lock_nested+0x183/0x440
[ 240.608425] [<ffffffff817083ad>] ? register_pernet_device+0x1d/0x70
[ 240.608542] [<ffffffff817083ad>] ? register_pernet_device+0x1d/0x70
[ 240.608662] [<ffffffffa071d000>] ? 0xffffffffa071d000
[ 240.608759] [<ffffffff817083ad>] register_pernet_device+0x1d/0x70
[ 240.608881] [<ffffffffa071d020>] ppp_init+0x20/0x1000 [ppp_generic]
[ 240.609021] [<ffffffff81002148>] do_one_initcall+0xd8/0x210
[ 240.609131] [<ffffffff81153c02>] load_module+0x20c2/0x2870
[ 240.609235] [<ffffffff8114ebe0>] ? store_uevent+0x70/0x70
[ 240.609339] [<ffffffff8110ac26>] ? lock_release_non_nested+0x3c6/0x3d0
[ 240.609462] [<ffffffff81154497>] SyS_init_module+0xe7/0x140
[ 240.609568] [<ffffffff818652a9>] system_call_fastpath+0x12/0x17
[ 240.609749] #0: (net_mutex){+.+.+.}, at: [<ffffffff817083ad>]
register_pernet_device+0x1d/0x70
Looks like contention on net_mutex or something, but I honestly have
no idea yet. I can't recreate it myself at the moment or I would
bisect.
Has nobody else run into this with the pre-3.18 kernels? Fedora isn't
carrying any patches in this area.
josh
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
More majordomo info at http://vger.kernel.org/majordomo-info.html

21 Replies
58 Views
Permalink to this page
Disable enhanced parsing

Thread Navigation

Kevin Fenzi 2014-10-20 20:15:15 UTC

Dave Jones 2014-10-20 20:43:26 UTC

Kevin Fenzi 2014-10-20 20:53:59 UTC

Kevin Fenzi 2014-10-21 21:12:25 UTC

Josh Boyer 2014-10-22 17:12:25 UTC

Cong Wang 2014-10-22 17:37:53 UTC

Josh Boyer 2014-10-22 17:49:38 UTC

Eric W. Biederman 2014-10-22 17:53:24 UTC

Paul E. McKenney 2014-10-22 18:11:35 UTC

Eric W. Biederman 2014-10-22 18:25:37 UTC

Paul E. McKenney 2014-10-22 18:55:11 UTC

Josh Boyer 2014-10-22 19:33:35 UTC

Yanko Kaneti 2014-10-22 22:40:32 UTC

Paul E. McKenney 2014-10-22 23:24:21 UTC

Yanko Kaneti 2014-10-23 06:09:26 UTC

Paul E. McKenney 2014-10-23 12:27:50 UTC

Paul E. McKenney 2014-10-23 15:33:33 UTC

Paul E. McKenney 2014-10-23 16:28:48 UTC

Yanko Kaneti 2014-10-23 19:51:59 UTC

Paul E. McKenney 2014-10-23 20:05:07 UTC

Paul E. McKenney 2014-10-22 17:59:09 UTC

Josh Boyer 2014-10-22 18:03:11 UTC

about - legalese

Loading...