Skip to content

Commit fb8f9b2

Browse files
Baolin Wanggregkh
authored andcommitted
blk-cgroup: Use cond_resched() when destroy blkgs
[ Upstream commit 6c635ca ] On !PREEMPT kernel, we can get below softlockup when doing stress testing with creating and destroying block cgroup repeatly. The reason is it may take a long time to acquire the queue's lock in the loop of blkcg_destroy_blkgs(), or the system can accumulate a huge number of blkgs in pathological cases. We can add a need_resched() check on each loop and release locks and do cond_resched() if true to avoid this issue, since the blkcg_destroy_blkgs() is not called from atomic contexts. [ 4757.010308] watchdog: BUG: soft lockup - CPU#11 stuck for 94s! [ 4757.010698] Call trace: [ 4757.010700]  blkcg_destroy_blkgs+0x68/0x150 [ 4757.010701]  cgwb_release_workfn+0x104/0x158 [ 4757.010702]  process_one_work+0x1bc/0x3f0 [ 4757.010704]  worker_thread+0x164/0x468 [ 4757.010705]  kthread+0x108/0x138 Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
1 parent 4d00f1b commit fb8f9b2

1 file changed

Lines changed: 13 additions & 5 deletions

File tree

block/blk-cgroup.c

Lines changed: 13 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1017,21 +1017,29 @@ static void blkcg_css_offline(struct cgroup_subsys_state *css)
10171017
*/
10181018
void blkcg_destroy_blkgs(struct blkcg *blkcg)
10191019
{
1020+
might_sleep();
1021+
10201022
spin_lock_irq(&blkcg->lock);
10211023

10221024
while (!hlist_empty(&blkcg->blkg_list)) {
10231025
struct blkcg_gq *blkg = hlist_entry(blkcg->blkg_list.first,
10241026
struct blkcg_gq, blkcg_node);
10251027
struct request_queue *q = blkg->q;
10261028

1027-
if (spin_trylock(&q->queue_lock)) {
1028-
blkg_destroy(blkg);
1029-
spin_unlock(&q->queue_lock);
1030-
} else {
1029+
if (need_resched() || !spin_trylock(&q->queue_lock)) {
1030+
/*
1031+
* Given that the system can accumulate a huge number
1032+
* of blkgs in pathological cases, check to see if we
1033+
* need to rescheduling to avoid softlockup.
1034+
*/
10311035
spin_unlock_irq(&blkcg->lock);
1032-
cpu_relax();
1036+
cond_resched();
10331037
spin_lock_irq(&blkcg->lock);
1038+
continue;
10341039
}
1040+
1041+
blkg_destroy(blkg);
1042+
spin_unlock(&q->queue_lock);
10351043
}
10361044

10371045
spin_unlock_irq(&blkcg->lock);

0 commit comments

Comments
 (0)