Skip to content

Commit 4b5d1e4

Browse files
QQXanaduakpm00
authored andcommitted
zsmalloc: fix races between modifications of fullness and isolated
We encountered many kernel exceptions of VM_BUG_ON(zspage->isolated == 0) in dec_zspage_isolation() and BUG_ON(!pages[1]) in zs_unmap_object() lately. This issue only occurs when migration and reclamation occur at the same time. With our memory stress test, we can reproduce this issue several times a day. We have no idea why no one else encountered this issue. BTW, we switched to the new kernel version with this defect a few months ago. Since fullness and isolated share the same unsigned int, modifications of them should be protected by the same lock. [andrew.yang@mediatek.com: move comment] Link: https://lkml.kernel.org/r/20230727062910.6337-1-andrew.yang@mediatek.com Link: https://lkml.kernel.org/r/20230721063705.11455-1-andrew.yang@mediatek.com Fixes: c4549b8 ("zsmalloc: remove zspage isolation for migration") Signed-off-by: Andrew Yang <andrew.yang@mediatek.com> Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
1 parent 5d0c230 commit 4b5d1e4

1 file changed

Lines changed: 9 additions & 5 deletions

File tree

mm/zsmalloc.c

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1798,6 +1798,7 @@ static void replace_sub_page(struct size_class *class, struct zspage *zspage,
17981798

17991799
static bool zs_page_isolate(struct page *page, isolate_mode_t mode)
18001800
{
1801+
struct zs_pool *pool;
18011802
struct zspage *zspage;
18021803

18031804
/*
@@ -1807,9 +1808,10 @@ static bool zs_page_isolate(struct page *page, isolate_mode_t mode)
18071808
VM_BUG_ON_PAGE(PageIsolated(page), page);
18081809

18091810
zspage = get_zspage(page);
1810-
migrate_write_lock(zspage);
1811+
pool = zspage->pool;
1812+
spin_lock(&pool->lock);
18111813
inc_zspage_isolation(zspage);
1812-
migrate_write_unlock(zspage);
1814+
spin_unlock(&pool->lock);
18131815

18141816
return true;
18151817
}
@@ -1875,12 +1877,12 @@ static int zs_page_migrate(struct page *newpage, struct page *page,
18751877
kunmap_atomic(s_addr);
18761878

18771879
replace_sub_page(class, zspage, newpage, page);
1880+
dec_zspage_isolation(zspage);
18781881
/*
18791882
* Since we complete the data copy and set up new zspage structure,
18801883
* it's okay to release the pool's lock.
18811884
*/
18821885
spin_unlock(&pool->lock);
1883-
dec_zspage_isolation(zspage);
18841886
migrate_write_unlock(zspage);
18851887

18861888
get_page(newpage);
@@ -1897,14 +1899,16 @@ static int zs_page_migrate(struct page *newpage, struct page *page,
18971899

18981900
static void zs_page_putback(struct page *page)
18991901
{
1902+
struct zs_pool *pool;
19001903
struct zspage *zspage;
19011904

19021905
VM_BUG_ON_PAGE(!PageIsolated(page), page);
19031906

19041907
zspage = get_zspage(page);
1905-
migrate_write_lock(zspage);
1908+
pool = zspage->pool;
1909+
spin_lock(&pool->lock);
19061910
dec_zspage_isolation(zspage);
1907-
migrate_write_unlock(zspage);
1911+
spin_unlock(&pool->lock);
19081912
}
19091913

19101914
static const struct movable_operations zsmalloc_mops = {

0 commit comments

Comments
 (0)