Fix pagerank convergence threshold for large sparse graphs (#1575)#1576
Open
pathway wants to merge 1 commit intoQiskit:mainfrom
Open
Fix pagerank convergence threshold for large sparse graphs (#1575)#1576pathway wants to merge 1 commit intoQiskit:mainfrom
pathway wants to merge 1 commit intoQiskit:mainfrom
Conversation
The previous convergence check `norm < n * tol` scaled the L1 tolerance by graph size, which made it a useless threshold once N > 2/tol (since L1 distance between probability vectors is bounded by 2). On large sparse graphs, the first power-iteration step's L1 diff from the uniform starting vector could trivially fall below `n * tol`, causing `pagerank` to return the initial uniform 1/N distribution without any indication of failure. Minimal reproduction: a 2000-node graph with 2 edges (path 0->1->2) returns `pr[2] = 0.0005` (uniform) instead of the correct `pr[2] = 0.00128` (2.6x above uniform). This patch changes the check to `norm < tol` (absolute L1 tolerance, matching the docstring semantics) and adds a regression test. Fixes Qiskit#1575
|
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #1575 — a silent bug where
pagerank()returns the initial uniform1/Ndistribution on large sparse graphs instead of the actual PageRank.Root Cause
The convergence check
norm < (n as f64) * tolscales the L1 tolerance by graph size. Since L1 distance between probability vectors is bounded by 2, this threshold becomes useless onceN > 2/tol(e.g. any N > 2000 with the defaulttol=1e-6). The first power-iteration step's L1 diff from the uniform starting vector can trivially fall belown * tol, causing the algorithm to report convergence and return the initial uniform vector — with no error or warning.Minimal repro (included as regression test):
Change
One-line fix:
norm < (n as f64) * tol→norm < tol. Thetolparameter is now an absolute L1 tolerance, matching the docstring semantics ("error tolerance used when checking for convergence in the power method").Testing
Added
test_sparse_large_graph_does_not_return_uniformintests/digraph/test_pagerank.py. It builds the 2000-node, 2-edge graph from the repro and asserts:pr[2]is at least 2x uniform (mass accumulates at the sink of the path)pr[1]is at least 1.5x uniform (mass flows through it)pr[0]is close to uniform (dangling source)All existing pagerank tests should continue to pass because the new threshold is strictly tighter — anywhere the old check converged correctly, the new check converges at the same iteration or later (never earlier, never worse).
Release Note
Added under
releasenotes/notes/fix-pagerank-convergence-threshold-1575.yaml.Notes
pagerankuses the sameerr < N * tolformula. We haven't verified whether NetworkX exhibits the same failure mode (our environment was missing_bz2so we couldn't run side-by-side), but if it does, that's worth reporting upstream there too. This rustworkx fix stands on its own either way.rx.pagerank()silently returned uniform 1/N for all nodes.