Skip to content

Fix Markdown blockquote parsing#1627

Open
st0012 wants to merge 2 commits intomasterfrom
fix-blockquote-lazy-continuation
Open

Fix Markdown blockquote parsing#1627
st0012 wants to merge 2 commits intomasterfrom
fix-blockquote-lazy-continuation

Conversation

@st0012
Copy link
Copy Markdown
Member

@st0012 st0012 commented Feb 28, 2026

Summary

Two fixes to the BlockQuoteRaw rule in the PEG grammar:

  • Separate blockquotes (separated by an unquoted blank line) were merged into one. Now only blank lines prefixed with > continue the blockquote, so an unquoted blank line ends it.
  • Lazy continuation was consuming block-level elements (headings, list markers, code fences) that should break out of the blockquote. Added negative lookaheads for these elements.

Ref: #1550 (comment)

@matzbot
Copy link
Copy Markdown
Collaborator

matzbot commented Feb 28, 2026

🚀 Preview deployment available at: https://31657d5e.rdoc-6cd.pages.dev (commit: 2d74d02)

@st0012 st0012 force-pushed the fix-blockquote-lazy-continuation branch from 1959fd1 to d658309 Compare February 28, 2026 21:23
@st0012 st0012 added the bug label Feb 28, 2026
@st0012 st0012 force-pushed the fix-blockquote-lazy-continuation branch 4 times, most recently from 6f54b65 to c168d42 Compare February 28, 2026 22:33
@st0012 st0012 changed the title Fix blockquote lazy continuation and preserve Markdown newlines Fix blockquote parsing and soft line break rendering to match GFM Feb 28, 2026
@st0012 st0012 changed the title Fix blockquote parsing and soft line break rendering to match GFM Improve Markdown blockquote rendering Feb 28, 2026
@st0012 st0012 force-pushed the fix-blockquote-lazy-continuation branch 2 times, most recently from 3d620eb to 0706cd1 Compare March 1, 2026 12:14
@st0012 st0012 changed the title Improve Markdown blockquote rendering Fix Markdown blockquote parsing Mar 1, 2026
@st0012 st0012 marked this pull request as ready for review March 1, 2026 12:15
@st0012 st0012 force-pushed the fix-blockquote-lazy-continuation branch from 0706cd1 to 652ed49 Compare April 4, 2026 14:14
@kou kou requested a review from Copilot April 8, 2026 14:28
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes blockquote parsing so that (1) unquoted blank lines terminate a blockquote instead of being absorbed, and (2) “lazy continuation” does not incorrectly consume certain block-level constructs that should break out of a blockquote.

Changes:

  • Update BlockQuoteRaw grammar to require >-prefixed blank lines for blockquote continuation.
  • Add negative lookaheads to prevent lazy continuation from consuming headings/list markers/code fences.
  • Add regression tests for separate blockquotes and lazy-continuation breakouts.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
test/rdoc/rdoc_markdown_test.rb Adds tests covering separated blockquotes and ensuring list/heading/fence starters break out of lazy continuation.
lib/rdoc/markdown.rb Updates generated parser implementation of BlockQuoteRaw to match the revised grammar.
lib/rdoc/markdown.kpeg Updates the PEG grammar for BlockQuoteRaw to correctly terminate on unquoted blank lines and block certain lazy continuations.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


assert_equal expected, doc
end

Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are tests toggling @parser.github = false for fenced code blocks, but the new blockquote/code-fence interaction isn’t covered in that mode. Add a regression test that sets @parser.github = false and verifies that a ``` line does not force a blockquote to terminate (since fences aren’t block-level without the GitHub extension).

Suggested change
def test_parse_block_quote_no_lazy_continuation_for_code_fence_non_github
@parser.github = false
doc = parse <<~BLOCK_QUOTE
> foo
```
code
```
BLOCK_QUOTE
expected =
doc(
block(
para("foo\n```\ncode\n```")))
assert_equal expected, doc
end

Copilot uses AI. Check for mistakes.
(( ">" " "? Line:l { a << l } )
( !">" !@BlankLine Line:c { a << c } )*
( @BlankLine:n { a << n } )*
( !">" !@BlankLine !AtxStart !Bullet !Enumerator !Ticks3 Line:c { a << c } )*
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work with the following cases?

> x
#no-space-heading

x
#no-space-heading

> x
|header|
|--|
|content|

x
|header|
|--|
|content|

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Kinda yes. It's displayed that way but # is stripped. It's a bug I'll address separately in Preserve # prefix for unresolved cross-references #1676
  2. The table is included on the same time as x in the quote block. I tried to fix it but it's a much more complex fix so I think it deserves a separate PR.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense.

We may need to add more !XXX eventually. (XXX will be came from

rdoc/lib/rdoc/markdown.kpeg

Lines 566 to 581 in 799e551

Block = @BlankLine*
( BlockQuote
| Verbatim
| CodeFence
| Table
| Note
| Reference
| HorizontalRule
| Heading
| OrderedList
| BulletList
| DefinitionList
| HtmlBlock
| StyleBlock
| Para
| Plain )
.)

st0012 added 2 commits April 9, 2026 16:43
- Headings (`# foo`), bullet lists, ordered lists, and code fences now
  break out of a blockquote instead of being lazily continued.
- Unquoted blank lines end the blockquote; only `>`-prefixed blank
  lines continue it.
- Heading lookahead uses `!(AtxStart @Spacechar)` so `#no-space` text
  (not a valid heading) is correctly kept inside the blockquote.
- Code fence lookahead is gated behind `&{ github? }` so it only
  applies when the GitHub extension is enabled, matching `CodeFence`.

Ref: #1627
When `#name` doesn't resolve to a method, the cross-reference handler
was stripping the `#` and returning just the name. Now the original
text including `#` is restored when the lookup fails.

This fixes rendering of text like `#no-space-heading` in Markdown
paragraphs, where the `#` was silently dropped in the final HTML.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Member

@kou kou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants