Skip to content

Use pcre2 regex & enable auto async#6991

Open
dranikpg wants to merge 2 commits intodragonflydb:mainfrom
dranikpg:pcre2-regex
Open

Use pcre2 regex & enable auto async#6991
dranikpg wants to merge 2 commits intodragonflydb:mainfrom
dranikpg:pcre2-regex

Conversation

@dranikpg
Copy link
Copy Markdown
Contributor

No description provided.

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
@dranikpg
Copy link
Copy Markdown
Contributor Author

augment reviw

@dranikpg
Copy link
Copy Markdown
Contributor Author

augment review

@augmentcode
Copy link
Copy Markdown

augmentcode Bot commented Apr 20, 2026

🤖 Augment PR Summary

Summary: This PR replaces Lua script async-call detection from std::regex to PCRE2 to avoid stack overflows on 64KB fiber stacks and to enable automatic async-call rewriting.

Changes:

  • Adds PCRE2 as a bundled third-party dependency and links dfly_core / core tests against TRDP::pcre2
  • Removes optional system PCRE2 discovery logic from src/core/CMakeLists.txt
  • Reimplements Interpreter::DetectPossibleAsyncCalls using PCRE2 match iteration and match-data management
  • Preserves the existing rewrite behavior by inserting 'a' before call/pcall tokens (turning them into acall/apcall)

Technical Notes: PCRE2 is built statically with tests/grep disabled and JIT support enabled.

🤖 Was this summary useful? React with 👍 or 👎

Copy link
Copy Markdown

@augmentcode augmentcode Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 1 suggestion posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

Comment thread src/core/interpreter.cc

sregex_iterator it{body.begin(), body.end(), kRegex};
sregex_iterator end{};
pcre2_match_data* match_data = pcre2_match_data_create_from_pattern(kRe, nullptr);
Copy link
Copy Markdown

@augmentcode augmentcode Bot Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/core/interpreter.cc:850 — pcre2_match_data_create_from_pattern(kRe, nullptr) can return nullptr (e.g., allocation failure), but match_data is used unconditionally in pcre2_match, which could crash. Consider handling a null match_data before entering the match loop.

Severity: medium

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

@dranikpg dranikpg marked this pull request as ready for review April 20, 2026 18:33
Copilot AI review requested due to automatic review settings April 20, 2026 18:33
Copy link
Copy Markdown

@augmentcode augmentcode Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 2 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

Comment thread src/core/CMakeLists.txt
cxx_link(dash_bench dfly_core redis_test_lib)

helio_cxx_test(dfly_core_test dfly_core TRDP::fast_float ${PCRE2_LIB} ${RE2_LIB} LABELS DFLY)
helio_cxx_test(dfly_core_test dfly_core TRDP::fast_float TRDP::pcre2 ${RE2_LIB} LABELS DFLY)
Copy link
Copy Markdown

@augmentcode augmentcode Bot Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/core/CMakeLists.txt:34 — This file still has an if(LIB_PCRE2) block (later in the file) that sets USE_PCRE2, but LIB_PCRE2 is no longer discovered here, so that branch will never run and PCRE2-gated code/benchmarks in dfly_core_test will stay disabled unintentionally.

Severity: medium

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

Comment thread src/core/interpreter.cc
static pcre2_code* kRe = [] {
int errnum;
PCRE2_SIZE erroffset;
// PCRE2_MULTILINE makes ^ match at line starts, PCRE2_DOTALL makes . match \n
Copy link
Copy Markdown

@augmentcode augmentcode Bot Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/core/interpreter.cc:819 — The comment mentions PCRE2_DOTALL but the compile flags only pass PCRE2_MULTILINE, which makes the comment misleading about what options are actually enabled.

Severity: low

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR switches Lua script “auto async” detection from std::regex to PCRE2 and updates the build to always provide/link PCRE2.

Changes:

  • Add PCRE2 as a bundled third-party dependency (static build, JIT enabled).
  • Replace std::regex-based matching in Interpreter::DetectPossibleAsyncCalls with PCRE2 matching to avoid fiber stack overflows.
  • Update core build/test linking to use TRDP::pcre2 instead of a system find_library result.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
src/external_libs.cmake Adds PCRE2 as a third-party dependency to ensure availability.
src/core/interpreter.cc Migrates async-call detection to PCRE2 to reduce stack usage on fibers.
src/core/CMakeLists.txt Removes system PCRE2 detection and links against the bundled TRDP::pcre2.

Comment thread src/core/CMakeLists.txt
Comment on lines 1 to 3
find_library(LIB_RE2 NAMES re2)
if(LIB_RE2)
set(RE2_LIB ${LIB_RE2})
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LIB_PCRE2 is no longer set (the find_library block was removed), but it’s still used later for if(LIB_PCRE2) to enable USE_PCRE2. As a result the define will never be set; switch the condition to something that’s actually defined now (e.g. if(TARGET TRDP::pcre2)), or just set USE_PCRE2 unconditionally since TRDP::pcre2 is always linked.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants