Background
The parser currently recognises a single placeholder form, ? (positional, sequential). Most SQL dialects that use this parser as a frontend also accept one or both of these in production queries:
- PostgreSQL uses
$N with N a positive integer — WHERE a = $1 AND b = $2. Order of appearance does not determine binding; the explicit number does.
- SQLite supports
:name — WHERE a = :user AND b = :id. Bound by name.
Anyone embedding hyrise/sql-parser into a real database driver currently has to either pre-translate queries before parsing or work around the limitation downstream. Adding native support inside the parser is small, isolated, and unblocks those use cases without changing existing ? behaviour.
Proposed grammar
Two additional rules in the lexer, two additional alternatives in param_expr:
# flex_lexer.l (before the punctuation catch-all)
\$[0-9]+ -> DOLLAR_PARAM (ival = atoll(yytext + 1))
:[A-Za-z][A-Za-z0-9_]* -> NAMED_PARAM (sval = strdup(yytext + 1))
# bison_parser.y
param_expr
: '?' { /* existing */ }
| DOLLAR_PARAM { if ($1 < 1) YYERROR; // reject $0
$$ = Expr::makeDollarParameter($1); ... }
| NAMED_PARAM { $$ = Expr::makeNamedParameter($1); ... }
;
The identifier pattern matches the existing SQL_IDENTIFIER rule so :foo, :abc_123 parse and :_x does not — same surface as everywhere else in the grammar.
Conflict analysis
$ is not currently a lexer token. No collision.
: is in the punctuation character class but never referenced by any grammar rule (zero matches for ':' in bison_parser.y); the standalone : token remains defined but unreachable. There is no :: cast syntax in the grammar, so :identifier does not interfere with anything.
bison -v reports no new shift/reduce or reduce/reduce conflicts on the resulting grammar (verified locally).
Proposed AST
Three ExprType values rather than one with a discriminator field:
enum ExprType {
...
kExprParameter, // ?
kExprParameterDollar, // $N — ival holds N (1-based, preserves user intent)
kExprParameterNamed, // :name — name holds the identifier
...
};
static Expr* Expr::makeDollarParameter(int64_t n);
static Expr* Expr::makeNamedParameter(char* name);
The top-level input rule's renumber loop only touches kExprParameter so ? retains its current 0-based sequential ival semantics; $N keeps its explicit N (consistent with PostgreSQL's contract); :name is bound by name and its ival is not meaningful.
The reason for three distinct enum values rather than overloading one: future consumers (binders, AST printers, query rewriters) get an explicit switch dispatch instead of having to remember to check whether name is null. It also makes round-trip printing through sqlhelper straightforward.
Backward compatibility
- The
? path is unchanged at every layer — lexer rule, parser action, AST, SQLParserResult::parameters() order.
- Existing tests under
test/prepare_tests.cpp keep passing untouched.
- The two new
ExprType values are appended after kExprParameter, so no ordinal change.
- Mixing styles in one statement (e.g.
WHERE a = ? AND b = $1) parses successfully. We took the position that policing the mix belongs in the driver, not the parser.
Tests
The change comes with:
- Three new cases in
test/prepare_tests.cpp covering $N, $N declared out of order, and :name.
- Three new good queries and two new bad queries (
$0, lone $) in test/queries/.
make test passes all three checks (SQL tests, valgrind, grammar conflict).
Background
The parser currently recognises a single placeholder form,
?(positional, sequential). Most SQL dialects that use this parser as a frontend also accept one or both of these in production queries:$Nwith N a positive integer —WHERE a = $1 AND b = $2. Order of appearance does not determine binding; the explicit number does.:name—WHERE a = :user AND b = :id. Bound by name.Anyone embedding
hyrise/sql-parserinto a real database driver currently has to either pre-translate queries before parsing or work around the limitation downstream. Adding native support inside the parser is small, isolated, and unblocks those use cases without changing existing?behaviour.Proposed grammar
Two additional rules in the lexer, two additional alternatives in
param_expr:The identifier pattern matches the existing
SQL_IDENTIFIERrule so:foo,:abc_123parse and:_xdoes not — same surface as everywhere else in the grammar.Conflict analysis
$is not currently a lexer token. No collision.:is in the punctuation character class but never referenced by any grammar rule (zero matches for':'inbison_parser.y); the standalone:token remains defined but unreachable. There is no::cast syntax in the grammar, so:identifierdoes not interfere with anything.bison -vreports no new shift/reduce or reduce/reduce conflicts on the resulting grammar (verified locally).Proposed AST
Three
ExprTypevalues rather than one with a discriminator field:The top-level
inputrule's renumber loop only toucheskExprParameterso?retains its current 0-based sequentialivalsemantics;$Nkeeps its explicit N (consistent with PostgreSQL's contract);:nameis bound by name and itsivalis not meaningful.The reason for three distinct enum values rather than overloading one: future consumers (binders, AST printers, query rewriters) get an explicit switch dispatch instead of having to remember to check whether
nameis null. It also makes round-trip printing throughsqlhelperstraightforward.Backward compatibility
?path is unchanged at every layer — lexer rule, parser action, AST,SQLParserResult::parameters()order.test/prepare_tests.cppkeep passing untouched.ExprTypevalues are appended afterkExprParameter, so no ordinal change.WHERE a = ? AND b = $1) parses successfully. We took the position that policing the mix belongs in the driver, not the parser.Tests
The change comes with:
test/prepare_tests.cppcovering$N,$Ndeclared out of order, and:name.$0, lone$) intest/queries/.make testpasses all three checks (SQL tests, valgrind, grammar conflict).