python driver: preserve trailing quotes in agtype string values#2425
python driver: preserve trailing quotes in agtype string values#2425SAY-5 wants to merge 2 commits into
Conversation
|
@SAY-5 Please rebase :) |
| # that are part of the actual data when the value starts or ends | ||
| # with an escaped quote, e.g. '"foo \\"bar\\""' -> 'foo \\"bar\\', | ||
| # so trim exactly the first and last character instead. | ||
| return ctx.STRING().getText()[1:-1] |
There was a problem hiding this comment.
Centralized into a small _stripStringDelimiters helper.
| # See visitStringValue() for why we slice instead of using strip('"'). | ||
| return (strNode.getText()[1:-1] , agValNode) |
There was a problem hiding this comment.
Both call sites now use _stripStringDelimiters.
| raise AGTypeError(ctx.getText(), "Missing value in object pair") | ||
| return (strNode.getText().strip('"') , agValNode) | ||
| # See visitStringValue() for why we slice instead of using strip('"'). | ||
| return (strNode.getText()[1:-1] , agValNode) |
There was a problem hiding this comment.
Removed the stray space.
| """Issue #2418: visitStringValue must remove only the outer quote | ||
| delimiters, not every '"' on either side, otherwise values that end | ||
| with an escaped quote (e.g. '"foo \\"bar\\""') lose data.""" |
There was a problem hiding this comment.
Reformatted to summary + body + closing quotes on their own line.
str.strip('"') in visitStringValue() and visitPair() removes every '"'
on either side of the token, not just the outer delimiters, so a value
ending in an escaped quote (e.g. '"foo \"bar\""') loses its trailing
backslash-escaped '"' character. The Agtype grammar guarantees STRING
tokens always carry exactly one delimiter on each side, so slice with
[1:-1] to strip them precisely.
Fixes apache#2418
Signed-off-by: SAY-5 <saiasish.cnp@gmail.com>
91e2085 to
033d81a
Compare
|
Rebased onto master. |
|
@SAY-5 Can you address Copilot's comments above? Please address then in each comment :) It makes it easier to verify. |
|
@jrgemignani Addressed all four Copilot comments in e265425: centralized the trim logic into _stripStringDelimiters and reused it in both visitStringValue/visitPair, removed the stray space before the comma, and reformatted the test docstring. Replied on each inline thread. |
Fixes #2418.
ResultVisitor.visitStringValue()andvisitPair()indrivers/python/age/builder.pyusestr.strip('"')to remove the surrounding delimiters from agtype STRING tokens.str.strip()removes all matching characters from both ends, so when a property value or object key starts or ends with an escaped quote the parser drops the data character along with the delimiter:"foo \"bar\""foo \"bar\"foo \"bar\"\"leading"\"leadingleading"trailing\""trailing\"trailing\The Agtype grammar (
drivers/Agtype.g4) guaranteesSTRING : '"' (ESC | SAFECODEPOINT)* '"', so the token always has exactly one"delimiter on each side. Slicing with[1:-1]removes them precisely without touching the body.Test added:
test_string_value_preserves_inner_quotescovers leading/trailing/embedded escaped quotes plus thevisitPair()path for object keys, and exercises the empty-string""edge case.