Runtime-Extensible SQL Parsers Using Peg

(duckdb.org)

92 points | by todsacerdoti 5 days ago

7 comments

  • xrd 4 days ago

    The incredible Janet for Mortals book by Ian Henry was my first exposure to peg. It's very interesting and changed my thinking on programming in a big way. It's a free book.

    https://janet.guide/pegular-expressions/

    • cryptonector 4 days ago

      Reminds me very much of the Icon programming language.

  • EuAndreh 4 days ago

    There is nothing wrong with using PEGs for SQL parsing, but this article (I didn't read the paper) presents flawed arguments:

    - tech $X is from the 60s, therefore it is bad and/or outdated: one doesn't need to "disrupt" or innovate in everything to become modern. There are plenty of things from the 60s that still don't have a better replacement, and its OK to keep using it.

    - "YACC-style parsers" clumps together parsers that are generated at compile-time, from declarative grammars, using LALR(1). But that's not inherit to the technique or algorithm: a parser can be LALR(1) from a declarative grammar and still extensible at run-time, or provide LL(1) alongside, or be built from statements instead of a grammar. There's nothing wrong with using PEGs over "YACC-style" parsers, but not for these distorted reasons.

    • ttfkam 3 days ago

      I'm not sure that was their position. They're not saying tech from the 60s was inherently bad. They specifically mentioned that we today are not constrained by the same hardware restrictions that gave rise to the software in the 1960s. Those are two very different positions.

      For example, I like Rust. But if Rust had been introduced as-is fifty years ago, no one would have used it, because the hardware requirements to make Rust compilation practical simply didn't exist yet. Taking a week just to compile "hello world" would have been a nonstarter. Not because Rust is bad but because hardware requirements at the time ruled something like it out.

      2024 is not 1964 however, and it's always good to re-examine old assumptions.

  • lovasoa 4 days ago

    From a practical standpoint, for anyone who needs to parse SQL today, I can recommend datafusion's sqlparser-rs. This is what we use in http://sql-page.com , and I regularly contribute to it. I don't know anything else that matches its level of support for all the crazy little-known syntax particularities of the various SQL dialects.

    In particular, Microsoft SQL Server seems to do everything just a little bit differently, and sqlparser-rs does support its idiosyncrasies most of the time.

  • kristianp 4 days ago

    > parsers should be rewritten using modern abstractions like Parser Expression Grammars (PEG), which allow dynamic changes to the accepted query syntax and better error recovery.

  • gigatexal 4 days ago

    The DuckDb project continues to impress me every day.