screenx影厅是什么| 书生是什么意思| 吃什么能增加免疫力| 买什么| 231是什么意思| 拔完智齿吃什么食物好| 芒果不可以跟什么一起吃| ms什么意思| kim是什么意思| 什么是宫腔镜手术| 蚊虫叮咬涂什么药| 八仙过海指什么生肖| 纷纷扬扬是什么意思| 痛风有什么症状| 静置是什么意思| 胸膜炎是什么症状| 核磁共振能检查什么| 套一是什么意思| 刚怀孕需要注意什么| 孕中期失眠是什么原因| 什么食物胆固醇高| 变蛋吃多了有什么好处和坏处| 月经几个月不来是什么原因| 陈皮是什么皮做的| 自然生化流产是什么意思| 凤毛麟角是什么生肖| 便秘和腹泻交替出现是什么意思| 人参果是什么季节的| 笑口常开是什么生肖| 什么是普世价值| 疗养是什么意思| 纹绣是什么| 吊人什么意思| 小孩口臭吃什么药| 误人子弟什么意思| 维生素b2治什么病| 深蓝色是什么颜色| 西梅不能和什么一起吃| 孕妇吃什么最好| 漂脱是什么意思| 八月一日是什么日子| 原来是什么意思| 物是人非什么意思| 边缘性人格障碍是什么| 身心是什么意思| 子宫肌瘤吃什么食物| 50岁用什么牌子化妆品好| 19点是什么时辰| 腰间盘突出挂什么科室| 0属于什么数| 妃嫔是什么意思| 味精吃多了有什么危害| 高线是什么| 胃食管反流有什么症状| 每天喝豆浆有什么好处| 事后紧急避孕药什么时候吃有效| 嗓子疼吃什么药最管用| gigi 是什么意思| 左手臂麻木是什么征兆| 脚发热是什么原因| 肝炎吃什么药好| 卷柏属于什么植物| 什么是纯净物| 胆囊壁不光滑是什么意思| 专科警校出来干什么| 援交什么意思| 接触性皮炎用什么药膏| 喝酒头晕是什么原因| 司是什么级别| 黄色是什么意思| 口嫌体正直什么意思| 七月七是什么星座| 不字五行属什么| 京东什么时候有活动| 五福临门是什么生肖| 用凝胶排出豆腐渣一样的东西是什么原因| 牙龈肿痛吃什么水果| 车万是什么意思| 牙龈一直肿不消什么原因| 梦见蛇预示着什么| 鳞状上皮是什么意思| 尿蛋白质弱阳性是什么意思| 什么食物对心脏好| 杰克琼斯属于什么档次| 斗智斗勇什么意思| 弛张热常见于什么病| 郫县豆瓣酱能做什么菜| 中性人是什么意思| 幽门杆菌吃什么药最好| 金目念什么| 飞蚊症是什么原因引起的| 白细胞减少有什么症状| 母亲节送什么颜色的康乃馨| 后背出汗多是什么原因| lop是什么意思| 鱼跳出鱼缸有什么征兆| 发小是什么| 取什么网名好听| 普拉提是什么意思| 肝阳上亢吃什么中成药| 白内障什么症状| 肠梗阻是什么| 什么手串最好| 心脏彩超能检查出什么| 手臂粗是什么原因| 薄荷有什么作用| 老是打嗝什么原因| 右眼跳是什么兆头| 梦见小婴儿是什么意思| 心脏早搏吃什么药效果好| 脑部缺氧有什么症状| 酸菜鱼用什么鱼做好吃| 公务员是干什么的| 龙跟什么生肖配对最好| 宫后积液是什么意思| 退役和退伍有什么区别| 小孩睡觉趴着睡是什么原因| 乌梅是什么| 肺的作用和功能是什么| 什么白| 恭送是什么意思| 腰椎间盘突出有什么症状| 起司是什么| 告辞是什么意思| 尿隐血弱阳性是什么意思| 藏风聚气是什么意思| 皮肤过敏忌口什么食物| mrcp检查是什么意思| 组cp是什么意思| 总蛋白低是什么意思| 曹植字什么| 绝望的绝是什么意思| 牙齿松动是什么原因| o和b型生的孩子是什么血型| 高血压什么症状表现| 为什么会有床虱| 子宫内膜增厚有什么影响| 经颅多普勒检查什么| 十月五号是什么星座| 月经不能吃什么东西| 脂肪肝吃什么中药| 时柱将星是什么意思| 喉咙痛吃什么| 文房四宝指的是什么| 梦见洗鞋子是什么意思| ch是什么牌子| 转氨酶高不能吃什么| 为什么睡觉流口水| 脖子肿大是什么病的症状| 蛇吐信子是什么意思| 湿气重吃什么中药| 小孩吃鹅蛋有什么好处| 打呼噜是什么引起的| 打喷嚏流清鼻涕属于什么感冒| 原始鳞状上皮成熟是什么意思| 喉咙发炎吃什么食物好| 冬枣什么时候成熟| 中专是什么学历| 阴道长什么样| 心下痞是什么意思| 阴道炎用什么药效果好| 肚脐叫什么穴位| 支原体阳性是什么意思| 脾胃虚弱吃什么药最好| 去痣挂号挂什么科| 炒菜什么油最好| 性冷淡吃什么药| 淋巴排毒是什么意思| 善变是什么意思| 大脖子病有什么症状| 豆浆和什么不能一起吃| 火险痣是什么意思| 三月二十二是什么星座| apf值是什么意思| 白内障是什么引起的| 什么植物驱蚊效果最好| 家里蟑螂多是什么原因| 凤鸾是什么意思| 什么什么为笑| 吃什么能让月经量增多| 沙棘是什么东西| 勃起困难是什么原因造成的| 换肾是什么病| 感染科主要看什么病| 前面有个豹子是什么车| 42年属什么生肖| 腿肿吃什么药消肿最快最有效| cs和cf有什么区别| 三花鱼是什么鱼| 噗噗噗是什么意思| 语言障碍挂什么科| 花圃是什么意思| a216是什么材质| 脚筋膜炎什么办法恢复的最快| 防字代表什么生肖| 塞来昔布是什么药| 产后吃什么对身体恢复好| 青少年吃什么钙片有助于长高| 什么的垂下| 男人头发硬说明什么| 油腻腻的什么| 压片糖果是什么意思| 长明灯是什么意思| 手掌发紫是什么原因| 胡萝卜是什么科| 消化不良吃什么水果好| 睡多了头疼是什么原因| 中暑是什么感觉| 漏蛋白是什么原因造成的| 大人是什么意思| 为什么现在不建议输液| 钙片什么牌子好| 遗精是什么症状| 25羟基维生素d是什么| 梦见打官司预示着什么| 狗狗为什么喜欢舔人| 观赏是什么意思| 胃肠型感冒吃什么药| 什么球身上长毛| 青霉素是什么药| 英语介词是什么意思| 无异于是什么意思| 玛丽苏是什么意思| 蛇缠腰是什么病| 鬼冢虎属于什么档次| 血红蛋白浓度是什么意思| 为什么心率过快| 什么东西化痰效果最好最快| 际遇是什么意思| 忌廉是什么东西| 箔是什么意思| 芥末油是什么提炼出来的| 风寒感冒咳嗽吃什么药| 姨妈痛吃什么药| 89年属什么生肖| 红茶适合什么季节喝| 破釜沉舟的釜是什么意思| amv是什么意思| 11月29是什么星座| 手抖是什么病的预兆| 3月7日是什么星座| 蝉鸣声耳鸣是什么原因引起的| 夏天为什么要吃姜| 胖头鱼又叫什么鱼| 素海参是什么做的| 血脂高吃什么好| 计数单位是什么意思| 为什么会湿气重| 魔鬼城是什么地貌| 近五行属什么| 木芙蓉什么时候开花| 白带是什么| 生肖马和什么生肖最配| 为什么突然就得肝炎了| 甘草不能和什么一起吃| 尿失禁是什么意思| 柠檬水苦是什么原因| bug什么意思| 起痱子是什么原因| punk什么意思| 宝宝嘴巴臭臭的是什么原因| 胃ct能检查出什么病| 做什么检查确诊是白塞| 胃痉挛有什么症状表现| 百度
Skip to content

tobymao/sqlglot

Repository files navigation

SQLGlot logo

SQLGlot is a no-dependency SQL parser, transpiler, optimizer, and engine. It can be used to format SQL or translate between 30 different dialects like DuckDB, Presto / Trino, Spark / Databricks, Snowflake, and BigQuery. It aims to read a wide variety of SQL inputs and output syntactically and semantically correct SQL in the targeted dialects.

It is a very comprehensive generic SQL parser with a robust test suite. It is also quite performant, while being written purely in Python.

You can easily customize the parser, analyze queries, traverse expression trees, and programmatically build SQL.

SQLGlot can detect a variety of syntax errors, such as unbalanced parentheses, incorrect usage of reserved keywords, and so on. These errors are highlighted and dialect incompatibilities can warn or raise depending on configurations.

Learn more about SQLGlot in the API documentation and the expression tree primer.

Contributions are very welcome in SQLGlot; read the contribution guide and the onboarding document to get started!

Table of Contents

Install

From PyPI:

pip3 install "sqlglot[rs]"

# Without Rust tokenizer (slower):
# pip3 install sqlglot

Or with a local checkout:

make install

Requirements for development (optional):

make install-dev

Versioning

Given a version number MAJOR.MINOR.PATCH, SQLGlot uses the following versioning strategy:

  • The PATCH version is incremented when there are backwards-compatible fixes or feature additions.
  • The MINOR version is incremented when there are backwards-incompatible fixes or feature additions.
  • The MAJOR version is incremented when there are significant backwards-incompatible fixes or feature additions.

Get in Touch

We'd love to hear from you. Join our community Slack channel!

FAQ

I tried to parse SQL that should be valid but it failed, why did that happen?

  • Most of the time, issues like this occur because the "source" dialect is omitted during parsing. For example, this is how to correctly parse a SQL query written in Spark SQL: parse_one(sql, dialect="spark") (alternatively: read="spark"). If no dialect is specified, parse_one will attempt to parse the query according to the "SQLGlot dialect", which is designed to be a superset of all supported dialects. If you tried specifying the dialect and it still doesn't work, please file an issue.

I tried to output SQL but it's not in the correct dialect!

  • Like parsing, generating SQL also requires the target dialect to be specified, otherwise the SQLGlot dialect will be used by default. For example, to transpile a query from Spark SQL to DuckDB, do parse_one(sql, dialect="spark").sql(dialect="duckdb") (alternatively: transpile(sql, read="spark", write="duckdb")).

What happened to sqlglot.dataframe?

  • The PySpark dataframe api was moved to a standalone library called SQLFrame in v24. It now allows you to run queries as opposed to just generate SQL.

Examples

Formatting and Transpiling

Easily translate from one dialect to another. For example, date/time functions vary between dialects and can be hard to deal with:

import sqlglot
sqlglot.transpile("SELECT EPOCH_MS(1618088028295)", read="duckdb", write="hive")[0]
'SELECT FROM_UNIXTIME(1618088028295 / POW(10, 3))'

SQLGlot can even translate custom time formats:

import sqlglot
sqlglot.transpile("SELECT STRFTIME(x, '%y-%-m-%S')", read="duckdb", write="hive")[0]
"SELECT DATE_FORMAT(x, 'yy-M-ss')"

Identifier delimiters and data types can be translated as well:

import sqlglot

# Spark SQL requires backticks (`) for delimited identifiers and uses `FLOAT` over `REAL`
sql = """WITH baz AS (SELECT a, c FROM foo WHERE a = 1) SELECT f.a, b.b, baz.c, CAST("b"."a" AS REAL) d FROM foo f JOIN bar b ON f.a = b.a LEFT JOIN baz ON f.a = baz.a"""

# Translates the query into Spark SQL, formats it, and delimits all of its identifiers
print(sqlglot.transpile(sql, write="spark", identify=True, pretty=True)[0])
WITH `baz` AS (
  SELECT
    `a`,
    `c`
  FROM `foo`
  WHERE
    `a` = 1
)
SELECT
  `f`.`a`,
  `b`.`b`,
  `baz`.`c`,
  CAST(`b`.`a` AS FLOAT) AS `d`
FROM `foo` AS `f`
JOIN `bar` AS `b`
  ON `f`.`a` = `b`.`a`
LEFT JOIN `baz`
  ON `f`.`a` = `baz`.`a`

Comments are also preserved on a best-effort basis:

sql = """
/* multi
   line
   comment
*/
SELECT
  tbl.cola /* comment 1 */ + tbl.colb /* comment 2 */,
  CAST(x AS SIGNED), # comment 3
  y               -- comment 4
FROM
  bar /* comment 5 */,
  tbl #          comment 6
"""

# Note: MySQL-specific comments (`#`) are converted into standard syntax
print(sqlglot.transpile(sql, read='mysql', pretty=True)[0])
/* multi
   line
   comment
*/
SELECT
  tbl.cola /* comment 1 */ + tbl.colb /* comment 2 */,
  CAST(x AS INT), /* comment 3 */
  y /* comment 4 */
FROM bar /* comment 5 */, tbl /*          comment 6 */

Metadata

You can explore SQL with expression helpers to do things like find columns and tables in a query:

from sqlglot import parse_one, exp

# print all column references (a and b)
for column in parse_one("SELECT a, b + 1 AS c FROM d").find_all(exp.Column):
    print(column.alias_or_name)

# find all projections in select statements (a and c)
for select in parse_one("SELECT a, b + 1 AS c FROM d").find_all(exp.Select):
    for projection in select.expressions:
        print(projection.alias_or_name)

# find all tables (x, y, z)
for table in parse_one("SELECT * FROM x JOIN y JOIN z").find_all(exp.Table):
    print(table.name)

Read the ast primer to learn more about SQLGlot's internals.

Parser Errors

When the parser detects an error in the syntax, it raises a ParseError:

import sqlglot
sqlglot.transpile("SELECT foo FROM (SELECT baz FROM t")
sqlglot.errors.ParseError: Expecting ). Line 1, Col: 34.
  SELECT foo FROM (SELECT baz FROM t
                                   ~

Structured syntax errors are accessible for programmatic use:

import sqlglot
try:
    sqlglot.transpile("SELECT foo FROM (SELECT baz FROM t")
except sqlglot.errors.ParseError as e:
    print(e.errors)
[{
  'description': 'Expecting )',
  'line': 1,
  'col': 34,
  'start_context': 'SELECT foo FROM (SELECT baz FROM ',
  'highlight': 't',
  'end_context': '',
  'into_expression': None
}]

Unsupported Errors

It may not be possible to translate some queries between certain dialects. For these cases, SQLGlot may emit a warning and will proceed to do a best-effort translation by default:

import sqlglot
sqlglot.transpile("SELECT APPROX_DISTINCT(a, 0.1) FROM foo", read="presto", write="hive")
APPROX_COUNT_DISTINCT does not support accuracy
'SELECT APPROX_COUNT_DISTINCT(a) FROM foo'

This behavior can be changed by setting the unsupported_level attribute. For example, we can set it to either RAISE or IMMEDIATE to ensure an exception is raised instead:

import sqlglot
sqlglot.transpile("SELECT APPROX_DISTINCT(a, 0.1) FROM foo", read="presto", write="hive", unsupported_level=sqlglot.ErrorLevel.RAISE)
sqlglot.errors.UnsupportedError: APPROX_COUNT_DISTINCT does not support accuracy

There are queries that require additional information to be accurately transpiled, such as the schemas of the tables referenced in them. This is because certain transformations are type-sensitive, meaning that type inference is needed in order to understand their semantics. Even though the qualify and annotate_types optimizer rules can help with this, they are not used by default because they add significant overhead and complexity.

Transpilation is generally a hard problem, so SQLGlot employs an "incremental" approach to solving it. This means that there may be dialect pairs that currently lack support for some inputs, but this is expected to improve over time. We highly appreciate well-documented and tested issues or PRs, so feel free to reach out if you need guidance!

Build and Modify SQL

SQLGlot supports incrementally building SQL expressions:

from sqlglot import select, condition

where = condition("x=1").and_("y=1")
select("*").from_("y").where(where).sql()
'SELECT * FROM y WHERE x = 1 AND y = 1'

It's possible to modify a parsed tree:

from sqlglot import parse_one
parse_one("SELECT x FROM y").from_("z").sql()
'SELECT x FROM z'

Parsed expressions can also be transformed recursively by applying a mapping function to each node in the tree:

from sqlglot import exp, parse_one

expression_tree = parse_one("SELECT a FROM x")

def transformer(node):
    if isinstance(node, exp.Column) and node.name == "a":
        return parse_one("FUN(a)")
    return node

transformed_tree = expression_tree.transform(transformer)
transformed_tree.sql()
'SELECT FUN(a) FROM x'

SQL Optimizer

SQLGlot can rewrite queries into an "optimized" form. It performs a variety of techniques to create a new canonical AST. This AST can be used to standardize queries or provide the foundations for implementing an actual engine. For example:

import sqlglot
from sqlglot.optimizer import optimize

print(
    optimize(
        sqlglot.parse_one("""
            SELECT A OR (B OR (C AND D))
            FROM x
            WHERE Z = date '2025-08-06' + INTERVAL '1' month OR 1 = 0
        """),
        schema={"x": {"A": "INT", "B": "INT", "C": "INT", "D": "INT", "Z": "STRING"}}
    ).sql(pretty=True)
)
SELECT
  (
    "x"."a" <> 0 OR "x"."b" <> 0 OR "x"."c" <> 0
  )
  AND (
    "x"."a" <> 0 OR "x"."b" <> 0 OR "x"."d" <> 0
  ) AS "_col_0"
FROM "x" AS "x"
WHERE
  CAST("x"."z" AS DATE) = CAST('2025-08-06' AS DATE)

AST Introspection

You can see the AST version of the parsed SQL by calling repr:

from sqlglot import parse_one
print(repr(parse_one("SELECT a + 1 AS z")))
Select(
  expressions=[
    Alias(
      this=Add(
        this=Column(
          this=Identifier(this=a, quoted=False)),
        expression=Literal(this=1, is_string=False)),
      alias=Identifier(this=z, quoted=False))])

AST Diff

SQLGlot can calculate the semantic difference between two expressions and output changes in a form of a sequence of actions needed to transform a source expression into a target one:

from sqlglot import diff, parse_one
diff(parse_one("SELECT a + b, c, d"), parse_one("SELECT c, a - b, d"))
[
  Remove(expression=Add(
    this=Column(
      this=Identifier(this=a, quoted=False)),
    expression=Column(
      this=Identifier(this=b, quoted=False)))),
  Insert(expression=Sub(
    this=Column(
      this=Identifier(this=a, quoted=False)),
    expression=Column(
      this=Identifier(this=b, quoted=False)))),
  Keep(
    source=Column(this=Identifier(this=a, quoted=False)),
    target=Column(this=Identifier(this=a, quoted=False))),
  ...
]

See also: Semantic Diff for SQL.

Custom Dialects

Dialects can be added by subclassing Dialect:

from sqlglot import exp
from sqlglot.dialects.dialect import Dialect
from sqlglot.generator import Generator
from sqlglot.tokens import Tokenizer, TokenType


class Custom(Dialect):
    class Tokenizer(Tokenizer):
        QUOTES = ["'", '"']
        IDENTIFIERS = ["`"]

        KEYWORDS = {
            **Tokenizer.KEYWORDS,
            "INT64": TokenType.BIGINT,
            "FLOAT64": TokenType.DOUBLE,
        }

    class Generator(Generator):
        TRANSFORMS = {exp.Array: lambda self, e: f"[{self.expressions(e)}]"}

        TYPE_MAPPING = {
            exp.DataType.Type.TINYINT: "INT64",
            exp.DataType.Type.SMALLINT: "INT64",
            exp.DataType.Type.INT: "INT64",
            exp.DataType.Type.BIGINT: "INT64",
            exp.DataType.Type.DECIMAL: "NUMERIC",
            exp.DataType.Type.FLOAT: "FLOAT64",
            exp.DataType.Type.DOUBLE: "FLOAT64",
            exp.DataType.Type.BOOLEAN: "BOOL",
            exp.DataType.Type.TEXT: "STRING",
        }

print(Dialect["custom"])
<class '__main__.Custom'>

SQL Execution

SQLGlot is able to interpret SQL queries, where the tables are represented as Python dictionaries. The engine is not supposed to be fast, but it can be useful for unit testing and running SQL natively across Python objects. Additionally, the foundation can be easily integrated with fast compute kernels, such as Arrow and Pandas.

The example below showcases the execution of a query that involves aggregations and joins:

from sqlglot.executor import execute

tables = {
    "sushi": [
        {"id": 1, "price": 1.0},
        {"id": 2, "price": 2.0},
        {"id": 3, "price": 3.0},
    ],
    "order_items": [
        {"sushi_id": 1, "order_id": 1},
        {"sushi_id": 1, "order_id": 1},
        {"sushi_id": 2, "order_id": 1},
        {"sushi_id": 3, "order_id": 2},
    ],
    "orders": [
        {"id": 1, "user_id": 1},
        {"id": 2, "user_id": 2},
    ],
}

execute(
    """
    SELECT
      o.user_id,
      SUM(s.price) AS price
    FROM orders o
    JOIN order_items i
      ON o.id = i.order_id
    JOIN sushi s
      ON i.sushi_id = s.id
    GROUP BY o.user_id
    """,
    tables=tables
)
user_id price
      1   4.0
      2   3.0

See also: Writing a Python SQL engine from scratch.

Used By

Documentation

SQLGlot uses pdoc to serve its API documentation.

A hosted version is on the SQLGlot website, or you can build locally with:

make docs-serve

Run Tests and Lint

make style  # Only linter checks
make unit   # Only unit tests (or unit-rs, to use the Rust tokenizer)
make test   # Unit and integration tests (or test-rs, to use the Rust tokenizer)
make check  # Full test suite & linter checks

Benchmarks

Benchmarks run on Python 3.10.12 in seconds.

Query sqlglot sqlglotrs sqlfluff sqltree sqlparse moz_sql_parser sqloxide
tpch 0.00944 (1.0) 0.00590 (0.625) 0.32116 (33.98) 0.00693 (0.734) 0.02858 (3.025) 0.03337 (3.532) 0.00073 (0.077)
short 0.00065 (1.0) 0.00044 (0.687) 0.03511 (53.82) 0.00049 (0.759) 0.00163 (2.506) 0.00234 (3.601) 0.00005 (0.073)
long 0.00889 (1.0) 0.00572 (0.643) 0.36982 (41.56) 0.00614 (0.690) 0.02530 (2.844) 0.02931 (3.294) 0.00059 (0.066)
crazy 0.02918 (1.0) 0.01991 (0.682) 1.88695 (64.66) 0.02003 (0.686) 7.46894 (255.9) 0.64994 (22.27) 0.00327 (0.112)
make bench            # Run parsing benchmark
make bench-optimize   # Run optimization benchmark

Optional Dependencies

SQLGlot uses dateutil to simplify literal timedelta expressions. The optimizer will not simplify expressions like the following if the module cannot be found:

x + interval '1' month
地贫是什么病 obsidian什么意思 暹什么意思 支那人什么意思 鹞子是什么鸟
胆汁反流什么症状 两个菱形是什么牌子 宝宝贫血有什么危害 zero什么意思 检点是什么意思
做梦梦到已故的亲人是什么意思 1999是什么年 夏的五行属什么 眼角发痒是什么原因 山见念什么
心脏支架是什么病 男人蛋蛋疼是什么原因 上午12点是什么时候 十月30号是什么星座 绿豆和什么相克中毒
载脂蛋白是什么意思hcv9jop0ns1r.cn 印度人为什么用手抓饭吃hcv8jop6ns3r.cn 万人空巷是什么意思hcv8jop1ns4r.cn 有什么放不下hcv7jop6ns0r.cn 跟腱炎吃什么药效果好hcv8jop7ns7r.cn
二尖瓣关闭不全是什么意思hcv7jop6ns9r.cn 什么的绽放hcv7jop5ns5r.cn 什么牌子的冰箱好用又省电hcv8jop3ns5r.cn 石女是什么意思啊hcv8jop8ns7r.cn 镜里观花是什么生肖hcv8jop3ns6r.cn
右肺小结节是什么意思travellingsim.com 桃花开在什么季节hcv8jop0ns8r.cn 梦见修路什么意思hcv8jop2ns9r.cn 沙拉是什么意思hcv8jop5ns1r.cn 什么食物含有维生素bhcv9jop5ns5r.cn
护士是干什么的hcv9jop7ns3r.cn 住院门槛费是什么意思hcv8jop1ns3r.cn 咖啡对心脏有什么影响hcv9jop4ns1r.cn db是什么单位hcv9jop7ns1r.cn 宜五行属什么hcv9jop5ns8r.cn
百度