Table of Contents

C++をパースする

clangのast-dumpを使う

import json
import subprocess
 
def walk(item):
  if "kind" in item: print(item["kind"])
  for child in item.get("inner", []):
    walk(child)
 
proc = subprocess.run(["clang++", "-x", "c++", "-std=c++2b", "-Xclang", "-ast-dump=json", "-fsyntax-only", "a.cpp"], capture_output=True)
if 0 == proc.returncode:
  walk(json.loads(proc.stdout))
else:
  print(proc.stderr.decode("utf-8"))
  exit(1)

出力形式

kind一覧

kindコード例説明
TranslationUnitDecl-翻訳単位(Rootノード)
TypedefDecltypedef A B;typedef宣言
NamespaceDeclnamespace { }名前空間宣言
FunctionDeclvoid f() { }関数宣言
CompoundStmt{ }ブロック
ReturnStmtreturn;return文
IntegerLiteral0数値リテラル

今後追記予定

出力例

int main() { return 0; }
{
  "id": "0x121829c08",
  "kind": "TranslationUnitDecl",
  "loc": {},
  "range": {
    "begin": {},
    "end": {}
  },
  "inner": [
    "...省略...",
    {
      "id": "0x12102e440",
      "kind": "FunctionDecl",
      "loc": {
        "offset": 4,
        "file": "a.cc",
        "line": 1,
        "col": 5,
        "tokLen": 4
      },
      "range": {
        "begin": {
          "offset": 0,
          "col": 1,
          "tokLen": 3
        },
        "end": {
          "offset": 23,
          "col": 24,
          "tokLen": 1
        }
      },
      "name": "main",
      "mangledName": "_main",
      "type": {
        "qualType": "int ()"
      },
      "inner": [
        {
          "id": "0x12102e588",
          "kind": "CompoundStmt",
          "range": {
            "begin": {
              "offset": 11,
              "col": 12,
              "tokLen": 1
            },
            "end": {
              "offset": 23,
              "col": 24,
              "tokLen": 1
            }
          },
          "inner": [
            {
              "id": "0x12102e578",
              "kind": "ReturnStmt",
              "range": {
                "begin": {
                  "offset": 13,
                  "col": 14,
                  "tokLen": 6
                },
                "end": {
                  "offset": 20,
                  "col": 21,
                  "tokLen": 1
                }
              },
              "inner": [
                {
                  "id": "0x12102e558",
                  "kind": "IntegerLiteral",
                  "range": {
                    "begin": {
                      "offset": 20,
                      "col": 21,
                      "tokLen": 1
                    },
                    "end": {
                      "offset": 20,
                      "col": 21,
                      "tokLen": 1
                    }
                  },
                  "type": {
                    "qualType": "int"
                  },
                  "valueCategory": "prvalue",
                  "value": "0"
                }
              ]
            }
          ]
        }
      ]
    }
  ]
}

libclangを使う