Sunday, 12 March 2023

Explicit line ranges when parsing YAML in Python

I have the following YAML file contents (a single block for example):

.deploy_container:
  tags:
    - gcp
  image: google/cloud-sdk
  services:
    - docker:dind
  variables:
    PORT: "$APP_PORT"
  script:
    - cp $GCP_SERVICE_KEY gcloud-service-key.json # Google Cloud

I have a wrapper to the default loader :

class SafeLineLoader(SafeLoader):
    def construct_mapping(self, node: MappingNode, deep: bool = False) -> dict[Hashable, Any]:
        mapping = super().construct_mapping(node, deep=deep)

        mapping['__startline__'] = node.start_mark.line + 1
        mapping['__endline__'] = node.end_mark.line + 1
        return mapping

Given that file contents and loader, the parsed object I get is :

".deploy_container": {
    "tags": [
      "gcp"
    ],
    "image": "google/cloud-sdk",
    "services": [
      "docker:dind"
    ],
    "variables": {
      "PORT": "$APP_PORT",
      "__startline__": 137,
      "__endline__": 138
    },
    "script": [
      "cp $GCP_SERVICE_KEY gcloud-service-key.json"
    ],
    "__startline__": 131,
    "__endline__": 144
  }

The thing I am struggling with is that on string or list values there's no additional info on the start line and end line. Is there a way to add that metadata to these objects too? even if it means essentially converting them to dicts.

For example I'd like that instead of

"services": [
      "docker:dind"
    ]

I'll get

"services": {
  1: [
      "docker:find"
    ]
  __start_line__ : ...
  __end_line__: ...
}


from Explicit line ranges when parsing YAML in Python

No comments:

Post a Comment