I have a column named diff
in my df, where the value is like a json string of the format:
{'info': {'version': {'from': '2.0.0', 'to': '2.3.4'}}, 'paths': {'modified': {'/dummy': {'operations': {'added': ['PUT']}}}}, 'endpoints': {'added': [{'method': 'PUT', 'path': '/dummy'}]}, 'components': {'schemas': {'added': ['ObjectOfObjects', 'inline_object', 'ObjectOfObjects_inner']}, 'requestBodies': {'added': ['inline_object', 'nested_response']}}}
Over here info
, paths
, endpoints
and components
represent the first set of nested elements. Like the first category, then we have the next category: such as info
has different fields like: title
, description
etc, components
has fields like: schemas
and so on.
The df column looks something like this:
I want to flatten the json,meaning divided all the parameters, so that means I get around 5-6 new columns( as these are all the parameters changing,first set of elements). I don't want to keep the changes like you can see in the pic from
:... to:..
, I only want the field
, sub-field
and the sub-sub-field
changed
so I get output as something as follows:
info paths endpoints components
version modified added schemas:added
requestBodies:added
I looked into json_normalize
, flatten
and jsonpath
, but somehow neither of these work for the use case. It yields a completely different output to the one I want. It would be really great if someone could help me with this! I seem to be a bit stuck.
from Flatten and extract keywords from json field in csv
No comments:
Post a Comment