Saturday, 18 November 2023

How to update nested object in delta-rs

Given the below code, I'm able to update root-level fields as e.g. "age". But how to update fields inside my_struct?

The attempt with struct("a": "1", "b": "2") gives

DeltaTable error: This feature is not implemented: Unsupported SQL json operator Colon

I also tried named_struct, it gives

Generic DeltaTable error: Error during planning: Invalid function 'named_struct'.

I've been searching a couple of hours for docs/examples, but without luck so far.

import pandas as pd
from deltalake import write_deltalake, DeltaTable

df = pd.DataFrame.from_records([
     {"name": "Alice", "age": 25, "gender": "Female", "my_struct": {"a": 1, "b": 2}},
     {"name": "Bob", "age": 30, "gender": "Male", "my_struct": {"a": 3, "b": 4}}
])

write_deltalake('./db/my_table', df)

dt = DeltaTable('./db/my_table')
dt.update({"age": "42"})  # this is working
dt.update({"my_struct": 'struct("a": "1", "b": "2")'})  # not working
print(dt.to_pandas())

My pyproject.toml

[tool.poetry]
name = "delta-test"

[tool.poetry.dependencies]
python = "^3.10"
deltalake = "^0.12.0"
pandas = "1.5"

[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"


from How to update nested object in delta-rs

No comments:

Post a Comment