Thursday, 12 August 2021

Factory function for dataclass fields

I'm making a library where I want to take advantage of metadata on the fields of a dataclass.

To get my desired results, I can write the dataclass like the following:

@dataclass
class Foo:
    a: int = field(
        metadata={'my_metadata': {'my_required_key': "c"}}
    )
    b: dict[str, str] = field(
        metadata={'my_metadata': {'my_required_key': "d"}}, default_factory=dict
    )

This seems like a lot of boilerplate, especially if I want to make many classes with many fields like this. I was thinking I could write a factory function to wrap dataclass.field and help reduce the amount of repetition.

However, I can't seem to get the type parameters right for calling into dataclass.field and proper value for the response type is a mystery to me. What I have so far:

from dataclasses import dataclass, field, MISSING, _MISSING_TYPE
from typing import TypeVar, Union, Callable

_T = TypeVar("_T")


def myfield(
    my_required_key: str,
    *,
    default: Union[_MISSING_TYPE, _T] = MISSING,
    default_factory: Union[_MISSING_TYPE, Callable[[], _T]] = MISSING
) -> _T:
    return field(  # type: ignore
        metadata={'my_metadata': {'my_required_key': my_required_key}},
        default=default,
        default_factory=default_factory,
    )
@dataclass
class Foo:
    a: int = myfield("c")
    b: dict[str, str] = myfield("d", default_factory=dict)

This code will pass mypy validation, but PyCharm doesn't seem to like it, reporting that:

Mutable default 'myfield("d", default_factory=dict)' is not allowed. Use 'default_factory'`

I'm okay with ignoring the PyCharm error, as the class does appear to function correctly, and I use mypy in my CICD which seems to be cool with it.

As for the return type, I currently have myfield(...) -> _T. I feel like the signature should look more like myfield(...) -> Field[_T], but mypy rejects that idea, and reports:

error: Incompatible types in assignment (expression has type "Field[<nothing>]", variable has type "int")
error: Incompatible types in assignment (expression has type "Field[Dict[_KT, _VT]]", variable has type "Dict[str, str]")

I'm also not sure about how to type the default and default_factory parameters. Without the # type: ignore I will get:

error: No overload variant of "field" matches argument types "Dict[str, Dict[str, str]]", "Union[_MISSING_TYPE, _T]", "Union[_MISSING_TYPE, Callable[[], _T]]"
note: Possible overload variants:
note:     def [_T] field(*, default: _T, init: bool = ..., repr: bool = ..., hash: Optional[bool] = ..., compare: bool = ..., metadata: Optional[Mapping[str, Any]] = ...) -> _T
note:     def [_T] field(*, default_factory: Callable[[], _T], init: bool = ..., repr: bool = ..., hash: Optional[bool] = ..., compare: bool = ..., metadata: Optional[Mapping[str, Any]] = ...) -> _T
note:     def field(*, init: bool = ..., repr: bool = ..., hash: Optional[bool] = ..., compare: bool = ..., metadata: Optional[Mapping[str, Any]] = ...) -> Any

I see other libraries have gone about reducing the boiler-plate by making a factory method only return the metadata dictionary. i.e.

@dataclass
class Fizz:
    a: int = field(metadata=myfield("c"))
    b: dict[str, str] = field(metadata=myfield("d"), default_factory=dict)

This still feels a bit ugly to me, but maybe this is the way to go.

Any help or ideas for cleaning this up would be appreciated!



from Factory function for dataclass fields

No comments:

Post a Comment