It seems to me that Pandas ExtensionArrays would be one of the cases where a simple example to get one started would really help. However, I have not found a simple enough example anywhere.
Creating an ExtensionArray
To create an ExtensionArray, you need to
- Create an
ExtensionDtypeand register it - Create an
ExtensionArrayby implementing the required methods.
There is also a section in the Pandas documentation with a brief overview.
Example implementations
There are many examples of implementations:
- Pandas' own internal extension arrays
- Geopandas'
GeometryArray - Pandas documentation has a list of projects with extension data types
- e.g. CyberPandas'
IPArray
- e.g. CyberPandas'
- Many others around the web, for example Fletcher's
StringSupportingExtensionArray
Question
Despite having studied all of the above, I still find extension arrays difficult to understand. All of the examples have a lot of specifics and custom functionality that makes it difficult to work out what is actually necessary. I suspect many have faced a similar problem.
I am thus asking for a simple and minimal example of a working ExtensionArray.
To have a concrete example, let's say I want to extend ExtensionArray to obtain an integer array that is able to hold NA values. That is essentially IntegerArray, but stripped of any actual functionality beyond the basics of ExtensionArray.
from Simple example of Pandas ExtensionArray
No comments:
Post a Comment