Hemant Vishwakarma: how to convert a bytearray in one row of a pyspark dataframe to a column of bytes?

Monday 25 October 2021

how to convert a bytearray in one row of a pyspark dataframe to a column of bytes?

My data currently looks something like this

df = pd.DataFrame({'content': [bytearray(b'\x01%\xeb\x8cH\x89')]})
spark.createDataFrame(df).show()

+-------------------+
|            content|
+-------------------+
|[01 25 EB 8C 48 89]|
+-------------------+

How do I get a column that has a row for each value in the array?

+-------+
|content|
+-------+
|      1|
|     37|
|    235|
|    140|
|     72|
|    137|
+-------+

I've tried explode but this will not work on a bytearray.

edit: additional context, the df is the result of reading in a binary file with spark.read.format('binaryfile').load(...).

from how to convert a bytearray in one row of a pyspark dataframe to a column of bytes?

Hemant Vishwakarma

Monday 25 October 2021

how to convert a bytearray in one row of a pyspark dataframe to a column of bytes?

No comments:

Post a Comment