Sunday, 18 April 2021

How to encode binary data as any arbitrary text representation?

I need a pair of functions for encoding binary data as any arbitrary text representation, and decoding it back

Say we have an ArrayBuffer of any size:

const buffer = new ArrayBuffer(1000)

Then we define a hexadecimal "lingo", and use it for encoding and decoding hex strings:

const lingo = "0123456789abcdef"

const text = encode(buffer, lingo)
const data = decode(text, lingo)

My goal is to define my own base48 "lingo", which omits vowels to avoid naughty words:

const lingo = "256789bcdfghjklmnpqrstvwxyzBCDFGHJKLMNPQRSTVWXYZ"

const text = encode(buffer, lingo)
const data = decode(text, lingo)

How can we approach creating the algorithms for efficiently transforming data between arbitrary representations? Even though this strikes me as something quite fundamental, I'm having a hard time finding resources to help me with this task

Bonus points if you can think of any plausible naughty words without any vowels, I even took out the numbers that look like vowels!

I'm working in javascript, but I'd also like to understand the principals in general. Thanks!



from How to encode binary data as any arbitrary text representation?

No comments:

Post a Comment