Tuesday, 9 April 2019

Search for bitstring most unlike a set of bitstrings

I have a set of bitstrings: {'0011', '1100', '1110'} (all bitstrings within a set are of same length).

I want to quickly find the bitstring of same length that has the smallest max-similarity to the set. Max-similarity can be computed as such:

def max_similarity(bitstring, set):
    max = 0
    for item in set:
        temp = 0
        for i in range(len(bitstring)):
            if bitstring[i] == item[i]:
                temp += 1
        if temp > max:
            max = temp
    return max

I know that I can iterate through all the possible bitstrings of that length, compute the max-similarity for each one and finally keep the smallest of these iterations. But this solves the problem in O(2^n). I would like to know if someone sees any quicker alternatives.

I've been playing around with Pythons XOR:

def int2bin(integer, digits):
    if integer >= 0:
        return bin(integer)[2:].zfill(digits)
    else:
        return bin(2**digits + integer)[2:]


def XOR(bitset):  
    intset = [int('{}'.format(bitstring), 2) for bitstring in bitset]

    digits = len(bitset.pop())

    if len(intset) == 1:
        return int2bin(~intset.pop(), digits)        
    else:
        curr = 0    
        while intset:
            curr = curr ^ intset.pop()

        return int2bin(curr, digits)

if __name__ == '__main__':
    bitset1 = {'0011', '1100', '1110'}
    bitset2 = {'01001', '11100', '10111'}
    print(XOR(bitset1))
    print(XOR(bitset2))

>>> python test.py
0001
00010

(Function int2bin stolen from here)

But I've found it works for some input, and not for others. In the test above it finds the correct solution for bitset2, but not for bitset1. Any solutions below O(2^n) available?



from Search for bitstring most unlike a set of bitstrings

No comments:

Post a Comment