Thursday, 7 April 2022

How to remove a watermark from a document image?

I have the following images

and another variant of it with the exact same logo

where I'm trying to get rid of the logo itself while preserving the underlying text. Using the following code segment

import skimage.filters as filters
import cv2

image = cv2.imread('ingrained.jpeg')

gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
smooth1 = cv2.GaussianBlur(gray, (5,5), 0)
division1 = cv2.divide(gray, smooth1, scale=255)

sharpened = filters.unsharp_mask(division1, radius=3, amount=7, preserve_range=False)
sharpened = (255*sharpened).clip(0,255).astype(np.uint8)

# line segments
components, output, stats, centroids = cv2.connectedComponentsWithStats(sharpened, connectivity=8)
sizes = stats[1:, -1]; components = components - 1
size = 100
result = np.zeros((output.shape))
for i in range(0, components):
    if sizes[i] >= size:
        result[output == i + 1] = 255

cv2.imwrite('image-after.jpeg',result)

I've got these results

But as shown, the resulting images are respectively inconsistent as for the watermark contours' remains and the letters washed out. Is there a better solution that can be added? An ideal solution would be the removal of the watermark borders without affecting the text lying beneath it.



from How to remove a watermark from a document image?

No comments:

Post a Comment