Monday 20 November 2023

Extracting feature embeddings from an image

I'm trying to use TensorFlow.js to extract feature embeddings from images.

Elsewhere I'm using PyTorch and ResNet152 to extract feature embeddings to good effect.

The following is a sample of how I'm extracting those feature embeddings.

import torch
import torchvision.models as models
from torchvision import transforms
from PIL import Image

# Load the model
resnet152_torch = models.resnet152(pretrained=True)

# Enumerate all of the layers of the model, except the last layer. This should leave
# the average pooling layer. 
layers = list(resnet152_torch.children())[:-1]

resnet152 = torch.nn.Sequential(*(list(resnet152_torch.children())[:-1]))

# Set to evaluation model. 
resnet152_torch.eval()

# Load and preprocess the image, it's already 224x224
image_path = "test.png" 
img = Image.open(image_path).convert("RGB")

# Define the image transformation
preprocess = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])

# Apply the preprocessing steps to the image
img_tensor = preprocess(img).unsqueeze(0)

with torch.no_grad():
    # Get the image features from the ResNet-152 model
    img_features = resnet152(img_tensor)

print(img_features.squeeze())

Essentially, I'm using the pre-trained model and dropping the last layer to get my feature embeddings.

The result of the above script is:

tensor([0.2098, 0.4687, 0.0914,  ..., 0.0309, 0.0919, 0.0480])

So now, I want to do something similar with TensorFlow.js.

The first thing that I need is an instance of the ResNet152 model that I can use with TensorFlow.js. So I created the following Python script to export ResNet152 to the Keras format...

from tensorflow.keras.applications import ResNet152
from tensorflow.keras.models import save_model

# Load the pre-trained ResNet-152 model without the top (fully connected) layer
resnet152 = ResNet152(weights='imagenet')

# Set the model to evaluation mode
resnet152.trainable = False

# Save the ResNet-152 model
save_model(resnet152, "resnet152.h5")

And then I exported the Keras (.h5) model to the TensorFlow.js format using the "tensorflowjs_converter" utility...

tensorflowjs_converter --input_format keras resnet152.h5 resnet152   

Once I have the model in the appropriate format (I think), I switch over to Javascript.

import * as tf from '@tensorflow/tfjs-node';
import fs from 'fs';

async function main() {
    const model = await tf.loadLayersModel('file://resnet152/model.json');

    const modelWithoutFinalLayer = tf.model({
        inputs: model.input,
        outputs: model.getLayer('avg_pool').output
    });

    // Load the image from disk
    const image = fs.readFileSync('example_images/test.png'); // This is the exact same image file.
    const imageTensor = tf.node.decodeImage(image, 3);
    const preprocessedInput = tf.div(tf.sub(imageTensor, [123.68, 116.779, 103.939]), [58.393, 57.12, 57.375]);

    const batchedInput = preprocessedInput.expandDims(0);
    const embeddings = modelWithoutFinalLayer.predict(batchedInput).squeeze();

    embeddings.print();

    return;
}

await main();

The result of the above script is:

Tensor
    [0, 0, 0, ..., 0, 0, 0.029606]

Looking at the first three values of the outputs between the two versions of the script, I expected there to be some variation but not THIS MUCH.

Where do I go from here? Is this much variation expected? Am I just doing this wrong?

Any help would be greatly appreciated.



from Extracting feature embeddings from an image

No comments:

Post a Comment