I'm trying to use TensorFlow.js to extract feature embeddings from images.
Elsewhere I'm using PyTorch and ResNet152 to extract feature embeddings to good effect.
The following is a sample of how I'm extracting those feature embeddings.
import torch
import torchvision.models as models
from torchvision import transforms
from PIL import Image
# Load the model
resnet152_torch = models.resnet152(pretrained=True)
# Enumerate all of the layers of the model, except the last layer. This should leave
# the average pooling layer.
layers = list(resnet152_torch.children())[:-1]
resnet152 = torch.nn.Sequential(*(list(resnet152_torch.children())[:-1]))
# Set to evaluation model.
resnet152_torch.eval()
# Load and preprocess the image, it's already 224x224
image_path = "test.png"
img = Image.open(image_path).convert("RGB")
# Define the image transformation
preprocess = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
# Apply the preprocessing steps to the image
img_tensor = preprocess(img).unsqueeze(0)
with torch.no_grad():
# Get the image features from the ResNet-152 model
img_features = resnet152(img_tensor)
print(img_features.squeeze())
Essentially, I'm using the pre-trained model and dropping the last layer to get my feature embeddings.
The result of the above script is:
tensor([0.2098, 0.4687, 0.0914, ..., 0.0309, 0.0919, 0.0480])
So now, I want to do something similar with TensorFlow.js.
The first thing that I need is an instance of the ResNet152 model that I can use with TensorFlow.js. So I created the following Python script to export ResNet152 to the Keras format...
from tensorflow.keras.applications import ResNet152
from tensorflow.keras.models import save_model
# Load the pre-trained ResNet-152 model without the top (fully connected) layer
resnet152 = ResNet152(weights='imagenet')
# Set the model to evaluation mode
resnet152.trainable = False
# Save the ResNet-152 model
save_model(resnet152, "resnet152.h5")
And then I exported the Keras (.h5) model to the TensorFlow.js format using the "tensorflowjs_converter" utility...
tensorflowjs_converter --input_format keras resnet152.h5 resnet152
Once I have the model in the appropriate format (I think), I switch over to Javascript.
import * as tf from '@tensorflow/tfjs-node';
import fs from 'fs';
async function main() {
const model = await tf.loadLayersModel('file://resnet152/model.json');
const modelWithoutFinalLayer = tf.model({
inputs: model.input,
outputs: model.getLayer('avg_pool').output
});
// Load the image from disk
const image = fs.readFileSync('example_images/test.png'); // This is the exact same image file.
const imageTensor = tf.node.decodeImage(image, 3);
const preprocessedInput = tf.div(tf.sub(imageTensor, [123.68, 116.779, 103.939]), [58.393, 57.12, 57.375]);
const batchedInput = preprocessedInput.expandDims(0);
const embeddings = modelWithoutFinalLayer.predict(batchedInput).squeeze();
embeddings.print();
return;
}
await main();
The result of the above script is:
Tensor
[0, 0, 0, ..., 0, 0, 0.029606]
Looking at the first three values of the outputs between the two versions of the script, I expected there to be some variation but not THIS MUCH.
Where do I go from here? Is this much variation expected? Am I just doing this wrong?
Any help would be greatly appreciated.
from Extracting feature embeddings from an image
No comments:
Post a Comment