I have a FAISS index populated with 8M embedding vectors. I don't have the embedding vectors anymore, only the index, and it is expensive to recompute the embeddings.
Can I search the index for the top-k most similar vectors to each of the index's vectors?
To be more concrete, say this is how my index was populated:
d = 1024
N = 100
embeddings = np.random.rand(N, d)
ids = range(N)
index = faiss.index_factory(
d, 'IDMap,Flat', faiss.METRIC_INNER_PRODUCT
)
index.add_with_ids(embeddings, ids)
I would like to get D, I
such that:
D, I = index.search(embeddings, k)
but I don't have access to embeddings
anymore, I only have the index
.
I tried using index.reconstruct()
to get back my (approximated?) embeddings but I run into
RuntimeError: Error in virtual void
faiss::Index::reconstruct(faiss::Index::idx_t, float*) const at /root/miniconda3/conda-bld/faiss-pkg_1613228717761/work/faiss/Index.cpp:57: reconstruct not implemented for this type of index
from can I 'inner-search' most similar vectors within a FAISS index?
No comments:
Post a Comment