问题描述
我有一个图像文件夹。我想从每个图像计算VLAD特征。
I have a folder of images. I want to compute VLAD features from each image.
我遍历每个图像,加载它,并获得如下的SIFT描述符:
I loop over each image, load it, and obtain the SIFT descriptors as follows:
repo = '/media/data/images/';
filelist = dir([repo '*.jpg']);
sift_descr = {}
for i = 1:size(filelist, 1)
I = imread([repo filelist(i).name]) ;
I = single(rgb2gray(I)) ;
[f,d] = vl_sift(I) ;
sift_descr{i} = d
end
但是,VLAD需要矩阵描述符是2D。请参见。在VLAD编码之前,处理SIFT描述符的正确方法是什么?谢谢。
However, VLAD requires the matrix of descriptors to be 2D. See here. What is the correct way to process my SIFT descriptors, before VLAD encoding? Thank you.
推荐答案
首先,您需要获取可视单词词典,或者更具体一些:群集SIFT功能使用 k -means聚类的所有图像。在[1]中,使用例如粗略聚类。建议使用64或256个簇。
First, you need to obtain a dictionary of visual words, or to be more specific: cluster the SIFT features of all images using k-means clustering. In [1], a coarse clustering using e.g. 64, or 256 clusters is recommended.
为此,我们必须将所有描述符连接成一个矩阵,然后我们可以将其传递给功能。此外,我们将描述符从 uint8
转换为单
,作为 vl_kmeans
函数要求输入为单
或 double
。
For that, we have to concatenate all descriptors into one matrix, which we can then pass to the vl_kmeans
function. Further, we convert the descriptors from uint8
to single
, as the vl_kmeans
function requires the input to be either single
or double
.
all_descr = single([sift_descr{:}]);
centroids = vl_kmeans(all_descr, 64);
其次,你必须创建一个赋值矩阵,其维数为 NumberOfClusters-by- NumberOfDescriptors ,它将每个描述符分配给一个集群。您可以灵活地创建此分配矩阵:您可以执行软分配或硬分配,您可以自行决定使用简单的最近邻搜索或kd树或其他近似或分层最近邻居方案。
Second, you have to create an assignment matrix, which has the dimensions NumberOfClusters-by-NumberOfDescriptors, which assigns each descriptor to a cluster. You have a lot of flexibility in creating this assignment matrix: you can do soft or hard assignments, you can use simple nearest neighbor search or kd-trees or other approximate or hierarchical nearest neighbor schemes at your discretion.
在本教程中,他们使用kd-tree,所以让我们坚持:首先,必须构建一个kd-tree。此操作属于找到 centroids
后:
In the tutorial, they use kd-trees, so let's stick to that: First, a kd-tree has to be built. This operation belongs right after finding the centroids
:
kdtree = vl_kdtreebuild(centroids);
然后,我们准备为每个图像构建VLAD向量。因此,我们必须再次遍历所有图像,并独立地计算它们的VLAD向量。首先,我们完全按照教程中的描述创建赋值矩阵。然后,我们可以使用功能。
生成的VLAD向量的大小为 NumberOfClusters * SiftDescriptorSize ,即我们示例中的64 * 128 ..
Then, we are ready to construct the VLAD vector for each image. Thus, we have to go through all images again, and calculate their VLAD vector independently. First, we create the assignment matrix exactly as described in the tutorial. Then, we can encode the SIFT descriptors using the vl_vlad
function.The resulting VLAD vector will have the size NumberOfClusters * SiftDescriptorSize, i.e. 64*128 in our example..
enc = zeros(64*128, numel(sift_descr));
for k=1:numel(sift_descr)
% Create assignment matrix
nn = vl_kdtreequery(kdtree, centroids, single(sift_descr{k}));
assignments = zeros(64, numel(nn), 'single');
assignments(sub2ind(size(assignments)), nn, 1:numel(nn))) = 1;
% Encode using VLAD
enc(:, k) = vl_vlad(single(sift_descr{k}), centroids, assignments);
end
最后,我们为数据库中的所有图像提供高维VLAD向量。通常,您需要降低VLAD描述符的维数,例如使用PCA。
Finally, we have the high-dimensional VLAD vectors for all images in the database. Usually, you'll want to reduce the dimensionality of the VLAD descriptors e.g. using PCA.
现在,鉴于新图像不在数据库中,您可以使用 vl_sift
,使用 vl_kdtreequery
创建赋值矩阵,并使用 vl_vlad
为该图像创建VLAD向量。因此,您不必查找新的质心或创建新的kd树:
Now, given new image which is not in the database, you can extract the SIFT features using vl_sift
, create the assignment matrix with vl_kdtreequery
, and create the VLAD vector for that image using vl_vlad
. So, you don't have to find new centroids or create a new kd-tree:
% Load image and extract SIFT features
new_image = imread('filename.jpg');
new_image = single(rgb2gray(new_image));
[~, new_sift] = vl_sift(new_image);
% Create assignment matrix
nn = vl_kdtreequery(kdtree, centroids, single(new_sift));
assignments = zeros(64, numel(nn), 'single');
assignments(sub2ind(size(assignments)), nn, 1:numel(nn))) = 1;
% Encode using VLAD
new_vlad = vl_vlad(single(new_sift), centroids, assignments);
[1] Arandjelovic,R。,& Zisserman,A。(2013)。关于VLAD。 IEEE计算机视觉和模式识别会议(CVPR),1578-1585。
[1] Arandjelovic, R., & Zisserman, A. (2013). All About VLAD. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 1578–1585. https://doi.org/10.1109/CVPR.2013.207
这篇关于用Matlab从VLFeat中的SIFT描述符中提取VLAD的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!