python - OpenCV-Python Bag Of Words(BoW)从字典生成直方图

我一直在尝试使用要点和单词袋技术在Python OpenCV 3.2.0中创建图像分类器。经过一番阅读后，我发现我可以如下执行此操作

使用AKAZE

提取图像描述符

对描述符执行k均值聚类以生成字典

根据字典

生成图像的直方图

使用直方图训练SVM

我设法执行了步骤1和2，但陷入了步骤3和4。

我成功地使用了k均值聚类返回的标签来生成直方图。但是，当我想使用未用于生成字典的新测试数据时，出现了一些意外的结果。我尝试使用像tutorial这样的FLANN匹配器，但是从标签数据生成直方图得到的结果与FLANN匹配返回的数据不匹配。

我加载图像:

dictionary_size = 512
# Loading images
imgs_data = []
# imreads returns a list of all images in that directory
imgs = imreads(imgs_path)
for i in xrange(len(imgs)):
    # create a numpy to hold the histogram for each image
    imgs_data.insert(i, np.zeros((dictionary_size, 1)))

然后，我创建一个描述符数组(desc):

def get_descriptors(img, detector):
    # returns descriptors of an image
    return detector.detectAndCompute(img, None)[1]

# Extracting descriptors
detector = cv2.AKAZE_create()

desc = np.array([])
# desc_src_img is a list which says which image a descriptor belongs to
desc_src_img = []
for i in xrange(len(imgs)):
    img = imgs[i]
    descriptors = get_descriptors(img, detector)
    if len(desc) == 0:
        desc = np.array(descriptors)
    else:
        desc = np.vstack((desc, descriptors))
    # Keep track of which image a descriptor belongs to
    for j in range(len(descriptors)):
        desc_src_img.append(i)
# important, cv2.kmeans only accepts type32 descriptors
desc = np.float32(desc)

然后使用k均值对描述符进行聚类:

# Clustering
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 10, 0.01)
flags = cv2.KMEANS_PP_CENTERS
# desc is a type32 numpy array of vstacked descriptors
compactness, labels, dictionary = cv2.kmeans(desc, dictionary_size, None, criteria, 1, flags)

然后，我使用从k均值返回的标签为每个图像创建直方图:

# Getting histograms from labels
size = labels.shape[0] * labels.shape[1]
for i in xrange(size):
    label = labels[i]
    # Get this descriptors image id
    img_id = desc_src_img[i]
    # imgs_data is a list of the same size as the number of images
    data = imgs_data[img_id]
    # data is a numpy array of size (dictionary_size, 1) filled with zeros
    data[label] += 1

ax = plt.subplot(311)
ax.set_title("Histogram from labels")
ax.set_xlabel("Visual words")
ax.set_ylabel("Frequency")
ax.plot(imgs_data[0].ravel())

这将输出一个像

python - OpenCV-Python Bag Of Words(BoW)从字典生成直方图-LMLPHP

这样的直方图，它非常均匀地分布并且符合我的期望。

然后，我尝试使用FLANN在同一张图片上执行相同的操作:

matcher = cv2.FlannBasedMatcher_create()
matcher.add(dictionary)
matcher.train()

descriptors = get_descriptors(imgs[0], detector)

result = np.zeros((dictionary_size, 1), np.float32)
# flan matcher needs descriptors to be type32
matches = matcher.match(np.float32(descriptors))
for match in matches:
    visual_word = match.trainIdx
    result[visual_word] += 1

ax = plt.subplot(313)
ax.set_title("Histogram from FLANN")
ax.set_xlabel("Visual words")
ax.set_ylabel("Frequency")
ax.plot(result.ravel())

这样会输出一个像

这样的直方图，它非常不均匀地分布并且与第一个直方图不匹配。

您可以在GitHub上查看完整的代码和图像。在运行之前，将“imgs_path”(第20行)更改为包含图像的目录。

我要去哪里错了？为什么直方图如此不同？如何使用字典生成新数据的直方图？

附带说明一下，我尝试使用OpenCV BOW实现，但发现了另一个导致错误的问题:“_ queryDescriptors.type()==函数cv::BFMatcher::knnMatchImpl中的trainDescType”，这就是为什么我自己尝试实现它。如果有人可以使用Python OpenCV BOW和AKAZE提供一个可行的示例，那也将同样有用。

最佳答案

看来您无法事先使用字典来训练FlannBasedMatcher，如下所示:

matcher = cv2.FlannBasedMatcher_create()
matcher.add(dictionary)
matcher.train()

但是，您可以在匹配时像这样传递dictionary:

matcher = cv2.FlannBasedMatcher_create()

...

matches = matcher.match(np.float32(descriptors), dictionary)

我不太确定为什么会这样。也许train方法仅由post中暗示的match方法使用。

同样根据opencv docs，match的参数为:

因此，我猜您只应该将dictionary作为trainDescriptors传入，因为这就是事实。

如果有人能对此有更多的了解，将不胜感激。

使用上述方法后的结果如下:

您可以看到完整的更新代码here。

关于python - OpenCV-Python Bag Of Words(BoW)从字典生成直方图，我们在Stack Overflow上找到一个类似的问题：https://stackoverflow.com/questions/43104111/