问题描述
更新
直接Colab
数据集
我已经创建了一个虚拟数据集.链接
它总共有 4类和 df.object.value_counts()
:
human 2313号车猫 5狗3
数据加载器和马赛克增强
数据加载器定义如下.但是,应该在内部定义镶嵌增强功能,但现在,我将创建一个单独的代码段以进行更好的演示.
IMG_SIZE = 2000DatasetRetriever(Dataset)类:def __init__(self, main_df, image_ids, transforms=None, test=False):super().__ init __()self.image_ids = image_idsself.main_df = main_dfself.transforms =转换self.size_limit = 1self.test =测试def __getitem __(self,index:int):image_id = self.image_ids [index]图片,盒子,标签= self.load_mosaic_image_and_boxes(index)#标签= torch.tensor(labels,dtype = torch.int64)#多类labels = torch.ones((boxes.shape [0],),dtype = torch.int64)#单类目标= {}target ['boxes'] =盒子target ['cls'] =标签target ['image_id'] = torch.tensor([index])如果self.transforms:对于我在范围(10)中:样本= self.transforms(** {'图像':图像,'bboxes':target ['boxes'],标签":target ['cls']})断言len(sample ['bboxes'])== target ['cls'].shape [0],不相等!"如果len(sample ['bboxes'])>0:# 图像图片=样本['图片']# 盒子target ['boxes'] = torch.tensor(sample ['bboxes'])target ['boxes'] [:,[0,1,2,3]] = target ['boxes'] [:,[1,0,3,2]]# 标签target ['cls'] = torch.stack(sample ['labels'])休息返回图像,目标def __len __(self)->整数:返回self.image_ids.shape [0]
基本转换
def get_transforms():返回A.Compose([A.Resize(高度= IMG_SIZE,宽度= IMG_SIZE,p = 1.0),ToTensorV2(p = 1.0),],p = 1.0,bbox_params = A.BboxParams(格式='pascal_voc',min_area = 0,min_visibility = 0,label_fields = ['标签']))
马赛克增强
注意,它应该在数据加载器内部定义.主要问题是,在此扩充中,虽然将迭代所有 4 个样本以创建此类扩充,但图像和 bounding_box 的缩放比例如下:
mosaic_image [y1a:y2a,x1a:x2a] =图像[y1b:y2b,x1b:x2b]offset_x = x1a-x1boffset_y = y1a-y1b框[:,0] + = offset_x框[:,1] + = offset_y框[:,2] + = offset_x框[:,3] + = offset_y
通过这种方式,我该如何为那些选择的 bounding_box 选择相关的类别标签?请查看下面的完整代码:
def load_mosaic_image_and_boxes(self,index,s = 3000,minfrac = 0.25,maxfrac = 0.75):self.mosaic_size = sxc,yc = np.random.randint(s * minfrac,s * maxfrac,(2,))#随机其他3个样本索引= [索引] + random.sample(range(len(self.image_ids)),3)mosaic_image = np.zeros((s,s,3),dtype = np.float32)final_boxes = []子区域的框final_labels = []#个相关的类别标签对于我,在枚举(索引)中的索引:图片,盒子,标签= self.load_image_and_boxes(index)if i == 0: # 左上角x1a,y1a,x2a,y2a = 0、0,xc,ycx1b,y1b,x2b,y2b = s-xc,s-yc,s,s#从右下角开始elif i == 1:#右上x1a,y1a,x2a,y2a = xc,0,s,ycx1b,y1b,x2b,y2b = 0,s-yc,s-xc,s#从左下角开始Elif I == 2:#左下x1a,y1a,x2a,y2a = 0,yc,xc,sx1b,y1b,x2b,y2b = s-xc,0,s,s-yc#从右上方开始elif i == 3:#右下x1a,y1a,x2a,y2a = xc,yc,s,sx1b,y1b,x2b,y2b = 0,0,s-xc,s-yc#从左上方开始#计算并应用由于替换而引起的框偏移offset_x = x1a-x1boffset_y = y1a-y1b框[:,0] + = offset_x框[:,1] + = offset_y框[:,2] + = offset_x盒子[:, 3] += offset_y#剪切图像,保存框mosaic_image [y1a:y2a,x1a:x2a] =图像[y1b:y2b,x1b:x2b]final_boxes.append(boxes)'''注意力:需要一些机制来获取相关的类标签'''final_labels.append(标签)#收集箱final_boxes = np.vstack(final_boxes)final_labels = np.hstack(final_labels)#剪辑框到图像区域final_boxes [:, 0:] = np.clip(final_boxes [:, 0:],0,s).astype(np.int32)w =(final_boxes [:,2]-final_boxes [:,0])h =(final_boxes [:,3]-final_boxes [:,1])#丢弃w或h
就是这样.希望我能清楚地说明我的问题.您的建议将不胜感激.
通过此查询,我还更新了几天前提出的另一个非常相关查询,但没有得到足够的答复.我也更新了该查询,并使其更加清晰.如果您有兴趣,请链接:
Update
Direct Colab Link. Just grab the given dummy data set and load it to colab.
I'm trying to train an object detection model for a multi-class problem. In my training, I am using the Mosaic augmentation, Paper, for this task.
In my training mechanism, I'm a bit stuck to properly retrieve the class labels of each category, as the augmentation mechanism randomly picks the sub-portion of a sample. However, below is a result of a mosaic augmentation that we've achieved with a relevant bounding box until now.
Data Set
I've created a dummy data set. Link here. The
df.head()
:
It has 4 class in total and
df.object.value_counts()
:
human 23
car 13
cat 5
dog 3
Data Loader and Mosaic Augmentation
The data loader is defined as follows. However, the mosaic augmentation should be defined inside but for now, I'll create a separate code snippet for better demonstration.
IMG_SIZE = 2000
class DatasetRetriever(Dataset):
def __init__(self, main_df, image_ids, transforms=None, test=False):
super().__init__()
self.image_ids = image_ids
self.main_df = main_df
self.transforms = transforms
self.size_limit = 1
self.test = test
def __getitem__(self, index: int):
image_id = self.image_ids[index]
image, boxes, labels = self.load_mosaic_image_and_boxes(index)
# labels = torch.tensor(labels, dtype=torch.int64) # for multi-class
labels = torch.ones((boxes.shape[0],), dtype=torch.int64) # for single-class
target = {}
target['boxes'] = boxes
target['cls'] = labels
target['image_id'] = torch.tensor([index])
if self.transforms:
for i in range(10):
sample = self.transforms(**{
'image' : image,
'bboxes': target['boxes'],
'labels': target['cls']
})
assert len(sample['bboxes']) == target['cls'].shape[0], 'not equal!'
if len(sample['bboxes']) > 0:
# image
image = sample['image']
# box
target['boxes'] = torch.tensor(sample['bboxes'])
target['boxes'][:,[0,1,2,3]] = target['boxes'][:,[1,0,3,2]]
# label
target['cls'] = torch.stack(sample['labels'])
break
return image, target
def __len__(self) -> int:
return self.image_ids.shape[0]
Basic Transform
def get_transforms():
return A.Compose(
[
A.Resize(height=IMG_SIZE, width=IMG_SIZE, p=1.0),
ToTensorV2(p=1.0),
],
p=1.0,
bbox_params=A.BboxParams(
format='pascal_voc',
min_area=0,
min_visibility=0,
label_fields=['labels']
)
)
Mosaic Augmentation
Note, It should be defined inside the data loader. The main issue is, in this augmentation, while iterating will all 4 samples to create such augmentation, image and bounding_box is rescaled as follows:
mosaic_image[y1a:y2a, x1a:x2a] = image[y1b:y2b, x1b:x2b]
offset_x = x1a - x1b
offset_y = y1a - y1b
boxes[:, 0] += offset_x
boxes[:, 1] += offset_y
boxes[:, 2] += offset_x
boxes[:, 3] += offset_y
In this way, how would I select the relevant class labels for those selected bounding_box? Please, see the full code below:
def load_mosaic_image_and_boxes(self, index, s=3000,
minfrac=0.25, maxfrac=0.75):
self.mosaic_size = s
xc, yc = np.random.randint(s * minfrac, s * maxfrac, (2,))
# random other 3 sample
indices = [index] + random.sample(range(len(self.image_ids)), 3)
mosaic_image = np.zeros((s, s, 3), dtype=np.float32)
final_boxes = [] # box for the sub-region
final_labels = [] # relevant class labels
for i, index in enumerate(indices):
image, boxes, labels = self.load_image_and_boxes(index)
if i == 0: # top left
x1a, y1a, x2a, y2a = 0, 0, xc, yc
x1b, y1b, x2b, y2b = s - xc, s - yc, s, s # from bottom right
elif i == 1: # top right
x1a, y1a, x2a, y2a = xc, 0, s , yc
x1b, y1b, x2b, y2b = 0, s - yc, s - xc, s # from bottom left
elif i == 2: # bottom left
x1a, y1a, x2a, y2a = 0, yc, xc, s
x1b, y1b, x2b, y2b = s - xc, 0, s, s-yc # from top right
elif i == 3: # bottom right
x1a, y1a, x2a, y2a = xc, yc, s, s
x1b, y1b, x2b, y2b = 0, 0, s-xc, s-yc # from top left
# calculate and apply box offsets due to replacement
offset_x = x1a - x1b
offset_y = y1a - y1b
boxes[:, 0] += offset_x
boxes[:, 1] += offset_y
boxes[:, 2] += offset_x
boxes[:, 3] += offset_y
# cut image, save boxes
mosaic_image[y1a:y2a, x1a:x2a] = image[y1b:y2b, x1b:x2b]
final_boxes.append(boxes)
'''
ATTENTION:
Need some mechanism to get relevant class labels
'''
final_labels.append(labels)
# collect boxes
final_boxes = np.vstack(final_boxes)
final_labels = np.hstack(final_labels)
# clip boxes to the image area
final_boxes[:, 0:] = np.clip(final_boxes[:, 0:], 0, s).astype(np.int32)
w = (final_boxes[:,2] - final_boxes[:,0])
h = (final_boxes[:,3] - final_boxes[:,1])
# discard boxes where w or h <10
final_boxes = final_boxes[(w>=self.size_limit) & (h>=self.size_limit)]
return mosaic_image, final_boxes, final_labels
That's it. I hope, I make my query clear. Your suggestion would be highly appreciated.
With this query, I've also update another very related query which I've asked a few days ago but didn't get enough response. I update that query too and make it more clear. In case you're interested, please, Link: Stratified K-Fold For Multi-Class Object Detection?
解决方案
Solved -)
The problem is solved. Initially, I thought it in a very hard way, However, all I just need to parse the
bounding box
and class label
information at the same time. Jokes aside, I lost 100 bounties >_<, I should try one more time
Anyway, below is the output that we've achieved now. In case you're interested to try it with your own data set, here is the colab notebook for a starter. Happy coding -)
这篇关于如何从对象检测数据加载器中的镶嵌增强中获取类标签?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!