在 tensorflow 训练管道中禁用增强

本文介绍了在 tensorflow 训练管道中禁用增强的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我用谷歌搜索了一下，但只找到了有关启用数据增强的问题.

我遵循了这个

参考正常图像看起来像这样

如您所见，我尝试检测 Kellogs 盒.数据集是使用搅拌机生成的(汽水罐和围栏有某种诱饵对象，并且能够部分覆盖盒子)

现在我的问题是:如何在对象检测 api 中禁用任何类型的数据增强?由于在训练过程中使用了这些扭曲的图像，因此地图非常非常低.

解决方案

这是图像规范化的问题.它不会影响您的训练.但是，如果您希望图像在 tensorboard 中正确显示，则将它们在 (0, 1) 之间标准化.查看此链接了解一些可能的更改.

注意:已报告在 (-1, 1) 之间进行标准化会产生相同的问题.

I googled around a bit but I only found questions about enabling data augmentation.

I followed this tutorial but with my own dataset (only one class). I already performed data augmentation on my dataset so I deleted the responsible lines from the pipeline.config.

Now my pipeline looks like this

model {
  ssd {
    num_classes: 1
    image_resizer {
      fixed_shape_resizer {
        height: 640
        width: 640
      }
    }
    feature_extractor {
      type: "ssd_resnet50_v1_fpn_keras"
      depth_multiplier: 1.0
      min_depth: 16
      conv_hyperparams {
        regularizer {
          l2_regularizer {
            weight: 0.00039999998989515007
          }
        }
        initializer {
          truncated_normal_initializer {
            mean: 0.0
            stddev: 0.029999999329447746
          }
        }
        activation: RELU_6
        batch_norm {
          decay: 0.996999979019165
          scale: true
          epsilon: 0.0010000000474974513
        }
      }
      override_base_feature_extractor_hyperparams: true
      fpn {
        min_level: 3
        max_level: 7
      }
    }
    box_coder {
      faster_rcnn_box_coder {
        y_scale: 10.0
        x_scale: 10.0
        height_scale: 5.0
        width_scale: 5.0
      }
    }
    matcher {
      argmax_matcher {
        matched_threshold: 0.5
        unmatched_threshold: 0.5
        ignore_thresholds: false
        negatives_lower_than_unmatched: true
        force_match_for_each_row: true
        use_matmul_gather: true
      }
    }
    similarity_calculator {
      iou_similarity {
      }
    }
    box_predictor {
      weight_shared_convolutional_box_predictor {
        conv_hyperparams {
          regularizer {
            l2_regularizer {
              weight: 0.00039999998989515007
            }
          }
          initializer {
            random_normal_initializer {
              mean: 0.0
              stddev: 0.009999999776482582
            }
          }
          activation: RELU_6
          batch_norm {
            decay: 0.996999979019165
            scale: true
            epsilon: 0.0010000000474974513
          }
        }
        depth: 256
        num_layers_before_predictor: 4
        kernel_size: 3
        class_prediction_bias_init: -4.599999904632568
      }
    }
    anchor_generator {
      multiscale_anchor_generator {
        min_level: 3
        max_level: 7
        anchor_scale: 4.0
        aspect_ratios: 1.0
        aspect_ratios: 2.0
        aspect_ratios: 0.5
        scales_per_octave: 2
      }
    }
    post_processing {
      batch_non_max_suppression {
        score_threshold: 9.99999993922529e-09
        iou_threshold: 0.6000000238418579
        max_detections_per_class: 100
        max_total_detections: 100
        use_static_shapes: false
      }
      score_converter: SIGMOID
    }
    normalize_loss_by_num_matches: true
    loss {
      localization_loss {
        weighted_smooth_l1 {
        }
      }
      classification_loss {
        weighted_sigmoid_focal {
          gamma: 2.0
          alpha: 0.25
        }
      }
      classification_weight: 1.0
      localization_weight: 1.0
    }
    encode_background_as_zeros: true
    normalize_loc_loss_by_codesize: true
    inplace_batchnorm_update: true
    freeze_batchnorm: false
  }
}
train_config {
  batch_size: 1

  sync_replicas: true
  optimizer {
    momentum_optimizer {
      learning_rate {
        cosine_decay_learning_rate {
          learning_rate_base: 0.03999999910593033
          total_steps: 25000
          warmup_learning_rate: 0.013333000242710114
          warmup_steps: 2000
        }
      }
      momentum_optimizer_value: 0.8999999761581421
    }
    use_moving_average: false
  }
  fine_tune_checkpoint: "/home/sally/work/training/TensorFlow/workspace/pre-trained-models/ssd_resnet50_v1_fpn_640x640_coco17_tpu-8/checkpoint/ckpt-0"
  num_steps: 25000
  startup_delay_steps: 0.0
  replicas_to_aggregate: 8
  max_number_of_boxes: 100
  unpad_groundtruth_tensors: false
  fine_tune_checkpoint_type: "detection"
  use_bfloat16: false
  fine_tune_checkpoint_version: V2
}
train_input_reader {
  label_map_path: "/home/sally/work/training/TensorFlow/workspace/annotations/label_map.pbtxt"
  tf_record_input_reader {
    input_path: "/home/sally/work/training/TensorFlow/workspace/annotations/train.record"
  }
}
eval_config {
  metrics_set: "coco_detection_metrics"
  use_moving_averages: false
}
eval_input_reader {
  label_map_path: "/home/sally/work/training/TensorFlow/workspace/annotations/label_map.pbtxt"
  shuffle: false
  num_epochs: 1
  tf_record_input_reader {
    input_path: "/home/sally/work/training/TensorFlow/workspace/annotations/test.record"
  }
}

I started the training but with tensorboard I can see that the training images are very very distorted.

For reference normal images look like this

As you can see I try to detect Kellogs boxes. The dataset is generated using blender (soda can and fence are to have some sort of decoy objects and to be able to cover the boxes partially)

Now my question: How do I disably any sort of data augmentation in the object detection api?The map is very very low because of these distorted images used during the training process.

解决方案

This is an issue with the normalization of the image. It does not affect your training.However, if you want the images to be displayed correctly in tensorboard, then normalize them between (0, 1). Check this link for some possible changes.

Note: normalizing between (-1, 1) has been reported to create the same issue.

这篇关于在 tensorflow 训练管道中禁用增强的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！