train_with_data_aug(no_aug, no_aug)——————————————————————————————结果——————————————————————training on [gpu(0)]epoch 1, loss 1.3485, train acc 0.522, test acc 0.556, time 62.9 secepoch 2, loss 0.7872, train acc 0.722, test acc 0.705, time 65.0 secepoch 3, loss 0.5654, train acc 0.802, test acc 0.738, time 67.0 secepoch 4, loss 0.4175, train acc 0.853, test acc 0.777, time 67.9 secepoch 5, loss 0.3043, train acc 0.895, test acc 0.789, time 67.8 secepoch 6, loss 0.2183, train acc 0.923, test acc 0.799, time 68.2 secepoch 7, loss 0.1547, train acc 0.946, test acc 0.810, time 68.5 secepoch 8, loss 0.1150, train acc 0.960, test acc 0.799, time 68.9 secepoch 9, loss 0.0814, train acc 0.972, test acc 0.809, time 69.1 secepoch 10, loss 0.0725, train acc 0.974, test acc 0.806, time 70.2 sec
在基于CIFAR-10数据集的模型训练中增加不同的图像增广方法。观察实现结果。
答:每轮的耗时不变,测试集效率有提高,且测试集拟合的也比较慢。
complex_aug = gdata.vision.transforms.Compose([ gdata.vision.transforms.RandomFlipLeftRight(), gdata.vision.transforms.RandomHue(0.5), gdata.vision.transforms.ToTensor()])train_with_data_aug(complex_aug, no_aug)————————————————————————————————结果————————————————————————training on [gpu(0)]epoch 1, loss 1.5822, train acc 0.446, test acc 0.496, time 69.3 secepoch 2, loss 0.9240, train acc 0.673, test acc 0.676, time 68.4 secepoch 3, loss 0.6791, train acc 0.764, test acc 0.739, time 68.6 secepoch 4, loss 0.5490, train acc 0.810, test acc 0.728, time 69.8 secepoch 5, loss 0.4555, train acc 0.842, test acc 0.777, time 70.5 secepoch 6, loss 0.3836, train acc 0.868, test acc 0.762, time 70.2 secepoch 7, loss 0.3227, train acc 0.889, test acc 0.795, time 69.8 secepoch 8, loss 0.2728, train acc 0.906, test acc 0.807, time 70.0 secepoch 9, loss 0.2392, train acc 0.918, test acc 0.823, time 70.8 secepoch 10, loss 0.1931, train acc 0.934, test acc 0.820, time 70.1 sec
查阅MXNet文档,Gluon的transforms模块还提供了哪些图像增广方法?
方法
涵义
Sequentially composes multiple transforms.
Cast inputs to a specific data type
transforms.ToTensor
Converts an image NDArray or batch of image NDArray to a tensor NDArray.
transforms.Normalize
Normalize an tensor of shape (C x H x W) or (N x C x H x W) with mean and standard deviation.
transforms.RandomResizedCrop
Crop the input image with random scale and aspect ratio.
transforms.CenterCrop
Crops the image src to the given size by trimming on all four sides and preserving the center of the image.
transforms.Resize
Resize an image or a batch of image NDArray to the given size.
transforms.RandomFlipLeftRight
Randomly flip the input image left to right with a probability of p(0.5 by default).
transforms.RandomFlipTopBottom
Randomly flip the input image top to bottom with a probability of p(0.5 by default).
transforms.RandomBrightness
Randomly jitters image brightness with a factor chosen from [max(0, 1 - brightness), 1 + brightness].
transforms.RandomContrast
Randomly jitters image contrast with a factor chosen from [max(0, 1 - contrast), 1 + contrast].
transforms.RandomSaturation
Randomly jitters image saturation with a factor chosen from [max(0, 1 - saturation), 1 + saturation].
transforms.RandomHue
Randomly jitters image hue with a factor chosen from [max(0, 1 - hue), 1 + hue].
transforms.RandomColorJitter
Randomly jitters the brightness, contrast, saturation, and hue of an image.
finetune_net.features.collect_params().setattr('grad_req', 'null')
————————————————输出——————————————————
training on [gpu(0)]
epoch 1, loss 0.4164, train acc 0.824, test acc 0.849, time 13.2 sec
epoch 2, loss 0.4104, train acc 0.820, test acc 0.848, time 13.3 sec
epoch 3, loss 0.4065, train acc 0.812, test acc 0.849, time 13.1 sec
epoch 4, loss 0.3911, train acc 0.820, test acc 0.850, time 13.1 sec
epoch 5, loss 0.3945, train acc 0.822, test acc 0.846, time 13.0 sec
aug_list (listorNone) – Augmenter list for generating distorted images
batch_size (int) – Number of examples per batch.
data_shape (tuple) – Data shape in (channels, height, width) format. For now, only RGB image with 3 channels is supported.
path_imgrec (str) – Path to image record file (.rec). Created with tools/im2rec.py or bin/im2rec.
path_imglist (str) – Path to image list (.lst). Created with tools/im2rec.py or with custom script. Format: Tab separated record of index, one or more labels and relative_path_from_root.
imglist (list) – A list of images with the label(s). Each item is a list [imagelabel: float or list of float, imgpath].
path_root (str) – Root folder of image files.
path_imgidx (str) – Path to image index file. Needed for partition and shuffling when using .rec source.
shuffle (bool) – Whether to shuffle all images at the start of each iteration or not. Can be slow for HDD.
part_index (int) – Partition index.
num_parts (int) – Total number of partitions.
data_name (str) – Data name for provided symbols.
label_name (str) – Name for detection labels
last_batch_handle (str**,optional) – How to handle the last batch. This parameter can be ‘pad’(default), ‘discard’ or ‘roll_over’. If ‘pad’, the last batch will be padded with data starting from the begining If ‘discard’, the last batch will be discarded If ‘roll_over’, the remaining elements will be rolled over to the next iteration
kwargs – More arguments for creating augmenter. See mx.image.CreateDetAugmenter.
image.CreateDetAugmenter的Parameters
data_shape (tuple of int) – Shape for output data
resize (int) – Resize shorter edge if larger than 0 at the begining
rand_crop (float) – [0, 1], probability to apply random cropping
rand_pad (float) – [0, 1], probability to apply random padding
rand_gray (float) – [0, 1], probability to convert to grayscale for all channels
rand_mirror (bool) – Whether to apply horizontal flip to image with probability 0.5
mean (np.ndarrayorNone) – Mean pixel values for [r, g, b]
std (np.ndarrayorNone) – Standard deviations for [r, g, b]
brightness (float) – Brightness jittering range (percent)
contrast (float) – Contrast jittering range (percent)
saturation (float) – Saturation jittering range (percent)
hue (float) – Hue jittering range (percent)
pca_noise (float) – Pca noise level (percent)
inter_method (int**,default=2(Area-based**)) –
Interpolation method for all resizing operations
Possible values: 0: Nearest Neighbors Interpolation. 1: Bilinear interpolation. 2: Area-based (resampling using pixel area relation). It may be a preferred method for image decimation, as it gives moire-free results. But when the image is zoomed, it is similar to the Nearest Neighbors method. (used by default). 3: Bicubic interpolation over 4x4 pixel neighborhood. 4: Lanczos interpolation over 8x8 pixel neighborhood. 9: Cubic for enlarge, area for shrink, bilinear for others 10: Random select from interpolation method metioned above. Note: When shrinking an image, it will generally look best with AREA-based interpolation, whereas, when enlarging an image, it will generally look best with Bicubic (slow) or Bilinear (faster but still looks OK).
min_object_covered (float) – The cropped area of the image must contain at least this fraction of any bounding box supplied. The value of this parameter should be non-negative. In the case of 0, the cropped area does not need to overlap any of the bounding boxes supplied.
min_eject_coverage (float) – The minimum coverage of cropped sample w.r.t its original size. With this constraint, objects that have marginal area after crop will be discarded.
aspect_ratio_range (tuple of floats) – The cropped area of the image must have an aspect ratio = width / height within this range.
area_range (tuple of floats) – The cropped area of the image must contain a fraction of the supplied image within in this range.
max_attempts (int) – Number of attempts at generating a cropped/padded region of the image of the specified constraints. After max_attempts failures, return the original image.
pad_val (float) – Pixel value to be filled when padding is enabled. pad_val will automatically be subtracted by mean and divided by std if applicable.