WO2022111231A1 - Cnn训练方法、电子设备和计算机可读存储介质 - Google Patents

Cnn训练方法、电子设备和计算机可读存储介质 Download PDF

Info

Publication number
WO2022111231A1
WO2022111231A1 PCT/CN2021/127979 CN2021127979W WO2022111231A1 WO 2022111231 A1 WO2022111231 A1 WO 2022111231A1 CN 2021127979 W CN2021127979 W CN 2021127979W WO 2022111231 A1 WO2022111231 A1 WO 2022111231A1
Authority
WO
WIPO (PCT)
Prior art keywords
training
cnn
verification
image input
image
Prior art date
Application number
PCT/CN2021/127979
Other languages
English (en)
French (fr)
Inventor
栗伟清
屠要峰
王永成
高洪
刘涛
金士英
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2022111231A1 publication Critical patent/WO2022111231A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the embodiments of the present application relate to the technical field of image processing, and in particular, to a CNN training method, an electronic device, and a computer-readable storage medium.
  • CNN Convolutional Neural Networks
  • the traditional architecture of CNN is generally to stack multi-layer convolution modules, including convolution layers and pooling layers.
  • the convolution layer is responsible for feature extraction of images
  • the pooling layer is responsible for downsampling, that is, reducing the feature dimension.
  • expand the receptive field retain the salient features, and then generally have more than two fully connected layers, and finally add the Softmax layer.
  • the number of nodes output by the Softmax layer is equal to the number of categories, and each node corresponds to a category.
  • the CNN network has a huge amount of parameters and is prone to overfitting; on the other hand, this network architecture requires a fixed-size image input.
  • the embodiment of the present application provides a CNN training method, the method includes: determining each training stage in the training process of the convolutional neural network CNN and the sequence between the training stages; The image input size of each training stage; wherein, the image input size of each training stage increases from small to large according to the sequence; according to the image corresponding to the image input size of each training stage, the CNN is trained .
  • An embodiment of the present application further provides an electronic device, comprising: at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores a program that can be executed by the at least one processor instructions, the instructions being executed by the at least one processor to enable the at least one processor to execute the above-described CNN training method.
  • the embodiment of the present application further provides a readable storage medium storing a computer program, and when the computer program is executed by a processor, the above-mentioned CNN training method is implemented.
  • FIG. 1 is a flowchart of a CNN training method according to a first embodiment of the present application
  • FIG. 2 is a flowchart of determining the image input size of each training stage according to the sequence according to the first embodiment of the present application;
  • FIG. 3 is a flowchart of a CNN training method according to a second embodiment of the present application.
  • Fig. 4 is according to the second embodiment of the present application, according to the image corresponding to the image input size of each training stage and the number of training cycles, the flow chart of training CNN;
  • FIG. 5 is a schematic diagram of the training speed of the CNN training method according to the second embodiment of the present application.
  • FIG. 6 is a flowchart of a CNN training method according to a third embodiment of the present application.
  • FIG. 7 is a flow chart of acquiring several verification sets according to the third embodiment of the present application.
  • FIG. 9 is a schematic structural diagram of an electronic device according to a fourth embodiment of the present application.
  • the main purpose of the embodiments of this application is to propose a CNN training method, an electronic device, and a computer-readable storage medium, which aims to divide the CNN training process into several training stages, and learn images from small to large according to the sequence between the training stages
  • the features of CNN can improve the training speed of CNN and the training accuracy of CNN at the same time.
  • Step 101 determining each training stage in the training process of the convolutional neural network CNN and the sequence between the training stages;
  • the server when the server trains the convolutional neural network CNN, it may first determine the training stages and the sequence between the training stages in the CNN training process.
  • the server may divide the training process of the CNN to be trained into several training stages, and determine the sequence of the several training stages by means of numbering or the like.
  • the number of divided training stages may be set by those skilled in the art according to actual needs, which is not specifically limited in the embodiments of the present application.
  • the server divides the training process of the CNN to be trained into four training stages, namely: the first training stage, the second training stage, the third training stage and the fourth training stage.
  • Step 102 determine the image input size of each training stage
  • the server may determine the image input size of each training stage according to the sequence. Among them, the image input size of each training stage increases from small to large in order.
  • the image input size of each training stage is determined according to the sequence, which can be realized by each sub-step as shown in FIG. 2, as follows:
  • Sub-step 1021 determine the original image input size of the CNN
  • the server may determine the original image input size of the CNN after determining the training stages and the sequence between the training stages in the training process of the convolutional neural network CNN.
  • the server may determine the original image input size of the CNN according to the data of the CNN provider (such as the CNN production company, etc.).
  • the CNN trained by the embodiment of the present application is a CNN including a global average pooling layer (Global Average Pooling).
  • the global average pooling layer is a pooling layer proposed to replace some fully connected layers.
  • the global average pooling layer starts directly from the feature channel. For example, the output of the last layer of convolutional layer has 2048 channels, then the global average pooling layer sums and averages the data of the entire plane on each channel, and finally obtains a 2048 vector, and finally Add another fully connected layer.
  • the global average pooling layer greatly reduces the number of parameters of the CNN, reduces the risk of overfitting of the CNN, and at the same time enables the input of images of any size to the CNN. Since the CNN containing the global average pooling layer allows inputting images of different sizes to the CNN, it also allows inputting rectangular images to the CNN, considering that the target features of some images are not in the central area of the image, or the shape of the recognition target is rectangular, such as For swords, mops, etc., using rectangular images for training can prevent the loss of important features and effectively improve the training effect.
  • the CNN being trained is ResNet-50
  • the server determines that the original image input size of ResNet-50 is: 224px ⁇ 224px.
  • the CNN being trained is AlexNet
  • the server determines that the original image input size of AlexNet is: 256px ⁇ 256px.
  • Sub-step 1022 according to the sequence and the original image input size of CNN, determine the image input size of each training stage;
  • the server may determine the image input size of each training stage according to the sequence and the original image input size of the CNN.
  • the image input size of the first training stage is smaller than the original image input size
  • the image input size of the last training stage is larger than the original image input size.
  • the model learns few features, and it can meet the requirements by learning from images with small size.
  • the features learned by the model gradually increase, and the size of the input image needs to be increased to meet the learning requirements of CNN.
  • Gradually increasing the image input size allows CNN to learn the resolution of each image with different sizes, which is beneficial to improve the recognition accuracy of CNN. Due to the small size of the image input in the previous stages, it can effectively improve the training of CNN. speed.
  • the CNN being trained is ResNet-50
  • the original image input size of ResNet-50 is 224px.
  • the server divides the training process of ResNet-50 into four training stages. According to the sequence and the original image input size of ResNet-50, the image input size of each training stage is determined as: the image input size of the first training stage is 96px; The image input size of the second training stage is 128px; the image input size of the third training stage is 224px; the image input size of the fourth training stage is 288px.
  • the CNN being trained is AlexNet
  • the original image input size of AlexNet is 227px.
  • the server divides the training process of AlexNet into three training stages. According to the sequence and the original image input size of AlexNet, the image input size of each training stage is determined as: the image input size of the first training stage is 128px; the image input size of the second training stage is 128px; The image input size is 227px; the image input size for the third training stage is 320px.
  • Step 103 Train the CNN according to the images corresponding to the image input sizes of each training stage.
  • the server may train the CNN according to the image corresponding to the image input size of each training stage.
  • the server can obtain images corresponding to the image input size of each training stage by performing data enhancement on the original image, and then train the CNN according to the images corresponding to the image input size of each training stage.
  • Data enhancement operations include, but are not limited to: image inversion, size cropping, affine transformation, super-resolution transformation, image blurring, sharpening, brightness adjustment, contrast fine-tuning, embossing, edge detection, adding Gaussian noise to sample images , color channel offset, etc.
  • the size of the training image is 224px.
  • the server obtains training images with sizes of 96px, 128px and 288px by means of cropping and scaling. Enter a 128px training image, a 224px training image for the third training stage, and a 288px training image for the fourth training stage.
  • each training stage of the training process of the convolutional neural network CNN and the sequence between the training stages are determined; according to the sequence, the image input size of each training stage is determined; The image input size increases from small to large in order, which can make the training process more scientific and reasonable.
  • the input images in each training stage use different sizes, which can greatly improve the CNN's ability to extract image features.
  • the CNN is trained according to the image corresponding to the image input size of each training stage.
  • the model learns few features. Learning from images with small size can meet the learning requirements.
  • the model The learned features gradually increase, and the size of the input image needs to be increased to meet the learning requirements of CNN. That is, the training process of CNN is divided into several training stages, and the features of the images are learned from small to large according to the sequence between the training stages, so as to improve the CNN.
  • the training speed is improved, and the training accuracy of CNN is improved at the same time.
  • FIG. 3 is a schematic diagram of the CNN training method described in the second embodiment of the present application, including:
  • Step 201 determining each training stage in the training process of the convolutional neural network CNN and the sequence between the training stages;
  • the step 201 has been described in the first embodiment, and will not be repeated here.
  • Step 202 determining the number of training cycles in each training stage
  • the server may determine the number of training epochs (Epochs) of each training stage after determining the training stages and the sequence between the training stages in the training process of the convolutional neural network CNN.
  • the server may obtain the number of training cycles input by those skilled in the art.
  • the number of training cycles input by those skilled in the art is summed up based on a large amount of actual training experience, which can make the setting of the number of training cycles in each training stage more scientific and reasonable.
  • the server determines that the number of training epochs for the last training phase is 1 or 2.
  • the CNN that has entered the last training stage has learned most of the features of the image. Setting the number of training cycles in the last training stage to 1 or 2 can prevent the CNN from learning the noise contained in the training image, thereby improving the final performance of the CNN. recognition accuracy.
  • the server divides the training process of CNN into four training stages, and the server determines the number of training cycles in each training stage as: the number of training cycles in the first training stage is 22; the number of training cycles in the second training stage is 26; the number of training cycles in the third training stage is 26; The number of training epochs in the training phase is 10; the number of training epochs in the fourth training phase is 2. That is, the number of training cycles in the entire training process is 60.
  • Step 203 determine the image input size of each training stage
  • step 203 has been described in the first embodiment, and will not be repeated here.
  • Step 204 according to the image corresponding to the image input size of each training stage and the number of training cycles, train the CNN;
  • the server may train the CNN according to the images corresponding to the image input size of each training stage and the number of training cycles.
  • the server inputs all the images corresponding to the image input size of a certain training stage into the training stage of the CNN to complete a training cycle of training at one time. After completing the training of a training cycle, the server will update according to the training results. The parameters of each layer of the CNN are trained for the next training cycle. The CNN is trained according to the images corresponding to the image input size of each training stage and the number of training cycles, that is, the CNN is iteratively trained.
  • the CNN is trained according to the images corresponding to the image input size of each training stage and the number of training cycles, which can be implemented by each sub-step as shown in Figure 4, as follows:
  • Sub-step 2041 according to the label value of the image corresponding to the image input size of each training stage and the output value of the CNN, determine the cost value after training in each training period;
  • the server may determine the cost value after training in each training cycle according to the label value of the image corresponding to the image input size of each training stage and the output value of the CNN.
  • the label value of the image corresponding to the image input size of each training stage is manually annotated, the label value is real and accurate.
  • the output value of CNN is calculated according to the input image and the parameters of each layer of CNN, which can truly reflect the recognition effect of CNN.
  • the cost value is calculated from the tag value, the output value and the cost function, and the cost function can be selected by those skilled in the art according to actual needs, which is not specifically limited in the embodiments of the present application.
  • Sub-step 2042 determine whether the cost value of the continuous preset number of training cycles decreases, if so, return to sub-step 2042 to continue to wait, otherwise, execute sub-step 2043;
  • the server may determine whether the cost value of the continuous preset number of training cycles decreases.
  • the preset number of training cycles can be set by those skilled in the art according to actual needs.
  • the preset number of training cycles is 5, and the cost values after training in the last 5 training cycles are: 1.21, 1.23, 1.22, 1.21, and 1.21.
  • the server determines that the cost value after training for 5 consecutive training cycles has not decreased. .
  • the preset number of training cycles is 4, and the cost values after training in the last four training cycles are: 0.97, 0.98, 0.91, and 0.9, respectively.
  • the server determines that the cost value after training for four consecutive training cycles is still decreasing. .
  • Sub-step 2043 enter the next training stage.
  • the server determines that the cost value of the continuous preset number of training cycles has not decreased, it will directly enter the next training stage. If the current stage is the last training stage, the training process of the CNN will be ended.
  • This embodiment of the present application may Effectively improve the training speed of CNN and avoid repeated and ineffective input of training resources. It has been verified that the training speed of the CNN training method provided according to the embodiment of the present application using a graphics processor (Graphics Processing Unit, GPU) is shown in Figure 5: wherein, 1GPU means that training is performed by 1 GPU, and 8GPU means 8 The GPU trains the CNN together.
  • 96 means the input image size is 96px
  • 128 means the input image size is 128px
  • 244 means the input image size is 244px
  • 288 means the input image size is 288px
  • the ordinate represents the training speed.
  • the current training stage is the second training stage
  • the preset number of training cycles is 5
  • the server determines that the cost value after training for 5 consecutive training cycles has not decreased, ends the second training stage, and directly enters the third training stage. training phase.
  • the current training stage is the fourth training stage, which is also the last training stage, the preset number of training cycles is 4, the server determines that the cost value after training for 4 consecutive training cycles has not decreased, and the server ends directly The training process of this CNN.
  • the second embodiment of the present application after determining the training stages and the sequence between the training stages in the training process of the convolutional neural network CNN, further includes: determining the number of training cycles in each training stage; Training the CNN with images corresponding to the image input size of each stage, including: training the CNN according to the images corresponding to the image input size of each training stage and the number of training cycles, which can make the CNN training process more scientific and reasonable, and further to improve the final recognition accuracy of CNN.
  • the CNN is trained according to the images corresponding to the image input size of each training stage and the number of training cycles, including: determining each training cycle according to the label value of the image corresponding to the image input size of each training stage and the output value of the CNN Cost value after training; if the cost value of the continuous preset number of training cycles does not decrease, entering the next training stage can effectively improve the training speed of CNN and avoid repeated and ineffective input of training resources.
  • FIG. 6 is a schematic diagram of the CNN training method described in the third embodiment of the present application, including:
  • Step 301 determining each training stage and the sequence between each training stage in the training process of the convolutional neural network CNN;
  • Step 302 determine the image input size of each training stage
  • Step 303 according to the image corresponding to the image input size of each training stage, train the CNN
  • steps 301 to 303 have been described in the first embodiment, and will not be repeated here.
  • Step 304 obtaining several verification sets
  • the server After the server finishes the training process of the CNN, it can obtain several verification sets and enter the verification process.
  • the verification set contains several verification images used to verify the recognition effect of CNN.
  • obtaining several verification sets can be achieved by the sub-steps shown in Figure 7, as follows:
  • Sub-step 3041 obtain several verification images, and determine the aspect ratio of several verification images
  • the server may acquire several verification images for verifying the recognition effect of the CNN, and determine the aspect ratio of the several verification images.
  • the server when the server acquires the training images for training, it may randomly select some images from them as verification images for verifying the recognition effect of the CNN, and calculate the aspect ratio of each verification image.
  • the size of a verification image is: 244px ⁇ 960px
  • Sub-step 3042 sort several verification images, and determine the sorting order of several verification images
  • the server may sort the verification images according to the aspect ratios of the verification images, and determine the sorting order of the verification images, wherein the sorting order may be from large to small , or may be from small to large, which is not specifically limited in the embodiments of the present application.
  • Sub-step 3043 according to the sorting order, obtain several verification sets
  • the server can obtain several verification sets according to the sorting order. Sorting all verification images can ensure that the aspect ratios of the same batch of images are not much different, so as to maximize the retention of important feature information.
  • the server may obtain a preset batch size, and obtain a number of validation sets in a sorted order according to the preset batch size and a number of sorted verification images.
  • the batch size is the number of verification images required for one verification.
  • the server sorts the 50,000 verification images in ascending order of aspect ratio.
  • the preset batch size is 500
  • Step 305 normalize the size of each verification image in the same verification set
  • the server may normalize the size of each verification image in the same verification set. Considering that the verification process is to evaluate the accuracy of the trained model, normalizing the size of the images in the verification set and then verifying the CNN can significantly improve the recognition effect of the CNN.
  • the verification image can be a rectangle
  • the rectangle image can be used for verification, and can be normalized using different aspect ratios in different verification sets, so as to retain important information to the maximum extent, thereby improving the recognition accuracy.
  • normalizing the size of each verification image in the same verification set can be achieved by each sub-step as shown in FIG. 8 , as follows:
  • Sub-step 3051 according to the aspect ratio of each verification image in the same verification set, determine the average aspect ratio of each verification image in the same verification set;
  • the server when the server normalizes the size of each verification image in the same verification set, it may first determine the average aspect ratio of each verification image in the same verification set according to the aspect ratio of each verification image in the same verification set.
  • Sub-step 3052 normalize the size of each verification image in the same verification set according to the average aspect ratio
  • the server may normalize the size of each verification image in the same verification set according to the average aspect ratio. Normalizing the size of each validation image in the same validation set according to the average aspect ratio can make the normalization process more scientific and reasonable.
  • the server can normalize the aspect ratio of each verification image to the average aspect ratio of each verification image in the verification set by means of clipping, scaling, etc. for each verification image in the same verification set.
  • a verification set contains 10 verification images, and the server determines that the average aspect ratio of each verification image in the verification set is: 0.258. The ratio is normalized to 0.258.
  • Step 306 verify the recognition effect of the CNN according to each verification image whose size is normalized.
  • the server can verify the recognition effect of the CNN according to each verification image whose size is normalized.
  • the server can determine the cost value after each verification according to the label value of each verification image, the output value of the CNN and the preset cost function, and judge the recognition effect of the CNN according to the cost value.
  • the method further includes: acquiring several verification sets; wherein, the verification sets include several recognition effects for verifying the CNN The size of each verification image in the same verification set is normalized; according to each verification image after size normalization, the recognition effect of CNN is verified, which can significantly improve the recognition effect of CNN.
  • Acquiring several verification sets includes: acquiring several verification images and determining the aspect ratio of several verification images; sorting several verification images according to the aspect ratio, and determining the sorting order of several verification images; and acquiring several verification sets according to the sorting order. It can ensure that the aspect ratios of the same batch of images are not much different, so as to maximize the retention of important feature information.
  • Normalizing the size of each verification image in the same verification set includes: determining the average aspect ratio of each verification image in the same verification set according to the aspect ratio of each verification image in the same verification set; according to the average aspect ratio , normalizing the size of each verification image in the same verification set can make the normalization process more scientific and reasonable.
  • the fourth embodiment of the present application relates to an electronic device, as shown in FIG. 9 , comprising: at least one processor 401 ; and a memory 402 communicatively connected to the at least one processor 401 ; wherein the memory 402 stores Instructions executable by the at least one processor 401, where the instructions are executed by the at least one processor 401, so that the at least one processor 401 can execute the CNN training methods in the foregoing embodiments.
  • the memory and the processor are connected by a bus, and the bus may include any number of interconnected buses and bridges, and the bus connects one or more processors and various circuits of the memory.
  • the bus may also connect together various other circuits, such as peripherals, voltage regulators, and power management circuits, which are well known in the art and therefore will not be described further herein.
  • the bus interface provides the interface between the bus and the transceiver.
  • a transceiver may be a single element or multiple elements, such as multiple receivers and transmitters, providing a means for communicating with various other devices over a transmission medium.
  • the data processed by the processor is transmitted on the wireless medium through the antenna, and further, the antenna also receives the data and transmits the data to the processor.
  • the processor is responsible for managing the bus and general processing, and can also provide various functions, including timing, peripheral interface, voltage regulation, power management, and other control functions. Instead, memory may be used to store data used by the processor in performing operations.
  • the fifth embodiment of the present application relates to a computer-readable storage medium storing a computer program.
  • the above method embodiments are implemented when the computer program is executed by the processor.
  • a storage medium includes several instructions to make a device ( It may be a single chip microcomputer, a chip, etc.) or a processor (processor) to execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, Read-Only Memory (ROM for short), Random Access Memory (RAM for short), magnetic disk or optical disk, etc. medium of program code.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

本申请实施例涉及图像处理技术领域,特别涉及一种CNN训练方法、电子设备和计算机可读存储介质。上述CNN训练方法包括:确定卷积神经网络CNN的训练过程中的各训练阶段和所述各训练阶段之间的先后顺序;根据所述先后顺序,确定所述各训练阶段的图像输入尺寸;其中,所述各训练阶段的图像输入尺寸按照所述先后顺序从小变大;根据与所述各训练阶段的图像输入尺寸对应的图像,对所述CNN进行训练。

Description

CNN训练方法、电子设备和计算机可读存储介质
相关申请的交叉引用
本申请基于申请号为“202011349094.5”、申请日为2020年11月26日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此以引入方式并入本申请。
技术领域
本申请实施例涉及图像处理技术领域,特别涉及一种CNN训练方法、电子设备和计算机可读存储介质。
背景技术
卷积神经网络(Convolutional Neural Networks,简称:CNN)是一类包含卷积计算且具有深度结构的前馈神经网络,是深度学习的代表算法之一。在图像处理技术领域,CNN的传统架构一般是堆叠多层卷积模块,包括卷积层和池化层,其中卷积层负责对图像进行特征提取,池化层负责降采样,即降低特征维度,扩大感受野,保留显著特征,之后一般是两层以上的全连接层,最后加上Softmax层,Softmax层输出的节点数等于类别数,每个节点对应一个类别。但传统架构存在两个问题,一方面是CNN网络参数量巨大,容易发生过拟合;另一方面这种网络架构需要固定大小的图像输入。
针对CNN的传统架构存在的问题,有学者提出了全局平均池化层(Global Average Pooling)来替换部分全连接层。然而,在训练包含全局平均池化层的CNN时,CNN的训练速度不高,训练精度低。
发明内容
本申请实施例提供了一种CNN训练方法,该方法包括:确定卷积神经网络CNN的训练过程中的各训练阶段和所述各训练阶段之间的先后顺序;根据所述先后顺序,确定所述各训练阶段的图像输入尺寸;其中,所述各训练阶段的图像输入尺寸按照所述先后顺序从小变大;根据与所述各训练阶段的图像输入尺寸对应的图像,对所述CNN进行训练。
本申请实施例还提供了一种电子设备,包括:至少一个处理器;以及,与所述至少一个处理器通信连接的存储器;其中,所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行上述的CNN训练方法。
本申请实施例还提供了一种可读存储介质,存储有计算机程序,所述计算机程序被处理器执行时实现上述的CNN训练方法。
附图说明
图1是根据本申请第一实施例的CNN训练方法的流程图;
图2是根据本申请第一实施例中,根据先后顺序,确定各训练阶段的图像输入尺寸的流程图;
图3是根据本申请第二实施例的CNN训练方法的流程图;
图4是根据本申请第二实施例中,根据与各训练阶段的图像输入尺寸对应的图像和训练周期数,对CNN进行训练的流程图;
图5是根据本申请第二实施例的CNN训练方法的训练速度的示意图;
图6是根据本申请第三实施例的CNN训练方法的流程图;
图7是根据本申请第三实施例中,获取若干验证集的流程图;
图8是根据本申请第三实施例中,对同一验证集内的各验证图像的尺寸归一化的流程图;
图9是根据本申请第四实施例的电子设备的结构示意图。
具体实施方式
本申请实施例的主要目的在于提出一种CNN训练方法、电子设备和计算机可读存储介质,旨在将CNN的训练过程分为若干训练阶段,按照训练阶段之间的先后顺序从小到大学习图像的特征,从而提高CNN的训练速度,同时提高CNN的训练精度。
为使本申请实施例的目的、技术方案和优点更加清楚,下面将结合附图对本申请的各实施例进行详细的阐述。然而,本领域的普通技术人员可以理解,在本申请各实施例中,为了使读者更好地理解本申请而提出了许多技术细节。但是,即使没有这些技术细节和基于以下各实施例的种种变化和修改,也可以实现本申请所要求保护的技术方案。以下各个实施例的划分是为了描述方便,不应对本申请的具体实现方式构成任何限定,各个实施例在不矛盾的前提下可以相互结合相互引用。
本实施例的CNN训练方法的具体流程可以如图1所示,包括:
步骤101,确定卷积神经网络CNN的训练过程中的各训练阶段和各训练阶段之间的先后顺序;
具体而言,服务器在对卷积神经网络CNN进行训练时,可以先确定CNN的训练过程中的各训练阶段和各训练阶段之间的先后顺序。
在具体实现中,服务器可以将待训练的CNN的训练过程划分成若干训练阶段,并通过编号等方式确定若干训练阶段的先后顺序。其中,划分的训练阶段的数目可以由本领域的技术人员根据实际需要进行设置,本申请的实施例对此不做具体限定。
在一个例子中,服务器将待训练的CNN的训练过程划分成四个训练阶段,分别为:第一训练阶段、第二训练阶段、第三训练阶段和第四训练阶段。
步骤102,根据先后顺序,确定各训练阶段的图像输入尺寸;
具体而言,服务器在确定完卷积神经网络CNN的训练过程中的各训练阶段和各训练阶段之间的先后顺序之后,可以根据先后顺序,确定各训练阶段的图像输入尺寸。其中,各训练阶段的图像输入尺寸按照先后顺序从小变大。
在一个例子中,根据先后顺序,确定各训练阶段的图像输入尺寸,可以由如图2所示的各子步骤实现,具体如下:
子步骤1021,确定CNN的原始图像输入尺寸;
具体而言,服务器在确定卷积神经网络CNN的训练过程中的各训练阶段和各训练阶段之间的先后顺序之后,可以确定CNN的原始图像输入尺寸。
在具体实现中,服务器可以根据CNN的提供方(如CNN的制作公司等)的数据,来确定CNN的原始图像输入尺寸。本申请的实施例进行训练的CNN,是包含全局平均池化层 (Global Average Pooling)的CNN,全局平均池化层,是用来替换部分全连接层而提出的一种池化层,全局平均池化层直接从特征通道入手,如最后一层卷积层输出有2048个通道,那么全局平均池化层对每个通道上整个平面的数据进行求和平均,最终得到一个2048的向量,最后再加一层全连接层。全局平均池化层大大减少了CNN的参数量,降低了CNN过拟合的风险,同时实现了对CNN进行任意大小的图像输入。由于包含全局平均池化层的CNN允许对CNN输入不同尺寸的图像,也允许对CNN输入长方形图像,考虑到某些图像的目标特征不在图像的中心区域,或者识别目标的形状是长方形的,如宝剑、拖把等,使用长方形图片进行训练,可以防止重要特征丢失,有效提升训练效果。
在一个例子中,进行训练的CNN为ResNet-50,服务器确定ResNet-50的原始图像输入尺寸为:224px×224px。
在另一个例子中,进行训练的CNN为AlexNet,服务器确定AlexNet的原始图像输入尺寸为:256px×256px。
子步骤1022,根据先后顺序和CNN的原始图像输入尺寸,确定各训练阶段的图像输入尺寸;
具体而言,服务器在确定CNN的原始图像输入尺寸后,可以根据先后顺序和CNN的原始图像输入尺寸,确定各训练阶段的图像输入尺寸。其中,第一个训练阶段的图像输入尺寸小于原始图像输入尺寸,最后一个训练阶段的图像输入尺寸大于原始图像输入尺寸。模型在前几个训练阶段,学习的特征少,根据尺寸小的图像学习即可满足要求,后面的训练阶段,模型学习的特征逐渐增多,需要增大输入图像的尺寸以满足CNN的学习要求。逐步地增大图像输入尺寸,可以使CNN学习到每张图像的多种不同大小的分辨率,有利于提高CNN的识别精度,由于前几个阶段图像输入尺寸较小,可以有效提升CNN的训练速度。
在一个例子中,进行训练的CNN为ResNet-50,ResNet-50的原始图像输入尺寸为224px。服务器将ResNet-50的训练过程划分为四个训练阶段,根据先后顺序和ResNet-50的原始图像输入尺寸,确定各训练阶段的图像输入尺寸为:第一训练阶段的图像输入尺寸为96px;第二训练阶段的图像输入尺寸为128px;第三训练阶段的图像输入尺寸为224px;第四训练阶段的图像输入尺寸为288px。
在另一个例子中,进行训练的CNN为AlexNet,AlexNet的原始图像输入尺寸为227px。服务器将AlexNet的训练过程划分为三个训练阶段,根据先后顺序和AlexNet的原始图像输入尺寸,确定各训练阶段的图像输入尺寸为:第一训练阶段的图像输入尺寸为128px;第二训练阶段的图像输入尺寸为227px;第三训练阶段的图像输入尺寸为320px。
步骤103,根据与各训练阶段的图像输入尺寸对应的图像,对CNN进行训练。
具体而言,服务器在确定各训练阶段的图像输入尺寸之后,可以根据与各训练阶段的图像输入尺寸对应的图像,对CNN进行训练。
在具体实现中,服务器可以通过对原始图像进行数据增强,来获得与各训练阶段的图像输入尺寸对应的图像,再根据与各训练阶段的图像输入尺寸对应的图像,对CNN进行训练。数据增强操作包括但不限于:对样本图片进行图像反转、尺寸裁剪、仿射变换,超分辨率转换、图像模糊、锐化处理、亮度调整、对比度微调、浮雕处理、边缘检测、附加高斯噪声、颜色通道偏移等操作。
在一个例子中,训练图像的尺寸为224px,服务器通过裁剪、缩放的手段,获得尺寸为 96px、128px和288px的训练图像,对第一训练阶段输入尺寸为96px的训练图像,对第二训练阶段输入128px的训练图像,对第三训练阶段输入224px的训练图像,对第四训练阶段输入288px的训练图像。
本申请的第一实施例,确定卷积神经网络CNN的训练过程的各训练阶段和各训练阶段之间的先后顺序;根据先后顺序,确定各训练阶段的图像输入尺寸;其中,各训练阶段的图像输入尺寸按照先后顺序从小变大,可以使训练过程更加科学、合理,每个训练阶段的输入图像采用不同的尺寸,可以很大程度地提高CNN对图像特征的提取能力。根据与各训练阶段的图像输入尺寸对应的图像,对CNN进行训练,模型在刚开始训练时,学习的特征少,根据尺寸小的图像学习即可满足学习要求,随着训练的逐渐深入,模型学习的特征逐渐增多,需要增大输入图像的尺寸以满足CNN的学习要求,即将CNN的训练过程分为若干训练阶段,按照训练阶段之间的先后顺序从小到大学习图像的特征,从而提高CNN的训练速度,同时提高CNN的训练精度。
本申请的第二实施例涉及一种CNN训练方法,下面对本实施例的CNN训练方法的实现细节进行具体的说明,以下内容仅为方便理解提供的实现细节,并非实施本方案的必须,图3是本申请第二实施例所述的CNN训练方法的示意图,包括:
步骤201,确定卷积神经网络CNN的训练过程中的各训练阶段和各训练阶段之间的先后顺序;
其中,步骤201在第一实施例中已有说明,此处不再赘述。
步骤202,确定各训练阶段的训练周期数;
具体而言,服务器在确定完卷积神经网络CNN的训练过程中的各训练阶段和各训练阶段之间的先后顺序之后,可以确定各训练阶段的训练周期数(Epochs)。
在一个例子中,服务器可以获取本领域技术人员输入的训练周期数。本领域技术人员输入的训练周期数是根据大量实际训练经验总结得来的,可以使各训练阶段的训练周期数设置的更加科学、合理。
在另一个例子中,服务器确定最后一个训练阶段的训练周期数为1或2。进入到最后一个训练阶段的CNN已学习到图像的绝大多数的特征,将最后一个训练阶段的训练周期数设置为1或2,可以防止CNN学习到训练图像中包含的噪声,从而提高CNN最终的识别精度。
比如:服务器将CNN的训练过程划分为四个训练阶段,服务器确定各训练阶段的训练周期数为:第一训练阶段的训练周期数为22;第二训练阶段的训练周期数为26;第三训练阶段的训练周期数为10;第四训练阶段的训练周期数为2。即整个训练过程的训练周期数为60。
步骤203,根据先后顺序,确定各训练阶段的图像输入尺寸;
其中,步骤203在第一实施方式中已有说明,此处不再赘述。
步骤204,根据与各训练阶段的图像输入尺寸对应的图像和训练周期数,对CNN进行训练;
具体而言,服务器在确定各训练阶段的图像输入尺寸后,可以根据与各训练阶段的图像输入尺寸对应的图像和训练周期数,对CNN进行训练。
在具体实现中,服务器将与某一训练阶段的图像输入尺寸对应的图像全部输入至CNN的该训练阶段一次即完成一个训练周期的训练,完成一个训练周期的训练后,服务器会根据训 练结果更新CNN各层的参数,并进行下一个训练周期的训练。根据与各训练阶段的图像输入尺寸对应的图像和训练周期数,对CNN进行训练,即对CNN进行迭代训练。
在一个例子中,根据与各训练阶段的图像输入尺寸对应的图像和训练周期数,对CNN进行训练,可以由如图4所示的各子步骤实现,具体如下:
子步骤2041,根据与各训练阶段的图像输入尺寸对应的图像的标签值和CNN的输出值,确定每一训练周期训练后的代价值;
具体而言,服务器可以根据与各训练阶段的图像输入尺寸对应的图像的标签值和CNN的输出值,确定每一训练周期训练后的代价值。
在具体实现中,由于与各训练阶段的图像输入尺寸对应的图像的标签值是人工进行标注的,所以标签值是真实的、准确的。CNN的输出值是根据输入的图像和CNN各层的参数计算得来的,能够真实反映CNN的识别效果。代价值是由标签值、输出值和代价函数计算得来,代价函数可以由本领域的技术人员根据实际需要进行选择,本申请的实施例对此不做具体限定。
子步骤2042,判断连续预设训练周期数的代价值是否下降,如果是,返回子步骤2042继续等待,否则,执行子步骤2043;
具体而言,服务器在确定每一训练周期训练后的代价值后,可以判断连续预设训练周期数的代价值是否下降。其中,预设训练周期数可以由本领域的技术人员根据实际需要进行设定。
在一个例子中,预设训练周期数为5,最近5个训练周期训练后的代价值分别为:1.21、1.23、1.22、1.21、1.21,服务器确定连续5个训练周期训练后的代价值没有下降。
在另一个例子中,预设训练周期数为4,最近4个训练周期训练后的代价值分别为:0.97、0.98、0.91、0.9,服务器确定连续4个训练周期训练后的代价值仍在下降。
子步骤2043,进入下一训练阶段。
具体而言,若服务器判断连续预设训练周期数的代价值没有下降,则直接进入下一训练阶段,若当前阶段为最后一个训练阶段,则结束对该CNN的训练过程,本申请实施例可以有效提升CNN的训练速度,避免重复地、无效地投入训练资源。经验证,使用图形处理器(Graphics Processing Unit,简称:GPU)根据本申请实施例提供的CNN训练方法的训练速度如图5所示:其中,1GPU表示由1个GPU进行训练,8GPU表示8个GPU对CNN共同进行训练,96表示输入的图像尺寸为96px,128表示输入的图像尺寸为128px,244表示输入的图像尺寸为244px,288表示输入的图像尺寸为288px,纵坐标表示训练速度。
在一个例子中,当前进行的训练阶段为第二训练阶段,预设训练周期数为5,服务器确定连续5个训练周期的训练后的代价值没有下降,结束第二训练阶段,直接进入第三训练阶段。
在另一个例子中,当前进行的训练阶段为第四训练阶段,也是最后一个训练阶段,预设训练周期数为4,服务器确定连续4个训练周期的训练后的代价值没有下降,服务器直接结束对该CNN的训练过程。
本申请的第二实施例,在确定卷积神经网络CNN的训练过程中的各训练阶段和各训练阶段之间的先后顺序之后,还包括:确定各训练阶段的训练周期数;根据与各训练阶段的图像输入尺寸对应的图像,对CNN进行训练,包括:根据与各训练阶段的图像输入尺寸对应的图 像和训练周期数,对CNN进行训练,可以使CNN训练过程更加的科学、合理,进一步地提高CNN的最终识别精度。根据与各训练阶段的图像输入尺寸对应的图像和训练周期数,对CNN进行训练,包括:根据与各训练阶段的图像输入尺寸对应的图像的标签值和CNN的输出值,确定每一训练周期训练后的代价值;若连续预设训练周期数的代价值没有下降,进入下一训练阶段,可以有效提升CNN的训练速度,避免重复地、无效地投入训练资源。
本申请的第三实施例涉及一种CNN训练方法,下面对本实施例的CNN训练方法的实现细节进行具体的说明,以下内容仅为方便理解提供的实现细节,并非实施本方案的必须,图6是本申请第三实施例所述的CNN训练方法的示意图,包括:
步骤301,确定卷积神经网络CNN的训练过程中的各训练阶段和各训练阶段之间的先后顺序;
步骤302,根据先后顺序,确定各训练阶段的图像输入尺寸;
步骤303,根据与各训练阶段的图像输入尺寸对应的图像,对CNN进行训练;
其中,步骤301至步骤303在第一实施例中已有说明,此处不再赘述。
步骤304,获取若干验证集;
具体而言,服务器在结束对CNN的训练过程之后,可以获取若干验证集,进入验证过程。其中,验证集内包含若干用于验证CNN的识别效果的验证图像。
在一个例子中,获取若干验证集,可以由如图7所示的各子步骤实现,具体如下:
子步骤3041,获取若干验证图像,并确定若干验证图像的宽高比;
具体而言,服务器可以获取若干用于验证CNN的识别效果的验证图像,并确定若干验证图像的宽高比。
在具体实现中,服务器在获取用于训练的训练图像时,可以从中随机选取部分图像作为用于验证CNN的识别效果的验证图像,并计算各验证图像的宽高比。
在一个例子中,一张验证图像的尺寸为:244px×960px,服务器确定该验证图像的宽高比为:244÷960=0.254。
子步骤3042,根据宽高比,对若干验证图像排序,确定若干验证图像的排序顺序;
具体而言,服务器确定所有验证图像的宽高比之后,可以根据若干验证图像的宽高比,对若干验证图像进行排序,确定若干验证图像的排序顺序,其中,排序顺序可以为由大到小,也可以为由小到大,本申请的实施例对此不做具体限定。
子步骤3043,根据排序顺序,获取若干验证集;
具体而言,服务器在根据若干验证图像的宽高比,对若干验证图像进行排序,确定若干验证图像的排序顺序后,可以根据排序顺序,获取若干验证集。对所有验证图像排序,可以保证同一批图像的宽高比相差不大,从而最大限度地保留重要特征信息。
在具体实现中,服务器可以获取预设的批大小,根据预设的批大小和排序后的若干验证图像,按照排序的顺序,获取若干验证集。其中,批大小即一次验证所需的验证图像的数目。
在一个例子中,验证图像一共50000张,服务器将50000张验证图像按照宽高比从小到大的顺序排序。预设的批大小为500,服务器确认500张验证图像为一个验证集,共获取50000÷500=100个验证集。
步骤305,对同一验证集内的各验证图像的尺寸进行归一化;
具体而言,服务器在获取若干验证集后,可以对同一验证集内的各验证图像的尺寸归一化。考虑到验证过程是对训练出来的模型的准确率进行评估,将验证集内的图像的尺寸大小进行归一后再对CNN进行验证,能够显著提升CNN的识别效果。
在一个例子中,验证图像可以为长方形,使用长方形图像做验证,可以在不同验证集内使用不同的宽高比进行归一化,最大限度保留了重要信息,从而能够提高识别准确率。
在一个例子中,对同一验证集内的各验证图像的尺寸归一化,可以由如图8所示的各子步骤实现,具体如下:
子步骤3051,根据同一验证集中各验证图像的宽高比,确定同一验证集中各验证图像的平均宽高比;
具体而言,服务器对同一验证集内的各验证图像的尺寸进行归一化时,可以先根据同一验证集中各验证图像的宽高比,确定同一验证集中各验证图像的平均宽高比。
在一个例子中,某一验证集包含10张验证图像,这10张验证图像的宽高比分别为:0.254、0.254、0.256、0.257、0.257、0.257、0.258、0.261、0.263、0.264,服务器确定该验证集中各验证图像的平均宽高比为:(0.254+0.254+0.256+0.257+0.257+0.257+0.258+0.261+0.263+0.264)÷10=0.258。
子步骤3052,根据平均宽高比,对同一验证集内的各验证图像的尺寸进行归一化;
具体而言,服务器确定同一验证集中各验证图像的平均宽高比后,可以根据平均宽高比,对同一验证集内的各验证图像的尺寸进行归一化。根据平均宽高比,对同一验证集内的各验证图像尺寸进行归一化,可以使归一化过程更加科学,合理。
在具体实现中,服务器可以对同一验证集内的各验证图像,通过剪裁、缩放等方式,将各验证图像的宽高比归一化成该验证集中各验证图像的平均宽高比。
在一个例子中,某一验证集包含10张验证图像,服务器确定该验证集中各验证图像的平均宽高比为:0.258,服务器可以通过剪裁、缩放等方式,将这10张验证图像的宽高比归一为0.258。
步骤306,根据尺寸归一化后的各验证图像,对CNN的识别效果进行验证。
具体而言,服务器完成对同一验证集内的各验证图像的尺寸归一化后,可以根据尺寸归一化后的各验证图像,对CNN的识别效果进行验证。
在具体实现中,服务器可以根据各验证图像的标签值和CNN的输出值和预设的代价函数,确定每一次验证后的代价值,并根据代价值判断CNN的识别效果。
本申请的第三实施例,在根据与各训练阶段的图像输入尺寸对应的图像,对CNN进行训练之后,还包括:获取若干验证集;其中,验证集内包含若干用于验证CNN的识别效果的验证图像;对同一验证集内的各验证图像的尺寸进行归一化;根据尺寸归一化后的各验证图像,对CNN的识别效果进行验证,能够显著提升CNN的识别效果。获取若干验证集,包括:获取若干验证图像,并确定若干验证图像的宽高比;根据宽高比,对若干验证图像排序,确定若干验证图像的排序顺序;根据排序顺序,获取若干验证集。可以保证同一批图像的宽高比相差不大,从而最大限度地保留重要特征信息。对同一验证集内的各验证图像的尺寸进行归一化,包括:根据同一验证集中各验证图像的宽高比,确定所述同一验证集中各验证图像的平均宽高比;根据平均宽高比,对所述同一验证集内的各验证图像的尺寸进行归一化,可以使归一化过程更加科学,合理。
本申请第四实施例涉及一种电子设备,如图9所示,包括:至少一个处理器401;以及,与所述至少一个处理器401通信连接的存储器402;其中,所述存储器402存储有可被所述至少一个处理器401执行的指令,所述指令被所述至少一个处理器401执行,以使所述至少一个处理器401能够执行上述各实施方式中的CNN训练方法。
其中,存储器和处理器采用总线方式连接,总线可以包括任意数量的互联的总线和桥,总线将一个或多个处理器和存储器的各种电路连接在一起。总线还可以将诸如外围设备、稳压器和功率管理电路等之类的各种其他电路连接在一起,这些都是本领域所公知的,因此,本文不再对其进行进一步描述。总线接口在总线和收发机之间提供接口。收发机可以是一个元件,也可以是多个元件,比如多个接收器和发送器,提供用于在传输介质上与各种其他装置通信的单元。经处理器处理的数据通过天线在无线介质上进行传输,进一步,天线还接收数据并将数据传送给处理器。
处理器负责管理总线和通常的处理,还可以提供各种功能,包括定时,外围接口,电压调节、电源管理以及其他控制功能。而存储器可以被用于存储处理器在执行操作时所使用的数据。
本申请第五实施方式涉及一种计算机可读存储介质,存储有计算机程序。计算机程序被处理器执行时实现上述方法实施例。
即,本领域技术人员可以理解,实现上述实施例方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,该程序存储在一个存储介质中,包括若干指令用以使得一个设备(可以是单片机,芯片等)或处理器(processor)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,简称:ROM)、随机存取存储器(Random Access Memory,简称:RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
本领域的普通技术人员可以理解,上述各实施例是实现本申请的具体实施例,而在实际应用中,可以在形式上和细节上对其作各种改变,而不偏离本申请的精神和范围。

Claims (10)

  1. 一种CNN训练方法,包括:
    确定卷积神经网络CNN的训练过程中的各训练阶段和所述各训练阶段之间的先后顺序;
    根据所述先后顺序,确定所述各训练阶段的图像输入尺寸;其中,所述各训练阶段的图像输入尺寸按照所述先后顺序从小变大;
    根据与所述各训练阶段的图像输入尺寸对应的图像,对所述CNN进行训练。
  2. 根据权利要求1所述的CNN训练方法,其中,所述根据所述先后顺序,确定所述各训练阶段的图像输入尺寸,包括:
    确定所述CNN的原始图像输入尺寸;
    确定第一个训练阶段的图像输入尺寸小于所述原始图像输入尺寸,并确定最后一个训练阶段的图像输入尺寸大于所述原始图像输入尺寸。
  3. 根据权利要求1或2所述的CNN训练方法,其中,在所述确定卷积神经网络CNN的训练过程中的各训练阶段和各训练阶段之间的先后顺序之后,还包括:
    确定所述各训练阶段的训练周期数;
    所述根据与所述各训练阶段的图像输入尺寸对应的图像,对所述CNN进行训练,包括:
    根据与所述各训练阶段的图像输入尺寸对应的图像和所述训练周期数,对所述CNN进行训练。
  4. 根据权利要求3所述的CNN训练方法,其中,所述根据与所述各训练阶段的图像输入尺寸对应的图像和所述训练周期数,对所述CNN进行训练,包括:
    根据所述与所述各训练阶段的图像输入尺寸对应的图像的标签值和所述CNN的输出值,确定每一训练周期训练后的代价值;
    若连续预设训练周期数的代价值没有下降,进入下一训练阶段。
  5. 根据权利要求3或4所述的CNN训练方法,其中,所述确定所述各训练阶段的训练周期数,包括:确定所述各训练阶段中最后一个训练阶段的训练周期数为1或2。
  6. 根据权利要求1-5任一项所述的CNN训练方法,其中,在所述根据与所述各训练阶段的图像输入尺寸对应的图像,对所述CNN进行训练之后,还包括:
    获取若干验证集;其中,所述验证集内包含若干用于验证CNN的识别效果的验证图像;
    对同一验证集内的各验证图像的尺寸进行归一化;
    根据尺寸归一化后的各验证图像,对所述CNN的识别效果进行验证。
  7. 根据权利要求6所述的CNN训练方法,其中,所述获取若干验证集,包括:
    获取若干验证图像,并确定所述若干验证图像的宽高比;
    根据所述宽高比,对所述若干验证图像排序,确定所述若干验证图像的排序顺序;
    根据所述排序顺序,获取若干验证集。
  8. 根据权利要求7所述的CNN训练方法,其中,所述对同一验证集内的各验证图像的 尺寸进行归一化,包括:
    根据所述同一验证集中各验证图像的宽高比,确定所述同一验证集中各验证图像的平均宽高比;
    根据所述平均宽高比,对所述同一验证集内的各验证图像的尺寸进行归一化。
  9. 一种电子设备,包括:
    至少一个处理器;以及,
    与所述至少一个处理器通信连接的存储器;其中,
    所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行如权利要求1至8中任一项所述的CNN训练方法。
  10. 一种计算机可读存储介质,存储有计算机程序,所述计算机程序被处理器执行时实现权利要求1至8中任一项所述的CNN训练方法。
PCT/CN2021/127979 2020-11-26 2021-11-01 Cnn训练方法、电子设备和计算机可读存储介质 WO2022111231A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011349094.5A CN114548355A (zh) 2020-11-26 2020-11-26 Cnn训练方法、电子设备和计算机可读存储介质
CN202011349094.5 2020-11-26

Publications (1)

Publication Number Publication Date
WO2022111231A1 true WO2022111231A1 (zh) 2022-06-02

Family

ID=81668077

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/127979 WO2022111231A1 (zh) 2020-11-26 2021-11-01 Cnn训练方法、电子设备和计算机可读存储介质

Country Status (2)

Country Link
CN (1) CN114548355A (zh)
WO (1) WO2022111231A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114882442A (zh) * 2022-05-31 2022-08-09 广州信诚信息科技有限公司 一种基于电力作业现场的人员装备态势识别方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108197602A (zh) * 2018-01-30 2018-06-22 厦门美图之家科技有限公司 一种卷积神经网络生成方法及表情识别方法
CN109299733A (zh) * 2018-09-12 2019-02-01 江南大学 利用紧凑型深度卷积神经网络进行图像识别的方法
US20190303677A1 (en) * 2018-03-30 2019-10-03 Naver Corporation System and method for training a convolutional neural network and classifying an action performed by a subject in a video using the trained convolutional neural network
CN111767860A (zh) * 2020-06-30 2020-10-13 阳光学院 一种通过卷积神经网络实现图像识别的方法及终端

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108197602A (zh) * 2018-01-30 2018-06-22 厦门美图之家科技有限公司 一种卷积神经网络生成方法及表情识别方法
US20190303677A1 (en) * 2018-03-30 2019-10-03 Naver Corporation System and method for training a convolutional neural network and classifying an action performed by a subject in a video using the trained convolutional neural network
CN109299733A (zh) * 2018-09-12 2019-02-01 江南大学 利用紧凑型深度卷积神经网络进行图像识别的方法
CN111767860A (zh) * 2020-06-30 2020-10-13 阳光学院 一种通过卷积神经网络实现图像识别的方法及终端

Also Published As

Publication number Publication date
CN114548355A (zh) 2022-05-27

Similar Documents

Publication Publication Date Title
Su et al. Vitas: Vision transformer architecture search
WO2021238262A1 (zh) 一种车辆识别方法、装置、设备及存储介质
CN111275107A (zh) 一种基于迁移学习的多标签场景图像分类方法及装置
WO2021129181A1 (en) Portrait segmentation method, model training method and electronic device
CN111709406B (zh) 文本行识别方法及装置、可读存储介质、电子设备
CN109063719B (zh) 一种联合结构相似性和类信息的图像分类方法
CN106383912A (zh) 一种图片检索方法和装置
WO2021051987A1 (zh) 神经网络模型训练的方法和装置
CN110245683B (zh) 一种少样本目标识别的残差关系网络构建方法及应用
CN107516128A (zh) 一种基于ReLU激活函数的卷积神经网络的花卉识别方法
CN115080749B (zh) 一种基于自监督训练的弱监督文本分类方法、系统和装置
CN111931813A (zh) 一种基于cnn的宽度学习分类方法
WO2022111231A1 (zh) Cnn训练方法、电子设备和计算机可读存储介质
Dai Real-time and accurate object detection on edge device with TensorFlow Lite
CN111783688B (zh) 一种基于卷积神经网络的遥感图像场景分类方法
CN117152438A (zh) 一种基于改进DeepLabV3+网络的轻量级街景图像语义分割方法
CN116109868A (zh) 基于轻量化神经网络的图像分类模型构建和小样本图像分类方法
Lv et al. Image semantic segmentation method based on atrous algorithm and convolution CRF
CN115937852A (zh) 一种基于文本驱动的高效弱监督语义分割方法及装置
CN114332491A (zh) 一种基于特征重构的显著性目标检测算法
CN114298909A (zh) 一种超分辨网络模型及其应用
CN110211041B (zh) 一种基于感受野集成的神经网络图像分类器的优化方法
CN111091198A (zh) 一种数据处理方法及装置
Wang et al. Image Semantic Segmentation Algorithm Based on Self-learning Super-Pixel Feature Extraction
CN111931773B (zh) 图像识别方法、装置、设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21896732

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 18.10.2023)