WO2021027142A1 - 图片分类模型训练方法、系统和计算机设备 - Google Patents

图片分类模型训练方法、系统和计算机设备 Download PDF

Info

Publication number
WO2021027142A1
WO2021027142A1 PCT/CN2019/117405 CN2019117405W WO2021027142A1 WO 2021027142 A1 WO2021027142 A1 WO 2021027142A1 CN 2019117405 W CN2019117405 W CN 2019117405W WO 2021027142 A1 WO2021027142 A1 WO 2021027142A1
Authority
WO
WIPO (PCT)
Prior art keywords
sample
picture
model
training
layer
Prior art date
Application number
PCT/CN2019/117405
Other languages
English (en)
French (fr)
Inventor
沈吉祥
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021027142A1 publication Critical patent/WO2021027142A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/55Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the embodiments of the present application relate to the field of data processing, and in particular to a method, system, computer device, and non-volatile computer-readable storage medium for training a picture classification model based on a small amount of data.
  • the embodiment of the present application provides a method for training a picture classification model based on a small amount of data, and the method steps include:
  • Obtaining a plurality of sample pictures obtaining a sample picture training set according to the plurality of sample pictures, the plurality of sample pictures including pictures of a plurality of target picture types;
  • the pre-training model is trained through the sample picture training set to obtain a picture classification model.
  • an embodiment of the present application also provides a picture classification model training system based on a small amount of data, including:
  • the construction module is used to construct a pre-training model according to the ImageNet model, the ImageNet model being a network model trained according to a preset pre-sample picture;
  • An obtaining module configured to obtain a plurality of sample pictures, and obtain a sample picture training set according to the plurality of sample pictures, and the plurality of sample pictures include pictures of a plurality of target picture types;
  • the training module is used to train the pre-training model through the sample picture training set to obtain a picture classification model.
  • an embodiment of the present application further provides a computer device, the computer device including a memory, a processor, and computer-readable instructions stored on the memory and running on the processor, the When the computer-readable instructions are executed by the processor, the following steps are implemented:
  • Obtaining a plurality of sample pictures obtaining a sample picture training set according to the plurality of sample pictures, the plurality of sample pictures including pictures of a plurality of target picture types;
  • the pre-training model is trained through the sample picture training set to obtain a picture classification model.
  • the embodiments of the present application also provide a non-volatile computer-readable storage medium.
  • the non-volatile computer-readable storage medium stores computer-readable instructions, and the computer-readable instructions may Is executed by at least one processor, so that the at least one processor executes the following steps:
  • Obtaining a plurality of sample pictures obtaining a sample picture training set according to the plurality of sample pictures, the plurality of sample pictures including pictures of a plurality of target picture types;
  • the pre-training model is trained through the sample picture training set to obtain a picture classification model.
  • the method, system, computer equipment, and non-volatile computer-readable storage medium for training a picture classification model based on a small amount of data provided in the embodiments of the application provide an effective classification method for pictures; model training on a small amount of data can be achieved
  • the image training speed is improved, the image recognition speed is improved, and the image recognition accuracy is improved, and the types of distinguishable images can be increased, which improves the efficiency of image classification.
  • FIG. 1 is a schematic flowchart of a method for training a picture classification model based on a small amount of data according to an embodiment of the application.
  • Fig. 2 is a schematic diagram of a specific flow of step S100 in Fig. 1.
  • Fig. 3 is a schematic diagram of a specific flow of step S102 in Fig. 1.
  • FIG. 4 is a schematic diagram of the program modules of Embodiment 2 of the image classification model training system based on a small amount of data according to this application.
  • FIG. 5 is a schematic diagram of the hardware structure of the third embodiment of the computer equipment of this application.
  • the computer device 2 will be used as the execution subject for exemplary description.
  • FIG. 1 shows a flowchart of steps of a method for training a picture classification model based on a small amount of data in an embodiment of the present application. It can be understood that the flowchart in this method embodiment is not used to limit the order of execution of the steps.
  • the following exemplarily describes the computer device 2 as the execution subject. details as follows.
  • a pre-training model is constructed according to an ImageNet model, which is a network model trained according to a preset pre-sample picture.
  • the pre-sample pictures are a large number of arbitrary pictures, and the arbitrary pictures can be obtained from an existing picture database or can be directly downloaded on the Internet.
  • the step S100 may further include:
  • Step S100a Input each pre-sample picture into the network model, and each pre-sample picture is respectively associated with a corresponding classification label in advance.
  • Step S100b output the classification label confidence corresponding to each pre-sample picture through the network model.
  • Step S100c Adjust each model parameter in the network model based on the classification label confidence corresponding to each pre-sample picture and the classification label pre-associated with each pre-sample picture to obtain the ImageNet model.
  • Step S100d based on the ImageNet model, configure the layer structure of each network layer in the ImageNet model according to the layer-by-layer analysis network layer method to construct the pre-training model.
  • Step S100e wherein the pre-training model is a nine-layer image classification network, and the nine-layer image classification network in turn includes: input layer, conv1 convolution layer, conv2 convolution layer, conv3 convolution layer, conv4 convolution layer, conv5 convolutional layer, fc6 full link layer, fc7 full link layer and output layer.
  • the design method of the nine-layer image classification network includes:
  • the design method of the conv1 convolution layer is: input the original image through the image input layer to perform convolution processing, after the convolution processing, a feature map will be obtained; and the feature map will be obtained according to the maxout activation function; Then through pooling and downsampling processing, the pooling processing is the maximum pooling processing method, the size of the convolution kernel of the pooling layer and the sliding step length are set, after the downsampling processing, the feature map is output; the input to the conv2 volume Do a Batch Normalization (BN preprocessing) before stacking.
  • BN preprocessing Batch Normalization
  • the design method of the conv2 convolutional layer is as follows: the output of conv1 is used as the input of conv2, and the convolution processing is first performed. In order to prevent the feature map from becoming too small, the convolutional layer of this layer also adds edge compensation After processing, the feature map is obtained after convolution processing; then the feature map will be obtained through the maxout activation function; the same way through the maximum pooling processing method, the size of the convolution kernel of the pooling layer and the sliding step are set, and the feature map is output Figure; BN processing before input to the conv3 convolutional layer, and output feature maps.
  • the design method of the conv3 convolution layer is as follows: take the output of conv2 as the input of conv3, first perform convolution processing, and add edge compensation processing to the convolution layer of this layer, and then obtain the result after convolution processing
  • the feature map is also obtained according to the maxout activation function, and the output processed by the activation function is input to the next convolutional layer.
  • the design method of the conv4 convolutional layer is as follows: take the output of conv3 as the input of conv4, first go through convolution processing, add edge compensation processing to the convolution layer of this layer, and then get a Feature map, also using the maxout activation function will get the feature map, and input the output processed by the activation function to the next convolutional layer.
  • the design method of the conv4 convolutional layer is as follows: the output of conv4 is used as the input of conv5, and the conv4 is first subjected to convolution processing.
  • the convolutional layer of this layer adds edge compensation processing, and the conv4
  • the maxout activation function will also be used to obtain a feature map;
  • the maximum pooling processing method is used to set the size of the convolution kernel and the sliding step length of the pooling layer, and the feature map is output. This layer does not perform BN preprocessing.
  • the design method of the fc6 full link layer is as follows: take the output of conv5 as the input of the fc6 full link layer, input the feature map, and set the number of neurons in the fc6 full link layer, use the maxout activation function, and output the number of neurons , Using dropout processing method for output.
  • the design method of the fc7 full link layer is as follows: the output of fc6 is used as the input of the fc7 full link layer.
  • the structure of the layer is basically the same as that of fc6.
  • the design method of the output classification layer is as follows: the softmax classifier is selected as the classifier of this layer, and the number of neural nodes is determined according to the different training samples. For example, for the CIFAR-10 image database, the number of nodes is Set to 10; for the CIFAR-100 image database, the number of neural nodes should be set to 100; for the CIFAR-1000 image database, the number of nodes is set to 1000.
  • the softmax classifier is selected as the classifier of this layer, and the number of neural nodes is determined according to the different training samples. For example, for the CIFAR-10 image database, the number of nodes is Set to 10; for the CIFAR-100 image database, the number of neural nodes should be set to 100; for the CIFAR-1000 image database, the number of nodes is set to 1000.
  • Step S102 Obtain a plurality of sample pictures, and obtain a sample picture training set according to the plurality of sample pictures, and the plurality of sample pictures include pictures of a plurality of target picture types.
  • the step S102 may further include:
  • Step S102a Obtain N sample pictures, sample the N sample pictures, and obtain multiple sample picture sets.
  • Step S102b selecting one of the sample picture sets from the plurality of sample picture sets.
  • Step S102c extracting the image features of each sample picture in the selected sample picture set, and extracting the unselected ones by incremental learning according to the image characteristics of each sample picture in the selected sample picture set
  • the image characteristics of each sample picture in each sample picture set in the sample picture set are used to obtain the image characteristics of N sample pictures.
  • the incremental principal component analysis method is used to gradually extract the image features of the pictures in the remaining sample picture sets, and the extraction of the image characteristics of each sample picture in the selected sample picture set includes:
  • the finally calculated covariance matrix is the covariance matrix calculated in the PCA method Said Is the mean value of the image features of n pictures.
  • T(X) can be selected according to user needs, and can be selected according to the classification rules provided by the user.
  • PCA PCA
  • the covariance matrix Perform feature decomposition to get:
  • the image features are gradually extracted through the incremental learning method, for the i-th (1 ⁇ i ⁇ n) image I i , a vector f i is obtained , and the dimension of f i is smaller than the dimension of I i .
  • the embodiment of the application is based on obtaining the image features of each sample picture in the sample picture set, and extracts the image of each sample picture in each sample picture set in the multiple sample picture sets that have not been selected by the incremental learning method Features to obtain image features of N sample pictures.
  • Step S102d Obtain a sample picture training set based on the image features of the N sample pictures and based on a clustering algorithm.
  • the sample picture training set includes a plurality of picture sets, each picture set corresponds to a cluster category, and each picture The samples are in one or more image collections.
  • the steps of classifying N sample pictures through a clustering algorithm to obtain a training set of sample pictures include:
  • the N sample pictures are divided into a plurality of picture sets corresponding to the plurality of categories based on a clustering algorithm, and each picture is located in one or more picture sets ,
  • the image feature of each picture includes a category label, and each category label corresponds to the picture set to which each picture belongs and the number in the picture set to which it belongs.
  • the N sample pictures are divided into M picture sets, that is, M clusters, each image feature corresponds to a cluster, and the category label can be used to identify the cluster corresponding to the image feature, for example, the i-th ( ⁇ i ⁇ n) corresponding to an image feature class label L i, L i ranges from 1 to M.
  • the embodiment of the present application processes the features through a cluster analysis method, and divides the image features of the N sample pictures into M picture sets, so that pictures with similar image characteristics are grouped together, and one picture set corresponds to one cluster.
  • the i (1 ⁇ i ⁇ n) pictures obtained a class label L i, L i is in the range 1 to M, M cluster centers are The data in each cluster is closely clustered around its cluster center.
  • searching you only need to compare the image features of the picture to be recognized with the M cluster centers, and find the closest cluster center, so the picture to be recognized only belongs to this cluster, not to other clusters, and then By performing further matching in the cluster, the final matching result can be obtained, thereby reducing the search time in the database.
  • the cluster center of each cluster that is, the mean value of each cluster, needs to be saved in the database.
  • Step S104 training the pre-training model through the sample picture training set to obtain a picture classification model.
  • the step of training a pre-training model through the sample picture training set includes: training the pre-training model through a transfer learning method; wherein, the sample picture training set is used to input to the pre-training model.
  • the penultimate layer in the training model includes: training the pre-training model through a transfer learning method; wherein, the sample picture training set is used to input to the pre-training model.
  • the penultimate layer in the training model includes: training the pre-training model through a transfer learning method; wherein, the sample picture training set is used to input to the pre-training model.
  • the sample image training set is put into the fc7 full link layer in the pre-training model through the transfer learning method, as the training learning set, which can reach the last layer of the pre-training model
  • the effect of fine-tuning is produced to complete the training of the pre-training model, and obtain a picture classification model with higher robustness and better accuracy.
  • FIG. 4 is a schematic diagram of the program modules of Embodiment 2 of the image classification model training system based on a small amount of data according to this application.
  • the picture classification model training system 20 may include or be divided into one or more program modules, one or more program modules are stored in a storage medium and executed by one or more processors to complete the application, and The above-mentioned image classification model training method based on small amount of data is realized.
  • the program module referred to in the embodiments of the present application refers to a series of computer-readable instruction segments that can complete specific functions. The following description will specifically introduce the functions of each program module in this embodiment:
  • the construction module 200 is configured to construct a pre-training model according to the ImageNet model, which is a network model trained according to a preset pre-sample picture.
  • the construction module 200 is further configured to: input each pre-sample picture into a network model, and each pre-sample picture is associated with a corresponding classification label in advance; and output the classification corresponding to each pre-sample picture through the network model Label confidence; adjust each model parameter in the network model based on the classification label confidence corresponding to each pre-sample picture and the classification label associated with each pre-sample picture in advance to obtain the ImageNet model; based on the ImageNet model , And configure the layer structure of each network layer in the ImageNet model according to the layer-by-layer analysis network layer method to construct the pre-training model; wherein, the pre-training model is a nine-layer image classification network, and the nine-layer image classification
  • the network in turn includes: input layer, conv1 convolutional layer, conv2 convolutional layer, conv3 convolutional layer, conv4 convolutional layer, conv5 convolutional layer, fc6 fully linked layer, fc7 fully linked layer and output layer.
  • the acquisition module is further configured to: acquire N sample pictures, sample the N sample pictures to obtain a plurality of sample picture sets; and select one of the sample picture sets from the plurality of sample picture sets ; Extract the image features of each sample picture in the selected sample picture set, and extract multiple unselected sample pictures by incremental learning according to the image characteristics of each sample picture in the selected sample picture set
  • the sample picture training set includes multiple picture sets, each picture set corresponds to a cluster category, and each sample picture is located in one or more picture sets.
  • the training module 204 is configured to train the pre-training model through the sample picture training set to obtain a picture classification model.
  • the step of training a pre-training model through the sample picture training set includes: training the pre-training model through a transfer learning method; wherein, the sample picture training set is used to input to the The penultimate layer in the pre-trained model.
  • the computer device 2 is a device that can automatically perform numerical calculation and/or information processing according to pre-set or stored instructions.
  • the computer device 2 may be a rack server, a blade server, a tower server, or a cabinet server (including an independent server, or a server cluster composed of multiple servers).
  • the computer device 2 at least includes, but is not limited to, a memory 21, a processor 22, a network interface 23, and a picture classification model training system 20 that can communicate with each other through a system bus.
  • the memory 21 includes at least one type of non-volatile computer-readable storage medium.
  • the readable storage medium includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), Random access memory (RAM), static random access memory (SRAM), read only memory (ROM), electrically erasable programmable read only memory (EEPROM), programmable read only memory (PROM), magnetic memory, magnetic disk, optical disk Wait.
  • the memory 21 may be an internal storage unit of the computer device 2, such as a hard disk or memory of the computer device 2.
  • the memory 21 may also be an external storage device of the computer device 2, such as a plug-in hard disk, a smart media card (SMC), and a secure digital (Secure Digital, SD card, Flash Card, etc.
  • the memory 21 may also include both the internal storage unit of the computer device 2 and its external storage device.
  • the memory 21 is generally used to store the operating system and various application software installed in the computer device 2, for example, the program code of the image classification model training system 20 based on a small amount of data in the second embodiment.
  • the memory 21 can also be used to temporarily store various types of data that have been output or will be output.
  • the processor 22 may be a central processing unit (Central Processing Unit, CPU), a controller, a microcontroller, a microprocessor, or other data processing chips in some embodiments.
  • the processor 22 is generally used to control the overall operation of the computer device 2.
  • the processor 22 is used to run the program code stored in the memory 21 or process data, for example, to run the image classification model training system 20 based on a small amount of data to implement the image classification model training method based on a small amount of data in the first embodiment .
  • the network interface 23 may include a wireless network interface or a wired network interface, and the network interface 23 is generally used to establish a communication connection between the computer device 2 and other electronic devices.
  • the network interface 23 is used to connect the computer device 2 with an external terminal through a network, and establish a data transmission channel and a communication connection between the computer device 2 and the external terminal.
  • the network may be Intranet, Internet, Global System of Mobile Communication (GSM), Wideband Code Division Multiple Access (WCDMA), 4G network, 5G Network, Bluetooth (Bluetooth), Wi-Fi and other wireless or wired networks.
  • FIG. 5 only shows the computer device 2 with components 20-23, but it should be understood that it is not required to implement all the components shown, and more or fewer components may be implemented instead.
  • the image classification model training system 20 based on a small amount of data stored in the memory 21 can also be divided into one or more program modules, the one or more program modules are stored in the memory 21, and It is executed by one or more processors (the processor 22 in this embodiment) to complete the application.
  • FIG. 4 shows a schematic diagram of the program modules of the image classification model training system 20 based on a small amount of data according to the second embodiment of the present application.
  • the image classification model training system 20 based on a small amount of data can be It is divided into a building module 200, an acquisition module 202, and a training module 204.
  • the program module referred to in this application refers to a series of computer-readable instruction segments that can complete specific functions. The specific functions of the program modules 200-204 have been described in detail in the second embodiment, and will not be repeated here.
  • This embodiment also provides a non-volatile computer-readable storage medium, such as flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), random access memory (RAM), static random access memory ( SRAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), programmable read-only memory (PROM), magnetic memory, magnetic disk, optical disk, server, App application mall, etc., on which storage There are computer-readable instructions, and the corresponding functions are realized when the program is executed by the processor.
  • the non-volatile computer-readable storage medium of this embodiment is used in the image classification model training system 20 based on a small amount of data, and the processor executes the following steps:
  • Obtaining a plurality of sample pictures obtaining a sample picture training set according to the plurality of sample pictures, the plurality of sample pictures including pictures of a plurality of target picture types;
  • the pre-training model is trained through the sample picture training set to obtain a picture classification model.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Databases & Information Systems (AREA)
  • Library & Information Science (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种基于少量数据的图片分类模型训练方法,所述方法包括:根据ImageNet模型构建预训练模型,所述ImageNet模型为根据预设的预样本图片训练之后的网络模型(S100);获取多个样本图片,根据所述多个样本图片得到样本图片训练集,所述多个样本图片包括多个目标图片类型的图片(S102);通过所述样本图片训练集对预训练模型进行训练,得到图片分类模型(S104)。所述方法通过对少量数据进行模型训练,可实现图像训练速度提高、图像识别速度提高以及图片识别精度提高,还可增多可分辨图像种类,大大的提高了图片分类效率。

Description

图片分类模型训练方法、系统和计算机设备
本申请申明2019年08月14日递交的申请号为201910747597.9、名称为“图片分类模型训练方法、系统和计算机设备”的中国专利申请的优先权,该中国专利申请的整体内容以参考的方式结合在本申请中。
技术领域
本申请实施例涉及数据处理领域,尤其涉及一种基于少量数据的图片分类模型训练方法、系统、计算机设备及非易失性计算机可读存储介质。
背景技术
随着科技水平的提高,新的图像数据信息在被人类近一步了解,导致传统的图像数据集存在失效可能性和新的图像数据集数量有限。一个有效的图像分类系统的存在是具有重要意义。发明人发现,目前市面上的图像分类软件存在需要大量数据进行模型训练、图像训练和识别速度较慢、可分辨图像种类较较少以及图片识别精度不高等技术问题。
发明内容
有鉴于此,有必要提供一种基于少量数据的图片分类模型训练方法、系统、计算机设备及非易失性计算机可读存储介质,以解决当前存在需要大量数据进行模型训练、图像训练和识别速度较慢、可分辨图像种类较较少以及图片识别精度不高等技术问题。
为实现上述目的,本申请实施例提供了基于少量数据的图片分类模型训练方法,所述方法步骤包括:
根据ImageNet模型构建预训练模型,所述ImageNet模型为根据预设的预样本图片训练之后的网络模型;
获取多个样本图片,根据所述多个样本图片得到样本图片训练集,所述多个样本图片包括多个目标图片类型的图片;及
通过所述样本图片训练集对预训练模型进行训练,得到图片分类模型。
为实现上述目的,本申请实施例还提供了一种基于少量数据的图片分类模型训练系统,包括:
构建模块,用于根据ImageNet模型构建预训练模型,所述ImageNet模型为根据预设 的预样本图片训练之后的网络模型;
获取模块,用于获取多个样本图片,根据所述多个样本图片得到样本图片训练集,所述多个样本图片包括多个目标图片类型的图片;
训练模块,用于通过所述样本图片训练集对预训练模型进行训练,得到图片分类模型。
为实现上述目的,本申请实施例还提供了一种计算机设备,所述计算机设备包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机可读指令,所述计算机可读指令被处理器执行时实现以下步骤:
根据ImageNet模型构建预训练模型,所述ImageNet模型为根据预设的预样本图片训练之后的网络模型;
获取多个样本图片,根据所述多个样本图片得到样本图片训练集,所述多个样本图片包括多个目标图片类型的图片;及
通过所述样本图片训练集对预训练模型进行训练,得到图片分类模型。
为实现上述目的,本申请实施例还提供了一种非易失性计算机可读存储介质,所述非易失性计算机可读存储介质内存储有计算机可读指令,所述计算机可读指令可被至少一个处理器所执行,以使所述至少一个处理器执行如下步骤:
根据ImageNet模型构建预训练模型,所述ImageNet模型为根据预设的预样本图片训练之后的网络模型;
获取多个样本图片,根据所述多个样本图片得到样本图片训练集,所述多个样本图片包括多个目标图片类型的图片;及
通过所述样本图片训练集对预训练模型进行训练,得到图片分类模型。
本申请实施例提供的基于少量数据的图片分类模型训练方法、系统、计算机设备及非易失性计算机可读存储介质,为图片提供了有效的分类方法;通过对少量数据进行模型训练,可实现图像训练速度提高、图像识别速度提高以及图片识别精度提高,还可增多可分辨图像种类较,提高了图片分类效率。
附图说明
图1为本申请实施例基于少量数据的图片分类模型训练方法的流程示意图。
图2为图1中步骤S100的具体流程示意图。
图3为图1中步骤S102的具体流程示意图。
图4为本申请基于少量数据的图片分类模型训练系统实施例二的程序模块示意图。
图5为本申请计算机设备实施例三的硬件结构示意图。
具体实施方式
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本申请,并不用于限定本申请。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
需要说明的是,在本申请中涉及“第一”、“第二”等的描述仅用于描述目的,而不能理解为指示或暗示其相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括至少一个该特征。另外,各个实施例之间的技术方案可以相互结合,但是必须是以本领域普通技术人员能够实现为基础,当技术方案的结合出现相互矛盾或无法实现时应当认为这种技术方案的结合不存在,也不在本申请要求的保护范围之内。
以下实施例中,将以计算机设备2为执行主体进行示例性描述。
实施例一
参阅图1,示出了本申请实施例之基于少量数据的图片分类模型训练方法的步骤流程图。可以理解,本方法实施例中的流程图不用于对执行步骤的顺序进行限定。下面以计算机设备2为执行主体进行示例性描述。具体如下。
步骤S100,根据ImageNet模型构建预训练模型,所述ImageNet模型为根据预设的预样本图片训练之后的网络模型。
示例性的,所述预样本图片为海量的任意图片,所述任意图片可以在现有的图片数据库中获取或者可以在网络上直接下载得到。
具体的,所述步骤S100可以进一步包括:
步骤S100a,将各个预样本图片输入到网络模型中,所述各个预样本图片分别预先关联有对应的分类标签。
步骤S100b,通过网络模型输出各个预样本图片对应的分类标签置信度。
步骤S100c,基于所述各个预样本图片对应的分类标签置信度和各个预样本图片预先关联的分类标签调整所述网络模型中的各个模型参数,以得到所述ImageNet模型。
步骤S100d,基于所述ImageNet模型,并根据逐层分析网络层法配置所述ImageNet模型中各个网络层的层结构,以构建所述预训练模型。
步骤S100e,其中,所述预训练模型为九层图像分类网络,所述九层图像分类网络依次包括:输入层、conv1卷积层、conv2卷积层、conv3卷积层、conv4卷积层、conv5卷积层、fc6全链接层、fc7全链接层和输出层。
示例性的,所述九层图像分类网络的设计方法包括:
示例性的,所述conv1卷积层的设计方法为:通过图像输入层输入原始图像进行卷积处理,经过卷积处理之后将会得到了一个特征图;根据maxout激活函数将得到特征映射图;再通过池化降采样处理,所述池化处理为最大池化处理方法,设置池化层卷积核的大小以及滑动步长,经过降采样处理后,输出特征映射图;在输入到conv2卷积层之前做一个Batch Normalization即BN预处理。
示例性的,所述conv2卷积层的设计方法如下:将conv1的输出作为conv2的输入,首先进行卷积处理,为了防止特征图过快的变小,本层的卷积层还增加边缘补偿处理,再经过卷积处理之后得到特征图;然后同样通过maxout激活函数将得到特征映射图;同样的通过最大池化处理方法,设置池化层卷积核的大小以及滑动步长,输出特征映射图;在输入到conv3卷积层之前做BN处理,输出特征图。
示例性的,所述conv3卷积层的设计方法如下:将conv2的输出作为conv3的输入,首先进行卷积处理,本层的卷积层还增加边缘补偿处理,再通过卷积处理之后得到了一个特征图,同样的根据maxout激活函数将得到特征映射图,并将激活函数处理后的输出输入到了下一个卷积层。
示例性的,所述conv4卷积层的设计方法如下:将conv3的输出作为conv4的输入,首先经过卷积处理,本层的卷积层增加边缘补偿处理,再经过卷积处理之后得到了一个特征图,同样采用maxout激活函数将得到特征映射图,并将激活函数处理后的输出输入到了下一个卷积层。
示例性的,所述conv4卷积层的设计方法如下:将conv4的输出作为conv5的输入,首先将经过卷积处理,本层的卷积层增加边缘补偿处理,经过卷积处理之后将得到了一个特征图,同样采用maxout激活函数将得到特征映射图;采用最大池化处理方法,设置池化层卷积核的大小以及滑动步长,输出特征映射图,本层不进行BN预处理。
示例性的,所述fc6全链接层的设计方法如下:将conv5的输出作为fc6全链接层的输入,输入特征图,并设置fc6全链接层的神经元数,采用maxout激活函数,输出神经数,采用dropout处理方式进行输出。
示例性的,所述fc7全链接层的设计方法如下:将fc6的输出作为fc7全链接层的输入,该层结构与fc6基本相同,先设置fc6全链接层的神经元数,然后采用maxout激活函数, 输出神经数,同样采用dropout处理方式进行输出。
示例性的,所述输出分类层的设计方法如下:本层分类器选用softmax分类器,根据训练样本的不同来确定神经节点的个数,例如:对于CIFAR-10图像数据库来说,其节点数设置为10;对于CIFAR-100图像数据库来说,其神经节点数要设置为100,;对于CIFAR-1000图像数据库来说,其节点数设置为1000。通过以上步骤得到九层图像分类网络。
步骤S102,获取多个样本图片,根据所述多个样本图片得到样本图片训练集,所述多个样本图片包括多个目标图片类型的图片。
具体的,如图3所示,所述步骤S102可以进一步包括:
步骤S102a,获取N个样本图片,对所述N个样本图片进行取样,得到多个样本图片集合。
步骤S102b,从所述多个样本图片集合中选择其中一个样本图片集合。
示例性的,假设该被选择的样本图片集合中一共有n个图片x 1,x 2,…,x n
步骤S102c,提取被选择的样本图片集合中的各个样本图片的图像特征,以及根据所述被选择的样本图片集合中的各个样本图片的图像特征,通过增量学习法提取未被选择的多个样本图片集合中的每个样本图片集合中各个样本图片的图像特征,以得到N个样本图片的图像特征。
示例性的,若通过主分量分析方法计算所有图片的图像特征的均值
Figure PCTCN2019117405-appb-000001
以及协方差矩阵
Figure PCTCN2019117405-appb-000002
然后计算协方差矩阵Σ的特征值分解,根据特征值分解的结果提取图片的图像特征。这种做法需要读取n个图片的向量并存储在内存中,而当n非常大时,内存空间未必能满足存储需求,因此就会出错。
所以,采用增量主分量分析的方法逐步提取其余样本图片集合中图片的图像特征,所述提取被选择的样本图片集合中的各个样本图片的图像特征包括:
首先,从n个图片中先提取一部分图片,记为x 1,x 2,...,x m,m<n,计算出这m个图片的均值
Figure PCTCN2019117405-appb-000003
和协方差矩阵Σm,然后从n个图片中再提取一部分,记为x m+1,x m+2…,x m+p,同理,这p个图片的均值为
Figure PCTCN2019117405-appb-000004
协方差矩阵为Σp并计算出这m+p个图片x 1,x 2,…,x m,x m+1,x m+2,…,x m+p的协方差矩阵为:
Figure PCTCN2019117405-appb-000005
然后,不断从n个图片中提取部分图片,直到所有图片都提取完为止,最终计算出的协方差矩阵就是PCA方法中计算的协方差矩阵
Figure PCTCN2019117405-appb-000006
所述
Figure PCTCN2019117405-appb-000007
是n个图片的图像特征的均值。
特征提取中,T(X)可以根据用户需要进行选择,可通过用户提供的分类规则进行选择。以PCA为例,对协方差矩阵
Figure PCTCN2019117405-appb-000008
进行特征分解,得到:
Σ=PΛP T,对于每个图片x i,其特征就是
Figure PCTCN2019117405-appb-000009
Figure PCTCN2019117405-appb-000010
通过增量学习法逐步提取图像特征,对于第i(1≤i≤n)个图像I i,得到一个向量f i,而f i的维数小于I i的维数。
具体的,构造一个函数T(X),输入任意一张图像Ii,输出一个维数低于Ii的向量fi,即fi=T(Ii)。
示例性的,由于图片的维数较高,如果直接处理运算量会较大。因此采用特征提取的方式提取图像的特征。
本申请实施例是在得到样本图片集合中的各个样本图片的图像特征的基础上,通过增量学习法提取未被选择的多个样本图片集合中的每个样本图片集合中各个样本图片的图像特征,以得到N个样本图片的图像特征。
步骤S102d,根据所述N个样本图片的图像特征,并基于聚类算法得到样本图片训练集,所述样本图片训练集包括多个图片集合,每个图片集合对应一个聚类别类,每个图片样本位于一个或多个图片集合中。
通过聚类算法对N个样本图片进行分类,得到样本图片训练集的步骤,包括;
示例性的,根据所述N个样本图片的图像特征,基于聚类算法将所述N个样本图片划分为所述多个分类对应的多个图片集合,每个图片位于一个或多个图片集中,每个图片的图像特征包括一个类别标签,每个类别标签对应表征每个图片所属图片集合以及在所属图片集合中的编号。
具体的,通过聚类分析法将N个样本图片分成M个图片集合,即M个聚类,每个图像特征对应一个聚类,可以用类别标签标识图像特征所对应的聚类,例如第i(1≤i≤n)个图像特征对应一个类别标签L i,L i的取值范围从1到M。
由于降维之后的图像特征的数目仍然很多,直接进行查询和匹配还是耗时太多。
因此本申请实施例通过聚类分析法对特征进行处理,将N个样本图片的图像特征划分M个图片集合,从而将相似的图像特征的图片聚集到一起,一个图片集合对应一个聚类。聚类结束后,第i(1≤i≤n)个图片获得了一个类别标签L i,L i的取值范围为1到M,M个聚类中心分别为
Figure PCTCN2019117405-appb-000011
每个聚类中的数据都紧密的聚集在其聚类中心的周围。在进行查找时,只需要将待识别图片的图像特征和M个聚类中心进行比较,找到距离最近的那个聚类中心,于是待识别图片只属于该聚类,不会属于其他聚类,然后在该聚类中进行进一步的匹配,就可以获得最终的匹配结果,从而降低了在数据库中搜索的时间。
最后,对于每一张图片,将该图片、该图片对应的图像信息(例如该图像所展现的人的档案信息等)、提取出的该图片的图像特征、该图片对应的类别标签记录到数据库中。
另外,还需要把每个聚类的聚类中心,即每个聚类的均值保存到数据库中。
步骤S104,通过所述样本图片训练集对预训练模型进行训练,得到图片分类模型。
具体的,所述通过所述样本图片训练集对预训练模型进行训练的步骤包括:通过迁移学习法对所述预训练模型进行训练;其中,所述样本图片训练集用于输入至所述预训练模型中的倒数第二层。
示例性的,在此预训练模型的基础上,通过迁移学习法,将样本图片训练集放入预训练模型中的fc7全链接层中,作为训练学习集,可达到对预训练模型最后一层产生微调的效果,以完成预训练模型的训练,得到鲁棒性较高、准确度较好地图片分类模型。
实施例二
图4为本申请基于少量数据的图片分类模型训练系统实施例二的程序模块示意图。图片分类模型训练系统20可以包括或被分割成一个或多个程序模块,一个或者多个程序模块被存储于存储介质中,并由一个或多个处理器所执行,以完成本申请,并可实现上述基于少量数据的图片分类模型训练方法。本申请实施例所称的程序模块是指能够完成特定功能的一系列计算机可读指令段。以下描述将具体介绍本实施例各程序模块的功能:
构建模块200,用于根据ImageNet模型构建预训练模型,所述ImageNet模型为根据预设的预样本图片训练之后的网络模型。
示例性的,所述构建模块200还用于:将各个预样本图片输入到网络模型中,所述各个预样本图片分别预先关联有对应的分类标签;通过网络模型输出各个预样本图片对应的分类标签置信度;基于所述各个预样本图片对应的分类标签置信度和各个预样本图片预先关联的分类标签调整所述网络模型中的各个模型参数,以得到所述ImageNet模型;基于所述ImageNet模型,并根据逐层分析网络层法配置所述ImageNet模型中各个网络层的层结构,以构建所述预训练模型;其中,所述预训练模型为九层图像分类网络,所述九层图像分类网络依次包括:输入层、conv1卷积层、conv2卷积层、conv3卷积层、conv4卷积层、conv5卷积层、fc6全链接层、fc7全链接层和输出层。
示例性的,所述获取模块还用于:获取N个样本图片,对所述N个样本图片进行取样,得到多个样本图片集合;从所述多个样本图片集合中选择其中一个样本图片集合;提取被选择的样本图片集合中的各个样本图片的图像特征,以及根据所述被选择的样本图片集合中的各个样本图片的图像特征,通过增量学习法提取未被选择的多个样本图片集合中的每个样本图片集合中各个样本图片的图像特征,以得到N个样本图片的图像特征;及根据所述N个样本图片的图像特征,并基于聚类算法得到样本图片训练集,所述样本图片训练集包括多个图片集合,每个图片集合对应一个聚类别类,每个样本图片位于一个或多个图片集合中。
训练模块204,用于通过所述样本图片训练集对预训练模型进行训练,得到图片分类模型。
示例性的,所述通过所述样本图片训练集对预训练模型进行训练的步骤包括:通过迁移学习法对所述预训练模型进行训练;其中,所述样本图片训练集用于输入至所述预训练模型中的倒数第二层。
实施例三
参阅图5,是本申请实施例三之计算机设备的硬件架构示意图。本实施例中,所述计算机设备2是一种能够按照事先设定或者存储的指令,自动进行数值计算和/或信息处理的设备。该计算机设备2可以是机架式服务器、刀片式服务器、塔式服务器或机柜式服务器(包括独立的服务器,或者多个服务器所组成的服务器集群)等。如图所示,所述计算机设备2至少包括,但不限于,可通过系统总线相互通信连接存储器21、处理器22、网络接口23、以及图片分类模型训练系统20。
本实施例中,存储器21至少包括一种类型的非易失性计算机可读存储介质,所述可读存储介质包括闪存、硬盘、多媒体卡、卡型存储器(例如,SD或DX存储器等)、随机访 问存储器(RAM)、静态随机访问存储器(SRAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、可编程只读存储器(PROM)、磁性存储器、磁盘、光盘等。在一些实施例中,存储器21可以是计算机设备2的内部存储单元,例如该计算机设备2的硬盘或内存。在另一些实施例中,存储器21也可以是计算机设备2的外部存储设备,例如该计算机设备2上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。当然,存储器21还可以既包括计算机设备2的内部存储单元也包括其外部存储设备。本实施例中,存储器21通常用于存储安装于计算机设备2的操作系统和各类应用软件,例如实施例二的基于少量数据的图片分类模型训练系统20的程序代码等。此外,存储器21还可以用于暂时地存储已经输出或者将要输出的各类数据。
处理器22在一些实施例中可以是中央处理器(Central Processing Unit,CPU)、控制器、微控制器、微处理器、或其他数据处理芯片。该处理器22通常用于控制计算机设备2的总体操作。本实施例中,处理器22用于运行存储器21中存储的程序代码或者处理数据,例如运行基于少量数据的图片分类模型训练系统20,以实现实施例一的基于少量数据的图片分类模型训练方法。
所述网络接口23可包括无线网络接口或有线网络接口,该网络接口23通常用于在所述计算机设备2与其他电子装置之间建立通信连接。例如,所述网络接口23用于通过网络将所述计算机设备2与外部终端相连,在所述计算机设备2与外部终端之间的建立数据传输通道和通信连接等。所述网络可以是企业内部网(Intranet)、互联网(Internet)、全球移动通讯系统(Global System of Mobile communication,GSM)、宽带码分多址(Wideband Code Division Multiple Access,WCDMA)、4G网络、5G网络、蓝牙(Bluetooth)、Wi-Fi等无线或有线网络。
需要指出的是,图5仅示出了具有部件20-23的计算机设备2,但是应理解的是,并不要求实施所有示出的部件,可以替代的实施更多或者更少的部件。
在本实施例中,存储于存储器21中的基于少量数据的图片分类模型训练系统20还可以被分割为一个或者多个程序模块,所述一个或者多个程序模块被存储于存储器21中,并由一个或多个处理器(本实施例为处理器22)所执行,以完成本申请。
例如,图4示出了本申请实施例二之所述实现基于少量数据的图片分类模型训练系统20的程序模块示意图,该实施例中,所述基于少量数据的图片分类模型训练系统20可以被划分为构建模块200、获取模块202和训练模块204。其中,本申请所称的程序模块是指能够完成特定功能的一系列计算机可读指令段。所述程序模块200-204的具体功能在实施 例二中已有详细描述,在此不再赘述。
实施例四
本实施例还提供一种非易失性计算机可读存储介质,如闪存、硬盘、多媒体卡、卡型存储器(例如,SD或DX存储器等)、随机访问存储器(RAM)、静态随机访问存储器(SRAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、可编程只读存储器(PROM)、磁性存储器、磁盘、光盘、服务器、App应用商城等等,其上存储有计算机可读指令,程序被处理器执行时实现相应功能。本实施例的非易失性计算机可读存储介质用于基于少量数据的图片分类模型训练系统20,被处理器执行如下步骤:
根据ImageNet模型构建预训练模型,所述ImageNet模型为根据预设的预样本图片训练之后的网络模型;
获取多个样本图片,根据所述多个样本图片得到样本图片训练集,所述多个样本图片包括多个目标图片类型的图片;及
通过所述样本图片训练集对预训练模型进行训练,得到图片分类模型。
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。

Claims (20)

  1. 一种基于少量数据的图片分类模型训练方法,所述方法包括:
    根据ImageNet模型构建预训练模型,所述ImageNet模型为根据预设的预样本图片训练之后的网络模型;
    获取多个样本图片,根据所述多个样本图片得到样本图片训练集,所述多个样本图片包括多个目标图片类型的图片;及
    通过所述样本图片训练集对预训练模型进行训练,得到图片分类模型。
  2. 如权利要求1所述的基于少量数据的图片分类模型训练方法,所述根据ImageNet模型构建预训练模型的步骤,包括:
    将各个预样本图片输入到网络模型中,所述各个预样本图片分别预先关联有对应的分类标签;
    通过网络模型输出各个预样本图片对应的分类标签置信度;
    基于所述各个预样本图片对应的分类标签置信度和各个预样本图片预先关联的分类标签调整所述网络模型中的各个模型参数,以得到所述ImageNet模型;
    基于所述ImageNet模型,并根据逐层分析网络层法配置所述ImageNet模型中各个网络层的层结构,以构建所述预训练模型;
    其中,所述预训练模型为九层图像分类网络,所述九层图像分类网络依次包括:输入层、conv1卷积层、conv2卷积层、conv3卷积层、conv4卷积层、conv5卷积层、fc6全链接层、fc7全链接层和输出层。
  3. 如权利要求2所述的基于少量数据的图片分类模型训练方法,所述根据所述多个样本图片得到样本图片训练集的步骤,包括:
    获取N个样本图片,对所述N个样本图片进行取样,得到多个样本图片集合;
    从所述多个样本图片集合中选择其中一个样本图片集合;
    提取被选择的样本图片集合中的各个样本图片的图像特征,以及根据所述被选择的样本图片集合中的各个样本图片的图像特征,通过增量学习法提取未被选择的多个样本图片集合中的每个样本图片集合中各个样本图片的图像特征,以得到N个样本图片的图像特征;及
    根据所述N个样本图片的图像特征,并基于聚类算法得到样本图片训练集,所述样本图片训练集包括多个图片集合,每个图片集合对应一个聚类别类,每个样本图片位于一个或多个图片集合中。
  4. 如权利要求1所述的基于少量数据的图片分类模型训练方法,通过所述样本图片训练集对预训练模型进行训练的步骤包括:通过迁移学习法对所述预训练模型进行训练。
  5. 如权利要求4所述的基于少量数据的图片分类模型训练方法,所述样本图片训练集用于输入至所述预训练模型中的倒数第二层。
  6. 一种基于少量数据的图片分类模型训练系统,包括:
    构建模块,用于根据ImageNet模型构建预训练模型,所述ImageNet模型为根据预设的预样本图片训练之后的网络模型;
    获取模块,用于获取多个样本图片,根据所述多个样本图片得到样本图片训练集,所述多个样本图片包括多个目标图片类型的图片;
    训练模块,用于通过所述样本图片训练集对预训练模型进行训练,得到图片分类模型。
  7. 如权利要求6所述的基于少量数据的图片分类模型训练系统,所述构建模块还用于:
    将各个预样本图片输入到网络模型中,所述各个预样本图片分别预先关联有对应的分类标签;
    通过网络模型输出各个预样本图片对应的分类标签置信度;
    基于所述各个预样本图片对应的分类标签置信度和各个预样本图片预先关联的分类标签调整所述网络模型中的各个模型参数,以得到所述ImageNet模型;
    基于所述ImageNet模型,并根据逐层分析网络层法配置所述ImageNet模型中各个网络层的层结构,以构建所述预训练模型;
    其中,所述预训练模型为九层图像分类网络,所述九层图像分类网络依次包括:输入层、conv1卷积层、conv2卷积层、conv3卷积层、conv4卷积层、conv5卷积层、fc6全链接层、fc7全链接层和输出层。
  8. 如权利要求7所述的基于少量数据的图片分类模型训练系统,所述获取模块还用于:
    获取N个样本图片,对所述N个样本图片进行取样,得到多个样本图片集合;
    从所述多个样本图片集合中选择其中一个样本图片集合;
    提取被选择的样本图片集合中的各个样本图片的图像特征,以及根据所述被选择的样本图片集合中的各个样本图片的图像特征,通过增量学习法提取未被选择的多个样本图片集合中的每个样本图片集合中各个样本图片的图像特征,以得到N个样本图片的图像特征;及
    根据所述N个样本图片的图像特征,并基于聚类算法得到样本图片训练集,所述样本图片训练集包括多个图片集合,每个图片集合对应一个聚类别类,每个图片样本位于一个或多个图片集合中。
  9. 如权利要求6所述的基于少量数据的图片分类模型训练系统,所述训练模块还用于:通过迁移学习法对所述预训练模型进行训练。
  10. 如权利要求9所述的基于少量数据的图片分类模型训练系统,所述样本图片训练集用于输入至所述预训练模型中的倒数第二层。
  11. 一种计算机设备,所述计算机设备包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机可读指令,所述计算机可读指令被处理器执行时实现以下步骤:
    根据ImageNet模型构建预训练模型,所述ImageNet模型为根据预设的预样本图片训练之后的网络模型;
    获取多个样本图片,根据所述多个样本图片得到样本图片训练集,所述多个样本图片包括多个目标图片类型的图片;及
    通过所述样本图片训练集对预训练模型进行训练,得到图片分类模型。
  12. 根据权利要求11所述的计算机设备,所述根据ImageNet模型构建预训练模型的步骤,包括:
    将各个预样本图片输入到网络模型中,所述各个预样本图片分别预先关联有对应的分类标签;
    通过网络模型输出各个预样本图片对应的分类标签置信度;
    基于所述各个预样本图片对应的分类标签置信度和各个预样本图片预先关联的分类标签调整所述网络模型中的各个模型参数,以得到所述ImageNet模型;
    基于所述ImageNet模型,并根据逐层分析网络层法配置所述ImageNet模型中各个网络层的层结构,以构建所述预训练模型;
    其中,所述预训练模型为九层图像分类网络,所述九层图像分类网络依次包括:输入层、conv1卷积层、conv2卷积层、conv3卷积层、conv4卷积层、conv5卷积层、fc6全链接层、fc7全链接层和输出层。
  13. 如权利要求12所述的计算机设备,所述根据所述多个样本图片得到样本图片训练集的步骤,包括:
    获取N个样本图片,对所述N个样本图片进行取样,得到多个样本图片集合;
    从所述多个样本图片集合中选择其中一个样本图片集合;
    提取被选择的样本图片集合中的各个样本图片的图像特征,以及根据所述被选择的样本图片集合中的各个样本图片的图像特征,通过增量学习法提取未被选择的多个样本图片集合中的每个样本图片集合中各个样本图片的图像特征,以得到N个样本图片的图像特征; 及
    根据所述N个样本图片的图像特征,并基于聚类算法得到样本图片训练集,所述样本图片训练集包括多个图片集合,每个图片集合对应一个聚类别类,每个样本图片位于一个或多个图片集合中。
  14. 如权利要求11所述的计算机设备,通过所述样本图片训练集对预训练模型进行训练的步骤包括:通过迁移学习法对所述预训练模型进行训练。
  15. 如权利要求14所述的计算机设备,所述样本图片训练集用于输入至所述预训练模型中的倒数第二层。
  16. 一种非易失性计算机可读存储介质,所述非易失性计算机可读存储介质内存储有计算机可读指令,所述计算机可读指令可被至少一个处理器所执行,以使所述至少一个处理器执行如下步骤:
    根据ImageNet模型构建预训练模型,所述ImageNet模型为根据预设的预样本图片训练之后的网络模型;
    获取多个样本图片,根据所述多个样本图片得到样本图片训练集,所述多个样本图片包括多个目标图片类型的图片;及
    通过所述样本图片训练集对预训练模型进行训练,得到图片分类模型。
  17. 根据权利要求16所述的非易失性计算机可读存储介质,所述根据ImageNet模型构建预训练模型的步骤,包括:
    将各个预样本图片输入到网络模型中,所述各个预样本图片分别预先关联有对应的分类标签;
    通过网络模型输出各个预样本图片对应的分类标签置信度;
    基于所述各个预样本图片对应的分类标签置信度和各个预样本图片预先关联的分类标签调整所述网络模型中的各个模型参数,以得到所述ImageNet模型;
    基于所述ImageNet模型,并根据逐层分析网络层法配置所述ImageNet模型中各个网络层的层结构,以构建所述预训练模型;
    其中,所述预训练模型为九层图像分类网络,所述九层图像分类网络依次包括:输入层、conv1卷积层、conv2卷积层、conv3卷积层、conv4卷积层、conv5卷积层、fc6全链接层、fc7全链接层和输出层。
  18. 如权利要求17所述的非易失性计算机可读存储介质,所述根据所述多个样本图片得到样本图片训练集的步骤,包括:
    获取N个样本图片,对所述N个样本图片进行取样,得到多个样本图片集合;
    从所述多个样本图片集合中选择其中一个样本图片集合;
    提取被选择的样本图片集合中的各个样本图片的图像特征,以及根据所述被选择的样本图片集合中的各个样本图片的图像特征,通过增量学习法提取未被选择的多个样本图片集合中的每个样本图片集合中各个样本图片的图像特征,以得到N个样本图片的图像特征;及
    根据所述N个样本图片的图像特征,并基于聚类算法得到样本图片训练集,所述样本图片训练集包括多个图片集合,每个图片集合对应一个聚类别类,每个样本图片位于一个或多个图片集合中。
  19. 如权利要求16所述的非易失性计算机可读存储介质,通过所述样本图片训练集对预训练模型进行训练的步骤包括:通过迁移学习法对所述预训练模型进行训练。
  20. 如权利要求19所述的非易失性计算机可读存储介质,所述样本图片训练集用于输入至所述预训练模型中的倒数第二层。
PCT/CN2019/117405 2019-08-14 2019-11-12 图片分类模型训练方法、系统和计算机设备 WO2021027142A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910747597.9A CN110659667A (zh) 2019-08-14 2019-08-14 图片分类模型训练方法、系统和计算机设备
CN201910747597.9 2019-08-14

Publications (1)

Publication Number Publication Date
WO2021027142A1 true WO2021027142A1 (zh) 2021-02-18

Family

ID=69037014

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/117405 WO2021027142A1 (zh) 2019-08-14 2019-11-12 图片分类模型训练方法、系统和计算机设备

Country Status (2)

Country Link
CN (1) CN110659667A (zh)
WO (1) WO2021027142A1 (zh)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113066053A (zh) * 2021-03-11 2021-07-02 紫东信息科技(苏州)有限公司 一种基于模型迁移的十二指肠自训练分类方法及系统
CN113112518A (zh) * 2021-04-19 2021-07-13 深圳思谋信息科技有限公司 基于拼接图像的特征提取器生成方法、装置和计算机设备
CN113239964A (zh) * 2021-04-13 2021-08-10 联合汽车电子有限公司 车辆数据的处理方法、装置、设备和存储介质
CN114873097A (zh) * 2022-04-11 2022-08-09 青岛理工大学 一种基于物体识别的智能分类垃圾桶
CN115114467A (zh) * 2021-03-17 2022-09-27 腾讯科技(深圳)有限公司 图片神经网络模型的训练方法以及装置

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112016622A (zh) * 2020-08-28 2020-12-01 中移(杭州)信息技术有限公司 模型训练的方法、电子设备和计算机可读存储介质
CN113360696A (zh) * 2021-06-23 2021-09-07 北京百度网讯科技有限公司 图像配对方法、装置、设备以及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106960219A (zh) * 2017-03-10 2017-07-18 百度在线网络技术(北京)有限公司 图片识别方法及装置、计算机设备及计算机可读介质
CN107292333A (zh) * 2017-06-05 2017-10-24 浙江工业大学 一种基于深度学习的快速图像分类方法
CN109102002A (zh) * 2018-07-17 2018-12-28 重庆大学 结合卷积神经网络和概念机递归神经网络的图像分类方法
CN109711426A (zh) * 2018-11-16 2019-05-03 中山大学 一种基于gan和迁移学习的病理图片分类装置及方法

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101604394A (zh) * 2008-12-30 2009-12-16 华中科技大学 一种有限存贮资源下的增量学习分类方法
CN107341518A (zh) * 2017-07-07 2017-11-10 东华理工大学 一种基于卷积神经网络的图像分类方法
CN109840530A (zh) * 2017-11-24 2019-06-04 华为技术有限公司 训练多标签分类模型的方法和装置
CN109934242A (zh) * 2017-12-15 2019-06-25 北京京东尚科信息技术有限公司 图片识别方法和装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106960219A (zh) * 2017-03-10 2017-07-18 百度在线网络技术(北京)有限公司 图片识别方法及装置、计算机设备及计算机可读介质
CN107292333A (zh) * 2017-06-05 2017-10-24 浙江工业大学 一种基于深度学习的快速图像分类方法
CN109102002A (zh) * 2018-07-17 2018-12-28 重庆大学 结合卷积神经网络和概念机递归神经网络的图像分类方法
CN109711426A (zh) * 2018-11-16 2019-05-03 中山大学 一种基于gan和迁移学习的病理图片分类装置及方法

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113066053A (zh) * 2021-03-11 2021-07-02 紫东信息科技(苏州)有限公司 一种基于模型迁移的十二指肠自训练分类方法及系统
CN113066053B (zh) * 2021-03-11 2023-10-10 紫东信息科技(苏州)有限公司 一种基于模型迁移的十二指肠自训练分类方法及系统
CN115114467A (zh) * 2021-03-17 2022-09-27 腾讯科技(深圳)有限公司 图片神经网络模型的训练方法以及装置
CN115114467B (zh) * 2021-03-17 2024-05-14 腾讯科技(深圳)有限公司 图片神经网络模型的训练方法以及装置
CN113239964A (zh) * 2021-04-13 2021-08-10 联合汽车电子有限公司 车辆数据的处理方法、装置、设备和存储介质
CN113239964B (zh) * 2021-04-13 2024-03-01 联合汽车电子有限公司 车辆数据的处理方法、装置、设备和存储介质
CN113112518A (zh) * 2021-04-19 2021-07-13 深圳思谋信息科技有限公司 基于拼接图像的特征提取器生成方法、装置和计算机设备
CN113112518B (zh) * 2021-04-19 2024-03-26 深圳思谋信息科技有限公司 基于拼接图像的特征提取器生成方法、装置和计算机设备
CN114873097A (zh) * 2022-04-11 2022-08-09 青岛理工大学 一种基于物体识别的智能分类垃圾桶

Also Published As

Publication number Publication date
CN110659667A (zh) 2020-01-07

Similar Documents

Publication Publication Date Title
WO2021027142A1 (zh) 图片分类模型训练方法、系统和计算机设备
US11348249B2 (en) Training method for image semantic segmentation model and server
WO2021238281A1 (zh) 一种神经网络的训练方法、图像分类系统及相关设备
WO2021017303A1 (zh) 行人重识别方法、装置、计算机设备及存储介质
WO2022042123A1 (zh) 图像识别模型生成方法、装置、计算机设备和存储介质
WO2020224106A1 (zh) 基于神经网络的文本分类方法、系统及计算机设备
US9141853B1 (en) System and method for extracting information from documents
WO2023134084A1 (zh) 多标签识别方法、装置、电子设备及存储介质
EP3620982B1 (en) Sample processing method and device
CN112926654A (zh) 预标注模型训练、证件预标注方法、装置、设备及介质
CN111475622A (zh) 一种文本分类方法、装置、终端及存储介质
CN114329029B (zh) 对象检索方法、装置、设备及计算机存储介质
CN113204660B (zh) 多媒体数据处理方法、标签识别方法、装置及电子设备
CN113806582B (zh) 图像检索方法、装置、电子设备和存储介质
WO2022105121A1 (zh) 一种应用于bert模型的蒸馏方法、装置、设备及存储介质
CN110110724A (zh) 基于指数型挤压函数驱动胶囊神经网络的文本验证码识别方法
WO2024001806A1 (zh) 一种基于联邦学习的数据价值评估方法及其相关设备
CN111178196B (zh) 一种细胞分类的方法、装置及设备
WO2021000674A1 (zh) 细胞图片识别方法、系统、计算机设备及可读存储介质
CN111783688B (zh) 一种基于卷积神经网络的遥感图像场景分类方法
CN112711652A (zh) 术语标准化方法及装置
CN115346084A (zh) 样本处理方法、装置、电子设备、存储介质及程序产品
CN111091198B (zh) 一种数据处理方法及装置
CN111666902B (zh) 行人特征提取模型的训练方法、行人识别方法及相关装置
CN114118411A (zh) 图像识别网络的训练方法、图像识别方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19941448

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19941448

Country of ref document: EP

Kind code of ref document: A1