WO2021151318A1 - Procédé et appareil de classification d'images basés sur l'apprentissage profond et dispositif informatique - Google Patents

Procédé et appareil de classification d'images basés sur l'apprentissage profond et dispositif informatique Download PDF

Info

Publication number
WO2021151318A1
WO2021151318A1 PCT/CN2020/122131 CN2020122131W WO2021151318A1 WO 2021151318 A1 WO2021151318 A1 WO 2021151318A1 CN 2020122131 W CN2020122131 W CN 2020122131W WO 2021151318 A1 WO2021151318 A1 WO 2021151318A1
Authority
WO
WIPO (PCT)
Prior art keywords
supernet
convolutional layer
training
image
training set
Prior art date
Application number
PCT/CN2020/122131
Other languages
English (en)
Chinese (zh)
Inventor
沈赞
庄伯金
王少军
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021151318A1 publication Critical patent/WO2021151318A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Definitions

  • This application relates to the field of artificial intelligence technology, and in particular to an image classification method, device and computer equipment based on deep learning.
  • Image classification can be intelligently classified using deep learning methods, among which deep learning methods have achieved great success in the field of machine learning, and many classic and effective network structures have emerged.
  • deep learning methods have achieved great success in the field of machine learning, and many classic and effective network structures have emerged.
  • the design of these network structures relies on the rich experience of domain experts, and it takes a lot of time and energy to design and experiment. Therefore, neural architecture search methods have become a hot research field in recent years. By defining the search space, reinforcement learning, evolutionary algorithms and other methods are used to automatically search for the optimal network structure. These methods are very time-consuming and require a lot of GPU resources.
  • the inventor found that the One-Shot method using weight sharing is currently proposed.
  • the supernet By constructing a directed acyclic graph that contains all the operation options, the supernet, only one training is performed, and the trained supernet is Sample a network composed of a single path composed of different operation items, evaluate the accuracy on the test set, and then select the optimal neural architecture.
  • the inventor of this application found in research that since the output of the previous layer in the convolutional neural network and the input of the next layer need to be consistent in the number of channels, the supernet cannot define the search in the dimension of the number of channels. Instead, the number of channels in each layer of the network is artificially defined in advance, so that the results obtained will affect the accuracy, resulting in the neural architecture obtained is not a suitable architecture, and then the image classification will be affected when the neural architecture model is used for image classification. Accuracy.
  • this application provides an image classification method, device, and computer equipment based on deep learning, the main purpose of which is to improve the technical problems that affect the accuracy of image classification in the prior art.
  • an image classification method based on deep learning includes: configuring search space information of a neural architecture based on a MobileNet network; constructing a supernet based on the search space information, and configuring the supernet
  • the model classifies images to be classified.
  • an image classification device based on deep learning includes: a configuration module for configuring search space information of a neural architecture based on a MobileNet network; and a building module for searching according to the Spatial information constructs a supernet, and configures a spring structure corresponding to each convolutional layer of the supernet, wherein the spring structure is used for the supernet training to reduce the number of channels corresponding to different operation items of the same convolutional layer.
  • a training module for training the supernet using the first image training set to determine a target neural architecture suitable for image classification
  • the training module further It is used to train the model of the target neural architecture by using the second picture training set
  • the classification module is used to use the trained model to perform image classification on the pictures to be classified.
  • a storage medium such as a non-volatile readable storage medium
  • the neural architecture is configured Search space information; construct a supernet according to the search space information, and configure a spring structure corresponding to each convolutional layer of the supernet, wherein the spring structure is used for the same convolution when the supernet is trained
  • the number of channels corresponding to different operation items of the layer is fixed to the same number of channels and output to the next convolutional layer;
  • the first image training set is used to train the supernet to determine the target neural architecture suitable for image classification;
  • the second picture training set trains the model of the target neural architecture, and uses the trained model to classify the pictures to be classified.
  • a computer device including a storage medium such as a non-volatile readable storage medium, a processor, and a storage medium such as a non-volatile readable storage medium and can be stored in the processor.
  • the processor When the processor executes the program, the processor implements the following methods: configure the search space information of the neural architecture based on the MobileNet network; construct the supernet according to the search space information, and configure each of the supernets The spring structure corresponding to the convolutional layer, wherein the spring structure is used in the supernet training to fix the number of channels corresponding to different operation items of the same convolutional layer to the same number of channels and output to the next convolutional layer; Use the first image training set to train the supernet to determine a target neural architecture suitable for image classification; use the second image training set to train the model of the target neural architecture, and use the model that meets the training criteria, Perform image classification on the pictures to be classified.
  • a model with the optimal neural architecture that has been trained up to the standard can be used to accurately classify images, which improves the accuracy of image classification.
  • Fig. 1 shows a schematic flowchart of a deep learning-based image classification method provided by an embodiment of the present application.
  • FIG. 2 shows a schematic flowchart of another image classification method based on deep learning provided by an embodiment of the present application.
  • Fig. 3 shows a schematic structural diagram of an image classification device based on deep learning provided by an embodiment of the present application.
  • the technical solution of this application can be applied to the fields of artificial intelligence, smart city, blockchain and/or big data technology, for example, it can specifically involve neural network technology.
  • the data involved in this application such as classification data, can be stored in a database, or can be stored in a blockchain, which is not limited in this application.
  • the example provides an image classification method based on deep learning. As shown in Figure 1, the method includes the following steps.
  • the search space information can include the search space range parameters of the optimal neural architecture.
  • the search space range parameters can specifically include the number, step length and size of the convolution kernel, the number of convolution layers, the number of neurons, and whether to use jumping Range values such as connection and activation function types. According to different search space range parameters, different neural architectures can be constructed, and subsequent searches for the optimal neural architecture suitable for image classification can be performed based on these different neural architectures.
  • the solution of this embodiment specifically follows the idea of the One-Shot method, and the search space of the neural architecture is based on the MobileNet network designed for mobile devices (a lightweight deep neural network proposed for embedded devices such as mobile phones).
  • the advantage of choosing the MobileNet network is that its model parameters are small and the calculation speed is fast, which can reduce server-side delay and increase the Query Per Second (QPS) of detection.
  • QPS Query Per Second
  • the storage model of MobileNet is very small, it can be easily deployed on the mobile side (such as the client side of mobile phones, tablet computers, etc.), that is, offline image detection can be performed on the mobile side. For example, if it is built into the APP, the user can detect and intercept the image before uploading it (illegal image interception), further reducing the pressure on the server and increasing the detection capability indefinitely.
  • the execution subject of this embodiment can be an image classification device or device based on deep learning, which can be deployed on a client or server, etc., which can improve the accuracy of image classification.
  • a directed acyclic graph containing all operation options is constructed, that is, a supernet. Subsequently, a network consisting of a single path composed of different operations can be sampled on the trained supernet, and the accuracy on the test set can be evaluated, and then the optimal neural architecture can be selected.
  • this embodiment introduces a new spring block, which can easily adapt to the selection of different channel numbers. Avoid destroying the stability of the network.
  • the spring structure can be used to fix the number of channels corresponding to different operation items of the same convolutional layer to the same number of channels and output to the next convolutional layer during supernet training.
  • the first image training set is created in advance to train the supernet to find the optimal neural architecture, that is, the target neural architecture, and then find a deep learning model structure suitable for image classification.
  • the image training set contains different image features (such as image content features such as patterns, colors, and line shapes in the image), as well as image tags corresponding to these image features (such as girls, small fresh, cars, animals, animation, advertisements, etc.) ).
  • image features such as image content features such as patterns, colors, and line shapes in the image
  • image tags corresponding to these image features such as girls, small fresh, cars, animals, animation, advertisements, etc.
  • the second image training set may contain more sample features and label data corresponding to the sample features than the first image training set.
  • the first picture training set can be partially obtained from the second picture training set.
  • the purpose of the first image training set is for the supernet to find the optimal neural architecture model suitable for image classification, and the second image training set is used to train the optimal neural architecture model to achieve an accuracy greater than a certain threshold
  • the classification model is used to classify images to be classified to determine the classification results corresponding to the images to be classified, such as girls, fresh, cars, animals, animations, advertisements and other classification results.
  • the model with the optimal neural architecture can choose the MobileNet model, such as MobileNetV2, MobileNetV3 model, etc. After the training of the MobileNet model with the optimal neural architecture is completed and the test reaches the standard, it can be used as a classification model for image classification.
  • the trained MobileNetV3 model can be deployed on the smartphone side.
  • the picture features of the user picture are first extracted locally, and then input into the MobileNetV3 model to find the most similar sample feature corresponding to it And output the classification result according to the image label, and the client of the smart phone determines whether to upload the user picture to the server according to the classification result. If the classification result is "girl”, “small fresh”, “cartoon”, “advertising”, etc., then the user's request to upload the picture can be rejected locally and prompted to upload another legal picture. In this way, the pressure on the server to identify and classify the pictures uploaded by users can be reduced, and illegal pictures can be intercepted locally for the first time.
  • this embodiment can first be based on the MobileNet network and configure the neural network.
  • the structure of the search space information and then construct the supernet according to the search space information, and configure the spring structure corresponding to each convolutional layer of the supernet, where the spring structure can be used to correspond to different operation items of the same convolutional layer during supernet training
  • the number of channels is fixed to the same number of channels and output to the next convolutional layer, which ensures that the number of input channels of the next convolutional layer is always fixed, thereby ensuring that the previous convolutional layer in the convolutional network
  • the output is consistent with the input of the latter convolutional layer in the number of channels, so as to avoid the inconsistency of the number of input channels of the latter convolutional layer caused by the difference in the number of output channels of the previous convolutional layer and the inability to train the supernet.
  • Subsequent supernets trained in this way can be used to accurately determine the optimal neural architecture suitable for image classification, so that a model with the optimal neural architecture that meets the training standards can be used to accurately classify images, which improves the accuracy of image classification
  • the method includes The following steps.
  • the set MobileNet network define the dimensional information of the neural architecture search space and the size information of the search space.
  • the dimension information includes at least the size of the convolution kernel, expansion coefficient, and number of channels of each convolution layer.
  • the search space is based on the MobileNetV2 network designed for mobile devices, with a total of 19 layers, and the optional operation items of each convolutional layer can be defined as an inverted residual block (inverted residual blocks).
  • the dimensions of the search space include the size of the convolution kernel k: 3 ⁇ 3, 5 ⁇ 5, 7 ⁇ 7, the expansion coefficient t: 3, 6 and the number of channels c (three options for each convolutional layer).
  • the size of the search space is 3 ⁇ 6 ⁇ 6 ⁇ 19.
  • step 202 the partial structure of the supernet constructed and obtained is shown in Table 1 below.
  • the spring structure may be obtained by transforming each convolutional layer of the supernet based on the reverse residual structure, and the intermediate deep convolutional layer of the spring structure is used for deep feature extraction, and the intermediate deep convolutional layer There are 1 ⁇ 1 convolutional layers before and after (depth-wise convolution); the first 1 ⁇ 1 convolution layer of the middle depth convolution layer is used to expand the input feature diversity, and the last 1 ⁇ 1 volume of the middle depth convolution layer
  • the product layer is used to restore the extracted deep features to a fixed number of channels and output them to the next convolutional layer.
  • the fixed number of channels is the maximum number of channels available for the convolutional layer structure.
  • the last layer of linear 1 ⁇ 1 convolutional layer in this structure can actually transform the number of output channels to any number of channels.
  • the number of channels corresponding to different operation items of the same layer is fixed to the number of channels of the same size when passing through the last layer of 1 ⁇ 1 convolutional layer and output, thus ensuring the reverse residual of the latter layer.
  • the number of input channels of the difference structure is always fixed (that is, to ensure that the output of the previous layer in the convolutional neural network and the input of the next layer need to be consistent in the number of channels), so as to avoid the difference in the number of channels in the previous layer
  • the inconsistency in the number of input channels leads to inability to train.
  • the fixed number of channels after transformation adopts the maximum number of channels that can be selected in the layer structure.
  • the training process of the supernet is divided into multiple sub-training processes according to a preset time interval.
  • the first image training set can be stored in the blockchain; accordingly, the first image training set is used to train the supernet, which specifically includes: Obtain the first image training set from the blockchain to train the supernet.
  • the first image training set data can be obtained from the target node of the blockchain, and then the supernet can be trained.
  • the blockchain referred to in this embodiment is a new application mode of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm.
  • Blockchain essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information for verification. The validity of the information (anti-counterfeiting) and the generation of the next block.
  • the blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
  • this scheme divides the supernet training into two stages.
  • the first stage maintains normal training, and each time a path of the supernet is randomly sampled to update the weight; the second stage gradually shrinks the search space based on the model trained in the first stage.
  • the process shown in step 205 can be executed.
  • a path of the supernet is randomly sampled for weight update, and training is continued based on the supernet after the path weight is updated, so as to shrink the supernet.
  • the search space corresponding to the net.
  • step 205 may specifically include: randomly sampling a preset number of models from the supernet trained in the previous sub-training process; and then using a picture test set to test the sampled models.
  • the picture test set may be Determined according to the first image training set; sort the sampled models according to the test accuracy; count the first number of models with each operation item in the top preset ratio of each convolutional layer, and the preset number after the ranking The second quantity of the proportional model; and then according to the difference between the first quantity and the second quantity, in each convolutional layer, the operation items with the difference of the previous first preset quantity greater than 0 are retained, and Delete the remaining unreserved operation items; after deleting the remaining unreserved operation items in each convolutional layer, execute this sub-training process; the number of remaining operation items in each convolutional layer after supernet training is less than When the number is equal to or equal to the predetermined number threshold, it is determined that the supernet after the training has been contracted in the search space.
  • the first step is to reduce the optional operation items of each layer from 18 to 9 when training the supernet.
  • the method adopted is to randomly sample 18 ⁇ 200 models from the supernet currently being trained. Test the accuracy on the test set (for example, it can be obtained from the first image training set), and then sort the models according to the accuracy, and count the number of models in the top third and the bottom third of each operation item in each layer. The difference in the number of models, sort the pros and cons of each operation item according to this difference, keep the first 9 operation items with a difference greater than zero, and then train for a period of time on the remaining search space.
  • the same method is adopted.
  • the optional operation items are reduced to 5 and 3.
  • the final search space is much smaller than the initial search space, shrinking to an appropriate size, which greatly alleviates the coupling and average effects caused by weight sharing between models, making it easier for each model to distinguish performance differences and maintain the relevance of rankings. And the training efficiency will also be improved.
  • step 205 search for the optimal structure (optimal neural architecture for image classification) in the supernet after shrinking the search space, because the model weights in the search phase are all inherited from the supernet and do not need to be retrained , Thus greatly speeding up the search time.
  • step 207 may specifically include: from the blockchain Obtain the second image training set, and train the model of the target neural architecture.
  • step 207 may specifically include: When training independently from the beginning, all spring structures that use the maximum number of channels output are adjusted to the original channel number of the operation items currently selected by each convolutional layer, so as to restore to the standard reverse residual structure; then use the second image to train Set the model to restore the target neural architecture to the standard reverse residual structure for training.
  • the solution of this embodiment has searched for two network structures in the new search space: BS-NAS-A and BS-NAS-B, and achieved 75.9% and 76.3% respectively on the publicly available large-scale ImageNet classification data set.
  • the top-1 accuracy rate has reached the world's advanced level in the mobile terminal model.
  • the method of this embodiment breaks through the limitation of the number of channels in the network layer that cannot be searched under the One-Shot framework.
  • a new spring block spring structure By introducing a new spring block spring structure, it can easily adapt to the selection of different channel numbers, while avoiding the stability of the network from being damaged.
  • a new training strategy of gradually shrinking the search space is proposed. By sorting the performance of each layer of operation, those operations that perform poorly are gradually eliminated, and the search space is reduced to an appropriate size. This approach can effectively alleviate the average effect of good and bad models caused by non-discriminatory weight sharing, and maintain the ranking correlation between good and bad models, which is more conducive to the search for the best model.
  • this embodiment provides an image classification device based on deep learning.
  • the device includes: a configuration module 31, a construction module 32, and a training module. Module 33, classification module 34.
  • the configuration module 31 is used to configure the search space information of the neural architecture based on the MobileNet network.
  • the construction module 32 is configured to construct a supernet according to the search space information, and configure a spring structure corresponding to each convolutional layer of the supernet, wherein the spring structure is used to combine the same volume during the supernet training.
  • the number of channels corresponding to different operation items of the buildup layer is fixed to the same number of channels and output to the next convolutional layer.
  • the training module 33 is configured to train the supernet using the first image training set to determine a target neural architecture suitable for image classification.
  • the training module 33 is also used to train the model of the target neural architecture by using the second image training set.
  • the classification module 34 is configured to use the trained model to classify images to be classified.
  • the spring structure is obtained by transforming each convolutional layer of the supernet based on a reverse residual structure, and the middle depth convolutional layer of the spring structure is used for
  • the middle depth convolutional layer of the spring structure is used for
  • there are 1 ⁇ 1 convolutional layers before and after the intermediate depth convolutional layer wherein the first 1 ⁇ 1 convolutional layer of the intermediate depth convolutional layer is used to expand the diversity of input features, and the intermediate depth
  • the last 1 ⁇ 1 convolutional layer of the convolutional layer is used to restore the extracted deep features to a fixed number of channels and output to the next convolutional layer.
  • the fixed number of channels is the maximum selectable for the convolutional layer structure. Number of channels.
  • the training module 33 is specifically used to adjust all the spring structures output with the maximum number of channels to the operation items currently selected for each convolutional layer when the model of the target neural architecture is independently trained from scratch.
  • the original number of channels can be restored to the standard reverse residual structure; the second image training set is used to train the model of the target neural architecture restored to the standard reverse residual structure.
  • the training module 33 is specifically used to divide the training process of the supernet into multiple sub-training processes according to a preset time interval; each time the sub-training process is executed, it is obtained based on the previous sub-training process.
  • the training module 33 randomly sample a path of the supernet for weight update, and continue training based on the supernet after the path weight update, so as to shrink the search space corresponding to the supernet; in the supernet after shrinking the search space , Search the target neural architecture.
  • the training module 33 is specifically also used to randomly sample a preset number of models from the supernet trained in the previous sub-training process; use the image test set to test the sampled models.
  • the picture test set is determined according to the first picture training set; the sampled models are sorted according to the test accuracy; the first number of models with each operation item in each convolutional layer in the top preset ratio are counted , And the second number of models located at a predetermined ratio after the ranking; according to the difference between the first number and the second number, in each convolutional layer, the difference of the previous first preset number is retained Operation items with a value greater than 0, and delete the remaining unreserved operation items; after deleting the remaining unreserved operation items in each convolutional layer, perform this sub-training process; in each convolutional layer after supernet training When the number of the remaining operation items in the network is less than or equal to the predetermined number threshold, it is determined that the supernet after the search space has been contracted through training is determined.
  • the configuration module 31 is specifically used to set the number of convolutional layers in the MobileNet network; according to the set MobileNet network, define the dimensional information of the search space and search space size information, where the dimensional information At least include the size of the convolution kernel, expansion coefficient, and number of channels of each convolution layer.
  • the first image training set and the second image training set are stored in a blockchain.
  • the training module 33 is specifically further configured to obtain the first image training set from the blockchain, and train the supernet.
  • the training module 33 is specifically further configured to obtain the second image training set from the blockchain, and train the model of the target neural architecture.
  • this embodiment also provides a storage medium on which a computer program is stored.
  • the storage medium involved in this application may be a readable storage medium (or a computer-readable storage medium), the storage medium may be non-volatile (such as a non-volatile readable storage medium), or an easy-to-use storage medium. Loss (such as volatile readable storage media).
  • the technical solution of this application can be embodied in the form of a software product.
  • the software product can be stored in a non-volatile storage medium (which can be a CD-ROM, U disk, mobile hard disk, etc.), including several
  • the instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute the methods in each implementation scenario of the present application.
  • this embodiment also provides a computer device, which may specifically be a personal computer, a notebook computer, or a server.
  • a computer device which may specifically be a personal computer, a notebook computer, or a server.
  • the physical equipment includes a storage medium and a processor; the storage medium is used to store a computer program; the processor is used to execute the computer program to implement the above-mentioned image classification based on deep learning as shown in Figure 1 and Figure 2 method.
  • the computer device may also include a user interface, a network interface, a camera, a radio frequency (RF) circuit, a sensor, an audio circuit, a WI-FI module, and so on.
  • the user interface may include a display screen (Display), an input unit such as a keyboard (Keyboard), etc., and the optional user interface may also include a USB interface, a card reader interface, and the like.
  • the optional network interface can include standard wired interface, wireless interface (such as Bluetooth interface, WI-FI interface), etc.
  • the computer device structure provided in this embodiment does not constitute a limitation on the physical device, and may include more or fewer components, or combine certain components, or arrange different components.
  • the storage medium may also include an operating system and a network communication module.
  • the operating system is a program that manages the hardware and software resources of the aforementioned physical devices, and supports the operation of information processing programs and other software and/or programs.
  • the network communication module is used to realize the communication between the various components in the storage medium and the communication with other hardware and software in the physical device.
  • this application can be implemented by means of software plus a necessary general hardware platform, or can be implemented by hardware.
  • this implementation For example, you can first configure the search space information of the neural architecture based on the MobileNet network, then construct the supernet based on the search space information, and configure the spring structure corresponding to each convolutional layer of the supernet, where the spring structure can be used for supernet training The number of channels corresponding to different operation items of the same convolutional layer are fixed to the same number of channels and output to the next convolutional layer.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

L'invention concerne un procédé et un appareil de classification d'images basés sur l'apprentissage profond, ainsi qu'un dispositif informatique, se rapportant au domaine technique de l'intelligence artificielle. Le procédé comprend les étapes consistant : premièrement, à configurer, sur la base d'un réseau MobileNet, des informations d'espace de recherche d'une architecture neuronale ; puis à construire un supernet en fonction des informations d'espace de recherche et à configurer un bloc d'extension correspondant à chaque couche de convolution du supernet, le bloc d'extension étant utilisé pour fixer, pendant l'entraînement du supernet, le nombre de canaux correspondant à différents éléments d'opération de la même couche de convolution au même nombre de canaux et réaliser une sortie vers la couche de convolution suivante ; puis à entraîner le supernet à l'aide d'un premier ensemble d'apprentissage d'images pour déterminer une architecture neuronale cible appropriée pour la classification d'images ; et enfin, à entraîner un modèle de l'architecture neuronale cible à l'aide d'un second ensemble d'apprentissage d'images et à soumettre des images à classifier à une classification d'images en utilisant le modèle qui est entraîné selon la norme. La présente invention permet d'améliorer la précision de la classification d'images. De plus, la présente invention concerne également la technologie des chaînes de blocs et les données d'entraînement de modèle peuvent être stockées dans une chaîne de blocs, de manière à garantir la confidentialité et la sécurité des données.
PCT/CN2020/122131 2020-07-31 2020-10-20 Procédé et appareil de classification d'images basés sur l'apprentissage profond et dispositif informatique WO2021151318A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010761098.8 2020-07-31
CN202010761098.8A CN111898683B (zh) 2020-07-31 2020-07-31 基于深度学习的图像分类方法、装置及计算机设备

Publications (1)

Publication Number Publication Date
WO2021151318A1 true WO2021151318A1 (fr) 2021-08-05

Family

ID=73184168

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/122131 WO2021151318A1 (fr) 2020-07-31 2020-10-20 Procédé et appareil de classification d'images basés sur l'apprentissage profond et dispositif informatique

Country Status (2)

Country Link
CN (1) CN111898683B (fr)
WO (1) WO2021151318A1 (fr)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114445674A (zh) * 2021-12-13 2022-05-06 上海悠络客电子科技股份有限公司 一种基于多尺度融合卷积的目标检测模型搜索方法
CN114936625A (zh) * 2022-04-24 2022-08-23 西北工业大学 一种基于神经网络架构搜索的水声通信调制方式识别方法
CN115631388A (zh) * 2022-12-21 2023-01-20 第六镜科技(成都)有限公司 图像分类方法、装置、电子设备及存储介质
WO2023055689A1 (fr) * 2021-09-29 2023-04-06 Subtle Medical, Inc. Systèmes et procédés d'amélioration autosupervisée sensible au bruit d'images à l'aide d'un apprentissage profond
CN117173446A (zh) * 2023-06-26 2023-12-05 北京百度网讯科技有限公司 图像分类与训练方法、装置、电子设备和存储介质

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112734015B (zh) * 2021-01-14 2023-04-07 北京市商汤科技开发有限公司 网络生成方法及装置、电子设备和存储介质
CN113076938B (zh) * 2021-05-06 2023-07-25 广西师范大学 一种结合嵌入式硬件信息的深度学习目标检测方法
CN113780146B (zh) * 2021-09-06 2024-05-10 西安电子科技大学 基于轻量化神经架构搜索的高光谱图像分类方法及系统
CN114266769B (zh) * 2022-03-01 2022-06-21 北京鹰瞳科技发展股份有限公司 一种基于神经网络模型进行眼部疾病识别的系统及其方法
CN115170973B (zh) * 2022-09-05 2022-12-20 广州艾米生态人工智能农业有限公司 一种智能化稻田杂草识别方法、装置、设备及介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109740534A (zh) * 2018-12-29 2019-05-10 北京旷视科技有限公司 图像处理方法、装置及处理设备
US20190279033A1 (en) * 2018-03-08 2019-09-12 Capital One Services, Llc Object detection using image classification models
CN110414570A (zh) * 2019-07-04 2019-11-05 北京迈格威科技有限公司 图像分类模型生成方法、装置、设备和存储介质
CN111433785A (zh) * 2017-10-19 2020-07-17 通用电气公司 用于自动化图像特征提取的深度学习架构

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111819580A (zh) * 2018-05-29 2020-10-23 谷歌有限责任公司 用于密集图像预测任务的神经架构搜索

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111433785A (zh) * 2017-10-19 2020-07-17 通用电气公司 用于自动化图像特征提取的深度学习架构
US20190279033A1 (en) * 2018-03-08 2019-09-12 Capital One Services, Llc Object detection using image classification models
CN109740534A (zh) * 2018-12-29 2019-05-10 北京旷视科技有限公司 图像处理方法、装置及处理设备
CN110414570A (zh) * 2019-07-04 2019-11-05 北京迈格威科技有限公司 图像分类模型生成方法、装置、设备和存储介质

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023055689A1 (fr) * 2021-09-29 2023-04-06 Subtle Medical, Inc. Systèmes et procédés d'amélioration autosupervisée sensible au bruit d'images à l'aide d'un apprentissage profond
CN114445674A (zh) * 2021-12-13 2022-05-06 上海悠络客电子科技股份有限公司 一种基于多尺度融合卷积的目标检测模型搜索方法
CN114936625A (zh) * 2022-04-24 2022-08-23 西北工业大学 一种基于神经网络架构搜索的水声通信调制方式识别方法
CN114936625B (zh) * 2022-04-24 2024-03-19 西北工业大学 一种基于神经网络架构搜索的水声通信调制方式识别方法
CN115631388A (zh) * 2022-12-21 2023-01-20 第六镜科技(成都)有限公司 图像分类方法、装置、电子设备及存储介质
CN117173446A (zh) * 2023-06-26 2023-12-05 北京百度网讯科技有限公司 图像分类与训练方法、装置、电子设备和存储介质

Also Published As

Publication number Publication date
CN111898683A (zh) 2020-11-06
CN111898683B (zh) 2023-07-28

Similar Documents

Publication Publication Date Title
WO2021151318A1 (fr) Procédé et appareil de classification d'images basés sur l'apprentissage profond et dispositif informatique
WO2021082743A1 (fr) Procédé et appareil de classification de vidéo et dispositif électronique
CN105917359B (zh) 移动视频搜索
CN112508094B (zh) 垃圾图片的识别方法、装置及设备
CN108197532A (zh) 人脸识别的方法、装置及计算机装置
CN110033332A (zh) 一种人脸识别方法、系统及电子设备和存储介质
CN108197285A (zh) 一种数据推荐方法以及装置
CN109145766A (zh) 模型训练方法、装置、识别方法、电子设备及存储介质
CN110276406A (zh) 表情分类方法、装置、计算机设备及存储介质
WO2020238039A1 (fr) Procédé et appareil de recherche de réseau neuronal
CN107679560A (zh) 数据传输方法、装置、移动终端及计算机可读存储介质
CN112989085B (zh) 图像处理方法、装置、计算机设备及存储介质
CN112232889A (zh) 一种用户兴趣画像扩展方法、装置、设备及存储介质
CN107590267A (zh) 基于图片的信息推送方法及装置、终端和可读存储介质
CN111400615B (zh) 一种资源推荐方法、装置、设备及存储介质
CN110059747A (zh) 一种网络流量分类方法
CN108763452A (zh) 基于大数据的游戏应用推送方法、系统及计算机存储介质
CN112132279A (zh) 卷积神经网络模型压缩方法、装置、设备及存储介质
CN105631404B (zh) 对照片进行聚类的方法及装置
CN112950640A (zh) 视频人像分割方法、装置、电子设备及存储介质
CN114281976A (zh) 一种模型训练方法、装置、电子设备及存储介质
CN112259078A (zh) 一种音频识别模型的训练和非正常音频识别的方法和装置
CN106294584B (zh) 排序模型的训练方法及装置
CN109697628B (zh) 产品数据推送方法及装置、存储介质、计算机设备
WO2021000411A1 (fr) Procédé et appareil de classification de documents basé sur un réseau neuronal, et dispositif et support de stockage

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20916735

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20916735

Country of ref document: EP

Kind code of ref document: A1