CN110674326A - Neural network structure retrieval method based on polynomial distribution learning - Google Patents
Neural network structure retrieval method based on polynomial distribution learning Download PDFInfo
- Publication number
- CN110674326A CN110674326A CN201910722978.1A CN201910722978A CN110674326A CN 110674326 A CN110674326 A CN 110674326A CN 201910722978 A CN201910722978 A CN 201910722978A CN 110674326 A CN110674326 A CN 110674326A
- Authority
- CN
- China
- Prior art keywords
- sampling
- nodes
- neural network
- sampled
- cells
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 36
- 238000000034 method Methods 0.000 title claims abstract description 31
- 238000005070 sampling Methods 0.000 claims abstract description 33
- 238000012549 training Methods 0.000 claims abstract description 25
- 238000012795 verification Methods 0.000 claims abstract description 13
- 238000012360 testing method Methods 0.000 claims abstract description 6
- 230000006870 function Effects 0.000 claims description 4
- 238000010586 diagram Methods 0.000 claims description 3
- 125000002015 acyclic group Chemical group 0.000 claims description 2
- 230000001537 neural effect Effects 0.000 abstract description 4
- 238000013135 deep learning Methods 0.000 description 4
- 238000004088 simulation Methods 0.000 description 3
- 230000002787 reinforcement Effects 0.000 description 2
- 238000010845 search algorithm Methods 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 239000004744 fabric Substances 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/53—Querying
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/55—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Library & Information Science (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
A neural network structure retrieval method based on polynomial distribution learning relates to neural architecture search. 1) The method comprises the steps of providing a calibrated image-label pair set, dividing the image-label pair set into a training sample set, testing a photo sample set and a verification sample set, and defining a possible search space of a neural network to be searched; 2) sampling possible network structures in a search space; 3) after the sampling in the step 2), training the sampled neural network structure by using the image label pair in the step 1); 4) recording the number of times each operation is sampled and the precision of each operation on the verification set; 5) calculating the difference of sampling times and the difference of precision between operations; 6) updating the probability defined in step 2) with the difference calculated in step 5); 7) and (5) circulating the steps 3) to 6) until a fixed training time is reached. The method is suitable for a relatively large data set, and is efficient and accurate.
Description
Technical Field
The invention relates to neural architecture search, in particular to a neural network structure retrieval method based on polynomial distribution learning.
Background
In recent years, with the development of artificial intelligence and deep learning, people begin to grow exponentially for customized deep learning network structures. The users are more expected to deeply learn the current tasks, and customized network structures and parameters are generated, so that the generation of the neural network structure retrieval system is guided. Given a data set, Neural Architecture Search (NAS) aims to find high-performance convolutional architectures in a huge search space through search algorithms. NAS has enjoyed great success in automated architectural searches for various deep learning tasks, such as image classification, language modeling and semantic segmentation. As described in [1] (t.devries and g.w.taylor.improved regulation of neural networks with future. arXiv preprinting arXiv:1708.04552,2017.), the neural architecture search method consists of three parts: search space, search strategy, and performance evaluation.
Conventional NAS algorithms sample a particular convolutional architecture through a search strategy and estimate performance, and at the same time, performance can be regarded as an objective function for updating the search strategy. Despite significant advances, conventional neural network structure search methods are still limited by computational intensive and memory costs. For example, over 20,000 neural networks out of 500 GPUs need to be trained and evaluated within 4 days based on the reinforcement learning method [2] (b.zoph, v.vacuevan, j.shlen, and q.v.le.learning transferable architecture for a scalable image retrieval in Proceedings of the IEEE con on computing and pattern retrieval, pages 8697 and 8710, 2018.). Recent work has improved scalability by making it in a differentiable way, where the search space is relaxed to a continuous space, so that the architecture can be optimized by performance on a gradient-decreasing validation set. However, the discriminative neural network fabric search is still subject to high GPU memory consumption, which grows linearly with the size of the candidate search set.
Disclosure of Invention
The invention aims to provide a neural network structure retrieval method based on polynomial distribution learning.
The invention comprises the following steps:
1) the method comprises the steps of providing a calibrated image-label pair set, dividing the image-label pair set into a training sample set, testing a photo sample set and a verification sample set, and defining a possible search space of a neural network to be searched;
2) sampling possible network structures in a search space, and defining sampling probability of each operation; the network structure is divided into networks, cells and nodes according to different scales;
3) after the sampling in the step 2), training the sampled neural network structure by using the image label pair in the step 1);
4) after training, recording the sampling times of each operation and the precision of each operation on a verification set;
5) calculating the difference of the sampling times and the difference of the accuracies among the operations according to the sampling times and the accuracies on the verification set of each operation obtained in the step 4);
6) updating the sampling probability defined in step 2) by using the difference calculated in step 5);
7) and (5) circulating the steps 3) to 6) until a fixed training time is reached.
In step 2), the network structure refers to the entire network topology; different numbers of cells are stacked linearly to form different network structures, wherein the cells are mainly divided into down-sampling cells and common cells; the width, the height and the depth of the input and the output of the common cells are kept consistent, and the width and the height of the down-sampling cells are halved and the depth of the down-sampling cells is doubled; the cells are composed of nodes, and a certain sequence of acyclic fully-connected topological graphs is maintained among the nodes in each cell; the nodes are mainly divided into input nodes, output nodes and intermediate nodes, each node stores a neural network intermediate characteristic diagram, and the connection among the nodes is a specific operation; the neural network search mainly determines which operation selection needs to be performed between two nodes; assuming that between any two nodes i, j, the sampling probability of each operation is defined as:
where N is the number of operations, that is, each operation is uniformly sampled.
In step 4), the record, for the operation space between two nodes, assumes that the operation space contains N possible operations, each operation being sampled for the number of timesPrecision of each operation on verification setIs a vector of N dimensions.
where N is the number of operations.
In step 6), the specific method for updating may be: when two operations are compared, one of the operations has a smaller number of times of being sampled and has higher precision, the probability of the operation being sampled is improved, and conversely, when one of the operations has a larger number of times of being sampled and has lower precision, the probability of the operation being sampled is reduced, and the formula is expressed as:
wherein,to indicate a function, when the input is true, 1 is returned, and the rest are returned to 0.
The method provided by the invention mainly comprises a rapid neural network structure searching method based on distributed learning. First, a completely new web search framework is proposed. Secondly, for better training, a distributed learning-based algorithm is provided, and the algorithm achieves the optimal training speed and precision. More ingeniously, the two methods proposed above can be mutually enhanced.
Compared with the prior art, the method has the following outstanding advantages:
firstly, the invention explicitly introduces the idea of the distributed learning algorithm, thereby solving the problem that the neural network structure retrieval is difficult to train to a certain extent.
Secondly, the invention provides a brand-new neural network searching framework, thereby efficiently and accurately carrying out the neural network structure retrieval.
Third, the present invention can be applied to relatively large data sets, which are optimized for speed and accuracy.
Drawings
FIG. 1 is a flow chart of the present invention for imparting a loss of central ordering and weak surveillance object localization.
Detailed Description
The invention discloses a neural network structure retrieval based on polynomial distribution learning. The invention is further described below with reference to the accompanying drawings.
Referring to fig. 1, the steps of the embodiment of the present invention are as follows:
where N is the number of operations, that is, each operation is uniformly sampled.
Step 3), after the step 2) is carried out sampling, training the sampled neural network structure by using the image label pair in the step 1);
Step 5) according to the sampling times and the precision of each operation obtained in step 4), calculating the difference of the sampling times among the operations
Wherein N is defined in step 4) as the number of operations.
Step 6 the difference calculated in step 5) can be used to update the probability defined in step 2), and the following update is performed, when two operations are compared, one of the operations has a smaller number of times of being sampled and has higher precision, so as to raise the probability of being sampled, and conversely, when one of the operations has a larger number of times of being sampled and has lower precision, the probability of being sampled is lowered, and the formula is expressed as:
wherein,to indicate a function, when the input is true, 1 is returned, and the rest are returned to 0.
And 7, circulating the steps 3-6 until a fixed training frequency is reached.
The method provided by the invention mainly comprises a rapid neural network structure searching method based on distributed learning. First, a completely new web search framework is proposed. Secondly, for better training, a distributed learning-based algorithm is provided, and the algorithm achieves the optimal training speed and precision. More subtly, the two algorithms proposed above can be mutually enhanced.
The effects of the present invention are further illustrated by the following simulation experiments.
1. Simulation conditions
The invention is developed on a Pycharm platform, and the developed deep learning framework is based on Pytorch. The language mainly used in the invention is Python, and the OpenCV is utilized to realize the traditional visual algorithm used in the invention.
2. Emulated content
Simulations were performed on the Cifar10 and ILSVRC2012 datasets, the Cifar-10 dataset consisting of 10 classes of 32x32 color pictures, for a total of 60000 pictures, each class containing 6000 pictures. Wherein 50000 pictures are taken as a training set, and 10000 pictures are taken as a testing set. The CIFAR-10 dataset was divided into 5 trained lots and 1 tested lot, each containing 10000 pictures. The pictures of the test set batch are composed of 1000 randomly selected pictures from each category, and the training set batch contains the remaining 50000 pictures in a random order. Some training sets batch may contain a greater number of pictures in one class than in other classes. The training set batch contains 5000 pictures from each class for a total of 50000 training pictures. The ImageNet project is a large visualization database for visual object recognition software research. Image URLs in excess of 1400 million were manually annotated by ImageNet to indicate objects in the picture; a bounding box is also provided in at least one million images. ImageNet contains 2 ten thousand categories; the comparison result between the invention and the best existing fine-grained retrieval method is shown in table 1, and the table 1 shows that compared with other methods, the invention has higher precision and higher speed.
TABLE 1
The invention provides a rapid neural network structure searching method based on distributed learning. A new network architecture search algorithm is introduced, applicable to a variety of large-scale datasets, because memory and computational costs are similar to those of ordinary neural network training. Furthermore, a performance ranking assumption is proposed that can be incorporated into existing NAS algorithms to speed up their search. The proposed method achieves a significant search efficiency improvement, for example, using 1 block GTX1080Ti to show that the network structure searched within 4h has a test error of only 2.4% on the relevant data set (6.0 times faster compared with the most advanced algorithm), which is attributed to the fact that the distributed learning using the present invention is completely different from the previous reinforcement learning-based method and differentiable method.
Claims (5)
1. A neural network structure retrieval method based on polynomial distribution learning is characterized by comprising the following steps:
1) the method comprises the steps of providing a calibrated image-label pair set, dividing the image-label pair set into a training sample set, testing a photo sample set and a verification sample set, and defining a possible search space of a neural network to be searched;
2) sampling possible network structures in a search space, and defining sampling probability of each operation; the network structure is divided into networks, cells and nodes according to different scales;
3) after the sampling in the step 2), training the sampled neural network structure by using the image label pair in the step 1);
4) after training, recording the sampling times of each operation and the precision of each operation on a verification set;
5) calculating the difference of the sampling times and the difference of the accuracies among the operations according to the sampling times and the accuracies on the verification set of each operation obtained in the step 4);
6) updating the sampling probability defined in step 2) by using the difference calculated in step 5);
7) and (5) circulating the steps 3) to 6) until a fixed training time is reached.
2. The method for searching a neural network structure based on polynomial distribution learning of claim 1, wherein in step 2), the network structure refers to the whole network topology; different numbers of cells are stacked linearly to form different network structures, wherein the cells are mainly divided into down-sampling cells and common cells; the width, the height and the depth of the input and the output of the common cells are kept consistent, and the width and the height of the down-sampling cells are halved and the depth of the down-sampling cells is doubled; the cells are composed of nodes, and a certain sequence of acyclic fully-connected topological graphs is maintained among the nodes in each cell; the nodes are mainly divided into input nodes, output nodes and intermediate nodes, each node stores a neural network intermediate characteristic diagram, and the connection among the nodes is a specific operation; the neural network search mainly determines which operation selection needs to be performed between two nodes; assuming that between any two nodes i, j, the sampling probability of each operation is defined as:
where N is the number of operations, that is, each operation is uniformly sampled.
3. The method as claimed in claim 1, wherein in step 4), the record is recorded, for the operation space between two nodes, assuming that the operation space contains N possible operations, and each operation is sampled for the number of timesPrecision of each operation on verification setIs oneA vector of N dimensions.
5. The method for searching the neural network structure based on the polynomial distribution learning of claim 1, wherein in the step 6), the specific method for updating is as follows: when two operations are compared, one of the operations has a smaller number of times of being sampled and has higher precision, the probability of the operation being sampled is improved, and conversely, when one of the operations has a larger number of times of being sampled and has lower precision, the probability of the operation being sampled is reduced, and the formula is expressed as:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910722978.1A CN110674326A (en) | 2019-08-06 | 2019-08-06 | Neural network structure retrieval method based on polynomial distribution learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910722978.1A CN110674326A (en) | 2019-08-06 | 2019-08-06 | Neural network structure retrieval method based on polynomial distribution learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110674326A true CN110674326A (en) | 2020-01-10 |
Family
ID=69068705
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910722978.1A Pending CN110674326A (en) | 2019-08-06 | 2019-08-06 | Neural network structure retrieval method based on polynomial distribution learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110674326A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111325328A (en) * | 2020-03-06 | 2020-06-23 | 上海商汤临港智能科技有限公司 | Neural network generation method, data processing method and device |
CN111667056A (en) * | 2020-06-05 | 2020-09-15 | 北京百度网讯科技有限公司 | Method and apparatus for searching model structure |
CN111967569A (en) * | 2020-06-29 | 2020-11-20 | 北京百度网讯科技有限公司 | Neural network structure generation method and device, storage medium and electronic equipment |
CN112183742A (en) * | 2020-09-03 | 2021-01-05 | 南强智视(厦门)科技有限公司 | Neural network hybrid quantization method based on progressive quantization and Hessian information |
CN114896436A (en) * | 2022-06-14 | 2022-08-12 | 厦门大学 | Network structure searching method based on representation mutual information |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105373829A (en) * | 2014-09-02 | 2016-03-02 | 北京大学 | Full-connection neural network structure |
US20190026639A1 (en) * | 2017-07-21 | 2019-01-24 | Google Llc | Neural architecture search for convolutional neural networks |
CN109871995A (en) * | 2019-02-02 | 2019-06-11 | 浙江工业大学 | The quantum optimization parameter adjustment method of distributed deep learning under Spark frame |
CN109948029A (en) * | 2019-01-25 | 2019-06-28 | 南京邮电大学 | Based on the adaptive depth hashing image searching method of neural network |
-
2019
- 2019-08-06 CN CN201910722978.1A patent/CN110674326A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105373829A (en) * | 2014-09-02 | 2016-03-02 | 北京大学 | Full-connection neural network structure |
US20190026639A1 (en) * | 2017-07-21 | 2019-01-24 | Google Llc | Neural architecture search for convolutional neural networks |
CN109948029A (en) * | 2019-01-25 | 2019-06-28 | 南京邮电大学 | Based on the adaptive depth hashing image searching method of neural network |
CN109871995A (en) * | 2019-02-02 | 2019-06-11 | 浙江工业大学 | The quantum optimization parameter adjustment method of distributed deep learning under Spark frame |
Non-Patent Citations (1)
Title |
---|
XIAWU ZHENG等: ""Multinomial Distribution Learning for Effective Neural Architecture Search"", 《ARXIV》 * |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111325328A (en) * | 2020-03-06 | 2020-06-23 | 上海商汤临港智能科技有限公司 | Neural network generation method, data processing method and device |
CN111325328B (en) * | 2020-03-06 | 2023-10-24 | 上海商汤临港智能科技有限公司 | Neural network generation method, data processing method and device |
CN111667056A (en) * | 2020-06-05 | 2020-09-15 | 北京百度网讯科技有限公司 | Method and apparatus for searching model structure |
CN111667056B (en) * | 2020-06-05 | 2023-09-26 | 北京百度网讯科技有限公司 | Method and apparatus for searching model structures |
CN111967569A (en) * | 2020-06-29 | 2020-11-20 | 北京百度网讯科技有限公司 | Neural network structure generation method and device, storage medium and electronic equipment |
CN111967569B (en) * | 2020-06-29 | 2024-02-13 | 北京百度网讯科技有限公司 | Neural network structure generation method and device, storage medium and electronic equipment |
CN112183742A (en) * | 2020-09-03 | 2021-01-05 | 南强智视(厦门)科技有限公司 | Neural network hybrid quantization method based on progressive quantization and Hessian information |
CN112183742B (en) * | 2020-09-03 | 2023-05-12 | 南强智视(厦门)科技有限公司 | Neural network hybrid quantization method based on progressive quantization and Hessian information |
CN114896436A (en) * | 2022-06-14 | 2022-08-12 | 厦门大学 | Network structure searching method based on representation mutual information |
CN114896436B (en) * | 2022-06-14 | 2024-04-30 | 厦门大学 | Network structure searching method based on characterization mutual information |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110866140B (en) | Image feature extraction model training method, image searching method and computer equipment | |
CN110597735B (en) | Software defect prediction method for open-source software defect feature deep learning | |
CN110674326A (en) | Neural network structure retrieval method based on polynomial distribution learning | |
CN110851645B (en) | Image retrieval method based on similarity maintenance under deep metric learning | |
CN111858954A (en) | Task-oriented text-generated image network model | |
CN111737535B (en) | Network characterization learning method based on element structure and graph neural network | |
CN103116766B (en) | A kind of image classification method of encoding based on Increment Artificial Neural Network and subgraph | |
CN108446312B (en) | Optical remote sensing image retrieval method based on deep convolution semantic net | |
CN115019123B (en) | Self-distillation contrast learning method for remote sensing image scene classification | |
CN109063112A (en) | A kind of fast image retrieval method based on multi-task learning deep semantic Hash, model and model building method | |
CN110598022B (en) | Image retrieval system and method based on robust deep hash network | |
CN113032613B (en) | Three-dimensional model retrieval method based on interactive attention convolution neural network | |
CN113255892B (en) | Decoupled network structure searching method, device and readable storage medium | |
CN109472282B (en) | Depth image hashing method based on few training samples | |
Sood et al. | Neunets: An automated synthesis engine for neural network design | |
CN112307914B (en) | Open domain image content identification method based on text information guidance | |
CN113887698A (en) | Overall knowledge distillation method and system based on graph neural network | |
CN111079840B (en) | Complete image semantic annotation method based on convolutional neural network and concept lattice | |
CN114972959B (en) | Remote sensing image retrieval method for sample generation and in-class sequencing loss in deep learning | |
CN114896436B (en) | Network structure searching method based on characterization mutual information | |
CN111507472A (en) | Precision estimation parameter searching method based on importance pruning | |
CN116797830A (en) | Image risk classification method and device based on YOLOv7 | |
Jing et al. | NASABN: A neural architecture search framework for attention-based networks | |
CN116011564A (en) | Entity relationship completion method, system and application for power equipment | |
CN112905820B (en) | Multi-graph retrieval method based on logic learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20200110 |