CN110674326A - Neural network structure retrieval method based on polynomial distribution learning - Google Patents

Neural network structure retrieval method based on polynomial distribution learning Download PDF

Info

Publication number
CN110674326A
CN110674326A CN201910722978.1A CN201910722978A CN110674326A CN 110674326 A CN110674326 A CN 110674326A CN 201910722978 A CN201910722978 A CN 201910722978A CN 110674326 A CN110674326 A CN 110674326A
Authority
CN
China
Prior art keywords
sampling
nodes
neural network
sampled
cells
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910722978.1A
Other languages
Chinese (zh)
Inventor
纪荣嵘
郑侠武
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen University
Original Assignee
Xiamen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen University filed Critical Xiamen University
Priority to CN201910722978.1A priority Critical patent/CN110674326A/en
Publication of CN110674326A publication Critical patent/CN110674326A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/53Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/55Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Library & Information Science (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A neural network structure retrieval method based on polynomial distribution learning relates to neural architecture search. 1) The method comprises the steps of providing a calibrated image-label pair set, dividing the image-label pair set into a training sample set, testing a photo sample set and a verification sample set, and defining a possible search space of a neural network to be searched; 2) sampling possible network structures in a search space; 3) after the sampling in the step 2), training the sampled neural network structure by using the image label pair in the step 1); 4) recording the number of times each operation is sampled and the precision of each operation on the verification set; 5) calculating the difference of sampling times and the difference of precision between operations; 6) updating the probability defined in step 2) with the difference calculated in step 5); 7) and (5) circulating the steps 3) to 6) until a fixed training time is reached. The method is suitable for a relatively large data set, and is efficient and accurate.

Description

Neural network structure retrieval method based on polynomial distribution learning
Technical Field
The invention relates to neural architecture search, in particular to a neural network structure retrieval method based on polynomial distribution learning.
Background
In recent years, with the development of artificial intelligence and deep learning, people begin to grow exponentially for customized deep learning network structures. The users are more expected to deeply learn the current tasks, and customized network structures and parameters are generated, so that the generation of the neural network structure retrieval system is guided. Given a data set, Neural Architecture Search (NAS) aims to find high-performance convolutional architectures in a huge search space through search algorithms. NAS has enjoyed great success in automated architectural searches for various deep learning tasks, such as image classification, language modeling and semantic segmentation. As described in [1] (t.devries and g.w.taylor.improved regulation of neural networks with future. arXiv preprinting arXiv:1708.04552,2017.), the neural architecture search method consists of three parts: search space, search strategy, and performance evaluation.
Conventional NAS algorithms sample a particular convolutional architecture through a search strategy and estimate performance, and at the same time, performance can be regarded as an objective function for updating the search strategy. Despite significant advances, conventional neural network structure search methods are still limited by computational intensive and memory costs. For example, over 20,000 neural networks out of 500 GPUs need to be trained and evaluated within 4 days based on the reinforcement learning method [2] (b.zoph, v.vacuevan, j.shlen, and q.v.le.learning transferable architecture for a scalable image retrieval in Proceedings of the IEEE con on computing and pattern retrieval, pages 8697 and 8710, 2018.). Recent work has improved scalability by making it in a differentiable way, where the search space is relaxed to a continuous space, so that the architecture can be optimized by performance on a gradient-decreasing validation set. However, the discriminative neural network fabric search is still subject to high GPU memory consumption, which grows linearly with the size of the candidate search set.
Disclosure of Invention
The invention aims to provide a neural network structure retrieval method based on polynomial distribution learning.
The invention comprises the following steps:
1) the method comprises the steps of providing a calibrated image-label pair set, dividing the image-label pair set into a training sample set, testing a photo sample set and a verification sample set, and defining a possible search space of a neural network to be searched;
2) sampling possible network structures in a search space, and defining sampling probability of each operation; the network structure is divided into networks, cells and nodes according to different scales;
3) after the sampling in the step 2), training the sampled neural network structure by using the image label pair in the step 1);
4) after training, recording the sampling times of each operation and the precision of each operation on a verification set;
5) calculating the difference of the sampling times and the difference of the accuracies among the operations according to the sampling times and the accuracies on the verification set of each operation obtained in the step 4);
6) updating the sampling probability defined in step 2) by using the difference calculated in step 5);
7) and (5) circulating the steps 3) to 6) until a fixed training time is reached.
In step 2), the network structure refers to the entire network topology; different numbers of cells are stacked linearly to form different network structures, wherein the cells are mainly divided into down-sampling cells and common cells; the width, the height and the depth of the input and the output of the common cells are kept consistent, and the width and the height of the down-sampling cells are halved and the depth of the down-sampling cells is doubled; the cells are composed of nodes, and a certain sequence of acyclic fully-connected topological graphs is maintained among the nodes in each cell; the nodes are mainly divided into input nodes, output nodes and intermediate nodes, each node stores a neural network intermediate characteristic diagram, and the connection among the nodes is a specific operation; the neural network search mainly determines which operation selection needs to be performed between two nodes; assuming that between any two nodes i, j, the sampling probability of each operation is defined as:
Figure BDA0002157894020000021
where N is the number of operations, that is, each operation is uniformly sampled.
In step 4), the record, for the operation space between two nodes, assumes that the operation space contains N possible operations, each operation being sampled for the number of times
Figure BDA0002157894020000022
Precision of each operation on verification set
Figure BDA0002157894020000023
Is a vector of N dimensions.
In step 5), calculating the difference of sampling times between operations
Figure BDA0002157894020000024
The following formula:
Figure BDA0002157894020000025
difference between precisions
Figure BDA0002157894020000026
The following formula:
Figure BDA0002157894020000027
where N is the number of operations.
In step 6), the specific method for updating may be: when two operations are compared, one of the operations has a smaller number of times of being sampled and has higher precision, the probability of the operation being sampled is improved, and conversely, when one of the operations has a larger number of times of being sampled and has lower precision, the probability of the operation being sampled is reduced, and the formula is expressed as:
Figure BDA0002157894020000031
wherein,
Figure BDA0002157894020000032
to indicate a function, when the input is true, 1 is returned, and the rest are returned to 0.
The method provided by the invention mainly comprises a rapid neural network structure searching method based on distributed learning. First, a completely new web search framework is proposed. Secondly, for better training, a distributed learning-based algorithm is provided, and the algorithm achieves the optimal training speed and precision. More ingeniously, the two methods proposed above can be mutually enhanced.
Compared with the prior art, the method has the following outstanding advantages:
firstly, the invention explicitly introduces the idea of the distributed learning algorithm, thereby solving the problem that the neural network structure retrieval is difficult to train to a certain extent.
Secondly, the invention provides a brand-new neural network searching framework, thereby efficiently and accurately carrying out the neural network structure retrieval.
Third, the present invention can be applied to relatively large data sets, which are optimized for speed and accuracy.
Drawings
FIG. 1 is a flow chart of the present invention for imparting a loss of central ordering and weak surveillance object localization.
Detailed Description
The invention discloses a neural network structure retrieval based on polynomial distribution learning. The invention is further described below with reference to the accompanying drawings.
Referring to fig. 1, the steps of the embodiment of the present invention are as follows:
step 1, a marked image-label pair set is given, the image-label pair set is divided into a training sample set, a photo sample set and a verification sample set are tested, and a possible search space of a neural network to be searched is defined;
step 2, sampling possible network structures in a search space; the network structure can be divided into networks, cells and nodes according to different scales. Wherein a network refers to the entire network topology. Different numbers of cells are stacked linearly to form different network structures, wherein the cells are mainly classified into down-sampling cells and common cells. The width, height, and depth of the input and output of the normal cell are kept consistent, while the width and height of the down-sampled cell are halved and the depth is doubled. The cells are fully connected topological graphs between the nodes. The nodes are mainly divided into input nodes, output nodes and intermediate nodes, each node stores a neural network intermediate characteristic diagram, and connection among the nodes is specific operation. The neural network search mainly determines which operation selection needs to be performed between two nodes. We assume that between any two nodes i, j, the sampling probability of each operation is:
Figure BDA0002157894020000041
where N is the number of operations, that is, each operation is uniformly sampled.
Step 3), after the step 2) is carried out sampling, training the sampled neural network structure by using the image label pair in the step 1);
step 4, step 3) of training the neural network, and then recording the precision on the verification set. Recording is performed, essentially two pieces of information, the first being the number of times each operation is sampled
Figure BDA0002157894020000042
And the precision of each operation on the verification set
Figure BDA0002157894020000043
For an operating space between two nodes, it is assumed that the operating space contains N possible operations,and
Figure BDA0002157894020000045
is a vector of N dimensions.
Step 5) according to the sampling times and the precision of each operation obtained in step 4), calculating the difference of the sampling times among the operations
Figure BDA0002157894020000046
And difference between precisions
Figure BDA0002157894020000048
Figure BDA0002157894020000049
Wherein N is defined in step 4) as the number of operations.
Step 6 the difference calculated in step 5) can be used to update the probability defined in step 2), and the following update is performed, when two operations are compared, one of the operations has a smaller number of times of being sampled and has higher precision, so as to raise the probability of being sampled, and conversely, when one of the operations has a larger number of times of being sampled and has lower precision, the probability of being sampled is lowered, and the formula is expressed as:
Figure BDA00021578940200000410
wherein,
Figure BDA00021578940200000411
to indicate a function, when the input is true, 1 is returned, and the rest are returned to 0.
And 7, circulating the steps 3-6 until a fixed training frequency is reached.
The method provided by the invention mainly comprises a rapid neural network structure searching method based on distributed learning. First, a completely new web search framework is proposed. Secondly, for better training, a distributed learning-based algorithm is provided, and the algorithm achieves the optimal training speed and precision. More subtly, the two algorithms proposed above can be mutually enhanced.
The effects of the present invention are further illustrated by the following simulation experiments.
1. Simulation conditions
The invention is developed on a Pycharm platform, and the developed deep learning framework is based on Pytorch. The language mainly used in the invention is Python, and the OpenCV is utilized to realize the traditional visual algorithm used in the invention.
2. Emulated content
Simulations were performed on the Cifar10 and ILSVRC2012 datasets, the Cifar-10 dataset consisting of 10 classes of 32x32 color pictures, for a total of 60000 pictures, each class containing 6000 pictures. Wherein 50000 pictures are taken as a training set, and 10000 pictures are taken as a testing set. The CIFAR-10 dataset was divided into 5 trained lots and 1 tested lot, each containing 10000 pictures. The pictures of the test set batch are composed of 1000 randomly selected pictures from each category, and the training set batch contains the remaining 50000 pictures in a random order. Some training sets batch may contain a greater number of pictures in one class than in other classes. The training set batch contains 5000 pictures from each class for a total of 50000 training pictures. The ImageNet project is a large visualization database for visual object recognition software research. Image URLs in excess of 1400 million were manually annotated by ImageNet to indicate objects in the picture; a bounding box is also provided in at least one million images. ImageNet contains 2 ten thousand categories; the comparison result between the invention and the best existing fine-grained retrieval method is shown in table 1, and the table 1 shows that compared with other methods, the invention has higher precision and higher speed.
TABLE 1
The invention provides a rapid neural network structure searching method based on distributed learning. A new network architecture search algorithm is introduced, applicable to a variety of large-scale datasets, because memory and computational costs are similar to those of ordinary neural network training. Furthermore, a performance ranking assumption is proposed that can be incorporated into existing NAS algorithms to speed up their search. The proposed method achieves a significant search efficiency improvement, for example, using 1 block GTX1080Ti to show that the network structure searched within 4h has a test error of only 2.4% on the relevant data set (6.0 times faster compared with the most advanced algorithm), which is attributed to the fact that the distributed learning using the present invention is completely different from the previous reinforcement learning-based method and differentiable method.

Claims (5)

1. A neural network structure retrieval method based on polynomial distribution learning is characterized by comprising the following steps:
1) the method comprises the steps of providing a calibrated image-label pair set, dividing the image-label pair set into a training sample set, testing a photo sample set and a verification sample set, and defining a possible search space of a neural network to be searched;
2) sampling possible network structures in a search space, and defining sampling probability of each operation; the network structure is divided into networks, cells and nodes according to different scales;
3) after the sampling in the step 2), training the sampled neural network structure by using the image label pair in the step 1);
4) after training, recording the sampling times of each operation and the precision of each operation on a verification set;
5) calculating the difference of the sampling times and the difference of the accuracies among the operations according to the sampling times and the accuracies on the verification set of each operation obtained in the step 4);
6) updating the sampling probability defined in step 2) by using the difference calculated in step 5);
7) and (5) circulating the steps 3) to 6) until a fixed training time is reached.
2. The method for searching a neural network structure based on polynomial distribution learning of claim 1, wherein in step 2), the network structure refers to the whole network topology; different numbers of cells are stacked linearly to form different network structures, wherein the cells are mainly divided into down-sampling cells and common cells; the width, the height and the depth of the input and the output of the common cells are kept consistent, and the width and the height of the down-sampling cells are halved and the depth of the down-sampling cells is doubled; the cells are composed of nodes, and a certain sequence of acyclic fully-connected topological graphs is maintained among the nodes in each cell; the nodes are mainly divided into input nodes, output nodes and intermediate nodes, each node stores a neural network intermediate characteristic diagram, and the connection among the nodes is a specific operation; the neural network search mainly determines which operation selection needs to be performed between two nodes; assuming that between any two nodes i, j, the sampling probability of each operation is defined as:
Figure FDA0002157894010000011
where N is the number of operations, that is, each operation is uniformly sampled.
3. The method as claimed in claim 1, wherein in step 4), the record is recorded, for the operation space between two nodes, assuming that the operation space contains N possible operations, and each operation is sampled for the number of times
Figure FDA0002157894010000012
Precision of each operation on verification set
Figure FDA0002157894010000013
Is oneA vector of N dimensions.
4. The method as claimed in claim 1, wherein in step 5), the difference of sampling times between operations is calculated
Figure FDA0002157894010000014
The following formula:
Figure FDA0002157894010000021
difference between precisions
Figure FDA0002157894010000022
The following formula:
Figure FDA0002157894010000023
where N is the number of operations.
5. The method for searching the neural network structure based on the polynomial distribution learning of claim 1, wherein in the step 6), the specific method for updating is as follows: when two operations are compared, one of the operations has a smaller number of times of being sampled and has higher precision, the probability of the operation being sampled is improved, and conversely, when one of the operations has a larger number of times of being sampled and has lower precision, the probability of the operation being sampled is reduced, and the formula is expressed as:
wherein,
Figure FDA0002157894010000025
to indicate a function, when the input is true, 1 is returned, and the rest are returned to 0.
CN201910722978.1A 2019-08-06 2019-08-06 Neural network structure retrieval method based on polynomial distribution learning Pending CN110674326A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910722978.1A CN110674326A (en) 2019-08-06 2019-08-06 Neural network structure retrieval method based on polynomial distribution learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910722978.1A CN110674326A (en) 2019-08-06 2019-08-06 Neural network structure retrieval method based on polynomial distribution learning

Publications (1)

Publication Number Publication Date
CN110674326A true CN110674326A (en) 2020-01-10

Family

ID=69068705

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910722978.1A Pending CN110674326A (en) 2019-08-06 2019-08-06 Neural network structure retrieval method based on polynomial distribution learning

Country Status (1)

Country Link
CN (1) CN110674326A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111325328A (en) * 2020-03-06 2020-06-23 上海商汤临港智能科技有限公司 Neural network generation method, data processing method and device
CN111667056A (en) * 2020-06-05 2020-09-15 北京百度网讯科技有限公司 Method and apparatus for searching model structure
CN111967569A (en) * 2020-06-29 2020-11-20 北京百度网讯科技有限公司 Neural network structure generation method and device, storage medium and electronic equipment
CN112183742A (en) * 2020-09-03 2021-01-05 南强智视(厦门)科技有限公司 Neural network hybrid quantization method based on progressive quantization and Hessian information
CN114896436A (en) * 2022-06-14 2022-08-12 厦门大学 Network structure searching method based on representation mutual information

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105373829A (en) * 2014-09-02 2016-03-02 北京大学 Full-connection neural network structure
US20190026639A1 (en) * 2017-07-21 2019-01-24 Google Llc Neural architecture search for convolutional neural networks
CN109871995A (en) * 2019-02-02 2019-06-11 浙江工业大学 The quantum optimization parameter adjustment method of distributed deep learning under Spark frame
CN109948029A (en) * 2019-01-25 2019-06-28 南京邮电大学 Based on the adaptive depth hashing image searching method of neural network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105373829A (en) * 2014-09-02 2016-03-02 北京大学 Full-connection neural network structure
US20190026639A1 (en) * 2017-07-21 2019-01-24 Google Llc Neural architecture search for convolutional neural networks
CN109948029A (en) * 2019-01-25 2019-06-28 南京邮电大学 Based on the adaptive depth hashing image searching method of neural network
CN109871995A (en) * 2019-02-02 2019-06-11 浙江工业大学 The quantum optimization parameter adjustment method of distributed deep learning under Spark frame

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
XIAWU ZHENG等: ""Multinomial Distribution Learning for Effective Neural Architecture Search"", 《ARXIV》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111325328A (en) * 2020-03-06 2020-06-23 上海商汤临港智能科技有限公司 Neural network generation method, data processing method and device
CN111325328B (en) * 2020-03-06 2023-10-24 上海商汤临港智能科技有限公司 Neural network generation method, data processing method and device
CN111667056A (en) * 2020-06-05 2020-09-15 北京百度网讯科技有限公司 Method and apparatus for searching model structure
CN111667056B (en) * 2020-06-05 2023-09-26 北京百度网讯科技有限公司 Method and apparatus for searching model structures
CN111967569A (en) * 2020-06-29 2020-11-20 北京百度网讯科技有限公司 Neural network structure generation method and device, storage medium and electronic equipment
CN111967569B (en) * 2020-06-29 2024-02-13 北京百度网讯科技有限公司 Neural network structure generation method and device, storage medium and electronic equipment
CN112183742A (en) * 2020-09-03 2021-01-05 南强智视(厦门)科技有限公司 Neural network hybrid quantization method based on progressive quantization and Hessian information
CN112183742B (en) * 2020-09-03 2023-05-12 南强智视(厦门)科技有限公司 Neural network hybrid quantization method based on progressive quantization and Hessian information
CN114896436A (en) * 2022-06-14 2022-08-12 厦门大学 Network structure searching method based on representation mutual information
CN114896436B (en) * 2022-06-14 2024-04-30 厦门大学 Network structure searching method based on characterization mutual information

Similar Documents

Publication Publication Date Title
CN110866140B (en) Image feature extraction model training method, image searching method and computer equipment
CN110597735B (en) Software defect prediction method for open-source software defect feature deep learning
CN110674326A (en) Neural network structure retrieval method based on polynomial distribution learning
CN110851645B (en) Image retrieval method based on similarity maintenance under deep metric learning
CN111858954A (en) Task-oriented text-generated image network model
CN111737535B (en) Network characterization learning method based on element structure and graph neural network
CN103116766B (en) A kind of image classification method of encoding based on Increment Artificial Neural Network and subgraph
CN108446312B (en) Optical remote sensing image retrieval method based on deep convolution semantic net
CN115019123B (en) Self-distillation contrast learning method for remote sensing image scene classification
CN109063112A (en) A kind of fast image retrieval method based on multi-task learning deep semantic Hash, model and model building method
CN110598022B (en) Image retrieval system and method based on robust deep hash network
CN113032613B (en) Three-dimensional model retrieval method based on interactive attention convolution neural network
CN113255892B (en) Decoupled network structure searching method, device and readable storage medium
CN109472282B (en) Depth image hashing method based on few training samples
Sood et al. Neunets: An automated synthesis engine for neural network design
CN112307914B (en) Open domain image content identification method based on text information guidance
CN113887698A (en) Overall knowledge distillation method and system based on graph neural network
CN111079840B (en) Complete image semantic annotation method based on convolutional neural network and concept lattice
CN114972959B (en) Remote sensing image retrieval method for sample generation and in-class sequencing loss in deep learning
CN114896436B (en) Network structure searching method based on characterization mutual information
CN111507472A (en) Precision estimation parameter searching method based on importance pruning
CN116797830A (en) Image risk classification method and device based on YOLOv7
Jing et al. NASABN: A neural architecture search framework for attention-based networks
CN116011564A (en) Entity relationship completion method, system and application for power equipment
CN112905820B (en) Multi-graph retrieval method based on logic learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200110