CN111582482B - Method, apparatus, device and medium for generating network model information - Google Patents

Method, apparatus, device and medium for generating network model information Download PDF

Info

Publication number
CN111582482B
CN111582482B CN202010393598.0A CN202010393598A CN111582482B CN 111582482 B CN111582482 B CN 111582482B CN 202010393598 A CN202010393598 A CN 202010393598A CN 111582482 B CN111582482 B CN 111582482B
Authority
CN
China
Prior art keywords
network
network model
generating
time delay
evolution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010393598.0A
Other languages
Chinese (zh)
Other versions
CN111582482A (en
Inventor
夏鑫
肖学锋
王星
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Douyin Vision Co Ltd
Original Assignee
Douyin Vision Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Douyin Vision Co Ltd filed Critical Douyin Vision Co Ltd
Priority to CN202010393598.0A priority Critical patent/CN111582482B/en
Publication of CN111582482A publication Critical patent/CN111582482A/en
Application granted granted Critical
Publication of CN111582482B publication Critical patent/CN111582482B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/12Computing arrangements based on biological models using genetic models
    • G06N3/126Evolutionary algorithms, e.g. genetic algorithms or genetic programming

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Physiology (AREA)
  • Genetics & Genomics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

Embodiments of the present disclosure disclose methods, apparatuses, electronic devices, and computer-readable media for generating network model information. One embodiment of the method comprises the following steps: sampling the pre-trained super network for a plurality of times to obtain a first network model set; generating a second network model set based on the first network model set and an evolution algorithm, wherein a time delay constraint is added in the evolution process of the evolution algorithm so that the time delay of a network model in the second network model set meets a preset range; and determining a pareto curve based on the time delay and the accuracy of each network model in the second network model set. The embodiment realizes that the network structure under a plurality of time delays can be determined efficiently under the condition of limited resources.

Description

Method, apparatus, device and medium for generating network model information
Technical Field
Embodiments of the present disclosure relate to the field of computer technology, and in particular, to a method, apparatus, device, and computer readable medium for generating network model information.
Background
At present, the problem of the neural network search is that the speed and the accuracy of the neural network cannot be better considered in the searching process. A method for searching out network structures under multiple time delays efficiently and accurately under the condition of limited resources is needed.
Disclosure of Invention
The disclosure is in part intended to introduce concepts in a simplified form that are further described below in the detailed description. The disclosure is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
Some embodiments of the present disclosure propose a method, apparatus, device and computer readable medium for generating network model information to solve the technical problems mentioned in the background section above.
In a first aspect, some embodiments of the present disclosure provide a method for generating network model information, the method comprising: sampling the pre-trained super network for a plurality of times to obtain a first network model set; generating a second network model set based on the first network model set and an evolution algorithm, wherein a time delay constraint is added in the evolution process of the evolution algorithm so that the time delay of a network model in the second network model set meets a preset range; and determining a pareto curve based on the time delay and the accuracy of each network model in the second network model set.
In a second aspect, some embodiments of the present disclosure provide an apparatus for generating network model information, the apparatus comprising: the sampling unit is configured to sample the pre-trained super network for a plurality of times to obtain a first network model set; the generating unit is configured to generate a second network model set based on the first network model set and an evolution algorithm, wherein a time delay constraint is added in the evolution process of the evolution algorithm so that the time delay of a network model in the second network model set meets a preset range; and the determining unit is configured to determine the pareto curve based on the time delay and the accuracy of each network model in the second network model set.
In a third aspect, some embodiments of the present disclosure provide an electronic device comprising: one or more processors; and a storage device having one or more programs stored thereon, which when executed by the one or more processors, cause the one or more processors to implement the method as in any of the first and second aspects.
In a fourth aspect, some embodiments of the present disclosure provide a computer readable medium having a computer program stored thereon, wherein the program when executed by a processor implements a method as in any of the first and second aspects.
One of the above embodiments of the present disclosure has the following advantageous effects: firstly, a first network model set is obtained as an input basis of an evolution algorithm by sampling a pre-trained super network for a plurality of times. Then, a second set of network models is generated based on the first set of network models and the evolution algorithm. Here, adding a delay constraint in the evolution process of the evolution algorithm can eliminate network models which do not meet the conditions. And finally, determining the pareto curve based on the time delay and the accuracy of each network model in the second network model set. The network model with high accuracy corresponding to different delays can be intuitively displayed. The implementation mode realizes that the network structure under a plurality of time delays can be efficiently and accurately searched under the condition of limited resources.
Drawings
The above and other features, advantages, and aspects of embodiments of the present disclosure will become more apparent by reference to the following detailed description when taken in conjunction with the accompanying drawings. The same or similar reference numbers will be used throughout the drawings to refer to the same or like elements. It should be understood that the figures are schematic and that elements and components are not necessarily drawn to scale.
1-3 are schematic diagrams of one application scenario for a method of generating network model information, according to some embodiments of the present disclosure;
FIG. 4 is a flow chart of some embodiments of a method for generating network model information according to the present disclosure;
FIG. 5 is a flow chart of further embodiments of a method for generating network model information according to the present disclosure;
FIG. 6 is a schematic structural diagram of some embodiments of a method apparatus for generating network model information according to the present disclosure;
fig. 7 is a schematic structural diagram of an electronic device suitable for use in implementing some embodiments of the present disclosure.
Detailed Description
Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete. It should be understood that the drawings and embodiments of the present disclosure are for illustration purposes only and are not intended to limit the scope of the present disclosure.
It should be noted that, for convenience of description, only the portions related to the present invention are shown in the drawings. Embodiments of the present disclosure and features of embodiments may be combined with each other without conflict.
It should be noted that the terms "first," "second," and the like in this disclosure are merely used to distinguish between different devices, modules, or units and are not used to define an order or interdependence of functions performed by the devices, modules, or units.
It should be noted that references to "one", "a plurality" and "a plurality" in this disclosure are intended to be illustrative rather than limiting, and those of ordinary skill in the art will appreciate that "one or more" is intended to be understood as "one or more" unless the context clearly indicates otherwise.
The names of messages or information interacted between the various devices in the embodiments of the present disclosure are for illustrative purposes only and are not intended to limit the scope of such messages or information.
The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
Fig. 1-3 are schematic diagrams of one application scenario of a method for generating network model information according to some embodiments of the present disclosure.
As shown in fig. 1, as an example, an electronic device 101 samples a pre-trained super network 102 multiple times to obtain a first set of network models 103. The first network model set 103 includes: network model 1031, network model 1032, and network model 1033. The network model 1031 is a network model obtained by selecting the operator 1 from the first layer network 1021, the second layer network 1022, and the third layer network 1023 of the super-network 102. The network model 1032 is a network model obtained by selecting the operator 1 from the first network 1021, the second network 1022, and the third network 1023 of the super network 102. The network model 1033 is a network model obtained by selecting the operator 2 from the first layer network 1021, the second layer network 1022 and the third layer network 1023 of the super network 102.
As shown in fig. 2, as an example, the electronic device 101 generates a second network model set 105 according to the first network model set 103 and the evolution algorithm 104, where a delay constraint is added in the evolution process of the evolution algorithm 104 so that the delay of the network model in the second network model set 105 meets a preset range. The first set of network models 103 includes a network model 1031, a network model 1032, and a network model 1033. The second set of network models 105 includes a network model 1051, a network model 1052, and a network model 1053.
As shown in fig. 3, as an example, the electronic device 101 may determine the pareto curve 106 based on the latency and accuracy of the individual network models in the second set of network models 105 described above. For example, the network model 1051 may correspond to a latency of 0.004s, with a corresponding accuracy of 96%. The corresponding delay of network model 1052 may be 0.003s, corresponding to 92% accuracy. The network 1053 may correspond to a delay of 0.005s with a corresponding accuracy of 98%.
It should be noted that, the method of generating the network model information may be performed by the electronic device 101. The electronic device 101 may be hardware or software. When the electronic device is hardware, the electronic device may be implemented as a distributed cluster formed by a plurality of servers or terminal devices, or may be implemented as a single server or a single terminal device. When the electronic device 101 is embodied as software, it may be implemented as a plurality of software or software modules, for example, for providing distributed services, or as a single software or software module. The present invention is not particularly limited herein.
It should be understood that the number of electronic devices in fig. 1-3 is merely illustrative. There may be any number of electronic devices as desired for an implementation.
With continued reference to fig. 4, a flow 400 of some embodiments of a method for generating network model information according to the present disclosure is shown. The method for generating network model information comprises the following steps:
step 401, sampling the pre-trained super network for a plurality of times to obtain a first network model set.
In some embodiments, an executing body of a method for generating network model information (e.g., an electronic device as shown in fig. 1) may sample a pre-trained super network multiple times to obtain a first set of network models. Wherein the super network includes a predetermined number of layer networks (e.g., the first layer network 1021, the second layer network 1022, and the third layer network 1023 shown in fig. 1). Each layer of the super network includes a predetermined number of operators (e.g., the first layer of network 1021 shown in fig. 1 includes operator 1, operator 2, and operator 3). The operator may include, but is not limited to, at least one of: IBConv-K3-E3, IBConv-K3-E6, IBConv-K5-E3, IBConv-K5-E6, IBConv-K7-E3, IBConv-K7-E6. Where IBConv-KX-EY may be a specific operator with extension Y and kernel X, IBConv may be a reverse bottleneck of mobiletv 2. As an example, the pre-trained supernetwork may be sampled multiple times in various ways to obtain the first set of network models described above.
In some alternative implementations of some embodiments, the step of sampling the pre-trained super-network multiple times to obtain the first set of network models may be as follows:
selecting a preset number of paths from the path set corresponding to the pre-trained super network, and taking a network corresponding to the preset number of paths as the first network model set. Wherein, the path can be obtained by the following steps:
first, selecting an operator from each layer of network of the super network.
And secondly, combining operators selected by each layer of network to obtain a single path.
On the basis, different paths are formed due to the fact that operators are selected by each layer of network, and then the different paths are summarized, so that a path set is obtained.
It should be noted that, in the training process of the above-mentioned super network, the parameter weight of the super network may be optimized by the following formula:
wherein S represents a search space, namely a set of types of operators included in each layer of network of the above super network, W represents a parameter weight, N (S, W) represents a super network with the search space of S and the parameter weight of W, loss train (N (S, W)) search space is S, the super network with parameter weight of W has a loss function on the training set,and the weight value when the loss function reaches the minimum value is represented. W (W) S Is a parameter weight value.
Step 402, generating a second network model set based on the first network model set and the evolution algorithm.
In some embodiments, the executing entity may generate the second set of network models based on the first set of network models and an evolution algorithm. And adding time delay constraint in the evolution process of the evolution algorithm to enable the time delay of the network model in the second network model set to meet a preset range. The second network model set may be a network model set obtained by performing a certain number of intersections and variations on the input of the evolution algorithm. The evolution algorithm can be a multi-objective genetic algorithm (NSGA-II) added with time delay constraint. As an example, the first network model set may be encoded, and the encoded result may be input into a multi-objective genetic algorithm that adds a delay constraint, to generate a second network model set.
In some optional implementations of some embodiments, the delay of the network model is obtained by querying a delay prediction table. The delay prediction table can be obtained through the following steps:
(1) At least one network model is run a first number of times.
(2) And running the at least one network model for a second number of times, and recording the corresponding time delay of the at least one network model.
(3) And determining the average time delay of operators included in the at least one network model based on the recorded corresponding time delay of the at least one network model.
(4) And constructing a corresponding delay prediction table based on the average delay of each operator.
It should be noted that the at least one network model may be executed on a determined mobile terminal, where the at least one network model is executed using a single thread at the time of execution, and using a large core of the mobile terminal.
In some optional implementations of some embodiments, the step of generating the second network model set based on the first network model set and the evolution algorithm may be as follows:
the first step is to encode the first network model set and to determine the encoded result as the initial parent population of the evolution algorithm. Wherein the encoding of the first set of network models aims at processing the first set of network models into a format that complies with an evolutionary algorithm input.
Step two, generating a child population based on evolution of the initial parent population;
and thirdly, determining the offspring population as a second network model set.
Optionally, the evolution of the initial parent population to generate the offspring population may be the following steps:
(1) The following evolution steps are performed on the initial parent population: and selecting a network with time delay meeting preset conditions for the initial parent population to obtain a selection result. Determining the weight of each sub-network in the selected result based on the weight of the pre-trained super-network; as an example, the weights of the respective subnetworks in the above selection result may be obtained directly from the weights of the pre-trained supernetwork. And determining the accuracy of each network in the selected result based on the weights, wherein the accuracy of each network can be obtained by verifying each network on a verification set according to the weights of each network. Based on the accuracy, the above selection results are ranked to obtain a network sequence, and as an example, the networks in the selection results may be ascending ranked according to the accuracy to obtain the network sequence. And evolving based on the sequencing result to obtain an initial offspring population. Generating a population of offspring in response to the number of evolutions being equal to the predetermined number.
(2) And in response to the evolution times being smaller than the preset number, taking the initial offspring population and the sequencing result as a new initial parent population, and continuing to execute the evolution steps.
Step 403, determining a pareto curve based on the time delay and the accuracy of each network model in the second network model set.
In some embodiments, the execution body for generating the network model information may determine the pareto curve based on the time delay and the accuracy of each network model in the second set of network models.
In alternative implementations of some embodiments, the target point on the pareto curve described above may be determined by the following formula;
s.t.Lat min ≤Latency(s*)≤Lat max
wherein S represents a search space of a network corresponding to a target point on the pareto curve, W S (s) represents a weight parameter value of the network corresponding to the target point, wherein the weight parameter value of the network in the network selection result corresponding to the target point can be determined according to a parameter value obtained from the weight of the pre-trained super-network based on the search space of the network corresponding to the target point. N (s, W) S (S)) means that the search space is S and the parameter weight is W S And (c) a network corresponding to the target point in the selection result of(s). Acc υal (N(s,W S (S))) means that the search space is S and the parameter weight is W S (s) a network corresponding to the target pointAnd the accuracy of the network, S epsilon S, indicates that the search space of the network corresponding to the target point belongs to the search space of the super network.And S represents that, in the case where the determined search space S belongs to the search space of the super network, the network corresponding to the search space with the highest accuracy is the network corresponding to the target point. s.t.Lat min ≤Latency(s * )≤Lat max The constraint condition is that the delay of the network corresponding to the target point is within a certain interval range.
The method provided by some embodiments of the present disclosure may obtain the first set of network models as an input basis for the evolution algorithm by sampling the pre-trained super network multiple times. And then generating a second network model set based on the first network model set and an evolution algorithm. Here, adding a delay constraint in the evolution process of the evolution algorithm can eliminate network models which do not meet the conditions. And finally, determining the pareto curve based on the time delay and the accuracy of each network model in the second network model set. The network model with high accuracy corresponding to different delays can be intuitively displayed. The implementation mode realizes that the network structure under a plurality of time delays can be efficiently and accurately searched under the condition of limited resources.
With further reference to fig. 5, a flow 500 of further embodiments of a method for generating network model information is shown. The flow 500 of the method for generating network model information includes the steps of:
step 501, sampling a pre-trained super network for a plurality of times to obtain a first network model set.
Step 502, generating a second network model set based on the first network model set and the evolution algorithm.
Step 503, determining a pareto curve based on the time delay and the accuracy of each network model in the second network model set.
In some embodiments, the specific implementation of steps 501-503 and the technical effects thereof may refer to steps 401-403 in those embodiments corresponding to fig. 4, which are not described herein.
Step 504, determining a corresponding network model based on the pareto curve and the target time delay.
In some embodiments, the executing entity may determine the corresponding network model based on the pareto curve. As an example, according to the target delay, a network with the highest accuracy corresponding to the target delay on the pareto curve may be determined by a manual searching method.
And step 505, training the determined network model to obtain the network model after training is finished.
In some embodiments, the executing body may train the determined network model to obtain a trained network model. As an example, the initial weights of the determined network may be obtained according to weights of a pre-trained super network, and then the determined network model is trained based on the initial weights and the training set, so that the training manner greatly reduces the convergence time of the model.
In an alternative implementation of some embodiments, in response to the training-finished network model including the target detection network, face recognition may be performed using the training-finished network model. Wherein the object detection network may include, but is not limited to, at least one of: SSD (Single Shot MultiBox Detector) algorithm, R-CNN (Region-Convolutional Neural Networks) algorithm, fast R-CNN (Fast Region-Convolutional Neural Networks) algorithm, SPP-NET (Spatial Pyramid Pooling Network) algorithm, YOLO (You Only Look Once) algorithm, FPN (Feature Pyramid Networks) algorithm, DCN (Deformable ConvNets) algorithm, retinaNet target detection algorithm. The above neural network for image segmentation may include, but is not limited to, at least one of: FCN network (Fully Convolutional Networks, full convolution network), segNet network (Semantic Segmentation Network, image semantic segmentation network), deep Lab semantic segmentation network, PSPNet network (Pyramid Scene Parsing Network, semantic segmentation network), mask-RCNN network (Mask-Region-CNN, image instance segmentation network)
As can be seen from fig. 5, the flow 500 of the method of generating network model information in some embodiments corresponding to fig. 5 embodies the step of selecting the corresponding highest accuracy network based on latency, as compared to the description of some embodiments corresponding to fig. 4. Therefore, the scheme described by the embodiments can select the determined network structure according to the time delay, and the efficiency of selecting the network structure is greatly improved.
With further reference to fig. 6, as an implementation of the method shown in the above figures, the present disclosure provides some embodiments of an apparatus for generating network model information, which apparatus embodiments correspond to those method embodiments shown in fig. 4, and which apparatus is particularly applicable in various electronic devices.
As shown in fig. 6, an apparatus 600 for generating network model information of some embodiments includes: a sampling unit 601, a generating unit 602, and a determining unit 603. The sampling unit 601 is configured to sample a pre-trained super network for a plurality of times to obtain a first network model set; the generating unit 602 is configured to generate the second set of network models based on the first set of network models and the evolution algorithm. Adding time delay constraint in the evolution process of the evolution algorithm to enable the time delay of the network model in the second network model set to meet a preset range; a determining unit 603 configured to determine a pareto curve based on the time delay and the accuracy of each network model in the second set of network models.
In an alternative implementation of some embodiments, the sampling unit 601 of the apparatus 600 for generating network model information may be further configured to: selecting a preset number of paths from the path set corresponding to the pre-trained super network, and taking a network corresponding to the preset number of paths as the first network model set.
In an alternative implementation of some embodiments, the delay of the network model is obtained by querying a delay prediction table.
In an alternative implementation of some embodiments, the generating unit 602 of the apparatus 600 for generating network model information may be further configured to: encoding the first network model set, and determining the encoded result as an initial parent population of an evolution algorithm; generating a child population based on evolving the initial parent population; and determining the child population as a second network model set.
In an alternative implementation of some embodiments, the generating unit 602 of the apparatus 600 for generating network model information may be further configured to: the following evolution steps are performed on the initial parent population: selecting a network with time delay meeting preset conditions for the initial parent population to obtain a selection result; determining the weight of each sub-network in the selected result based on the weight of the pre-trained super-network; determining the accuracy of each network in the selected result based on the weight; based on accuracy, sorting the selected results to obtain a network sequence; evolving based on the sequencing result to obtain an initial offspring population; generating a population of offspring in response to the number of evolutions being equal to a predetermined number; and in response to the evolution times being smaller than the preset number, taking the initial offspring population and the sequencing result as a new initial parent population, and continuing to execute the evolution steps.
In an alternative implementation of some embodiments, the apparatus 600 may further include: a determination unit and a training unit (not shown in the figure). Wherein the determining unit may be configured to determine the respective network model based on the pareto curve and the target delay. The training unit may be configured to train the determined network model.
In an alternative implementation of some embodiments, the apparatus 600 may further include: an identification unit (not shown in the figure). Wherein the recognition unit may be configured to, in response to the determined network model comprising the object detection network, perform face recognition using the determined network model.
It will be appreciated that the elements described in the apparatus 600 correspond to the various steps in the method described with reference to fig. 4. Thus, the operations, features and resulting benefits described above with respect to the method are equally applicable to the apparatus 600 and the units contained therein, and are not described in detail herein.
Referring now to fig. 7, a schematic diagram of an electronic device (e.g., the electronic device of fig. 1) 700 suitable for use in implementing some embodiments of the present disclosure is shown. The electronic device shown in fig. 7 is only one example and should not impose any limitations on the functionality and scope of use of embodiments of the present disclosure.
As shown in fig. 7, the electronic device 700 may include a processing means (e.g., a central processor, a graphics processor, etc.) 701, which may perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 702 or a program loaded from a storage means 708 into a Random Access Memory (RAM) 703. In the RAM703, various programs and data required for the operation of the electronic device 700 are also stored. The processing device 701, the ROM 702, and the RAM703 are connected to each other through a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.
In general, the following devices may be connected to the I/O interface 705: input devices 706 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, and the like; an output device 707 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 708 including, for example, magnetic tape, hard disk, etc.; and a communication device 709. The communication means 709 may allow the electronic device 700 to communicate wirelessly or by wire with other devices to exchange data. While fig. 7 shows an electronic device 700 having various means, it is to be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may be implemented or provided instead. Each block shown in fig. 7 may represent one device or a plurality of devices as needed.
In particular, according to some embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, some embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. In such embodiments, the computer program may be downloaded and installed from a network via communications device 709, or from storage 708, or from ROM 702. The above-described functions defined in the methods of some embodiments of the present disclosure are performed when the computer program is executed by the processing means 701.
It should be noted that, in some embodiments of the present disclosure, the computer readable medium may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In some embodiments of the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In some embodiments of the present disclosure, however, the computer-readable signal medium may comprise a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, RF (radio frequency), and the like, or any suitable combination of the foregoing.
In some implementations, the clients, servers may communicate using any currently known or future developed network protocol, such as HTTP (HyperText Transfer Protocol ), and may be interconnected with any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the internet (e.g., the internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed networks.
The computer readable medium may be contained in the electronic device; or may exist alone without being incorporated into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: sampling the pre-trained super network for a plurality of times to obtain a first network model set; generating a second network model set based on the first network model set and an evolution algorithm, wherein a time delay constraint is added in the evolution process of the evolution algorithm so that the time delay of a network model in the second network model set meets a preset range; and determining a pareto curve based on the time delay and the accuracy of each network model in the second network model set.
Computer program code for carrying out operations for some embodiments of the present disclosure may be written in one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in some embodiments of the present disclosure may be implemented by means of software, or may be implemented by means of hardware. The described units may also be provided in a processor, for example, described as: a processor includes a sampling unit, a generating unit, and a determining unit. Where the names of the units do not constitute a limitation on the unit itself in some cases, for example, the sampling unit may also be described as "a unit that samples a pre-trained super-network multiple times, resulting in a first set of network models".
The functions described above herein may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), an Application Specific Standard Product (ASSP), a system on a chip (SOC), a Complex Programmable Logic Device (CPLD), and the like.
In accordance with one or more embodiments of the present disclosure, there is provided a method for generating network model information, comprising: sampling the pre-trained super network for a plurality of times to obtain a first network model set; generating a second network model set based on the first network model set and an evolution algorithm, wherein a time delay constraint is added in the evolution process of the evolution algorithm so that the time delay of a network model in the second network model set meets a preset range; and determining a pareto curve based on the time delay and the accuracy of each network model in the second network model set.
According to one or more embodiments of the present disclosure, the above method further comprises: determining a corresponding network model based on the pareto curve and the target time delay; the determined network model is trained.
According to one or more embodiments of the present disclosure, the above method further comprises: in response to the determined network model including the object detection network, face recognition may be performed using the determined network model.
According to one or more embodiments of the present disclosure, the sampling the pre-trained super network multiple times to obtain the first network model set includes: selecting a preset number of paths from the path set corresponding to the pre-trained super network, and taking a network corresponding to the preset number of paths as the first network model set.
According to one or more embodiments of the present disclosure, the delay of the network model is obtained by querying a delay prediction table.
According to one or more embodiments of the present disclosure, the generating a second network model set based on the first network model set and the evolution algorithm includes: encoding the first network model set, and determining the encoded result as an initial parent population of an evolution algorithm; generating a child population based on evolving the initial parent population; and determining the child population as a second network model set.
According to one or more embodiments of the present disclosure, the generating a child population based on evolving the initial parent population includes: the following evolution steps are performed on the initial parent population: selecting a network with time delay meeting preset conditions for the initial parent population to obtain a selection result; determining the weight of each sub-network in the selected result based on the weight of the pre-trained super-network; determining the accuracy of each network in the selected result based on the weight; based on accuracy, sorting the selected results to obtain a network sequence; evolving based on the sequencing result to obtain an initial offspring population; generating a population of offspring in response to the number of evolutions being equal to a predetermined number; and in response to the evolution times being smaller than the preset number, taking the initial offspring population and the sequencing result as a new initial parent population, and continuing to execute the evolution steps.
According to one or more embodiments of the present disclosure, there is provided an apparatus for generating network model information, comprising: the sampling unit is configured to sample the pre-trained super network for a plurality of times to obtain a first network model set; the generating unit is configured to generate a second network model set based on the first network model set and an evolution algorithm, wherein a time delay constraint is added in the evolution process of the evolution algorithm so that the time delay of a network model in the second network model set meets a preset range; and the determining unit is configured to determine the pareto curve based on the time delay and the accuracy of each network model in the second network model set.
According to one or more embodiments of the present disclosure, the sampling unit of the apparatus for generating network model information may be further configured to: selecting a preset number of paths from the path set corresponding to the pre-trained super network, and taking a network corresponding to the preset number of paths as the first network model set.
According to one or more embodiments of the present disclosure, the generating unit of the apparatus for generating network model information may be further configured to: encoding the first network model set, and determining the encoded result as an initial parent population of an evolution algorithm; generating a child population based on evolving the initial parent population; and determining the child population as a second network model set.
According to one or more embodiments of the present disclosure, the generating unit of the apparatus for generating network model information may be further configured to: the following evolution steps are performed on the initial parent population: selecting a network with time delay meeting preset conditions for the initial parent population to obtain a selection result; determining the weight of each sub-network in the selected result based on the weight of the pre-trained super-network; determining the accuracy of each network in the selected result based on the weight; based on accuracy, sorting the selected results to obtain a network sequence; evolving based on the sequencing result to obtain an initial offspring population; generating a population of offspring in response to the number of evolutions being equal to a predetermined number; and in response to the evolution times being smaller than the preset number, taking the initial offspring population and the sequencing result as a new initial parent population, and continuing to execute the evolution steps.
In accordance with one or more embodiments of the present disclosure, the apparatus may further include: a determination unit and a training unit (not shown in the figure). Wherein the determining unit may be configured to determine the respective network model based on the pareto curve and the target delay. The training unit may be configured to train the determined network model.
In accordance with one or more embodiments of the present disclosure, the apparatus may further include: the recognition unit may be configured to, in response to the determined network model comprising the object detection network, perform face recognition using the determined network model.
The foregoing description is only of the preferred embodiments of the present disclosure and description of the principles of the technology being employed. It will be appreciated by those skilled in the art that the scope of the invention in the embodiments of the present disclosure is not limited to the specific combination of the above technical features, but encompasses other technical features formed by any combination of the above technical features or their equivalents without departing from the spirit of the invention. Such as the above-described features, are mutually substituted with (but not limited to) the features having similar functions disclosed in the embodiments of the present disclosure.

Claims (8)

1. A method for generating network model information, comprising:
performing multiple single-path random sampling on a pre-trained super network to obtain a first network model set;
generating a second network model set based on the first network model set and an evolution algorithm, wherein a time delay constraint is added in the evolution process of the evolution algorithm so that the time delay of a network model in the second network model set meets a preset range;
determining a pareto curve based on time delay and accuracy of each network model in the second network model set, wherein the accuracy is obtained by verifying each network on a verification set according to the weight of each network;
determining a corresponding network model based on the pareto curve and the target time delay;
training the determined network model to obtain a network model after training is finished;
and responding to the network model after the training is finished to comprise a target detection network, and performing face recognition by utilizing the network model after the training is finished.
2. The method of claim 1, wherein the performing multiple single-path random sampling on the pre-trained super-network to obtain the first set of network models comprises:
selecting a preset number of paths from the path set corresponding to the pre-trained super network, and taking the network corresponding to the preset number of paths as the first network model set.
3. The method of claim 1, wherein the delay of the network model is obtained by querying a delay prediction table.
4. The method of claim 1, wherein the generating a second set of network models based on the first set of network models and an evolution algorithm comprises:
encoding the first set of network models and determining the encoded result as an initial parent population for an evolutionary algorithm;
generating a child population based on evolving the initial parent population;
the offspring population is determined as a second set of network models.
5. The method of claim 4, wherein the generating a offspring population based on evolving the initial parent population comprises:
performing the following evolution steps on the initial parent population:
selecting a network with time delay meeting preset conditions for the initial parent population to obtain a selection result;
determining the weight of each sub-network in the selected result based on the weight of the pre-trained super-network;
determining the accuracy of each network in the selected result based on the weight;
based on accuracy, sorting the selected results to obtain a network sequence;
evolving based on the sequencing result to obtain an initial offspring population;
generating a population of offspring in response to the number of evolutions being equal to a predetermined number;
and in response to the evolution times being smaller than the preset number, taking the initial offspring population and the sequencing result as a new initial parent population, and continuing to execute the evolution step.
6. An apparatus for generating network model information, comprising:
the sampling unit is configured to sample the pre-trained super network for a plurality of times to obtain a first network model set;
the generating unit is configured to generate a second network model set based on the first network model set and an evolution algorithm, wherein a time delay constraint is added in the evolution process of the evolution algorithm so that the time delay of a network model in the second network model set meets a preset range;
a determining unit configured to determine a pareto curve based on a time delay and an accuracy of each network model in the second network model set, wherein the accuracy is obtained by verifying each network on a verification set according to a weight of each network; determining a corresponding network model based on the pareto curve and the target time delay; training the determined network model to obtain a network model after training is finished; and responding to the network model after the training is finished to comprise a target detection network, and performing face recognition by utilizing the network model after the training is finished.
7. An electronic device, comprising:
one or more processors;
a storage means for storing one or more programs;
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 1-5.
8. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the program when executed by a processor implements the method of any of claims 1-5.
CN202010393598.0A 2020-05-11 2020-05-11 Method, apparatus, device and medium for generating network model information Active CN111582482B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010393598.0A CN111582482B (en) 2020-05-11 2020-05-11 Method, apparatus, device and medium for generating network model information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010393598.0A CN111582482B (en) 2020-05-11 2020-05-11 Method, apparatus, device and medium for generating network model information

Publications (2)

Publication Number Publication Date
CN111582482A CN111582482A (en) 2020-08-25
CN111582482B true CN111582482B (en) 2023-12-15

Family

ID=72112254

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010393598.0A Active CN111582482B (en) 2020-05-11 2020-05-11 Method, apparatus, device and medium for generating network model information

Country Status (1)

Country Link
CN (1) CN111582482B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2016208504A (en) * 2015-04-15 2016-12-08 富士通株式会社 Network parameter determination method and determination device
CN107451363A (en) * 2017-08-03 2017-12-08 上海交通大学 A kind of computational methods of multiple target equalising network continuous optimization problems
CN109858445A (en) * 2019-01-31 2019-06-07 北京字节跳动网络技术有限公司 Method and apparatus for generating model
DE102018109835A1 (en) * 2018-04-24 2019-10-24 Albert-Ludwigs-Universität Freiburg Method and device for determining a network configuration of a neural network
CN110728358A (en) * 2019-09-30 2020-01-24 上海商汤智能科技有限公司 Data processing method and device based on neural network
CN110766089A (en) * 2019-10-30 2020-02-07 北京百度网讯科技有限公司 Model structure sampling method and device of hyper network and electronic equipment
CN110796231A (en) * 2019-09-09 2020-02-14 珠海格力电器股份有限公司 Data processing method, data processing device, computer equipment and storage medium
WO2020068437A1 (en) * 2018-09-28 2020-04-02 Xilinx, Inc. Training of neural networks by including implementation cost as an objective
WO2020092810A1 (en) * 2018-10-31 2020-05-07 Movidius Ltd. Automated generation of neural networks

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9064215B2 (en) * 2012-06-14 2015-06-23 Qualcomm Incorporated Learning spike timing precision

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2016208504A (en) * 2015-04-15 2016-12-08 富士通株式会社 Network parameter determination method and determination device
CN107451363A (en) * 2017-08-03 2017-12-08 上海交通大学 A kind of computational methods of multiple target equalising network continuous optimization problems
DE102018109835A1 (en) * 2018-04-24 2019-10-24 Albert-Ludwigs-Universität Freiburg Method and device for determining a network configuration of a neural network
WO2020068437A1 (en) * 2018-09-28 2020-04-02 Xilinx, Inc. Training of neural networks by including implementation cost as an objective
WO2020092810A1 (en) * 2018-10-31 2020-05-07 Movidius Ltd. Automated generation of neural networks
CN109858445A (en) * 2019-01-31 2019-06-07 北京字节跳动网络技术有限公司 Method and apparatus for generating model
CN110796231A (en) * 2019-09-09 2020-02-14 珠海格力电器股份有限公司 Data processing method, data processing device, computer equipment and storage medium
CN110728358A (en) * 2019-09-30 2020-01-24 上海商汤智能科技有限公司 Data processing method and device based on neural network
CN110766089A (en) * 2019-10-30 2020-02-07 北京百度网讯科技有限公司 Model structure sampling method and device of hyper network and electronic equipment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Evolutionarymulti-objectivegenerationofrecurrentneuralnetwork ensembles fortimeseriesprediction;ChristopherSmith等;《Neurocomputing》;第302-311页 *
基于改进Elman 网络模型的软件可靠性预测;程绪超等;《通信学报》;第32卷(第4期);第86-93页 *

Also Published As

Publication number Publication date
CN111582482A (en) 2020-08-25

Similar Documents

Publication Publication Date Title
CN111340220B (en) Method and apparatus for training predictive models
CN111368973B (en) Method and apparatus for training a super network
CN110390493B (en) Task management method and device, storage medium and electronic equipment
CN112650841A (en) Information processing method and device and electronic equipment
CN111353601A (en) Method and apparatus for predicting delay of model structure
CN114765025A (en) Method for generating and recognizing speech recognition model, device, medium and equipment
CN110956127A (en) Method, apparatus, electronic device, and medium for generating feature vector
CN116862319B (en) Power index information generation method, device, electronic equipment and medium
CN111158881B (en) Data processing method, device, electronic equipment and computer readable storage medium
CN116107666B (en) Program service flow information generation method, device, electronic equipment and computer medium
CN111582456B (en) Method, apparatus, device and medium for generating network model information
CN117241092A (en) Video processing method and device, storage medium and electronic equipment
CN116340632A (en) Object recommendation method, device, medium and electronic equipment
CN111582482B (en) Method, apparatus, device and medium for generating network model information
CN111626044B (en) Text generation method, text generation device, electronic equipment and computer readable storage medium
CN113240108B (en) Model training method and device and electronic equipment
CN112507676B (en) Method and device for generating energy report, electronic equipment and computer readable medium
CN111754984B (en) Text selection method, apparatus, device and computer readable medium
CN111680754B (en) Image classification method, device, electronic equipment and computer readable storage medium
CN116800834B (en) Virtual gift merging method, device, electronic equipment and computer readable medium
CN110633596A (en) Method and device for predicting vehicle direction angle
CN110633707A (en) Method and device for predicting speed
CN117743555B (en) Reply decision information transmission method, device, equipment and computer readable medium
CN113283115B (en) Image model generation method and device and electronic equipment
CN116541421B (en) Address query information generation method and device, electronic equipment and computer medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 100041 B-0035, 2 floor, 3 building, 30 Shixing street, Shijingshan District, Beijing.

Applicant after: Tiktok vision (Beijing) Co.,Ltd.

Address before: 100041 B-0035, 2 floor, 3 building, 30 Shixing street, Shijingshan District, Beijing.

Applicant before: BEIJING BYTEDANCE NETWORK TECHNOLOGY Co.,Ltd.

Address after: 100041 B-0035, 2 floor, 3 building, 30 Shixing street, Shijingshan District, Beijing.

Applicant after: Douyin Vision Co.,Ltd.

Address before: 100041 B-0035, 2 floor, 3 building, 30 Shixing street, Shijingshan District, Beijing.

Applicant before: Tiktok vision (Beijing) Co.,Ltd.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant