CN112381227B - Neural network generation method and device, electronic equipment and storage medium - Google Patents

Neural network generation method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN112381227B
CN112381227B CN202011381177.2A CN202011381177A CN112381227B CN 112381227 B CN112381227 B CN 112381227B CN 202011381177 A CN202011381177 A CN 202011381177A CN 112381227 B CN112381227 B CN 112381227B
Authority
CN
China
Prior art keywords
path
search
network
training
discriminator
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011381177.2A
Other languages
Chinese (zh)
Other versions
CN112381227A (en
Inventor
游山
李路军
王飞
钱晨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sensetime Technology Development Co Ltd
Original Assignee
Beijing Sensetime Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sensetime Technology Development Co Ltd filed Critical Beijing Sensetime Technology Development Co Ltd
Priority to CN202011381177.2A priority Critical patent/CN112381227B/en
Publication of CN112381227A publication Critical patent/CN112381227A/en
Application granted granted Critical
Publication of CN112381227B publication Critical patent/CN112381227B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Abstract

The present disclosure provides a neural network generation method, apparatus, electronic device and storage medium, the method comprising: determining an initial search space of a neural network structure based on a hyper-network; the initial search space comprises a plurality of search paths; the super network comprises a plurality of network layers, each network layer comprising at least one operator; each search path includes an operator in each network layer of the hyper-network; screening the search paths in the initial search space by using a path discriminator, training the super network based on a screening result, and determining a compressed search space corresponding to the trained super network; the path discriminator is a trained model used for classifying the performance of the neural network structure corresponding to the search path; and determining a target search path from the compressed search space, and generating a target neural network based on the target search path.

Description

Neural network generation method and device, electronic equipment and storage medium
Technical Field
The present disclosure relates to the field of deep learning technologies, and in particular, to a neural network generation method, a data processing method, an intelligent driving control device, an electronic device, and a storage medium.
Background
The deep learning model has good effects on many tasks, and particularly in tasks with images as objects to be processed, such as target detection, image segmentation and other tasks, the deep learning model has good image comprehension capability and can accurately extract the mapping of real spatial information in the images, so that the deep learning model is widely applied to various fields. The parameters of the deep learning model play an important role in the performance of the deep learning model, but the parameter adjustment is a difficult matter for the deep learning model because a plurality of hyper-parameters and network structure parameters can generate explosive combinations.
Generally, automatic network structure search is a novel and practical problem in the deep learning field at present, and aims to solve the problems of high cost and experience deviation of manually designed networks and obtain a basic network structure of a deep learning model. However, since the search space contains a large number of network structures, the network structure search algorithm usually requires long training and search time to search for an ideal network structure from the search space, which is inefficient and requires a large amount of hardware resources.
Disclosure of Invention
In view of the above, the present disclosure provides at least a neural network generation method, a data processing method, an intelligent driving control device, an electronic apparatus, and a storage medium.
In a first aspect, the present disclosure provides a neural network generation method, including:
determining an initial search space of a neural network structure based on a hyper-network; the initial search space comprises a plurality of search paths; the super network comprises a plurality of network layers, each network layer comprising at least one operator; each search path includes an operator in each network layer of the hyper-network;
screening the search path in the initial search space by using a path discriminator;
training the hyper-network based on the screening result;
determining a compressed search space corresponding to the trained hyper-network; the path discriminator is a trained model used for classifying the performance of the neural network structure corresponding to the search path;
and determining a target search path from the compressed search space, and generating a target neural network based on the target search path.
In the method, the path judger is used for screening the plurality of search paths in the initial search space, namely the trained path judger is used for judging the structural labels of the search paths in the initial search space, the search paths with poor performance are screened out, and the training of the super-network is carried out based on the screening result, so that the influence on the parameters of the search paths with good performance when the search paths with poor performance are trained is avoided, and the training efficiency of the super-network is improved. Meanwhile, after the hyper-network is trained, a compressed search space corresponding to the trained hyper-network can be obtained, the compressed search space is obtained after the initial search path is screened, the number of search paths in the compressed search space is small, and when the target search path is determined based on the compressed search space, the efficiency of determining the target search path is improved, so that the efficiency of generating the target neural network based on the target search path is improved, and hardware resources consumed in network structure search are reduced.
In one possible embodiment, the filtering the search path in the initial search space by using a path discriminator, and performing training of the super network based on the filtering result includes:
determining, with a path discriminator, structural labels for a plurality of search paths in the initial search space; wherein the structural labels comprise a first structural label and a second structural label, the first structural label having a structural performance superior to the second structural label;
and training a search path corresponding to the first structural label in the plurality of search paths to obtain the trained hyper-network.
By adopting the method, the structural label of the search path is determined by utilizing the path discriminator, so that the search path belonging to the first structural label can be selected to be trained, the influence of the search path of the second structural label on the parameter training process of the search path of the first structural label is reduced, the training accuracy is improved, and meanwhile, the training efficiency of the super-network can be improved.
In one possible embodiment, the hyper-network is trained according to the following steps:
taking the initial search space as a current search space, and selecting a plurality of search paths from the current search space;
determining the structure label of each selected search path in the plurality of search paths by using the path discriminator;
training a search path corresponding to the first structural label in the plurality of selected search paths, and generating a hyper-network after the training of the current round based on the trained search path corresponding to the first structural label and the unselected search path of the current round;
and taking the search space after the current round of compression corresponding to the hyper-network after the current round of training as the current search space, and returning to the step of selecting a plurality of search paths from the current search space until the training cutoff condition is met.
By adopting the method, the structure label of each selected search path is determined through the path discriminator, the search path corresponding to the first structure label in the plurality of selected search paths is trained, namely the search path with better training structure performance, and the hyper-network after the training of the current round is generated based on the trained search path corresponding to the first structure label and the unselected search path of the current round, so that the multi-round compression of the initial search space and the multi-round training of the hyper-network are completed.
In a possible implementation, in the case where the path arbiter comprises multiple stages, the super-network is trained according to the following steps:
taking the i-th-level path discriminator as a current path discriminator corresponding to the current search space; wherein, under the condition that the current search space is the initial search space, the ith-level path discriminator is a1 st-level path discriminator, and i is a positive integer;
selecting a plurality of search paths from the current search space, determining a structural label of each search path in the plurality of selected search paths by using a current path discriminator, training the search path corresponding to a first structural label in the plurality of selected search paths, and generating a hyper-network after the training of the current round based on the trained search path corresponding to the first structural label and the unselected search path of the current round;
under the condition that the number of rounds of training the hyper-network meets a preset condition, training a current path discriminator to generate an i + 1-stage path discriminator, and determining the i + 1-stage path discriminator as a current path discriminator;
and taking the search space after the current round of compression corresponding to the hyper-network after the current round of training as the current search space, and returning to the step of selecting a plurality of search paths from the current search space until the training cutoff condition is met.
Here, a multi-stage path discriminator is provided, where the i +1 th stage path discriminator is obtained based on the i-th stage path discriminator training, and the multi-stage path discriminator is provided to determine the structural label of the search path, so that the accuracy of determining the structural label can be improved.
In one possible embodiment, the path arbiter is trained according to the following steps:
selecting a plurality of first search path samples from a search space corresponding to the hyper-network, and evaluating the performance of a neural network structure corresponding to each first search path sample by using a verification sample;
based on the performance of the neural network structure corresponding to each first search path sample, sequencing the selected first search path samples;
determining a label structure label of each first search path sample based on the sequencing result of the selected plurality of first search path samples; wherein the labeled structure labels are used to characterize the structural performance of the first search path sample;
the path discriminator is trained on a plurality of first search path samples labeled with a label structure.
By adopting the method, the performance of the neural network structure corresponding to each first search path sample is evaluated by utilizing the verification sample, and the plurality of first search path samples are sequenced according to the evaluation performance, so that the relative performance among the plurality of first search path samples is determined; based on the relative performance among the multiple first search path samples, the labeled structure label of each first search path sample is determined, for example, the labeled structure label of the first search path ranked earlier (i.e., the first search path with better relative performance) is determined as the first labeled structure label, and the labeled structure label of the first search path ranked later (i.e., the first search path with worse relative performance) is determined as the second labeled structure label, so that the path discriminator is trained by using the multiple first search path samples with labeled structure labels.
In a possible implementation manner, the determining, based on the ranking result of the selected multiple first search path samples, a tag structure label of each first search path sample includes:
based on the set sorting percentage and the sorting results of the selected multiple first search path samples, determining the mark structure labels of the first search path samples with the sorting results within the sorting percentage as first mark structure labels, and determining the mark structure labels of the first search path samples with the sorting results outside the sorting percentage as second mark structure labels;
wherein the tag structure labels comprise a first tag structure label and a second tag structure label; the first tag structure label has a structural performance superior to the second tag structure label.
Here, a ranking percentage may be set, for example, the ranking percentage may be 20%, and when the number of the plurality of first search path samples is 100, based on the ranking result of the selected plurality of first search path samples, the tag structure label of the first search path sample whose ranking result is located at the top 20 bits is determined as the first tag structure label; and determining the mark structure label of the first search path sample with the ranking structure positioned at the rear 80 bits as a second mark structure label, so that the mark structure label corresponding to the first search path sample is determined by using the relative performance between the first search path samples.
In one possible embodiment, training the path discriminator based on a plurality of first search path samples with label structure labels includes:
determining a first loss of the path discriminator based on a plurality of first search path samples with label structure labels, and training the path discriminator by using the first loss;
wherein the first penalty comprises a binary classification penalty and/or an ordering penalty; the sorting loss is determined based on the number of the selected first search path samples, the confidence degree of the predicted structure label corresponding to each first search path sample determined by the path discriminator and the set sorting percentage.
In one possible implementation, after determining the tag structure label of each first search path sample based on the ranking result of the plurality of first search path samples, the method further includes:
selecting at least one second search path sample from other search paths except the plurality of first search path samples in the search space corresponding to the trained hyper-network;
determining a labeled result label for a second search path sample based on an edit distance between the second search path sample and the first search path sample;
the training the path discriminator based on a plurality of first search path samples with label structure labels comprises: training the path discriminator based on a plurality of first search path samples with label structure labels and a plurality of second search path samples.
In order to improve the training efficiency of the path discriminator, a plurality of second search path samples can be selected from other search paths except the plurality of first search path samples in the search space corresponding to the trained hyper-network; according to the conclusion that the performance of the model is close when the editing distance is close, the marking result label of the second search path sample is determined based on the marking structure label of the first search path sample, the data utilization efficiency is effectively improved by using a semi-supervised data amplification method, and the path discriminator is trained based on a plurality of first search path samples with the marking structure labels and a plurality of second search path samples.
In one possible embodiment, determining a target search path from the compressed search space and generating a target neural network based on the target search path includes:
determining a target search path from the compressed search space by using a trained path discriminator and a set evolutionary algorithm;
and training a target network structure corresponding to the target search path to generate the target neural network.
Here, since the compressed search space is a search space in which the search path having the structure label as the second structure label is deleted, and the number of the search paths in the compressed search space is smaller than the number of the search paths in the initial search space, when the target search path is determined from the compressed search space by using the trained path discriminator and the set evolutionary algorithm, the efficiency of determining the target search path can be improved, and the efficiency of generating the target neural network can be improved.
In one possible embodiment, determining a target search path from the compressed search space by using a trained path discriminator and a set evolutionary algorithm includes:
taking an initial population formed by at least one search path in the compressed search space as a current population;
determining a structure label of each search path in the current population by using a path discriminator;
generating a current population after screening based on a search path corresponding to the first structure label;
generating a population subjected to iteration processing at this time based on the current population after screening and the evolutionary algorithm;
taking the population after the iteration processing as a current population, and returning to the step of determining the structure label of each search path in the current population by using a path discriminator until the iteration times are equal to a set time threshold;
and determining a target search path from the initial population and the population generated by multiple iterations.
Here, in each iteration, the path discriminator may be used to determine the structure tag of each search path in the current population, delete the search path with the structure tag being the second structure tag, where the search path with the structure tag being the first structure tag constitutes the current population after screening, and generate the population after the current iteration based on the current population after screening and the evolutionary algorithm, thereby improving the efficiency of each iteration.
In one possible embodiment, before training the path discriminator, the method further comprises: pre-training the hyper-network until the hyper-network meets a preset initialization condition;
training the path arbiter, comprising: and initializing the path discriminator, and training the initialized path discriminator based on the hyper-network after pre-training.
In the method, the hyper-network is pre-trained until the hyper-network meets the preset initialization condition, and the performance of each search path in the search space corresponding to the pre-trained hyper-network is stable compared with that before pre-training, so that the path discriminator after initialization can be trained more accurately based on the hyper-network after pre-training.
The following descriptions of the effects of the apparatus, the electronic device, and the like refer to the description of the above method, and are not repeated here.
In a second aspect, the present disclosure provides a data processing method, including:
acquiring data to be processed; the data to be processed comprises: any one of the image to be processed, the character to be processed and the point cloud data to be processed;
and processing the data to be processed by using the neural network generated by the neural network generation method based on any one of the first aspect to obtain a data processing result of the data to be processed.
In a third aspect, the present disclosure provides an intelligent driving control method, including:
acquiring image or point cloud data acquired by a driving device in the driving process;
detecting a target object in the image or point cloud data by using a neural network generated based on the neural network generation method of any one of the first aspect;
controlling the running device based on the detected target object.
In a fourth aspect, the present disclosure provides a neural network generating device, including:
the determining module is used for determining an initial search space of the neural network structure based on the hyper-network; the initial search space comprises a plurality of search paths; the hyper-network comprises a plurality of network layers, each network layer comprising at least one operator; each search path including an operator in each network layer of the hyper-network;
the screening module is used for screening the search paths in the initial search space by using a path discriminator, training the super network based on a screening result and determining a compressed search space corresponding to the trained super network; the path discriminator is a trained model used for classifying the performance of the neural network structure corresponding to the search path;
and the generating module is used for determining a target search path from the compressed search space and generating a target neural network based on the target search path.
In a fifth aspect, the present disclosure provides a data processing apparatus comprising:
the first acquisition module is used for acquiring data to be processed; the data to be processed comprises: any one of the image to be processed, the character to be processed and the point cloud data to be processed;
a processing module, configured to process the data to be processed by using a neural network generated based on any one of the neural network generation methods in the first aspect, so as to obtain a data processing result of the data to be processed.
In a sixth aspect, the present disclosure provides an intelligent travel control apparatus, comprising:
the second acquisition module is used for acquiring the image or point cloud data acquired by the driving device in the driving process;
a detection module, configured to detect a target object in the image or point cloud data by using a neural network generated based on the neural network generation method of any one of the first aspects;
a control module for controlling the travel device based on the detected target object.
In a seventh aspect, the present disclosure provides an electronic device, comprising: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory communicating via the bus when the electronic device is running, the machine-readable instructions when executed by the processor performing the steps of the neural network generating method according to the first aspect or any one of the embodiments; or the step of performing the data processing method according to the second aspect; or the steps of the intelligent running control method according to the third aspect described above.
In an eighth aspect, the present disclosure provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the neural network generating method according to the first aspect or any one of the embodiments; or the step of performing the data processing method according to the second aspect; or the steps of the intelligent running control method according to the third aspect described above.
In order to make the aforementioned objects, features and advantages of the present disclosure more comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings required for use in the embodiments will be briefly described below, and the drawings herein incorporated in and forming a part of the specification illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the technical solutions of the present disclosure. It is appreciated that the following drawings depict only certain embodiments of the disclosure and are therefore not to be considered limiting of its scope, for those skilled in the art will be able to derive additional related drawings therefrom without the benefit of the inventive faculty.
Fig. 1 shows a schematic flow chart of a neural network generation method provided by an embodiment of the present disclosure;
fig. 2 illustrates an exemplary structural diagram of a super network in a neural network generation method provided in an embodiment of the present disclosure;
fig. 3 is a schematic flow chart illustrating a training path arbiter in a neural network generation method provided by an embodiment of the present disclosure;
FIG. 4 is a flow chart illustrating a data processing method provided by an embodiment of the present disclosure;
fig. 5 is a schematic flow chart illustrating an intelligent driving control method according to an embodiment of the present disclosure;
fig. 6 shows an architecture diagram of a neural network generating apparatus provided in an embodiment of the present disclosure;
FIG. 7 is a block diagram of a data processing apparatus according to an embodiment of the present disclosure;
fig. 8 is a schematic diagram illustrating an architecture of an intelligent driving control device according to an embodiment of the present disclosure;
fig. 9 shows a schematic structural diagram of an electronic device 900 provided by an embodiment of the present disclosure;
fig. 10 shows a schematic structural diagram of an electronic device 1000 provided by an embodiment of the present disclosure;
fig. 11 shows a schematic structural diagram of an electronic device 1100 provided in an embodiment of the present disclosure.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions of the embodiments of the present disclosure will be described clearly and completely with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are only a part of the embodiments of the present disclosure, not all of the embodiments. The components of the embodiments of the present disclosure, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present disclosure, presented in the figures, is not intended to limit the scope of the claimed disclosure, but is merely representative of selected embodiments of the disclosure. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the disclosure without making creative efforts, shall fall within the protection scope of the disclosure.
Research shows that when the automatic network structure search is carried out on the neural network, if the neural network comprises n layers of networks and each layer of network has m optional operations, the finally formed search space has m n Each search path is formed by one optional operation connection of each layer network in the n layer networks, and each search path corresponds to one search path respectivelyA neural network structure, wherein m and n are positive integers. In the current automatic network structure searching method, m is randomly selected for a plurality of times n Determining at least one search path to be trained in the search paths, and randomly selecting m from the search paths at each time n After at least one searching path to be trained is determined in the searching paths, the determined searching path to be trained is trained; and finally obtaining the ultra-network after the multi-round training through the multi-round training, and determining the network structure of the neural network based on the ultra-network after the multi-round training.
In the above method, m is included in the search space n A search path, the number of search paths is large, for example, if n is 20 and m is 4, the search space includes 4 20 The search paths are large in number, so that the training process time of the super network is long, the training process needs to occupy very many hardware resources, and the training efficiency is low. Meanwhile, in the method, all the search paths are trained in the same position; in practice, however, there are both better-performing search paths and poorer-performing search paths in all the search paths; the search path with better performance can have better data processing effect than the search path with poorer performance; because different search paths have greater correlation, for example, some search paths may share part of network parameters, which may cause training of search paths with poor effect and may interfere with parameters of search paths with better performance; the interference affects performance evaluation results of each search path in the multi-round trained super-network when the performance evaluation is performed on each search path in the multi-round trained super-network, and finally, an optimal network structure cannot be obtained, so that the problem that the performance of the generated neural network is poor is caused.
Based on the above research, the present disclosure provides a neural network generation method, where a plurality of search paths in an initial search space are screened by using a path discriminator, that is, a trained path discriminator is used to judge a structure tag of a search path in the initial search space, where the structure tag may represent a quality of a neural network structure corresponding to the search path, the search path with the structure tag being a second structure tag (that is, the performance of the corresponding neural network structure is poor) is screened out, and a super-network is trained based on a screening result, so that when a search path with the structure tag being the second structure tag is trained, parameter iteration affecting the search path with the structure tag being the first structure tag (that is, the performance of the corresponding neural network result is superior) is avoided, and training efficiency of the super-network is improved. Meanwhile, after the hyper-network is trained, a compressed search space corresponding to the trained hyper-network can be obtained, the compressed search space is obtained after the initial search path is screened, the number of search paths in the compressed search space is small, and when the target search path is determined based on the compressed search space, the efficiency of determining the target search path is improved, so that the efficiency of generating the target neural network based on the target search path is improved, and hardware resources consumed in network structure search are reduced.
The above-mentioned drawbacks are the results of the inventor after practical and careful study, and therefore, the discovery process of the above-mentioned problems and the solutions proposed by the present disclosure to the above-mentioned problems should be the contribution of the inventor in the process of the present disclosure.
The technical solutions in the present disclosure will be described clearly and completely with reference to the accompanying drawings in the present disclosure, and it is to be understood that the described embodiments are only a part of the embodiments of the present disclosure, and not all of the embodiments. The components of the present disclosure, as generally described and illustrated in the figures herein, may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present disclosure, presented in the figures, is not intended to limit the scope of the claimed disclosure, but is merely representative of selected embodiments of the disclosure. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the disclosure without making creative efforts, shall fall within the protection scope of the disclosure.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined or explained in subsequent figures.
For the understanding of the embodiments of the present disclosure, a detailed description will be given to a neural network generation method disclosed in the embodiments of the present disclosure. An execution subject of the neural network generation method provided by the embodiment of the present disclosure is generally a computer device with certain computing capability, and the computer device includes, for example: a terminal device, which may be a User Equipment (UE), a mobile device, a User terminal, a cellular phone, a cordless phone, a Personal Digital Assistant (PDA), a handheld device, a computing device, a vehicle-mounted device, a wearable device, or a server or other processing device. In some possible implementations, the neural network generation method may be implemented by a processor calling computer readable instructions stored in a memory.
Referring to fig. 1, a schematic flow chart of a neural network generation method provided in the embodiment of the present disclosure is shown, the method includes S101-S105, where:
s101, determining an initial search space of a neural network structure based on a hyper network; the initial search space comprises a plurality of search paths; the hyper-network comprises a plurality of network layers, each network layer comprising at least one operator; each search path includes an operator in each network layer of the hyper-network; operators corresponding to each network layer in each search path are sequentially connected to form a neural network structure.
S102, screening the search path in the initial search space by using a path discriminator.
S103, training the super network based on the screening result.
S104, determining a compressed search space corresponding to the trained hyper-network; the path discriminator is a trained model used for classifying the performance of the neural network structure corresponding to the search path.
And S105, determining a target search path from the compressed search space, and generating a target neural network based on the target search path.
In the method, the path discriminator is used for screening the plurality of search paths in the initial search space, the search paths with poor performance are screened out, the training of the super-network is carried out based on the screening result, the influence on the parameters of the search paths with good performance when the search paths with poor performance are trained is avoided, and the training efficiency of the super-network is improved. Meanwhile, after the hyper-network is trained, a compressed search space corresponding to the trained hyper-network can be obtained, the number of search paths in the compressed search space is small, when a target search path is determined based on the compressed search space, hardware resources consumed in network structure search are reduced, the efficiency of determining the target search path is improved, and the efficiency of generating a target neural network based on the target search path is improved.
For S101:
here, the super network may be, for example, a large neural network formed based on at least one operator included in each of a plurality of network layers. Each operator may correspond to an operation or a basic network structure element, for example, each operator of a certain network layer corresponds to a convolution operation, or a convolution network element. For example, assuming that a super network includes n types of networks, each network has m types of selectable operators, a search path in a generated search space has m types n And (3) strips.
Wherein each search path comprises an operator in each network layer of the super network. Here, the optional operators include, for example: at least one of a convolutional network unit, a pooling network unit, an identity mapping network unit, and a predetermined function block.
Here, the predetermined function block refers to some neural networks that have been trained; these neural networks can perform a certain function; such as a mobile network MobileNetV2 block for refinement, detection and segmentation, an extremely efficient convolutional neural network ShuffleNetV2 for mobile devices, etc.
As shown in fig. 2, the disclosed embodiments provide an exemplary structure of a super network; in the super network, a 3-layer network is included; the number of the sub-networks selectable by each layer of the network is 4, namely a convolution network unit, a pooling network unit, an identity mapping network unit and a functional block. The search path formed has a total of 4 3 And (3) strips. As shown by the thicker line in fig. 2, labeled 4 3 One of the search paths.
For S102, S103, and S104:
the path arbiter may be a two-classifier, with the two classes including good classes and poor classes (i.e., good classes and bad classes). For each of the plurality of sampled search paths, a vector representing a structure of the search path may be input to a trained path discriminator to obtain a structure label of the search path, for example, a first structure label with good indication performance and a corresponding confidence level, so as to obtain a structure label of each of the plurality of sampled search paths. The vector corresponding to each search path can be determined in a coding mode, the vector can represent the neural network structure of the search path, and different search paths correspond to different vectors.
Here, considering that the number of search paths included in the initial search space is large and the initial search space includes a search path with poor structural performance, a path discriminator may be used to filter a plurality of search paths before training the super network in order to improve training efficiency of the super network and reduce hardware resource consumption, and in order to reduce influence on parameters of a search path with good structural performance when a search path with poor structural performance in the initial search space is trained.
In an optional embodiment, the filtering the search path in the initial search space by using a path discriminator, and performing training of the super network based on the filtering result includes: determining, with a path discriminator, structural labels of a plurality of search paths in the initial search space, wherein the structural labels include a first structural label and a second structural label, and the structural performance of the first structural label is better than the second structural label; and training a search path corresponding to the first structural label in the plurality of search paths to obtain the trained hyper-network.
Here, a plurality of search paths may be randomly selected from a large number of search paths included in the initial search space, each of the plurality of selected search paths is determined by using a trained path discriminator, a structure label of each search path is determined, a search path with a structure label as a second structure label is screened out (i.e., a search path with a poor performance is screened out), a search path with a structure label as a first result label is obtained from the plurality of selected search paths, a super-network is trained based on a screening result, i.e., a search path with a good performance from the plurality of selected search paths is trained, a trained super-network is determined, and a compressed search space corresponding to the trained super-network is determined.
By adopting the method, the structural label of the search path is determined by utilizing the path discriminator, so that the search path belonging to the first structural label can be selected to be trained, the influence of the search path of the second structural label on the parameter training process of the search path of the first structural label is reduced, the training accuracy is improved, and meanwhile, the training efficiency of the super-network can be improved.
Wherein the hyper-network may be trained according to the following steps:
step one, taking the initial search space as a current search space, and selecting a plurality of search paths from the current search space.
And step two, determining the structure label of each of the plurality of selected search paths by using the path discriminator.
And step three, training the selected search path corresponding to the first structural label in the plurality of search paths, and generating the hyper-network after the training on the basis of the trained search path corresponding to the first structural label and the unselected search path of the training.
And step four, taking the search space after the current round of compression corresponding to the hyper-network after the current round of training as the current search space, and returning to the step of selecting a plurality of search paths from the current search space until the training cutoff condition is met.
By adopting the method, the structure label of each selected search path is determined through the path discriminator, the search path corresponding to the first structure label in the plurality of selected search paths is trained, namely the search path with better training structure performance, and the hyper-network after the training of the current round is generated based on the trained search path corresponding to the first structure label and the unselected search path of the current round, so that the multi-round compression of the initial search space and the multi-round training of the hyper-network are completed.
Here, the initial search space may be subjected to multiple rounds of screening and multiple rounds of training by using the path discriminator to obtain a compressed search space. In specific implementation, the initial search space may be used as a current search space, multiple search paths are selected from the current search space, the multiple selected search paths are subjected to a first round of screening by using the path discriminator, that is, the path discriminator is used to determine a structure label of each of the multiple selected search paths, and the search paths in which the structure labels are second structure labels are screened out, so as to obtain the search paths in which the structure labels are first structure labels in the multiple selected search paths.
And then, performing a first round of training of the hyper-network based on the first round of screening results, namely training a search path with a first structural label as a structural label in the selected multiple search paths, and generating the hyper-network after the training of the current round and a compressed search space corresponding to the hyper-network after the training of the current round based on the search path with the first structural label as the trained structural label and the search path which is not selected in the current round.
And performing a second round of training by taking the compressed search space corresponding to the hyper network after the first round of training as the current search space, specifically, selecting a plurality of search paths from the compressed search space (current search space) corresponding to the hyper network after the first round of training, and performing a second round of screening on the plurality of selected search paths by using a path discriminator, namely determining the structure label of each search path in the plurality of selected search paths by using the path discriminator, and screening the search path of which the structure label is the second structure label in the plurality of selected search paths to obtain the search path of which the structure label is the first structure label in the plurality of selected search paths. Then, performing a second round of training of the hyper-network based on a second round of screening results, namely training a search path with a structural label as a first structural label in the selected multiple search paths, and generating the hyper-network after the training of the current round (second round) and a compressed search space corresponding to the hyper-network after the training of the current round (second round) based on the search path with the trained structural label as the first structural label and the search path with the current round as the selection; and repeating multiple rounds of training based on the same processing process until a training cutoff condition is met, and further generating a compressed search space corresponding to the hyper-network after multiple rounds of training.
The training cutoff conditions include: the number of rounds of training is equal to the set threshold of the number of rounds, or the number of search paths included in the compressed search space corresponding to the trained hyper-network is less than or equal to the set threshold of the number.
The path discriminator includes a one-stage case and a multi-stage case. And under the condition that the path discriminator is in one stage, the path discriminator is trained once to obtain the path discriminator finished by one training, wherein the path discriminator can be trained for multiple times each time the path discriminator is trained, and the path discriminator finished by the training is obtained through the multiple times of training. And then, a path discriminator finished by one-time training can be used for subsequent processing, for example, the path discriminator finished by one-time training is used for screening a plurality of search paths in the initial search space, training of the hyper-network is carried out based on a screening result, and a compressed search space corresponding to the trained hyper-network is determined.
When the path discriminator is in multiple stages, the path discriminator is trained for many times in the process of searching the neural network structure based on the super network. Specifically, the untrained path discriminator may be trained for the first time based on the initial search space to obtain a level 1 path discriminator; further, part of the processing procedure may be performed by using the level 1 path discriminator, for example, filtering the plurality of search paths in the initial search space by using the level 1 path discriminator. The second training may be performed on the 1 st-level path discriminator based on the search space screened by the 1 st-level path discriminator to obtain a2 nd-level path discriminator, and a part of the processing process may be performed by using the 2 nd-level path discriminator, for example, the second training may be performed by using the 2 nd-level path discriminator to re-screen the plurality of search paths in the search space screened by the 1 st-level path discriminator; based on the same processing procedure, the nth stage path discriminator and the compressed search space can be obtained. Wherein n is the number of stages of the path discriminator and n is a positive integer.
In case the path arbiter comprises multiple stages, the super-network may be trained according to the following steps:
step one, taking an ith-stage path discriminator as a current path discriminator corresponding to a current search space; wherein, when the current search space is the initial search space, the ith-stage path discriminator is a1 st-stage path discriminator, and i is a positive integer. Wherein i is less than or equal to n.
Selecting a plurality of search paths from the current search space, determining a structural label of each search path in the plurality of selected search paths by using a current path discriminator, training the search path corresponding to the first structural label in the plurality of selected search paths, and generating the hyper-network after the training based on the search path corresponding to the first structural label after the training and the unselected search path of the round.
And step three, training the current path discriminator to generate an i + 1-level path discriminator and determining the i + 1-level path discriminator as a new current path discriminator under the condition that the number of training rounds of the hyper-network meets a preset condition.
And step four, taking the search space after the current round of compression corresponding to the hyper-network after the current round of training as the current search space, and returning to the step of selecting a plurality of search paths from the current search space until the training cutoff condition is met.
Here, a multi-stage path discriminator is provided, where the i +1 th stage path discriminator is obtained based on the i-th stage path discriminator training, and the multi-stage path discriminator is provided to determine the structural label of the search path, so that the accuracy of determining the structural label can be improved.
In specific implementation, the initial search space may be used as the current search space, and the level 1 path discriminator may be used as the current path discriminator. Selecting a plurality of search paths from a current search space, and performing a first round of screening on the plurality of selected search paths by using a current path discriminator (a level 1 path discriminator), namely determining a structure label of each of the plurality of selected search paths by using the current path discriminator, screening the search paths of which the structure labels are second structure labels out of the plurality of selected search paths to obtain the search paths of which the structure labels are first structure labels out of the plurality of selected search paths, training the search paths of which the structure labels are first structure labels out of the plurality of selected search paths, and generating the super-network trained in the current round (the first round) based on the search paths of which the trained structure labels are the first structure labels and the unselected search paths in the current round.
Then, it can be determined whether the number of rounds of training the super network meets a preset condition, for example, whether the number of training rounds reaches a preset threshold value of the number of rounds; if the search space is satisfied, training a current path discriminator by using the search space after the local compression corresponding to the hyper-network after the local training, namely training an ith-level path discriminator to generate an i + 1-level path discriminator, taking the i + 1-level path discriminator as the current path discriminator, taking the search space after the local compression corresponding to the hyper-network after the local training as the current search space, returning to the step of selecting a plurality of search paths from the current search space, and performing the second training until the training cutoff condition is satisfied; if the search space does not meet the training cut-off condition, the current path discriminator is not trained, namely the ith path discriminator is not trained, the ith path discriminator is still used as the current path discriminator, the search space after the current round of compression corresponding to the hyper-network after the current round of training is used as the current search space, the step of selecting a plurality of search paths from the current search space is returned, the next round of training is carried out until the training cut-off condition is met, and then the compressed search space corresponding to the hyper-network after the multiple rounds of training is generated.
For example, the determining whether the number of rounds of the training super network satisfies the preset condition may be determining whether the number of rounds of the training super network is a multiple of m, where m is a positive integer, and for example, the value of m may be any value such as 10, 20, 25, 30, and the like. When m is 30, judging whether the number of rounds of the training super network is a multiple of 30, if so, meeting a preset condition, otherwise, not meeting the preset condition, for example, if the number of rounds of the training super network is 20, not meeting the preset condition; if the number of rounds of training the super network is 60, the preset condition is met.
In an alternative embodiment, the path arbiter may be trained according to the following steps:
a1, selecting a plurality of first search path samples from a search space corresponding to a hyper-network, and evaluating the performance of a neural network corresponding to each first search path sample by using a verification sample.
And a2, sequencing the selected multiple first search path samples based on the performance of the neural network structure corresponding to each first search path sample.
A3, determining a mark structure label of each first search path sample based on the sequencing result of the selected multiple first search path samples; wherein the labeled structure labels are used to characterize the structural properties of the first search path sample.
And a4, training the path discriminator based on a plurality of first search path samples with label structure labels.
By adopting the method, the performance of the neural network structure corresponding to each first search path sample is evaluated by utilizing the verification sample, and the plurality of first search path samples are sequenced according to the evaluation performance, so that the relative performance among the plurality of first search path samples is determined; based on the relative performance among the multiple first search path samples, the labeled structure label of each first search path sample is determined, for example, the labeled structure label of the first search path ranked earlier (i.e., the first search path with better relative performance) is determined as the first labeled structure label, and the labeled structure label of the first search path ranked later (i.e., the first search path with worse relative performance) is determined as the second labeled structure label, so that the path discriminator is trained by using the multiple first search path samples with labeled structure labels.
In a possible implementation manner, the determining, based on the ranking result of the selected plurality of first search path samples, a tag structure label of each first search path sample includes:
based on the set sorting percentage and the sorting results of the selected multiple first search path samples, determining the mark structure labels of the first search path samples with the sorting results within the sorting percentage as first mark structure labels, and determining the mark structure labels of the first search path samples with the sorting results outside the sorting percentage as second mark structure labels;
wherein the tag structure labels comprise a first tag structure label and a second tag structure label; the first taggant structure label has a structural performance superior to the second taggant structure label.
Here, a ranking percentage may be set, for example, the ranking percentage may be 20%, and when the number of the plurality of first search path samples is 100, based on the ranking result of the selected plurality of first search path samples, the tag structure label of the first search path sample whose ranking result is located at the top 20 bits is determined as the first tag structure label; and determining the mark structure label of the first search path sample with the ranking result positioned at the last 80 bits as a second mark structure label, so that the mark structure label corresponding to the first search path sample is determined by using the relative performance between the first search path samples.
In the first case, when the path discriminator is at one stage, multiple first search path samples are selected from an initial search space corresponding to the super-network, and the performance of the neural network structure corresponding to each first search path sample is evaluated by using the verification sample, for example, the performance may be accuracy, or hardware delay generated when the neural network structure operates on specific hardware (e.g., a CPU or a GPU of a certain model), or the like. Illustratively, for each first search path, a verification sample may be input into a neural network structure corresponding to the first search path sample, a prediction result of a training sample is determined, the prediction result of the training sample is compared with a labeling result, the evaluation accuracy of the first search path sample is determined, and further, the evaluation accuracy of each selected first search path sample may be obtained.
The plurality of first search path samples may then be ranked according to the evaluated performance. And determining the mark structure label of each first search path sample according to the sorting result of the selected multiple first search path samples and the set sorting percentage, for example, if the number of the selected first search path samples is 100 and the set sorting percentage is 80%, determining the mark result labels of the first 80 first search path samples as the first mark structure labels and determining the mark result labels of the last 20 first search path samples as the second mark structure labels based on the sorting result, and obtaining the mark structure label of each selected first search path sample, wherein the structure performance corresponding to the first mark structure label is superior to that of the second mark structure label.
Finally, the path discriminator may be trained based on the plurality of first search path samples with the label structure, e.g., the path discriminator may be trained using a first penalty calculated based on the plurality of first search path samples with the label structure label.
In one possible implementation, training the path arbiter based on a plurality of first search path samples with labeled structure labels includes:
determining a first loss of the path discriminator based on a plurality of first search path samples with label structure labels, and training the path discriminator by using the first loss;
wherein the first penalty comprises a binary classification penalty and/or an ordering penalty; the sorting loss is determined based on the number of the selected first search path samples, the confidence degree of the predicted structure label corresponding to each first search path sample determined by the path discriminator and the set sorting percentage.
Here, the calculation formula of the first loss may be:
Figure BDA0002808491850000141
wherein, H (a) i ,y i ) Is a binary classification loss function (i.e., CE-loss), which may be a logical loss function or a two-dimensional cross-entropy loss function; l is r Is the Rank-loss (i.e., rank-loss); k is the number of the selected first search path samples, r is a set proportion (i.e. a set sorting percentage), for example, r is 0.8, k is 100, and then the label structure of the first k × r =80 first search path samples in the 100 first search path samples is a first label structure label; lambda is a preset balance parameter;
Figure BDA0002808491850000142
the confidence that the labeled structure label of the ith first search path sample is the first labeled structure label; />
Figure BDA0002808491850000143
The confidence that the label structure label for the jth first search path sample is the second label structure label.
As an optional embodiment, after determining the tag structure label of each first search path sample based on the ranking result of the plurality of first search path samples, the method further includes:
and a5, selecting at least one second search path sample from the search space corresponding to the trained hyper-network except the plurality of first search path samples.
Step a6, determining a labeling result label of a second search path sample based on the edit distance between the second search path sample and the first search path sample.
In step a4, training the path discriminator based on a plurality of first search path samples with label structure labels includes: step a41, training the path discriminator based on a plurality of first search path samples with label structure labels and a plurality of second search path samples.
In order to improve the training efficiency of the path discriminator, a plurality of second search path samples can be selected from other search paths except the plurality of first search path samples in the search space corresponding to the trained hyper-network; according to the conclusion that the performance of the model is close when the editing distance is close, the marking result label of the second search path sample is determined based on the marking structure label of the first search path sample, the data utilization efficiency is effectively improved by using a semi-supervised data amplification method, and the path discriminator is trained based on a plurality of first search path samples with the marking structure labels and a plurality of second search path samples.
Here, in order to improve the classification capability and the training efficiency of the path discriminator, after the label structure label of each first search path sample is determined, at least one second search path sample may be selected from other search paths except for the plurality of first search path samples in the initial search space corresponding to the super-network. The number of the selected second search path samples may be determined according to the number of the first search path samples, so that the number of the first search path samples and the number of the second search path samples satisfy a set ratio, for example, if the set ratio is 2:1, and when the number of the first search path samples is 100, the number of the second search path samples is 50.
For each second search path sample, an edit distance between the second search path sample and each first search path sample may be calculated, and the mark structure label of the first search path sample with the shortest edit distance is determined as the mark structure label of the second search path sample, so that the mark structure label of each second search path sample may be obtained.
After the labeled structure label of each second search path sample is obtained, a second loss function may be calculated based on the plurality of first search path samples with the labeled structure labels and the plurality of second search path samples, and the path discriminator may be trained using the second loss function.
Second loss function L total The calculation formula of (2) is as follows:
Figure BDA0002808491850000151
wherein, alpha is a preset balance parameter, L D (A) A calculated loss function based on a set of a plurality of first search path samples,
Figure BDA0002808491850000152
the resulting loss function is calculated based on a mixed set of the first plurality of search path samples and the second plurality of search path samples.
Under the condition that the path judger is in multiple stages, selecting a plurality of first search path samples from an initial search space which is a search space of a hyper-network corresponding to the untrained path judger, and training the untrained path judger according to the processes of the step a1 to the step a4, or the processes of the step a1 to the step a3, the step a5, the step a6 and the step a41 to obtain a level 1 path judger; further, the 1 st-level path discriminator may be used to perform multiple rounds of screening (e.g., 30 rounds of screening) on the initial search space of the super network, and perform multiple rounds of training (e.g., 30 rounds of training) on the super network based on the screening result, so as to obtain a compressed search space corresponding to the trained super network based on the 1 st-level path discriminator.
When the 1 st-level path discriminator is trained, a plurality of first search path samples are selected from a compressed search space corresponding to the trained hyper-network and corresponding to the 1 st-level path discriminator, and the 1 st-level path discriminator is trained according to the processes of the steps a1 to a4, or the processes of the steps a1 to a3, a5, a6 and a41, so as to obtain a2 nd-level path discriminator; further, the 2 nd-level discriminator may be used to perform multiple rounds of screening (e.g., 30 rounds of screening) on the initial search space of the super network, and perform multiple rounds of training (e.g., 30 rounds of training) on the super network based on the screening result, so as to obtain a compressed search space corresponding to the trained super network based on the 2 nd-level path discriminator. Based on the same process, the ith stage path discriminator can be trained to obtain the (i + 1) th stage path discriminator.
For example, a detailed process of generating a compressed search space is described, assuming that m is set to be 30 (that is, when the number of rounds of training the super network is a multiple of m =30, it is determined that the number of rounds of training the super network satisfies a preset condition), after an initial search space is determined based on the super network, first, taking the initial search space as a current search space, and taking a trained level 1 path discriminator as a current path discriminator corresponding to the current search space; selecting a plurality of search paths from the current search space, determining a structure label of each search path in the plurality of selected search paths by using a current path discriminator, training the search path of which the structure label is the first structure label in the plurality of selected search paths, and generating the super network after the training of the current round (the first round) based on the search path of which the trained structure label is the first structure label and the unselected search path of the current round.
And then, taking the search space after the current round of compression corresponding to the hyper-network after the current round of training as the current search space, taking the trained 1 st-level path discriminator as the current path discriminator corresponding to the current search space, returning to the step of selecting a plurality of search paths from the current search space, and generating the hyper-network after the current round (the second round) of training. And carrying out multiple rounds of training of the hyper-network until the hyper-network after the thirtieth round of training is generated. At this time, if the number of rounds of training the super network satisfies the preset condition, the current path discriminator (the 1 st-level path discriminator) may be trained to generate the 2 nd-level path discriminator, the 2 nd-level path discriminator is determined as the current path discriminator, the search space corresponding to the super network after the thirtieth round of training is used as the current search space, and the step of selecting a plurality of search paths from the current search space is returned until the training cutoff condition is satisfied, so that the compressed search space is generated.
Referring to fig. 3, a schematic flow chart of a training path discriminator in a neural network generation method is shown. Fig. 3 includes multiple search space compression, where multiple search paths are selected from the search space after the multiple compression, that is, each point in the graph represents one search path, the structure tags (the first structure tag and the second structure tag) of each search path are marked, different colors represent different structure tags, and each search path corresponds to one neural network structure. For each search path, determining a vector of the search path in a coding mode, inputting the vector representing the neural network structure of the search path into a path discriminator, and determining a predicted structure label of the search path; and determining a loss value (Rank-loss is sequencing loss, and CE-loss can be cross entropy loss) based on the structure label marked by each search path and the predicted structure label, training a path discriminator by using the loss value, and training the super network by using the trained path discriminator.
For S105:
in an alternative embodiment, determining a target search path from the compressed search space and generating a target neural network based on the target search path includes:
1. and determining a target search path from the compressed search space by using the trained path discriminator and a set evolutionary algorithm.
2. And training a target network structure corresponding to the target search path to generate the target neural network.
Here, the target search path may be determined from the compressed search space using a trained path discriminator and a set reinforcement learning algorithm, a recurrent neural network, or an evolutionary algorithm. When the path discriminator is multi-stage, the trained path discriminator is the last stage path discriminator, i.e. the nth stage path discriminator. The reinforcement learning algorithm, the recurrent neural network or the evolutionary algorithm may be set according to actual needs, for example, the recurrent neural network is a trained neural network for sampling a search path from a super network, and the evolutionary algorithm may be a genetic algorithm.
Here, since the compressed search space is a search space in which the search path having the structure label as the second structure label is deleted, and the number of the search paths in the compressed search space is smaller than the number of the search paths in the initial search space, when the target search path is determined from the compressed search space by using the trained path discriminator and the set reinforcement learning algorithm, the recurrent neural network, or the evolutionary algorithm, the efficiency of determining the target search path can be improved, and the efficiency of generating the target neural network is improved, thereby reducing hardware resources consumed in the super-network training and the neural network structure search.
In an alternative embodiment, determining a target search path from the compressed search space using a trained path discriminator and a set evolutionary algorithm includes:
1. and taking an initial population formed by at least one search path in the compressed search space as a current population.
2. Determining the structural label of each search path in the current population by using a path discriminator, and generating the screened current population based on the search path corresponding to the first structural label; and generating the population subjected to the iterative processing on the basis of the current population after screening and the evolutionary algorithm.
3. Taking the population after the iteration processing as a current population, and returning to the step of determining the structure label of each search path in the current population by using a path discriminator until the iteration times are equal to a set time threshold;
4. and determining a target search path from the initial population and the population generated by multiple iterations.
In specific implementation, the initial population formed by at least one search path (genetic path) in the compressed search space obtained in S104 may be used as the current population, the trained path discriminator (last-stage path discriminator) is used to determine the structural label of each search path (genetic path) in the current population, the search path with the structural label as the second structural label is deleted, and the search path with the structural label as the first structural label is configured into the current population after screening; and generating the population subjected to the iterative processing on the basis of the current population after screening and an evolutionary algorithm.
Here, for example, an evolutionary algorithm may be used to perform a path search on the multi-round trained super-network. Exemplary evolutionary algorithms include, for example: non-dominant sorting and sharing (NSGA-II) algorithm. Specifically, in the operating process of the evolutionary algorithm, the evolutionary algorithm firstly evaluates each genetic path in the current population after screening to obtain the evaluation accuracy of each genetic path on the verification data set, and then performs selection, recombination and compiling operations according to the evaluation result of each genetic search path to obtain a new genetic path set (to obtain the population after the iterative processing) for the next iteration. And after all iterations are completed, the genetic path with the highest evaluation index in the iteration process is finally output as an algorithm, and a target search path is obtained, namely the target network structure is obtained.
Meanwhile, when the path search is carried out on the hyper-network after the multi-round training based on the evolutionary algorithm, the hardware constraint condition can be set, so that the finally obtained target neural network can well support the hardware constraint condition. The hardware constraints include: a calculation amount constraint, a parameter amount constraint, and the like.
After the target search path is obtained, the training sample can be used to train the network structure corresponding to the target search path until the accuracy of the trained network structure is greater than the set accuracy threshold, or until the loss value of the trained network structure is less than the set loss threshold, and the like, so as to obtain the target neural network.
Here, in each iteration, the path discriminator may be used to determine the structure tag of each search path in the current population, delete the search path with the structure tag being the second structure tag, where the search path with the structure tag being the first structure tag constitutes the current population after screening, and generate the population after the current iteration based on the current population after screening and the evolutionary algorithm, thereby improving the efficiency of each iteration.
In an alternative embodiment, before training the path arbiter, the method further comprises: pre-training the hyper-network until the hyper-network meets a preset initialization condition;
training the path arbiter, comprising: and initializing the path discriminator, and training the initialized path discriminator based on the hyper-network after the pre-training is finished.
Here, before the training path discriminator, the selected training sample may be used to pre-train the super network until the super network meets a preset initialization condition, for example, until the stability of the super network meets a preset stability condition, or until the number of times of pre-training is equal to a set number threshold, so that each search path corresponds to the pre-trained initial parameter in the initial search space corresponding to the super network after the pre-training is completed.
And then, training the initialized path judger based on the hyper-network after completing the pre-training, wherein each search path corresponds to the initial parameter after the pre-training in the initial search space corresponding to the hyper-network after completing the pre-training, so that the evaluation performance of the search path can be accurately determined through the search path with the initial parameter, and further the path judger can be accurately trained.
For example, in the field of automatic driving or assisted driving, in order to accurately control a target vehicle, a road image acquired during driving needs to be accurately identified. In order to realize accurate identification of the road image, a more complex neural network needs to be designed, and multilayer feature extraction is performed on the road image to obtain a target object in the road image. When the neural network structure is complex, it is determined that the search space of the super network corresponding to the neural network structure is large, for example, if the neural network structure has 20 layers and each layer includes 4 operators, the initial search space corresponding to the super network includes 4 operators 20 =1099511627776 search paths, and it can be known that the number of the search paths is very large, so that the hyper-network training is realizedThe operation complexity is high, more hardware resources are needed, and the time consumption and the efficiency for determining the neural network structure are low.
In order to solve the above problem, according to the neural network generation method provided in the embodiment of the present disclosure, the path discriminator is used to screen the initial search space corresponding to the super network, and then training the super network based on the screening result is performed to determine the compressed search space corresponding to the trained super network, for example, the number of search paths included in the search space corresponding to the super network may be compressed to 2000, so that the number of search paths in the super network is greatly reduced, and then the compressed search space is based on which the target search path can be efficiently determined, so that the consumption of hardware resources is reduced, and further, the efficiency of generating the target neural network based on the target search path is improved.
Referring to fig. 4, an embodiment of the present disclosure further provides a data processing method, including:
s401: acquiring data to be processed; the data to be processed comprises: any one of the image to be processed, the character to be processed and the point cloud data to be processed;
s402: the method comprises the steps of processing data to be processed by utilizing a neural network generated by a neural network generation method provided by any embodiment of the disclosure to obtain a data processing result of the data to be processed.
The following are exemplary: (1) For the case that the data to be processed includes image data, the processing of the data to be processed includes: at least one of face recognition, object detection, and semantic segmentation. Here, the face recognition includes, for example: at least one of face key point identification, face emotion identification, face attribute (such as age, gender and the like) identification and living body detection. Object detection includes, for example: and detecting at least one of object position and object type.
(2) For the situation that the data to be processed comprises the character data, the processing of the data to be processed comprises the following steps: dialog generation, and character prediction. Dialog generation includes, for example: intelligent question answering, voice self-help and the like. Character prediction includes, for example: search keyword prediction, character completion prediction, and the like.
(3) For the condition that the data to be processed comprises point cloud data, the processing of the data to be processed comprises the following steps: and at least one of obstacle detection and target detection.
According to the data processing method provided by the embodiment of the disclosure, the neural network generated based on the neural network generation method provided by any embodiment of the disclosure is used for processing the data to be processed, and the generated neural network has better performance, so that the obtained data processing result has higher accuracy.
Referring to fig. 5, an embodiment of the present disclosure further provides an intelligent driving control method, including:
s501: acquiring image or point cloud data acquired by a driving device in the driving process;
s502: detecting a target object in the image or point cloud data by using a neural network generated based on the neural network generation method provided by any embodiment of the disclosure;
s503: controlling the running device based on the detected target object.
In a specific implementation, the driving device is, for example, but not limited to, any one of the following: an automatically driven vehicle, a vehicle equipped with an Advanced Driving Assistance System (ADAS), a robot, or the like. Controlling the traveling device, for example, includes controlling the traveling device to accelerate, decelerate, steer, brake, etc., or may play voice prompt information to prompt the driver to control the traveling device to accelerate, decelerate, steer, brake, etc.
It will be understood by those skilled in the art that in the method of the present invention, the order of writing the steps does not imply a strict order of execution and any limitations on the implementation, and the specific order of execution of the steps should be determined by their function and possible inherent logic.
Based on the same concept, an embodiment of the present disclosure further provides a neural network generating device, as shown in fig. 6, an architecture schematic diagram of the neural network generating device provided in the embodiment of the present disclosure includes a determining module 601, a screening module 602, a generating module 603, and a training module 604, specifically:
a determining module 601, configured to determine an initial search space of a neural network structure based on a super network; the initial search space comprises a plurality of search paths; the super network comprises a plurality of network layers, each network layer comprising at least one operator; each search path includes an operator in each network layer of the hyper-network;
a screening module 602, configured to screen a search path in the initial search space by using a path discriminator, perform training of the super network based on a screening result, and determine a compressed search space corresponding to the trained super network; the path discriminator is a trained model used for classifying the performance of the neural network structure corresponding to the search path;
a generating module 603, configured to determine a target search path from the compressed search space, and generate a target neural network based on the target search path.
In one possible implementation, the filtering module 602, when filtering the plurality of search paths in the initial search space by using a path discriminator and performing training of the super network based on the filtering result, is configured to:
determining, with a path discriminator, structural labels of a plurality of search paths in the initial search space, wherein the structural labels include a first structural label and a second structural label, and the structural performance of the first structural label is better than the second structural label;
and training a search path corresponding to the first structural label in the plurality of search paths to obtain the trained hyper-network.
In one possible implementation, the screening module 602 is configured to train the hyper-network according to the following steps:
taking the initial search space as a current search space, and selecting a plurality of search paths from the current search space;
determining the structure label of each selected search path in the plurality of search paths by using the path discriminator;
training a search path corresponding to the first structural label in the plurality of selected search paths, and generating a hyper-network after the training of the current round based on the trained search path corresponding to the first structural label and the unselected search path of the current round;
and taking the search space after the current round of compression corresponding to the hyper-network after the current round of training as the current search space, and returning to the step of selecting a plurality of search paths from the current search space until the training cutoff condition is met.
In a possible implementation, in case the path arbiter comprises multiple stages, the filtering module 602 is configured to train the super-network according to the following steps:
taking the i-th-level path discriminator as a current path discriminator corresponding to the current search space; wherein, under the condition that the current search space is the initial search space, the ith-level path discriminator is a1 st-level path discriminator, and i is a positive integer;
selecting a plurality of search paths from the current search space, determining a structural label of each search path in the plurality of selected search paths by using a current path discriminator, training the search path corresponding to a first structural label in the plurality of selected search paths, and generating a hyper-network after the training of the current round based on the trained search path corresponding to the first structural label and the unselected search path of the current round;
under the condition that the number of rounds of training the hyper-network meets a preset condition, training a current path discriminator to generate an i + 1-level path discriminator, and determining the i + 1-level path discriminator as the current path discriminator;
and taking the search space after the current round of compression corresponding to the hyper-network after the current round of training as the current search space, and returning to the step of selecting a plurality of search paths from the current search space until the training cutoff condition is met.
In a possible implementation, the apparatus further includes a training module 604 for training the path discriminator according to the following steps:
selecting a plurality of first search path samples from a search space corresponding to the hyper-network, and evaluating the performance of a neural network structure corresponding to each first search path sample by using a verification sample;
based on the performance of the neural network structure corresponding to each first search path sample, sequencing the selected first search path samples;
determining a label structure label of each first search path sample based on the sequencing result of the selected plurality of first search path samples; wherein the labeled structure labels are used to characterize the structural performance of the first search path sample;
the path discriminator is trained on a plurality of first search path samples labeled with a label structure.
In a possible implementation manner, the training module 604, determining the label structure label of each first search path sample based on the ranking result of the selected plurality of first search path samples, includes:
based on the set sorting percentage and the sorting results of the selected multiple first search path samples, determining the mark structure labels of the first search path samples with the sorting results within the sorting percentage as first mark structure labels, and determining the mark structure labels of the first search path samples with the sorting results outside the sorting percentage as second mark structure labels;
wherein the tag structure labels comprise a first tag structure label and a second tag structure label; the first tag structure label has a structural performance superior to the second tag structure label.
In one possible implementation, the training module 604, training the path discriminator based on a plurality of first search path samples with label structure labels, includes:
determining a first loss of the path discriminator based on a plurality of first search path samples with label structure labels, and training the path discriminator by using the first loss;
wherein the first penalty comprises a binary classification penalty and/or an ordering penalty; the sorting loss is determined based on the number of the selected first search path samples, the confidence degree of the predicted structure label corresponding to each first search path sample determined by the path discriminator and the set sorting percentage.
In one possible implementation, the training module 604, after determining the label structure label of each first search path sample based on the ranking results of the plurality of first search path samples, is further configured to:
selecting at least one second search path sample from other search paths except the plurality of first search path samples in a search space corresponding to the trained hyper-network;
determining a marker structure label for a second search path sample based on an edit distance between the second search path sample and the first search path sample;
the training the path discriminator based on a plurality of first search path samples with label structure labels comprises:
training the path discriminator based on a plurality of first search path samples with label structure labels and a plurality of second search path samples.
In a possible implementation, the generating module 603, when determining a target search path from the compressed search space and generating a target neural network based on the target search path, is configured to:
determining a target search path from the compressed search space by using a trained path discriminator and a set evolutionary algorithm;
and training a target network structure corresponding to the target search path to generate a target neural network.
In one possible implementation, the generating module 603, when determining the target search path from the compressed search space by using the trained path discriminator and the set evolutionary algorithm, is configured to:
taking an initial population formed by at least one search path in the compressed search space as a current population;
determining a structure label of each search path in the current population by using a path discriminator;
generating a current population after screening based on a search path corresponding to the first structure label;
generating a population subjected to iteration processing at this time based on the current population after screening and the evolutionary algorithm;
taking the population after the iteration processing as a current population, and returning to the step of determining the structure label of each search path in the current population by using a path discriminator until the iteration times are equal to a set time threshold;
and determining a target search path from the initial population and the population generated by multiple iterations.
In an alternative embodiment, before training the path discriminator, the apparatus further comprises:
a pre-training module 605, configured to pre-train the super network until the super network meets a preset initialization condition;
the training module 604, when training the path discriminator, is configured to: initializing the path discriminator, and training the initialized path discriminator based on the hyper-network after completing the pre-training.
Based on the same concept, an embodiment of the present disclosure further provides a data processing apparatus, as shown in fig. 7, which is an architecture schematic diagram of the data processing apparatus provided in the embodiment of the present disclosure, and includes a first obtaining module 701 and a processing module 702, specifically:
a first obtaining module 701, configured to obtain data to be processed; the data to be processed comprises: any one of an image to be processed, a character to be processed and point cloud data to be processed;
a processing module 702, configured to process the data to be processed by using the neural network generated based on the neural network generation method provided in the foregoing embodiment, so as to obtain a data processing result of the data to be processed.
Based on the same concept, an embodiment of the present disclosure further provides an intelligent driving control device, as shown in fig. 8, which is a schematic diagram of an architecture of the intelligent driving control device provided in the embodiment of the present disclosure, and includes a second obtaining module 801, a detecting module 802, and a control module 803, specifically:
a second obtaining module 801, configured to obtain an image or point cloud data acquired by a driving device in a driving process;
a detection module 802, configured to detect a target object in the image or point cloud data by using a neural network generated based on the neural network generation method provided in the embodiment of the present disclosure;
a control module 803 for controlling the running gear based on the detected target object.
In some embodiments, the functions of the apparatus provided in the embodiments of the present disclosure or the included templates may be used to execute the method described in the above method embodiments, and specific implementation thereof may refer to the description of the above method embodiments, and for brevity, no further description is provided here.
Based on the same technical concept, the embodiment of the disclosure also provides an electronic device. Referring to fig. 9, a schematic structural diagram of an electronic device provided in the embodiment of the present disclosure includes a processor 901, a memory 902, and a bus 903. The memory 902 is used for storing execution instructions, and includes a memory 9021 and an external memory 9022; the memory 9021 is also referred to as an internal memory, and is configured to temporarily store operation data in the processor 901 and data exchanged with an external memory 9022 such as a hard disk, the processor 901 exchanges data with the external memory 9022 through the memory 9021, and when the electronic device 900 is operated, the processor 901 communicates with the memory 902 through the bus 903, so that the processor 901 executes the following instructions:
determining an initial search space of a neural network structure based on a hyper-network; the initial search space comprises a plurality of search paths; the super network comprises a plurality of network layers, each network layer comprising at least one operator; each search path includes an operator in each network layer of the hyper-network;
screening a plurality of search paths in the initial search space by using a path discriminator;
training the hyper-network based on the screening result;
determining a compressed search space corresponding to the trained hyper-network; the path discriminator is a trained model used for classifying the performance of the neural network structure corresponding to the search path;
and determining a target search path from the compressed search space, and generating a target neural network based on the target search path.
Based on the same technical concept, the embodiment of the disclosure also provides an electronic device. Referring to fig. 10, a schematic structural diagram of an electronic device provided in the embodiment of the present disclosure includes a processor 1001, a memory 1002, and a bus 1003. The memory 1002 is used for storing execution instructions, and includes a memory 10021 and an external memory 10022; the memory 10021 is also referred to as a memory, and is used for temporarily storing operation data in the processor 1001 and data exchanged with the external memory 10022 such as a hard disk, the processor 1001 exchanges data with the external memory 10022 through the memory 10021, and when the electronic device 1000 operates, the processor 1001 and the memory 1002 communicate with each other through the bus 1003, so that the processor 1001 executes the following instructions:
acquiring data to be processed; the data to be processed comprises: any one of an image to be processed, a character to be processed and point cloud data to be processed;
and processing the data to be processed by using the neural network generated by the neural network generation method provided by the embodiment to obtain a data processing result of the data to be processed.
Based on the same technical concept, the embodiment of the disclosure also provides an electronic device. Referring to fig. 11, a schematic structural diagram of an electronic device provided in the embodiment of the present disclosure includes a processor 1101, a memory 1102, and a bus 1103. The storage 1102 is used for storing execution instructions and includes a memory 11021 and an external storage 11022; the memory 11021 is also referred to as an internal memory, and temporarily stores operation data in the processor 1101 and data exchanged with an external memory 11022 such as a hard disk, the processor 1101 exchanges data with the external memory 11022 through the memory 11021, and when the electronic device 1100 operates, the processor 1101 communicates with the memory 1102 through the bus 1103, so that the processor 1101 executes the following instructions:
acquiring image or point cloud data acquired by a driving device in the driving process;
detecting a target object in the image or point cloud data by using a neural network generated by the neural network generation method provided by the embodiment;
controlling the running device based on the detected target object.
Furthermore, the disclosed embodiments also provide a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, performs the steps of the neural network generation method described in the above method embodiments; or, the steps of the data processing method described in the above method embodiment are executed; alternatively, the steps of the intelligent driving control method described in the above method embodiment are performed.
The computer program product of the neural network generation method provided in the embodiments of the present disclosure includes a computer-readable storage medium storing a program code, where instructions included in the program code may be used to execute the steps of the neural network generation method described in the above method embodiments, which may be referred to in the above method embodiments specifically, and are not described herein again.
The computer program product of the data processing method provided in the embodiments of the present disclosure includes a computer-readable storage medium storing a program code, where instructions included in the program code may be used to execute steps of the data processing method in the above method embodiments, which may be referred to specifically for the above method embodiments, and are not described herein again.
The computer program product of the intelligent driving control method provided in the embodiments of the present disclosure includes a computer-readable storage medium storing a program code, where instructions included in the program code may be used to execute steps of the intelligent driving control method in the above method embodiments, which may be referred to in the above method embodiments specifically, and are not described herein again.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the system and the apparatus described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. In the several embodiments provided in the present disclosure, it should be understood that the disclosed system, apparatus, and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present disclosure may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present disclosure. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, an optical disk, or other various media capable of storing program codes.
The above are only specific embodiments of the present disclosure, but the scope of the present disclosure is not limited thereto, and any person skilled in the art can easily conceive of changes or substitutions within the technical scope of the present disclosure, and shall be covered by the scope of the present disclosure. Therefore, the protection scope of the present disclosure shall be subject to the protection scope of the claims.

Claims (13)

1. An intelligent travel control method, characterized by comprising:
determining an initial search space of a neural network structure based on a hyper-network; the initial search space comprises a plurality of search paths; the hyper-network comprises a plurality of network layers, each network layer comprising at least one operator; each search path including an operator in each network layer of the hyper-network;
screening the search path in the initial search space by using a path discriminator;
training the hyper-network based on the screening result;
determining a compressed search space corresponding to the trained hyper-network; the path discriminator is a trained model used for classifying the performance of the neural network structure corresponding to the search path;
determining a target search path from the compressed search space, and generating a target neural network based on the target search path;
in the case where the path arbiter comprises multiple stages, training the super-network according to the following steps:
taking the i-th-level path discriminator as a current path discriminator corresponding to the current search space; wherein, under the condition that the current search space is an initial search space, the ith-level path discriminator is a1 st-level path discriminator, and i is a positive integer;
selecting a plurality of search paths from the current search space, determining a structural label of each search path in the plurality of selected search paths by using a current path discriminator, training the search path corresponding to a first structural label in the plurality of selected search paths, and generating a hyper-network after the training of the current round based on the trained search path corresponding to the first structural label and the unselected search path of the current round;
under the condition that the number of rounds of training the hyper-network meets a preset condition, training a current path discriminator to generate an i + 1-level path discriminator, and determining the i + 1-level path discriminator as the current path discriminator;
taking the search space after the current round of compression corresponding to the hyper-network after the current round of training as the current search space, and returning to the step of selecting a plurality of search paths from the current search space until the training cutoff condition is met;
acquiring image or point cloud data acquired by a driving device in the driving process;
detecting a target object in the image or point cloud data by using the generated target neural network;
controlling the running device based on the detected target object.
2. The method of claim 1, wherein filtering the search path in the initial search space using a path discriminator, and wherein training the hyper-network based on the filtering comprises:
determining, with a path discriminator, structural labels for a plurality of search paths in the initial search space; wherein the structural labels comprise a first structural label and a second structural label, the first structural label having a structural performance superior to the second structural label;
and training a search path corresponding to the first structural label in the plurality of search paths to obtain the trained hyper-network.
3. The method of claim 2, wherein the hyper-network is trained according to the steps of:
taking the initial search space as a current search space, and selecting a plurality of search paths from the current search space;
determining the structure label of each of the selected multiple search paths by using the path discriminator;
training a search path corresponding to the first structural label in the multiple selected search paths, and generating a hyper-network after the training on the basis of the trained search path corresponding to the first structural label and the unselected search path of the round;
and taking the search space after the current round of compression corresponding to the hyper-network after the current round of training as the current search space, and returning to the step of selecting a plurality of search paths from the current search space until the training cutoff condition is met.
4. The method of claim 1~3 wherein the path discriminator is trained according to the steps of:
selecting a plurality of first search path samples from a search space corresponding to the hyper-network, and evaluating the performance of a neural network structure corresponding to each first search path sample by using a verification sample;
based on the performance of the neural network structure corresponding to each first search path sample, sequencing the selected first search path samples;
determining a label structure label of each first search path sample based on the sequencing result of the selected plurality of first search path samples; wherein the labeled structure labels are used to characterize the structural performance of the first search path sample;
the path discriminator is trained on a plurality of first search path samples labeled with a label structure.
5. The method according to claim 4, wherein determining the label structure label of each first search path sample based on the ranking result of the selected plurality of first search path samples comprises:
based on the set sorting percentage and the sorting results of the selected multiple first search path samples, determining the mark structure labels of the first search path samples with the sorting results within the sorting percentage as first mark structure labels, and determining the mark structure labels of the first search path samples with the sorting results outside the sorting percentage as second mark structure labels;
wherein the tag structure labels comprise a first tag structure label and a second tag structure label; the first tag structure label has a structural performance superior to the second tag structure label.
6. The method of claim 4, wherein training the path discriminator based on a plurality of first search path samples labeled with a label structure comprises:
determining a first loss of the path discriminator based on a plurality of first search path samples with label structure labels, and training the path discriminator by using the first loss;
wherein the first penalty comprises a binary classification penalty and/or an ordering penalty; the sorting loss is determined based on the number of the selected first search path samples, the confidence degree of the predicted structure label corresponding to each first search path sample determined by the path discriminator and the set sorting percentage.
7. The method of claim 4, further comprising, after determining the label structure label of each first search path sample based on the ranking results of the plurality of first search path samples:
selecting at least one second search path sample from other search paths except the plurality of first search path samples in the search space corresponding to the trained hyper-network;
determining a marker structure label for a second search path sample based on an edit distance between the second search path sample and the first search path sample;
the training the path arbiter based on a plurality of first search path samples with label structure labels comprises:
training the path discriminator based on a plurality of first search path samples with label structure labels and a plurality of second search path samples.
8. The method of claim 1~7 wherein determining a target search path from the compressed search space and generating a target neural network based on the target search path comprises:
determining a target search path from the compressed search space by using a trained path discriminator and a set evolutionary algorithm;
and training a target network structure corresponding to the target search path to generate the target neural network.
9. The method of claim 8, wherein determining a target search path from the compressed search space using a trained path discriminator and a set evolutionary algorithm comprises:
taking an initial population formed by at least one search path in the compressed search space as a current population;
determining a structure label of each search path in the current population by using a path discriminator;
generating a current population after screening based on a search path corresponding to the first structure label;
generating a population subjected to iteration processing at this time based on the current population after screening and the evolutionary algorithm;
taking the population subjected to the iterative processing as a current population, and returning to the step of determining the structure label of each search path in the current population by using a path discriminator until the iteration times are equal to a set time threshold;
and determining a target search path from the initial population and the population generated by multiple iterations.
10. The method of claim 1~9, wherein prior to training the path discriminator, the method further comprises:
pre-training the hyper-network until the hyper-network meets a preset initialization condition;
training the path arbiter, comprising:
and initializing the path discriminator, and training the initialized path discriminator based on the hyper-network after the pre-training is finished.
11. An intelligent travel control device, comprising:
a determining module for determining an initial search space of the neural network structure based on the hyper-network; the initial search space comprises a plurality of search paths; the super network comprises a plurality of network layers, each network layer comprising at least one operator; each search path includes an operator in each network layer of the hyper-network;
the screening module is used for screening the search paths in the initial search space by using a path discriminator, training the super network based on a screening result and determining a compressed search space corresponding to the trained super network; the path discriminator is a trained model used for classifying the performance of the neural network structure corresponding to the search path;
the generating module is used for determining a target searching path from the compressed searching space and generating a target neural network based on the target searching path;
in a case where the path arbiter comprises multiple stages, a screening module to train the hyper-network according to the following steps:
taking the i-th-level path discriminator as a current path discriminator corresponding to the current search space; wherein, under the condition that the current search space is an initial search space, the ith-level path discriminator is a1 st-level path discriminator, and i is a positive integer;
selecting a plurality of search paths from the current search space, determining a structural label of each search path in the plurality of selected search paths by using a current path discriminator, training the search path corresponding to a first structural label in the plurality of selected search paths, and generating a hyper-network after the training of the current round based on the trained search path corresponding to the first structural label and the unselected search path of the current round;
under the condition that the number of rounds of training the hyper-network meets a preset condition, training a current path discriminator to generate an i + 1-level path discriminator, and determining the i + 1-level path discriminator as the current path discriminator;
taking the search space after the current round of compression corresponding to the hyper-network after the current round of training as the current search space, and returning to the step of selecting a plurality of search paths from the current search space until the training cutoff condition is met;
the second acquisition module is used for acquiring the image or point cloud data acquired by the driving device in the driving process;
the detection module is used for detecting a target object in the image or point cloud data by using the generated target neural network;
a control module for controlling the travel device based on the detected target object.
12. An electronic device, comprising: a processor, a memory and a bus, the memory storing machine readable instructions executable by the processor, the processor and the memory communicating via the bus when the electronic device is operating, the machine readable instructions when executed by the processor performing the steps of the intelligent driving control method according to any one of claims 1 to 10.
13. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored thereon a computer program which, when being executed by a processor, carries out the steps of the intelligent driving control method according to any one of claims 1 to 10.
CN202011381177.2A 2020-11-30 2020-11-30 Neural network generation method and device, electronic equipment and storage medium Active CN112381227B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011381177.2A CN112381227B (en) 2020-11-30 2020-11-30 Neural network generation method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011381177.2A CN112381227B (en) 2020-11-30 2020-11-30 Neural network generation method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112381227A CN112381227A (en) 2021-02-19
CN112381227B true CN112381227B (en) 2023-03-24

Family

ID=74590933

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011381177.2A Active CN112381227B (en) 2020-11-30 2020-11-30 Neural network generation method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112381227B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111278085B (en) * 2020-02-24 2023-08-29 北京百度网讯科技有限公司 Method and device for acquiring target network
CN116415647A (en) * 2021-12-29 2023-07-11 华为云计算技术有限公司 Method, device, equipment and storage medium for searching neural network architecture
CN115099393B (en) * 2022-08-22 2023-04-07 荣耀终端有限公司 Neural network structure searching method and related device

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018093926A1 (en) * 2016-11-15 2018-05-24 Google Llc Semi-supervised training of neural networks
CA3028601C (en) * 2018-12-18 2021-10-26 Beijing Didi Infinity Technology And Development Co., Ltd. Systems and methods for determining driving path in autonomous driving
CN111325328B (en) * 2020-03-06 2023-10-24 上海商汤临港智能科技有限公司 Neural network generation method, data processing method and device
CN111582453B (en) * 2020-05-09 2023-10-27 北京百度网讯科技有限公司 Method and device for generating neural network model

Also Published As

Publication number Publication date
CN112381227A (en) 2021-02-19

Similar Documents

Publication Publication Date Title
CN112381227B (en) Neural network generation method and device, electronic equipment and storage medium
JP6862579B2 (en) Acquisition of image features
CN109522942B (en) Image classification method and device, terminal equipment and storage medium
CN107330074B (en) Image retrieval method based on deep learning and Hash coding
CN110782015A (en) Training method and device for network structure optimizer of neural network and storage medium
CN112487168B (en) Semantic question-answering method and device of knowledge graph, computer equipment and storage medium
CN111382868A (en) Neural network structure search method and neural network structure search device
CN112767997A (en) Protein secondary structure prediction method based on multi-scale convolution attention neural network
CN109460793A (en) A kind of method of node-classification, the method and device of model training
CN110795527B (en) Candidate entity ordering method, training method and related device
CN108536784B (en) Comment information sentiment analysis method and device, computer storage medium and server
CN111079780A (en) Training method of space map convolution network, electronic device and storage medium
CN113688851B (en) Data labeling method and device and fine granularity identification method and device
CN105184260A (en) Image characteristic extraction method, pedestrian detection method and device
CN111489366A (en) Neural network training and image semantic segmentation method and device
CN116596095B (en) Training method and device of carbon emission prediction model based on machine learning
CN114037055A (en) Data processing system, method, device, equipment and storage medium
CN112819050A (en) Knowledge distillation and image processing method, device, electronic equipment and storage medium
CN111881854A (en) Action recognition method and device, computer equipment and storage medium
CN111414951A (en) Method and device for finely classifying images
WO2022100607A1 (en) Method for determining neural network structure and apparatus thereof
CN111709475A (en) Multi-label classification method and device based on N-grams
CN112801271B (en) Method for generating neural network, data processing method and intelligent driving control method
CN116206201A (en) Monitoring target detection and identification method, device, equipment and storage medium
CN113032612B (en) Construction method of multi-target image retrieval model, retrieval method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant