CN110543944A

CN110543944A - neural network structure searching method, apparatus, electronic device, and medium

Info

Publication number: CN110543944A
Application number: CN201910859899.5A
Authority: CN
Inventors: 温圣召; 希滕; 张刚
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2019-09-11
Filing date: 2019-09-11
Publication date: 2019-12-06
Anticipated expiration: 2039-09-11
Also published as: CN110543944B

Abstract

The embodiment of the application discloses a neural network structure searching method, a neural network structure searching device, electronic equipment and a medium, and relates to the technical field of neural networks. The specific implementation scheme is as follows: the controller sends the candidate network structure information obtained by searching to the trainer according to a preset communication protocol; the trainer trains a neural network model according to the candidate network structure information and the sample data, and feeds back index information generated based on the neural network model to the controller according to a preset communication protocol; and the controller searches again according to the index information, and if the candidate network structure information obtained by searching is converged, the candidate network structure with the converged structure information is determined as the target neural network structure obtained by final searching. Based on a preset communication mechanism, information interaction between the controller and the trainer based on a preset communication protocol is realized, the technical problem of strong coupling caused by the fact that the controller and the trainer must be based on the same development framework is solved, and decoupling between the controller and the trainer is realized.

Description

Neural network structure searching method, apparatus, electronic device, and medium

Technical Field

the embodiment of the application relates to the technical field of computers, in particular to the technical field of neural networks, and specifically relates to a neural network structure searching method, a neural network structure searching device, electronic equipment and a medium.

background

Through Neural network structure Search (NAS), developers such as model parameters and the like can be helped to automatically Search out the optimal Neural network structure.

At present, in the automatic search process of a neural network structure, a controller (controller) and a trainer (trainer) must be based on the same development framework to achieve the acquisition of an optimal network structure through cyclic processes of structure search, model construction, training and the like. Therefore, in the prior art, the coupling between the controller and the trainer is too strong, so that the trainer cannot be independently debugged as an independent process, secondary development and index alignment are required to be performed on the trainer under the condition that development frames are different, development difficulty and time consumption are increased, and the requirement on the technical capability of a developer is high.

disclosure of Invention

The embodiment of the application provides a neural network structure searching method, a neural network structure searching device, electronic equipment and a medium, which can remove the coupling between a controller and a trainer and facilitate the independent debugging and operation of the controller and the trainer.

in a first aspect, an embodiment of the present application provides a neural network structure search method, including:

The controller sends the candidate network structure information obtained by searching to the trainer according to a preset communication protocol;

the trainer trains a neural network model according to the candidate network structure information and the sample data, and feeds back index information generated based on the neural network model to the controller according to the preset communication protocol;

and the controller searches again according to the index information, and determines the candidate network structure with converged structure information as the target neural network structure obtained by final search if the candidate network structure information obtained by search is converged.

one embodiment in the above application has the following advantages or benefits: based on a preset communication mechanism, information interaction between the controller and the trainer based on a preset communication protocol is realized, the technical problem of strong coupling caused by the fact that the controller and the trainer must be based on the same development frame is solved, decoupling between the controller and the trainer is realized, and the controller and the trainer can be independently debugged and operated, so that the controller and the trainer do not need to adopt the same development frame for network structure search, secondary development of the trainer is avoided, development difficulty and time consumption are reduced, and technical effects of technical capability requirements of developers are achieved.

Optionally, the controller sends the candidate network structure information obtained by searching to the trainer according to a preset communication protocol, including:

The controller extracts network structure sub-information from the information searched by the controller by adopting an information reading mode associated with a development framework of the controller in the preset communication protocol;

And the controller packages the network structure sub-information into candidate network structure information with a uniform format according to a format packaging mode in the preset communication protocol, and sends the candidate network structure information to the trainer.

Optionally, the feedback of the index information generated based on the neural network model to the controller by the trainer according to the preset communication protocol includes:

The trainer adopts an information reading mode associated with the development framework of the trainer in the preset communication protocol to extract index sub-information from the information generated by the trainer;

And the trainer packages the index sub-information into index information with a uniform format according to a format packaging mode in the preset communication protocol, and feeds the index information back to the controller.

One embodiment in the above application has the following advantages or benefits: no matter whether the development framework of the controller is the same as that of the trainer or not, the data and the indexes generated by the controller and the trainer can be converted in a unified manner based on a preset communication mechanism, and the coupling between the controller and the trainer is relieved.

The controller searches a neural network structure in a search space based on a search strategy to obtain candidate network structure information;

And the controller sends the candidate network structure information to the trainer according to a preset communication protocol.

One embodiment in the above application has the following advantages or benefits: the controller determines the maximum number of layers of the network, the operation type of each layer, the activation function, operation-related hyper-parameters and the like by automatically searching the network structure information, solves the technical problem of manual parameter adjustment, and improves the model construction effect.

optionally, the training unit trains a neural network model according to the candidate network structure information and the sample data, and feeds back, to the controller, index information generated based on the neural network model according to a preset communication protocol, including:

the trainer analyzes the received candidate network structure information according to a preset communication protocol;

The trainer builds a neural network model according to the candidate network structure information obtained by analysis, and trains and tests the neural network model based on the sample data to generate index information of the neural network model;

and the trainer feeds the index information back to the controller according to a preset communication protocol.

Optionally, the controller performs a re-search according to the index information, including:

The controller carries out standardization processing on the index information according to a development framework of the trainer;

And the controller searches again according to the index information after the standardization processing.

One embodiment in the above application has the following advantages or benefits: by carrying out standardization processing on the index information, the defect that the index standards under different development frames are inconsistent is overcome, so that the controller can interact with the trainer based on different development frames, and the search of a network structure is realized.

And the controller performs re-search according to the index information obtained by analysis and a preset index expected value associated with the development framework of the trainer.

One embodiment in the above application has the following advantages or benefits: for the trainer interacted with the controller, the controller is preset with index expected values associated with development frames of the trainer, so that the defect that index standards under different development frames are inconsistent is overcome.

The controller searches a neural network structure based on a preset basic network model to obtain candidate network structure information;

optionally, before the controller performs a re-search according to the index information, the method further includes:

And if the controller determines that the index information of the neural network model trained on the basis of the candidate network structure information is lower than the index information of the preset basic network model according to the index information fed back by the trainer, determining the preset basic model as the target neural network structure obtained by final search.

One embodiment in the above application has the following advantages or benefits: the controller searches based on the preset basic network model, so that the defect of starting searching from the lowest-level network structure is overcome, the useless searching operation in the initial searching stage is reduced, and the searching efficiency is improved.

optionally, the sample data includes at least one of an image classification sample, a speech recognition sample, and a semantic understanding sample.

Optionally, if the sample data is an image classification sample, the neural network model constructed based on the target neural network structure is used for image classification.

one embodiment in the above application has the following advantages or benefits: the trainer trains and tests the model through the sample data, so that the neural network model can be applied to the technical field to which the sample data belongs, and the data processing efficiency and accuracy based on the neural network model in the field are improved.

optionally, the development framework of the controller is different from the development framework of the trainer.

One embodiment in the above application has the following advantages or benefits: the controller and the trainer developed based on different development frameworks can realize independent debugging, operation and interaction of the controller and the trainer based on a preset communication mechanism, and strong coupling caused by the limitation based on the development frameworks between the controller and the trainer is relieved.

In a second aspect, an embodiment of the present application provides a neural network structure searching apparatus, including:

The network structure searching module is used for sending the candidate network structure information obtained by searching to the trainer by the controller according to a preset communication protocol;

The model training module is used for the trainer to train a neural network model according to the candidate network structure information and the sample data and to feed back index information generated based on the neural network model to the controller according to the preset communication protocol;

a searching result determining module, configured to perform re-searching by the controller according to the index information, and if the candidate network structure information obtained through searching is converged, determine the candidate network structure with the converged structure information as a target neural network structure obtained through final searching

in a third aspect, an embodiment of the present application provides an electronic device, including:

at least one processor; and

A memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

The memory stores instructions executable by the at least one processor to enable the at least one processor to perform a neural network structure search method as described in any of the embodiments of the present application.

in a fourth aspect, embodiments of the present application provide a non-transitory computer-readable storage medium storing computer instructions for causing a computer to execute a neural network structure searching method according to any of the embodiments of the present application.

one embodiment in the above application has the following advantages or benefits: based on a preset communication mechanism, the controller can send the candidate network structure information obtained by searching to the trainer, the trainer builds a neural network model according to the received candidate network structure information, trains and tests the neural network model based on sample data, and feeds back the generated index information of the neural network model to the controller, so that the controller searches again based on the index information, the steps are repeated in a circulating way until the candidate network structure information obtained by searching is determined to be converged, and the candidate network structure with the converged structure information is determined to be the target neural network structure obtained by final searching. Based on a preset communication mechanism, information interaction between the controller and the trainer based on a preset communication protocol is realized, the technical problem of strong coupling caused by the fact that the controller and the trainer must be based on the same development frame is solved, decoupling between the controller and the trainer is realized, and the controller and the trainer can be independently debugged and operated, so that the controller and the trainer do not need to adopt the same development frame for network structure search, secondary development of the trainer is avoided, development difficulty and time consumption are reduced, and technical effects of technical capability requirements of developers are achieved.

other effects of the above-described alternative will be described below with reference to specific embodiments.

drawings

The drawings are included to provide a better understanding of the present solution and are not intended to limit the present application. Wherein:

fig. 1 is a flowchart of a neural network structure searching method according to a first embodiment of the present application;

FIG. 2 is a flow chart of a neural network structure searching method according to a second embodiment of the present application;

FIG. 3 is an architectural diagram of a neural network structure search according to a second embodiment of the present application;

Fig. 4 is a flowchart of a network structure search by a controller according to a second embodiment of the present application;

FIG. 5 is a flow chart of a trainer performing neural network training in accordance with a second embodiment of the present application;

Fig. 6 is a schematic structural diagram of a neural network structure search apparatus according to a third embodiment of the present application;

fig. 7 is a block diagram of an electronic device of a neural network structure searching method according to an embodiment of the present application.

Detailed Description

The following description of the exemplary embodiments of the present application, taken in conjunction with the accompanying drawings, includes various details of the embodiments of the application for the understanding of the same, which are to be considered exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

first embodiment

Fig. 1 is a flowchart of a neural network structure search method according to a first embodiment of the present application, and this embodiment is applicable to a case of searching neural network structure information, and during a search process, training of a model may be performed according to searched network structure information based on sample data in a technical field to which the model is applied, so as to generate index information for assisting a structure search. The method can be executed by a neural network structure searching device, which is implemented by software and/or hardware, and is preferably configured in an electronic device. As shown in fig. 1, the method specifically includes the following steps:

And S110, the controller sends the candidate network structure information obtained by searching to the trainer according to a preset communication protocol.

in the specific embodiment of the present application, the controller refers to an executable program for neural network structure selection, feature extraction and hyper-parameter tuning. The trainer is an executable program for constructing, training and testing a neural network model by adopting sample data with labels based on network structure information obtained by searching of the controller. Given that neural network models can be applied in various specific fields of artificial intelligence, the sample data may include at least one of image classification samples, speech recognition samples, and semantic understanding samples. Illustratively, if the sample data is an image classification sample, the neural network model determined based on the search is used for image classification. The sample data may be pre-labeled sample data, for example, the image classification samples include positive samples and negative samples of the classified sample. The sample data can be further divided into training samples and testing samples, the training samples are used for training the constructed neural network model, and the testing samples are used for testing the trained neural network model to obtain index information. It should be noted that the present embodiment is not limited to the above-mentioned fields, and the applied field is determined by the sample data adopted by the trainer.

the controller and the trainer can be developed in advance based on a certain development framework, such as TensorFlow and the like. The development frames of the controller and the trainer are not limited in the embodiment, the development frames between the controller and the trainer can be the same or different, the controller and the trainer can be configured in the same computer device or different computer devices and respectively serve as independently running processes, and information interaction is realized in an information-based communication mode.

Correspondingly, the controller and the trainer are continuously cycled to carry out multiple times of optimization search to obtain the optimal network structure, so the candidate network structure information refers to the network structure information obtained by the controller in the current cycle stage after operations such as neural network structure selection, feature extraction, hyper-parameter tuning and the like. The network structure information obtained in each cycle is the optimization of the network structure obtained in the last cycle. The candidate network structure information may include the maximum number of layers of the network, the operation type of each layer, an activation function, and operation-related hyper-parameters, such as the size and number of filters, to implement the design of the neural network.

Specifically, the controller may perform neural network structure search in the search space based on the search strategy to obtain candidate network structure information, and send the candidate network structure information to the trainer according to a preset communication protocol. A plurality of single Network structures, such as convolutional layers, full-link layers, pooling layers, etc., and commonly used spliced basic Network structures, such as Recurrent Neural Networks (RNNs), etc., are defined in the search space. For a search task, the size of the search space can be reduced and the search can be simplified in combination with a priori knowledge of the relevant attributes. The search strategy defines what algorithm can be used to quickly and accurately find the optimal network structure parameter configuration. In the searching process, the controller can perform further optimization searching on the basis of the network structure obtained by initial searching and by combining index information fed back by the neural network model trained by the trainer based on the network structure until the neural network structure with the best performance is obtained by circular searching.

In the embodiment, in view of the prior art, the controller and the trainer must be based on the same development framework to be able to operate, and the debugging and the operation of the controller and the trainer must be executed together and cannot be debugged separately. Therefore, the present embodiment removes the limitation on the development frameworks of the controller and the trainer, and correspondingly, a communication mechanism is added between the controller and the trainer to ensure that the controller and the trainer can be separately and independently debugged and operated, and information interaction can be realized no matter whether the development frameworks of the two parties are consistent or not.

Specifically, the interactive communication protocol between the auxiliary controller and the trainer can be established in advance according to the aspects of information reading modes, index standards and the like under various development frameworks. The preset communication protocol can limit the reading mode of information under various development frames, the standardized processing mode of model performance indexes under various development frames, the field information of the information to be sent and the representation mode thereof, and the like. And the preset communication protocol is respectively configured in the controller and the trainer. Therefore, after the controller executes task search to obtain candidate network structure information, the candidate network structure information obtained through the search is sent to the trainer based on the preset communication protocol.

For example, the controller developed based on various development frameworks generates information with respective complex information formats, and the controller may extract the network structure sub-information from the information searched by the controller by using an information reading manner associated with the development framework of the controller in a preset communication protocol. The controller encapsulates the network structure sub-information into candidate network structure information with a uniform format, for example, a Key-value format formed by a field identifier and a field value, according to a format encapsulation mode in a preset communication protocol, and sends the candidate network structure information to the trainer.

In addition, the controller typically optimizes the search step by step from scratch when conducting the network structure search. For example, a primary network structure is formed by splicing and initializing parameters of each network structure, and further optimization is performed based on index information such as the running speed and accuracy fed back by a subsequent trainer. I.e. the searched network structure is a gradual searching process from none to any, low performance to high performance. Therefore, the embodiment can configure the preset basic network for the controller, and the preset basic network has a certain mature network structure and performance, so that when the controller is triggered to search, further optimized search can be performed based on the preset basic network, a large amount of useless search operations in the initial search stage are avoided, and the search flow and time are shortened.

And S120, training the neural network model by the trainer according to the candidate network structure information and the sample data, and feeding back index information generated based on the neural network model to the controller according to a preset communication protocol.

in a specific embodiment of the present application, the index information is used to describe performance indexes of the neural network model, such as running time, accuracy, delay, loss degree, and the like, where the described neural network model is constructed and trained based on candidate network structure information sent by the controller, and the index information is obtained by testing the neural network model based on a certain performance evaluation policy. Corresponding to the candidate structure information, the controller and the trainer optimize and search for multiple times through continuous circulation to obtain the optimal network structure, so the index information refers to various index information generated by the trainer in the current circulation stage, the index information obtained in each circulation is used for assisting the controller to search for better network structure information in the next circulation, and the index information has a guiding function on the search of the controller.

Specifically, the trainer can analyze the received candidate network structure information according to a preset communication protocol to obtain candidate network structure information which accords with a development framework of the trainer; the trainer constructs a neural network model according to the candidate network structure information obtained by analysis, and adopts pre-marked sample data to train and test the constructed neural network model to generate index information of the neural network model; and finally, according to a preset communication protocol, feeding the index information back to the controller so as to guide the next cyclic search of the controller.

the trainer is developed based on various development frames, generated information has respective complex information formats, and the trainer can extract index sub-information from the information generated by the trainer by adopting an information reading mode associated with the development frames of the trainer in a preset communication protocol. The trainer packages the index sub-information into index information with a uniform format, for example, a Key-value format formed by a field identifier and a field value, according to a format packaging mode in a preset communication protocol, and feeds the index information back to the controller.

S130, the controller searches again according to the index information, and if the candidate network structure information obtained by searching is converged, the candidate network structure with the converged structure information is determined as the target neural network structure obtained by final searching.

In the embodiment of the application, the controller may analyze the received index information according to a preset communication protocol to obtain the index information conforming to the development framework of the trainer itself, so that the controller re-performs further optimization search of the network structure under the guidance of the index information obtained through analysis.

In this embodiment, in the process of performing the cyclic search by the controller according to the index information, if the candidate network structure information obtained through multiple cycles is the same or similar, that is, a better network structure cannot be searched under the guidance of the index information, the candidate network structure with the structure information converged may be determined as the target neural network structure obtained through the final search.

If the controller searches based on the preset basic model, if the index information of the neural network model trained based on the candidate network structure information is determined to be lower than the index information of the preset basic network model according to the index information fed back by the trainer, namely the preset basic model is the best neural network model, the preset basic model can be directly determined as the target neural network structure obtained by final search.

in addition, in the embodiment, the strong coupling between the controller and the trainer is eliminated through the communication mechanism, and further, in the process of re-searching based on the index information by the controller, in view of the fact that the evaluation standards for the indexes under different development frames are different, the controller can perform standardization processing on the index information according to the development frame of the trainer and perform re-searching according to the index information after the standardization processing; or the index expected value preset in the controller and associated with the development framework of the communicated trainer, the controller can perform re-search according to the index information obtained by analysis and the index expected value.

According to the technical scheme, based on a preset communication mechanism, the controller can send the candidate network structure information obtained through searching to the trainer, the trainer builds the neural network model according to the received candidate network structure information, the neural network model is trained and tested based on sample data, the generated index information of the neural network model is fed back to the controller, and therefore the controller conducts searching again based on the index information, the steps are repeated in a circulating mode until the candidate network structure information obtained through searching is determined to be converged, and the candidate network structure with the converged structure information is determined to be the target neural network structure obtained through final searching. Based on a preset communication mechanism, information interaction between the controller and the trainer based on a preset communication protocol is realized, the technical problem of strong coupling caused by the fact that the controller and the trainer must be based on the same development frame is solved, decoupling between the controller and the trainer is realized, and the controller and the trainer can be independently debugged and operated, so that the controller and the trainer do not need to adopt the same development frame for network structure search, secondary development of the trainer is avoided, development difficulty and time consumption are reduced, and technical effects of technical capability requirements of developers are achieved.

Second embodiment

fig. 2 is a flowchart of a neural network structure searching method according to a second embodiment of the present application, and this embodiment further explains a message sending method of a controller and a trainer on the basis of the first embodiment, and can extract sub-messages from generated messages by using a message reading method associated with a development framework of a message sending end in a preset communication protocol, and construct the sub-messages into candidate network structure messages or index messages to be sent with uniform format according to the preset communication protocol. As shown in fig. 2, the method specifically includes the following steps:

s210, the controller searches the neural network structure in the search space based on the search strategy to obtain candidate network structure information.

In the embodiment of the present application, a plurality of single network structures, such as convolutional layers, full connectivity layers, pooling layers, etc., and commonly used spliced basic network structures, such as RNNs, etc., are defined in the search space. For a search task, the size of the search space can be reduced and the search can be simplified in combination with a priori knowledge of the relevant attributes. The search strategy defines what algorithm can be used to quickly and accurately find the optimal network structure parameter configuration.

In the searching process, the controller can perform further optimization searching on the basis of the network structure obtained by initial searching and by combining index information fed back by the neural network model trained by the trainer based on the network structure until the neural network structure with the best performance is obtained by circular searching.

specifically, the controller performs the network structure search, and usually optimizes the search step by step from zero. For example, a primary network structure is formed by splicing and initializing parameters of each network structure, and further optimization is performed based on index information such as the running speed and accuracy fed back by a subsequent trainer. I.e. the searched network structure is a gradual searching process from none to any, low performance to high performance. Or, a preset basic network can be configured for the controller, and the preset basic network has a certain mature network structure and performance, so that when the controller is triggered to search, further optimized search can be performed based on the preset basic network, a large amount of useless search operation in the initial search stage is avoided, and the search flow and time are shortened.

S220, the controller extracts network structure sub-information from the information obtained by searching the controller by adopting an information reading mode associated with a development framework of the controller in a preset communication protocol.

In this embodiment, the network structure sub-information refers to one or more specific items of information in the network structure information, and belongs to a subset of the network structure information. For example, the network structure information includes network structure sub-information such as the maximum number of layers of the network, the operation type of each layer, an activation function, and a hyper-parameter related to the operation.

Because the generated information of the controller developed based on various development frameworks has respective complex information formats, the communication protocol can be pre-configured with a reading mode of the information generated by the executable program constructed aiming at different development frameworks. For example, the network structure sub information a is located in the nth row and mth column in the information generated by the executable program constructed based on the development framework a, and is located in the mth row and mth column in the information generated by the executable program constructed based on the development framework B. Therefore, the controller adopts an information reading mode associated with the development framework of the controller in a preset communication protocol, and effective information serving as network structure sub-information is extracted from the information searched and obtained by the controller.

And S230, the controller packages the network structure sub-information into candidate network structure information with a uniform format according to a format packaging mode in a preset communication protocol, and sends the candidate network structure information to the trainer.

In the embodiment of the application, the information to be sent is packaged based on the preset communication protocol, so that the integrity and the format consistency of the sent information are ensured, and the information interaction between the controller and the trainer is not limited by a development framework. Therefore, the controller packages the searched network structure information and sends the network structure information to the trainer.

s240, the trainer analyzes the received candidate network structure information according to a preset communication protocol.

In the specific embodiment of the application, the received information can be identified by the executable programs constructed based on various development frameworks through the encapsulation of the information to be sent by the preset communication protocol. Correspondingly, the trainer can analyze the received candidate network structure information according to a preset communication protocol to obtain the candidate network structure information which accords with the development framework of the trainer.

And S250, the trainer constructs a neural network model according to the candidate network structure information obtained by analysis, and trains and tests the neural network model based on the sample data to generate index information of the neural network model.

in the embodiment of the present application, the trainer constructs the neural network model according to the analyzed candidate network structure information, so as to obtain the neural network model under the current network structure information. Training the constructed neural network model by adopting a training sample in the sample data based on the label in the training sample; and testing the trained neural network model by adopting a test sample in the sample data to obtain various index information of the neural network model, wherein the index information is used for evaluating the performance of the neural network model and assisting in guiding the next cyclic search.

and S260, the trainer extracts the index sub-information from the information generated by the trainer by adopting an information reading mode associated with the development framework of the trainer in a preset communication protocol.

In the specific embodiment of the present application, the index sub-information refers to one or more specific items of information in the index information, and belongs to a subset of the index information. For example, the index information includes specific index item information such as running time, precision, delay, loss degree, and the like.

Because the generated information of the trainer developed based on various development frameworks has respective complex information formats, the communication protocol can be pre-configured with a reading mode of the information generated by the executable program constructed aiming at different development frameworks. For example, the index sub-information a is located in the nth row and mth column in the information generated by the executable program constructed based on the development framework a, and is located in the mth row and mth column in the information generated by the executable program constructed based on the development framework B. Therefore, the trainer adopts an information reading mode associated with the development framework of the trainer in a preset communication protocol to extract effective information serving as index sub-information from the information generated by the trainer.

And S270, the trainer packages the index sub-information into index information with a uniform format according to a format packaging mode in a preset communication protocol, and feeds the index information back to the controller.

in the embodiment of the application, the information to be sent is packaged based on the preset communication protocol, so that the integrity and the format consistency of the sent information are ensured, and the information interaction between the controller and the trainer is not limited by a development framework. And the trainer feeds the generated index information back to the controller after packaging.

And S280, the controller searches again according to the index information, and if the candidate network structure information obtained by searching is converged, the candidate network structure with the converged structure information is determined as the target neural network structure obtained by final searching.

in the embodiment of the application, the controller may analyze the received index information according to a preset communication protocol to obtain the index information conforming to the development framework of the trainer itself, so that the controller re-performs further optimization search of the network structure under the guidance of the index information obtained through analysis. In the process of carrying out cyclic search by the controller according to the index information, if the candidate network structure information obtained by multiple cycles is the same or similar, that is, a better network structure cannot be searched under the guidance of the index information, the candidate network structure with the structure information converged can be determined as the target neural network structure obtained by final search.

Optionally, the controller performs standardization processing on the index information according to a development framework of the trainer; and the controller searches again according to the index information after the standardization processing.

In this embodiment, evaluation criteria for the neural network indexes under various development frames may be collected in advance, and a standard conversion rule may be formulated, so that the indexes under various development frames can be evaluated based on the same standard after conversion. Because the interaction between the controller and the trainer has a one-to-one correspondence relationship, the development framework information of the communicated trainer can be configured in the controller in advance; or, the mutual information of the two parties may include the development framework identifier of the information sender based on the provision in the preset communication protocol. Therefore, the controller can determine the standard conversion rule of the target development frame according to the target development frame of the trainer, and the index information is subjected to standardized processing based on the standard conversion rule, so that the controller can search again according to the uniform index standard, and the alignment of indexes under different development frames is guaranteed.

Optionally, the controller performs a re-search according to the index information obtained through the analysis and a preset index expected value associated with the development framework of the trainer.

in this embodiment, since the interaction between the controller and the trainer has a one-to-one correspondence, the controller may be configured with the index expected value of the trainer in communication in advance, so that the controller may perform a re-search based on the index expected value under the development framework to which the trainer belongs and the received comparison result between the current index information under the development framework to which the trainer belongs, thereby ensuring the consistency of the indexes.

Fig. 3 is an architecture diagram of a neural network structure search. As shown in fig. 3, the controller and the trainer in this embodiment may be debugged and operated as independent processes, respectively, and when a network structure search task is executed, information interaction is achieved based on a communication mechanism between the controller and the trainer. Fig. 4 is a flowchart of the network structure search performed by the controller. As shown in fig. 4, a communication mechanism is added in the controller, and the controller first searches the network structure based on a certain algorithm, generates and packages candidate network structure information, and sends the information to the trainer. Correspondingly, after the index information fed back by the trainer is received, the index information is analyzed and extracted to guide updating of the search strategy algorithm. FIG. 5 is a flow chart of the training of the neural network by the trainer. As shown in fig. 5, a communication mechanism is added to the trainer, and first, a trainer initiator may be included in the trainer, and after receiving the information sent by the controller, the trainer initiator starts the trainer and passes the received candidate network structure information to the trainer. Therefore, the trainer analyzes and extracts the candidate network structure information, constructs a neural network model, trains and tests the neural network model by adopting sample data, generates index information of the neural network model, packages the index information and feeds the index information back to the controller, and then the controller can quit the training.

In the technical scheme of this embodiment, based on a preset communication mechanism, no matter whether the development framework of the controller is the same as that of the trainer or not, the controller can send the candidate network structure information obtained by the search to the trainer, the trainer can feed back the index information of the neural network model constructed based on the candidate network structure information to the controller, and the controller searches again based on the index information, and the process is repeated in a circulating manner. The method has the advantages that data and indexes generated by the two parties are converted in a unified mode based on a preset communication mechanism, information interaction between the controller and the trainer based on a preset communication protocol is achieved, the technical problems of strong coupling caused by the fact that the controller and the trainer must be based on the same development frame and index alignment under different development frames are solved, decoupling between the controller and the trainer is achieved, and the controller and the trainer can be debugged and operated independently, so that the controller and the trainer do not need to adopt the same development frame to search for a network structure, secondary development of the trainer is avoided, development difficulty and time consumption are reduced, and the technical effect that a developer requires technical ability is achieved.

third embodiment

Fig. 6 is a schematic structural diagram of a neural network structure search apparatus according to a third embodiment of the present application, which is applicable to a case of searching neural network structure information, and during a search process, a model may be trained according to the searched network structure information based on sample data in a technical field to which the model is applied, so as to generate index information for assisting a structure search. The device can realize the neural network structure searching method in any embodiment of the application. The apparatus 600 specifically includes the following:

The network structure searching module 610 is used for the controller to send the candidate network structure information obtained by searching to the trainer according to the preset communication protocol;

a model training module 620, configured to train a neural network model by the trainer according to the candidate network structure information and the sample data, and feed back, according to the preset communication protocol, index information generated based on the neural network model to the controller;

And a search result determining module 630, configured to perform re-search by the controller according to the index information, and if the candidate network structure information obtained through the search is converged, determine the candidate network structure with the converged structure information as a target neural network structure obtained through final search.

Optionally, the network structure searching module 610 is specifically configured to:

Optionally, the model training module 620 is specifically configured to:

Optionally, the search result determining module 630 is specifically configured to:

Before the controller searches again according to the index information, if the controller determines that the index information of the neural network model trained on the basis of the candidate network structure information is lower than the index information of the preset basic network model according to the index information fed back by the trainer, the preset basic model is determined as the target neural network structure obtained by final searching.

According to the technical scheme of the embodiment, through the mutual cooperation of all the functional modules, the functions of searching the network structure, sending the information of the candidate network structure, constructing and training the neural network model, generating and sending the index information, packaging and converting the information and the like are realized. The method has the advantages that data and indexes generated by the two parties are converted in a unified mode based on a preset communication mechanism, information interaction between the controller and the trainer based on a preset communication protocol is achieved, the technical problems of strong coupling caused by the fact that the controller and the trainer must be based on the same development frame and index alignment under different development frames are solved, decoupling between the controller and the trainer is achieved, and the controller and the trainer can be debugged and operated independently, so that the controller and the trainer do not need to adopt the same development frame to search for a network structure, secondary development of the trainer is avoided, development difficulty and time consumption are reduced, and the technical effect that a developer requires technical ability is achieved.

Fourth embodiment

According to an embodiment of the present application, an electronic device and a readable storage medium are also provided.

fig. 7 is a block diagram of an electronic device according to an embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the present application that are described and/or claimed herein.

As shown in fig. 7, the electronic apparatus includes: one or more processors 701, a memory 702, and interfaces for connecting the various components, including a high-speed interface and a low-speed interface. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions for execution within the electronic device, including instructions stored in or on the memory to display Graphical information for a Graphical User Interface (GUI) on an external input/output device, such as a display device coupled to the Interface. In other embodiments, multiple processors and/or multiple buses may be used, along with multiple memories and multiple memories, as desired. Also, multiple electronic devices may be connected, with each device providing portions of the necessary operations, e.g., as a server array, a group of blade servers, or a multi-processor system. In fig. 7, one processor 701 is taken as an example.

the memory 702 is a non-transitory computer readable storage medium as provided herein. Wherein the memory stores instructions executable by at least one processor to cause the at least one processor to perform the neural network structure searching method provided herein. The non-transitory computer-readable storage medium of the present application stores computer instructions for causing a computer to perform the neural network structure search method provided by the present application.

The memory 702, which is a non-transitory computer readable storage medium, may be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules corresponding to the neural network structure searching method in the embodiment of the present application, for example, the network structure searching module 610, the model information module 620, and the search result determining module 630 shown in fig. 6. The processor 701 executes various functional applications of the server and data processing, i.e., implements the neural network structure search method in the above-described method embodiments, by executing the non-transitory software programs, instructions, and modules stored in the memory 702.

The memory 702 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to use of the electronic device of the neural network structure search method, and the like. Further, the memory 702 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory 702 may optionally include memory located remotely from the processor 701, and such remote memory may be connected to the neural network architecture search method electronics over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

the electronic device of the neural network structure searching method may further include: an input device 703 and an output device 704. The processor 701, the memory 702, the input device 703 and the output device 704 may be connected by a bus or other means, and fig. 7 illustrates an example of a connection by a bus.

the input device 703 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the electronic apparatus of the neural network structure search method, such as a touch screen, a keypad, a mouse, a track pad, a touch pad, a pointing stick, one or more mouse buttons, a track ball, a joystick, or the like. The output device 704 may include a display apparatus, an auxiliary lighting device such as a Light Emitting Diode (LED), a tactile feedback device, and the like; the tactile feedback device is, for example, a vibration motor or the like. The Display device may include, but is not limited to, a Liquid Crystal Display (LCD), an LED Display, and a plasma Display. In some implementations, the display device can be a touch screen.

Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, Integrated circuitry, Application Specific Integrated Circuits (ASICs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

These computer programs, also known as programs, software applications, or code, include machine instructions for a programmable processor, and may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or Device for providing machine instructions and/or data to a Programmable processor, such as a magnetic disk, optical disk, memory, Programmable Logic Device (PLD), including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.

to provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device for displaying information to a user, for example, a Cathode Ray Tube (CRT) or an LCD monitor; and a keyboard and a pointing device, such as a mouse or a trackball, by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback, e.g., visual feedback, auditory feedback, or tactile feedback; and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component, e.g., as a data server, or that includes a middleware component, e.g., an application server, or that includes a front-end component, e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here, or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication, e.g., a communication network. Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the internet.

the computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

According to the technical scheme of the embodiment of the application, based on a preset communication mechanism, the controller can send the candidate network structure information obtained by searching to the trainer, the trainer builds a neural network model according to the received candidate network structure information, trains and tests the neural network model based on sample data, and feeds back the generated index information of the neural network model to the controller, so that the controller searches again based on the index information, the steps are repeated in a circulating mode until the candidate network structure information obtained by searching is determined to be converged, and the candidate network structure with the converged structure information is determined to be the target neural network structure obtained by final searching. Based on a preset communication mechanism, information interaction between the controller and the trainer based on a preset communication protocol is realized, the technical problem of strong coupling caused by the fact that the controller and the trainer must be based on the same development frame is solved, decoupling between the controller and the trainer is realized, and the controller and the trainer can be independently debugged and operated, so that the controller and the trainer do not need to adopt the same development frame for network structure search, secondary development of the trainer is avoided, development difficulty and time consumption are reduced, and technical effects of technical capability requirements of developers are achieved.

In addition, one embodiment in the above application has the following advantages or benefits: no matter whether the development framework of the controller is the same as that of the trainer or not, the data and the indexes generated by the controller and the trainer can be converted in a unified manner based on a preset communication mechanism, and the coupling between the controller and the trainer is relieved.

in addition, one embodiment in the above application has the following advantages or benefits: the controller determines the maximum number of layers of the network, the operation type of each layer, the activation function, operation-related hyper-parameters and the like by automatically searching the network structure information, solves the technical problem of manual parameter adjustment, and improves the model construction effect.

In addition, one embodiment in the above application has the following advantages or benefits: by carrying out standardization processing on the index information, the defect that the index standards under different development frames are inconsistent is overcome, so that the controller can interact with the trainer based on different development frames, and the search of a network structure is realized.

in addition, one embodiment in the above application has the following advantages or benefits: for the trainer interacted with the controller, the controller is preset with index expected values associated with development frames of the trainer, so that the defect that index standards under different development frames are inconsistent is overcome.

In addition, one embodiment in the above application has the following advantages or benefits: the controller searches based on the preset basic network model, so that the defect of starting searching from the lowest-level network structure is overcome, the useless searching operation in the initial searching stage is reduced, and the searching efficiency is improved.

In addition, one embodiment in the above application has the following advantages or benefits: the trainer trains and tests the model through the sample data, so that the neural network model can be applied to the technical field to which the sample data belongs, and the data processing efficiency and accuracy based on the neural network model in the field are improved.

In addition, one embodiment in the above application has the following advantages or benefits: the controller and the trainer developed based on different development frameworks can realize independent debugging, operation and interaction of the controller and the trainer based on a preset communication mechanism, and strong coupling caused by the limitation based on the development frameworks between the controller and the trainer is relieved.

it should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present application may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present application can be achieved, and the present invention is not limited herein.

the above-described embodiments should not be construed as limiting the scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims

1. a neural network structure search method, comprising:

2. The method of claim 1, wherein the controller sends the candidate network structure information obtained by searching to the trainer according to a preset communication protocol, and the method comprises:

3. The method of claim 1, wherein the trainer feeds back index information generated based on the neural network model to the controller according to the preset communication protocol, and the method comprises the following steps:

4. The method of claim 1, wherein the controller sends the candidate network structure information obtained by searching to the trainer according to a preset communication protocol, and the method comprises:

5. The method of claim 1, wherein the trainer trains a neural network model according to the candidate network structure information and the sample data, and feeds back index information generated based on the neural network model to the controller according to a preset communication protocol, and the method comprises:

6. the method of claim 1, wherein the controller performs a re-search based on the metric information, comprising:

7. the method of claim 1, wherein the controller performs a re-search based on the metric information, comprising:

8. The method of claim 1, wherein the controller sends the candidate network structure information obtained by searching to the trainer according to a preset communication protocol, and the method comprises:

9. The method of claim 8, further comprising, before the controller performs the re-search based on the metric information:

10. The method of claim 1, wherein the sample data comprises at least one of an image classification sample, a speech recognition sample, and a semantic understanding sample.

11. The method of claim 10, wherein if the sample data is an image classification sample, a neural network model constructed based on the target neural network structure is used for image classification.

12. the method of claim 1, wherein a development framework of the controller is different from a development framework of the trainer.

13. A neural network structure search apparatus, comprising:

And the search result determining module is used for the controller to search again according to the index information, and if the candidate network structure information obtained by searching is converged, determining the candidate network structure with the converged structure information as the target neural network structure obtained by final searching.

14. an electronic device, comprising:

At least one processor; and

The memory stores instructions executable by the at least one processor to enable the at least one processor to perform the neural network structure searching method of any one of claims 1-12.

15. A non-transitory computer readable storage medium storing computer instructions for causing a computer to execute the neural network structure searching method according to any one of claims 1 to 12.