CN109685204B

CN109685204B - Image processing method and device, storage medium and electronic equipment

Info

Publication number: CN109685204B
Application number: CN201811584647.8A
Authority: CN
Inventors: 郭梓超
Original assignee: Beijing Kuangshi Technology Co Ltd
Current assignee: Beijing Kuangshi Technology Co Ltd
Priority date: 2018-12-24
Filing date: 2018-12-24
Publication date: 2021-10-01
Anticipated expiration: 2038-12-24
Also published as: CN109685204A

Abstract

The invention relates to the technical field of deep learning, and provides a model searching method and device and an image processing method and device. The model searching method comprises the following steps: constructing a structure to be searched, wherein at least one edge is connected between any two connected nodes of the structure to be searched, and each edge in the at least one edge corresponds to different candidate operations; training a structure to be searched, determining a model obtained after one edge is reserved in edges between every two connected nodes as a model to be trained in the current iteration during each iteration, and directly transferring the trained parameters of the edges if the model contains the previously trained edges; after the structure to be searched is trained, at least one available model is selected from models contained in the structure to be searched according to the test result of the model performance. The method has high efficiency of searching the model, can cover a large searching range, and avoids missing valuable models.

Description

Image processing method and device, storage medium and electronic equipment

Technical Field

The invention relates to the technical field of deep learning, in particular to a model searching method and device and an image processing method and device.

Background

The convolutional neural network is the most commonly used model in the field of deep learning nowadays, and researchers have proposed convolutional neural network models with better performance, such as AlexNet, VGG16, inclusion, respet, Xception and the like, through manual design and experiments in recent years.

However, the manual design method requires a great deal of ability and experience of researchers, and often requires a great deal of time and effort of the researchers to design a suitable model on a certain task or a certain data set. Moreover, models are designed in various ways, and the number of the models considered by a manual design method is very limited, so that models with good performance are likely to be omitted.

Disclosure of Invention

In view of this, embodiments of the present invention provide a model searching method and apparatus, and an image processing method and apparatus, which can incorporate a large number of models into a search range and efficiently search out a model meeting specific requirements.

In order to achieve the purpose, the invention provides the following technical scheme:

in a first aspect, an embodiment of the present invention provides a model search method, which is used for searching a neural network model, and includes:

constructing a structure to be searched, wherein the structure to be searched comprises a plurality of nodes and directional edges connected with the plurality of nodes, the nodes represent units for caching data in a neural network, the edges represent terminal nodes for inputting the data cached by the initial nodes of the edges into the edges after candidate operation processing, at least one edge is connected between any two connected nodes, and each edge in at least one edge corresponds to different candidate operations;

training a structure to be searched by using data in a training set, determining a model obtained after one edge is reserved in an edge between every two connected nodes as a model to be trained in the current iteration during each iteration in the training process, and if the model to be trained in the current iteration contains the edge trained in the previous iteration, determining a trained parameter corresponding to the trained edge as an initial parameter of the trained edge during the current iteration;

after the structure to be searched is trained, at least one available model is selected from models contained in the structure to be searched according to a test result of the performance of the model, wherein the model contained in the structure to be searched is a model obtained after one edge is reserved in the edges between every two connected nodes.

In the method, the neural network is represented as a directed graph composed of nodes and edges, the nodes represent units for caching data in the neural network, and the edges represent terminal nodes for inputting the data cached in the initial node of the edge to the edge after certain operation processing.

Nodes can be added to the structure to be searched at will, edges can be added between the nodes at will, at least one edge is connected between any two connected nodes of the structure, each edge corresponds to a candidate operation, and therefore a large number of neural networks can be contained in the structure to be searched, and the neural networks comprise common nodes and part of common edges. By constructing the structure to be searched, model search can be performed in a larger range, and valuable model structures are prevented from being omitted.

Meanwhile, in the training process of the method, if the model to be trained for a certain iteration contains the edge trained in the previous iteration, the trained parameter corresponding to the trained edge is used as the initial parameter of the trained edge in the current iteration. The method is equivalent to sharing parameters, after the parameters are shared, the model convergence speed is accelerated, the training efficiency is improved, the training effect is improved, and further the model searching efficiency is improved and the model searching result is improved.

In addition, the model searching method is high in automation degree, a user does not need to expend too much energy to design a model structure, and the searching method can automatically select the model structure with the performance meeting the requirements.

In some embodiments, the model to be trained in the iteration is a model obtained after randomly reserving one edge in the edges between every two connected nodes.

In the embodiments, the model for training is obtained by randomly reserving edges, and when the number of iterations is large enough, the number of training times of each edge in the structure to be searched is approximately the same, so that each edge can be sufficiently trained, and parameters corresponding to each edge can be sufficiently shared.

Because the structure to be searched contains a large number of models, each edge of the structure to be searched is fully trained, which is equivalent to the models, and meanwhile, because the parameters are shared among all the edges, better training effect can be obtained only by less training times.

In some embodiments, selecting at least one available model from the models comprised by the structure to be searched based on the test results for the model performance comprises:

and selecting the first N models with the optimal performance from the models contained in the structure to be searched according to the test result of the model performance, wherein N is a positive integer greater than or equal to 1.

In some embodiments, selecting the top N models with the best performance from the models included in the structure to be searched according to the test result of the model performance includes:

testing the performance of each model contained in the structure to be searched;

and selecting the top N models with optimal performance from all models contained in the structure to be searched according to the test result of the model performance.

The embodiments select the first N models with the best performance by testing the exhaustive models, and can ensure that the selected models have the best performance in an absolute sense. Moreover, because the testing speed is higher and is usually far faster than the training speed, the method can be used for performing exhaustion even when the number of models is large, and the embodiments have practical value.

and selecting the first N models with the optimal performance from the models contained in the structure to be searched by utilizing a heuristic search algorithm according to the test result of the model performance.

When the heuristic search algorithm searches in the state space, each searched position is evaluated, searching is continued from some local superior positions, and iteration is continued until a target is reached. Therefore, a large number of meaningless search paths can be omitted, and the search efficiency is remarkably improved. The N models searched using the heuristic search algorithm may not be perfectly performing in an absolute sense, but may perform well enough. The heuristic search algorithm includes, but is not limited to, a genetic algorithm, an ant colony algorithm, a simulated annealing algorithm, a hill climbing algorithm, a particle swarm algorithm, and the like.

In some embodiments, after selecting at least one available model from the models comprised by the structure to be searched based on the test results for the model performance, the method further comprises:

and further training at least one available model by using the data of the target task, and selecting the model with the optimal performance according to the further training result.

The performance of the available models can be further optimized using the data of the target task and finally the model best suited to perform the target task is selected.

In some embodiments, constructing the structure to be searched includes:

constructing at least one unit to be searched, wherein the unit to be searched comprises a plurality of nodes and directional edges connecting the nodes;

and constructing the structure to be searched according to at least one type of unit to be searched, wherein each type of unit to be searched can be copied into a plurality of units during construction.

When the number of nodes and edges is large, it may be difficult for a user to directly design the whole structure to be searched, and the structure may be constructed in a modular manner. Namely, a unit to be searched is constructed firstly, and then a structure to be searched is formed by copying and combining the unit to be searched. Therefore, the user can concentrate on the design of the unit to be searched, and the difficulty of model design is reduced.

In some embodiments, the candidate operation comprises a multiply-by-0 operation.

There is no edge connected between two nodes, which can be equivalent to an edge corresponding to multiply by 0 operation, so as to facilitate uniform processing.

In some embodiments, the plurality of nodes includes a node with a summation function, and the node with the summation function can add input data from different nodes to obtain data that the node needs to cache.

When a node corresponds to a plurality of input nodes, the node has the function of fusing input data, and the operation of fusing data can be summation, averaging, multiplication, splicing and the like. In particular, if the edge of the input node includes an edge corresponding to a multiply-by-0 operation, it is preferable that the node adopts a node having a summation function, because whether or not adding 0 does not affect the data summation result, which is equivalent to the fact that the edge does not actually exist, which is consistent with the meaning of the multiply-by-0 operation.

In a second aspect, an embodiment of the present invention provides an image processing method using a neural network model, where the neural network model includes an input layer, an intermediate layer, and an output layer, the method includes:

training a structure to be searched by using images in a training set, determining a model obtained after one edge is reserved in the edge between every two connected nodes as a model to be trained in the current iteration during each iteration in the training process, and if the model to be trained in the current iteration contains the edge trained in the previous iteration, determining a trained parameter corresponding to the trained edge as an initial parameter of the trained edge during the current iteration;

after the structure to be searched is trained, selecting at least one available model from models contained in the structure to be searched according to a test result of the model performance, wherein the model contained in the structure to be searched is a model obtained by reserving one edge in the edges between every two connected nodes;

determining a target model from at least one available model;

the method includes receiving an input image with an input layer of an object model, extracting image features of the input image with an intermediate layer of the object model, and outputting a processing result for the input image with an output layer of the object model.

The target model used in the image processing method is obtained by the model searching method provided by the first aspect, and the model searching method has high efficiency in searching the model and can cover a large searching range, so that the model suitable for the image processing task can be searched, a good processing result can be obtained, and meanwhile, the efficiency of the whole image processing process can be improved.

In a third aspect, an embodiment of the present invention provides a model search apparatus, configured to search a neural network model, including:

the device comprises a building module, a searching module and a searching module, wherein the building module is used for building a structure to be searched, the structure to be searched comprises a plurality of nodes and a directional edge connected with the plurality of nodes, the nodes represent units for caching data in a neural network, the edge represents a terminal node which inputs the data cached by a starting node of the edge into the edge after candidate operation processing, at least one edge is connected between any two connected nodes, and each edge in at least one edge corresponds to different candidate operations;

the training module is used for training a structure to be searched by utilizing data in a training set, determining a model obtained after one edge is reserved in the edge between every two connected nodes as a model to be trained in the current iteration during each iteration in the training process, and determining a trained parameter corresponding to the trained edge as an initial parameter of the trained edge during the current iteration if the model to be trained in the current iteration contains the edge trained in the previous iteration;

and the selection module is used for selecting at least one available model from models contained in the structure to be searched according to the test result of the model performance after the structure to be searched is trained, wherein the model contained in the search structure is a model obtained after one edge is reserved in the edge between every two connected nodes.

In some embodiments, the selection module is specifically configured to:

In some embodiments, the selection module comprises:

the test unit is used for testing the performance of each model contained in the structure to be searched;

and the selection unit is used for selecting the top N models with the optimal performance from all the models contained in the structure to be searched according to the test result of the model performance.

In some embodiments, the selection unit is specifically configured to:

In some embodiments, the apparatus further:

and the retraining module is used for further training at least one available model by using the data of the target task and selecting the model with the optimal performance according to the result of the further training.

In some embodiments, the building module is specifically configured to:

In a fourth aspect, an embodiment of the present invention provides an image processing apparatus using a neural network model, where the neural network model includes an input layer, an intermediate layer, and an output layer, the apparatus including:

the training module is used for training a structure to be searched by using the images in the training set, determining a model obtained after one edge is reserved in the edge between every two connected nodes as a model to be trained in the current iteration during each iteration in the training process, and determining a trained parameter corresponding to the trained edge as an initial parameter of the trained edge during the current iteration if the model to be trained in the current iteration contains the edge trained in the previous iteration;

the selection module is used for selecting at least one available model from models contained in the structure to be searched according to a test result of the model performance after the structure to be searched is trained, wherein the model contained in the search structure is a model obtained by reserving one edge in the edge between every two connected nodes;

a model determination module for determining a target model from at least one available model;

and the execution module is used for receiving the input image by using the input layer of the target model, extracting the image characteristics of the input image by using the middle layer of the target model, and outputting the processing result of the input image by using the output layer of the target model.

In a fifth aspect, an embodiment of the present invention provides a computer-readable storage medium, where computer program instructions are stored on the computer-readable storage medium, and when the computer program instructions are read and executed by a processor, the computer program instructions perform the steps of the method provided in any one of the possible implementation manners of the first aspect, the second aspect, or both.

In a sixth aspect, an embodiment of the present invention provides an electronic device, which includes a memory and a processor, where the memory stores computer program instructions, and the computer program instructions, when read and executed by the processor, perform the steps of the method provided in any one of the first aspect, the second aspect, or any one of the two possible implementation manners.

In order to make the above objects, technical solutions and advantages of the present invention more comprehensible, embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.

Fig. 1 shows a block diagram of an electronic device applicable to an embodiment of the present invention;

FIG. 2 is a flow chart of a model search method provided by an embodiment of the invention;

FIG. 3 is a diagram illustrating a structure to be searched according to an embodiment of the present invention;

FIG. 4 shows a schematic diagram of three models contained in the structure to be searched of FIG. 3;

fig. 5 is a schematic diagram illustrating a construction method of a structure to be searched according to an embodiment of the present invention;

FIG. 6 is a functional block diagram of a model searching apparatus according to an embodiment of the present invention;

FIG. 7 is a functional block diagram of another model searching apparatus according to an embodiment of the present invention;

fig. 8 is a functional block diagram of an image processing apparatus using a neural network model according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Also, in the description of the present invention, the terms "first", "second", and the like are used only to distinguish one entity or operation from another entity or operation, and are not to be construed as indicating or implying any relative importance or order between such entities or operations, nor are they to be construed as requiring or implying any such actual relationship or order between such entities or operations. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

Fig. 1 shows a block diagram of an electronic device applicable to an embodiment of the present invention. Referring to FIG. 1, an electronic device 100 includes one or more processors 102, one or more memory devices 104, an input device 106, and an output device 108, which are interconnected by a bus system 112 and/or other form of connection mechanism (not shown).

Processor 102 may be a Central Processing Unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in electronic device 100 to perform desired functions.

The storage 104 may be various forms of computer-readable storage media such as volatile memory and/or non-volatile memory. Volatile memory can include, for example, Random Access Memory (RAM), cache memory (or the like). The non-volatile memory may include, for example, Read Only Memory (ROM), a hard disk, flash memory, and the like. One or more computer program instructions may be stored on a computer-readable storage medium and executed by processor 102 to implement the methods of embodiments of the present invention and/or other desired functionality. Various applications and various data, such as various data used and/or generated by the applications, may also be stored in the computer-readable storage medium.

The input device 106 may be a device used by a user to input instructions and may include one or more of a keyboard, a mouse, a microphone, a touch screen, and the like.

The output device 108 may output various information (e.g., images or sounds) to the outside (e.g., a user), and may include one or more of a display, a speaker, and the like.

It will be appreciated that the configuration shown in FIG. 1 is merely illustrative and that electronic device 100 may include more or fewer components than shown in FIG. 1 or have a different configuration than shown in FIG. 1. The components shown in fig. 1 may be implemented in hardware, software, or a combination thereof. In the embodiment of the present invention, the electronic device 100 may be, but is not limited to, an entity device such as a desktop, a notebook computer, a smart phone, an intelligent wearable device, and a vehicle-mounted device, and may also be a virtual device such as a virtual machine.

When describing the model search method in the following, the method is described by way of example in which the steps of the method are executed on the processor 102 of the electronic device 100, but this should not be construed as limiting the scope of the invention.

Fig. 2 is a flowchart of a model searching method provided by an embodiment of the present invention, which is used for searching a neural network model, where the specific type of the neural network is not limited, and may be, for example, a convolutional neural network, a cyclic neural network, and the like. Note that, hereinafter, the neural network model is sometimes simply referred to as a neural network or model. Referring to fig. 2, the method includes:

step S20: and constructing a structure to be searched.

In the embodiment of the invention, any one neural network is represented as a directed graph. The directed graph comprises a plurality of nodes and directed edges connecting the nodes, and only one edge is connected between any two connected nodes. Each node represents a unit for caching data in the neural network, taking the convolutional neural network as an example, an input node of the network may represent a unit for caching an input image, an output node of the network may represent a unit for caching an output result, and an intermediate node of the network may represent a unit for caching a characteristic diagram. Each edge connects two nodes, since the edges are directional, the two nodes are not called a start node and a terminal node respectively, each edge corresponds to an operation, and data cached by the start node of one edge is processed by the corresponding operation and then input to the terminal node of the edge, taking a convolutional neural network as an example, such operations may be convolution of 1x1, convolution of 3x3, depth separable convolution, maximum pooling, average pooling, and the like.

The structure to be searched comprises a plurality of nodes and directional edges connecting the nodes, the definition of the nodes and the definition of the edges are the same as the above, but at least one edge is connected between any two connected nodes of the structure to be searched, each edge in the at least one edge corresponds to different operations, which are called candidate operations, and the operations possibly included between the two connected nodes are meant. Thus, the structure to be searched can accommodate a large number of neural networks, each of which includes a common node (i.e., all nodes of the structure to be searched), and only one edge is connected between any two connected nodes of each neural network, and it can be understood that an edge in the structure to be searched can be common to a plurality of neural networks.

Fig. 3 is a schematic diagram illustrating a structure to be searched according to an embodiment of the present invention. Referring to fig. 3, the structure to be searched includes 4 nodes, and 3 edges are connected between any two nodes. Generally, if the structure to be searched includes n nodes, and m edges are included between any two nodes, the total number of models included in the structure to be searched is m²＊…m^(n－1)＝m⁽ⁿ ^－1)n/2For FIG. 3, the structure to be searched contains 3⁶Models, which contain the 3-structure models shown in fig. 4, have the same nodes and may have common edges, e.g., the edge between the second node (in top-to-bottom order) and the fourth node is the same for the first (left-to-right order) model and the second model.

The set of neural networks represented by the structure to be searched actually determines the scope of the search, from which the model that is finally searched will be generated. Taking the above formula as an example, the number of models included in the structure to be searched grows exponentially with the increase of m and n, and assuming that m is 10 and n is 10, the number of models can reach 10⁴⁵Therefore, the structure to be searched can cover a relatively large searching range, so that more model structures can be searched, and model structures with better performance are prevented from being missed.

In some implementations, the structure to be searched may be designed by the user, and then the electronic device 100 constructs a data structure corresponding to the structure to be searched according to the design result of the user. For example, a user writes codes containing a structure to be searched for, the processor 102 constructs the structure to be searched by executing the codes, for example, the user writes a configuration file describing the structure to be searched, the processor 102 constructs the structure to be searched by reading the configuration file, for example, nodes and edges are visually presented on a graphical interface, the user constructs a directed graph corresponding to the structure to be searched by operations such as dragging, copying and pasting, the processor 102 constructs an actual structure to be searched according to the directed graph, and the like.

In the implementation modes, a user does not need to spend much effort on designing a structure to be searched, only the number of nodes and the connection relation between the nodes need to be determined, particularly when the edges of the nodes are designed, the operation corresponding to which kind of edges is better does not need to be analyzed, only the edges corresponding to possible candidate operations need to be added between the nodes, and the result can be automatically searched in the subsequent steps. Therefore, the design burden of the user can be greatly reduced, or the technical requirement on the user can be reduced, and the user is not required to have too much professional knowledge of model design.

In other implementations, the structure to be searched may also be automatically designed and constructed by the electronic device 100 according to some preset rules.

Step S21: and training the structure to be searched by using the data in the training set.

Training of neural networks is a process that requires a large number of iterations, each of which is trained with a batch (batch) of data in a training set. In each iteration, a model in the structure to be searched is selected for training, the model is determined by reserving an edge in the edges between every two connected nodes of the structure to be searched, wherein how to select an edge from the edges of the two connected nodes is not limited, for example, an edge may be randomly selected in each iteration, an edge different from the previous one may be selected in each iteration, or an edge may be selected according to a certain preset rule, and the like. For example, for the structure to be searched in fig. 3, the three models shown in fig. 4 may be selected during the iterative training process.

It has been mentioned before that there are common edges between parts of the models in the structure to be searched, so that some edges in the model to be trained at this iteration may have been trained before. And if the model to be trained in the current iteration contains the edge trained in the previous iteration, determining the trained parameter corresponding to the trained edge as the initial parameter of the trained edge in the current iteration. The parameter corresponding to one edge refers to a parameter corresponding to an operation corresponding to the edge, for example, if a certain edge represents a convolution operation, the corresponding parameter includes a parameter in a convolution kernel. The parameters trained at the same time are migrated to the model trained in the current iteration, or the common edges are used for realizing the parameter sharing between the model trained in the previous iteration and the model trained in the current iteration. After the parameters are shared, the convergence speed of the model is accelerated during training, which is beneficial to improving the training efficiency and improving the training effect, the training process is a time-consuming part in model search, if the training efficiency is improved, the efficiency of the model search is correspondingly improved, and meanwhile, the improvement of the training effect is also beneficial to improving the result of the model search.

In some implementation manners, a model obtained after an edge is randomly reserved in an edge between every two connected nodes of the structure to be searched is determined as the model to be trained in the current iteration. The method for randomly selecting the edges is very simple to implement, and when the iteration times are enough, the training times of each edge in the structure to be searched are approximately the same, so that each edge can be sufficiently trained, and parameters corresponding to each edge can be sufficiently shared.

Taking fig. 3 as an example to illustrate the meaning of parameter sharing, assuming that iteration is performed for 30 ten thousand times, since the probability of each edge being selected is 1/3, each edge is trained for 10 ten thousand times (in the probability sense), any model in the structure to be searched includes 3 edges, and after 3 edges are trained for 10 ten thousand times, the model is equivalent to being trained for 10 ten thousand times alone.

By contrast, if taken aloneTraining 3 contained in fig. 3⁶If the models do not share parameters, 3 times of training are needed to make each edge of each model train for 10 ten thousand times⁶X10 ten thousand times of training, the inefficiency.

When more nodes are included in the structure to be searched, the effect of parameter sharing is more prominent, and the example mentioned in step S20 is followed, and if m is 10 and n is 10, the number of models in the structure to be searched is 10⁴⁵Assuming that each model is trained independently, each model is trained 10 ten thousand times, and 10 iterations are required in total⁴⁵X10 ten thousands times, the quantity is very large, and the current common equipment is difficult to meet the calculation requirement or even if the calculation requirement can be met, the training time is also unacceptable.

In the model searching method provided by the embodiment of the invention, because parameter sharing is adopted, each edge can be trained for 10 ten thousand times only by training 100 ten thousand times (m is multiplied by 10 ten thousand times), the same training effect is achieved by a plurality of orders of magnitude faster than that of a mode of training a model independently, the iteration number is only related to the number of edges between every two connected nodes and is unrelated to the number of nodes.

In addition, the model to be trained is obtained by randomly selecting edges, so that more different models can be generated in the probability sense for training, and the training effect is improved.

Step S22: after the structure to be searched is trained, at least one available model is selected from models contained in the structure to be searched according to the test result of the model performance.

The available model refers to a model which can be used for a target task and meets the performance requirement, and the performance can be understood in a broad sense, and refers to the accuracy of the model for the target task, the running speed of the model for the target task, or other meanings. Although a practical task may require the use of only one of the models, different model structures may still bring the researcher the idea of model design, so it is of practical significance to obtain multiple available models.

In some implementation manners, if N (N is greater than or equal to 1) models with performances meeting certain requirements need to be searched out, the performance of the models in the structure to be searched can be tested one by using the data concentrated in the test, and the N models meeting the performance requirements are not tested any more after the test.

In other implementations, the top N models with the best performance need to be selected from the models included in the structure to be searched as the search result according to the test result of the model performance. At this time, at least the following two schemes can be adopted:

firstly, testing the performance of each model contained in the structure to be searched, and then selecting the top N models with the optimal performance from all models contained in the structure to be searched according to the test result of the performance of the models.

The scheme is to exhaust the models in the structure to be searched and test the performance one by one, thereby ensuring that the selected models have optimal performance in an absolute sense. Moreover, the test speed is higher, usually much higher than the training speed, for example, only several seconds are needed to complete the test of one model, so the exhaustion is feasible even when the number of models is large, and the practicability of the scheme is not reduced.

And secondly, selecting the first N models with the optimal performance from the models contained in the structure to be searched by utilizing a heuristic search algorithm according to the test result of the model performance.

When the heuristic search algorithm searches in the state space, each searched position is evaluated, searching is continued from some local superior positions, and iteration is continued until a target is reached. Therefore, a large number of meaningless search paths can be omitted, and the search efficiency is remarkably improved. The N models searched using the heuristic search algorithm may not be perfectly performing in an absolute sense (as compared to exhaustive), but may perform well enough. At present, the mature heuristic search algorithm comprises a genetic algorithm, an ant colony algorithm, a simulated annealing algorithm, a hill climbing algorithm, a particle swarm algorithm and the like.

The following description will be given by taking fig. 3 as an example of how to perform model search using a genetic algorithm, assuming that the top 10 models with the best performance are to be searched. Three edges between every two nodes are respectively represented by a, b and c, the two nodes are a pair, and the number of the two nodes is 1-6, wherein the two nodes are 6 pairs of nodes in total. Then any one of the models in the structure to be searched can be represented by one code, for example (1a, 2b, 3a, 4b, 5b, 6 c). For the state space formed by the encoding, one possible workflow of the genetic algorithm is as follows:

step 1: randomly generating 20 groups of codes, namely 20 models, testing the models, selecting 10 groups of codes with optimal performance according to a test result, and storing the codes;

step 2: the change is made based on storing 10 sets of codes to generate another 20 sets of codes, and the change is made in various ways, for example, as follows:

A. and (3) crossing: randomly selecting 2 codes from 10 codes to generate a new code, such as (1a, 2a, 3a, 4c, 5c, 6c) and (1b, 2b, 3a, 4a, 5c, 6b), wherein each bit of the new code is randomly from the two codes, such as (1a, 2b, 3a, 4c, 5c, 6 b). Repeating this step 10 times results in 10 new sets of codes.

B. Mutation: randomly selecting 1 code from 10 codes to generate a new code, such as (1a, 2a, 3a, 4c, 5c, 6c), randomly selecting one bit to generate a new code, such as changing the first bit 1a to 1c, to obtain the new code (1c, 2a, 3a, 4c, 5c, 6 c). Repeating this step 10 times results in 10 new sets of codes.

And step 3: and testing the performance of the newly generated 20 groups of codes by crossing and mutation, combining the performance of the previously stored 10 groups of codes, reselecting the 10 groups of codes with the optimal performance, and storing again.

And 4, step 4: and repeating the steps 2 and 3 for 10 times, and reserving the model structures corresponding to the final stored 10 groups of codes as the first 10 models with the optimal performance searched by the algorithm.

The algorithm tests 20 models at a time, repeats 10 times, and finally only 200 models are tested, which is far smaller than 3 contained in the structure to be searched⁶And (4) modeling. However, the new structure generated by the iterative algorithm each time is changed based on the current better structure, so that the performance of the stored 10 models can be continuously changed in the direction of better performance through one iteration, and the tests of a plurality of models with poorer performance are greatly reduced. Example (b)If the candidate operation performance corresponding to a in the three edges a, b, and c is the worst, then in the process of repeated iteration, the generated new 20 sets of codes will contain less and less edge a, and will contain more edge b and edge c, so as to omit the time for testing these poor-performance structures containing edge a.

For other heuristic search algorithms, the search principle and genetic algorithm are similar and will not be elaborated here.

In addition, in the prior art, after the model is trained, a certain fine tuning is usually performed on the training set to be used for the target task, so that the condition of insufficient training is avoided. In the embodiment of the present invention, if each edge is sufficiently trained (for example, by the aforementioned method of randomly selecting an edge) when the structure to be searched is trained, the available model selected in step S22 may be directly used for the target task, and the model parameters do not need to be fine-tuned, which is beneficial to improving the searching efficiency of the model.

In some embodiments, after obtaining the at least one available model, the at least one available model may be further trained using the data of the target task, and a model with the best performance may be selected according to the result of the further training. For example, at least one available model is the first N models with the best performance, which are searched from the structure to be searched, and after further training, the performance ranking of the models may change, and finally only one model with the best performance is selected for the target task, where the model is the model determined according to the training data to be the most suitable for executing the target task. Wherein the data in the training set may be a subset of the data of the target task.

In some embodiments, the candidate operations for an edge of the structure to be searched include a multiply-by-0 operation, such an edge not passing data, and thus being equivalent to having virtually no connecting edge between two nodes. The absence of a connecting edge between two nodes can also be regarded as an operation equivalent to convolution, pooling and the like, and before searching is carried out, it is very difficult to determine whether an edge is to be connected between two nodes, so that an edge corresponding to an operation of multiplying 0 can be added between two nodes and is juxtaposed with an edge corresponding to other operations, which means that the connection is not taken as an alternative mode. After the edges corresponding to the operation of multiplying by 0 are introduced, the method is favorable for expanding the searching range of the model, processing different model structures in a unified way, and simultaneously can further reduce the difficulty of designing the structure to be searched by a user.

In some embodiments, the plurality of nodes of the structure to be searched includes a node having a summation function, and the node having the summation function can add input data from different nodes to obtain data that the node needs to cache.

When a node corresponds to a plurality of input nodes, the node has the function of fusing input data, and the operation of fusing data can be summation, averaging, multiplication, splicing and the like. In particular, if the edge of the input node includes an edge corresponding to a multiply-by-0 operation, it is preferable that the node adopts a node having a summation function, because whether or not adding 0 does not affect the data summation result, which is equivalent to the fact that the edge does not actually exist, which is consistent with the meaning of the multiply-by-0 operation. In other cases, nodes with different functions may be used according to actual needs. Referring to fig. 3, assuming that edges of the first to third nodes in fig. 3 and edges of the first to fourth nodes each include an edge corresponding to a multiply-by-0 operation, the third and fourth nodes may employ nodes having a summing function, which are shown by a plus sign in fig. 3.

When the number of nodes and edges is large, it may be difficult for a user to directly design the whole structure to be searched, and in some embodiments, the structure to be searched may be constructed in a modular manner. Specifically, the following method can be adopted:

at least one type of unit to be searched is first constructed, wherein each type of unit to be searched has a different structure. The unit to be searched and the structure to be searched are similar in structure and also include a plurality of nodes and directional edges connecting the plurality of nodes, for example, if a larger structure to be searched is to be constructed, the structure to be searched in fig. 3 may also be used as a unit to be searched.

And then constructing a structure to be searched according to at least one type of unit to be searched. During construction, each type of unit to be searched can be copied into a plurality of units, and different types of units to be searched can be combined to finally form a structure to be searched. For example, in fig. 5, three types of units to be searched, namely a unit to be searched a, a unit to be searched B, and a unit to be searched C, are combined, where the unit to be searched a is copied 3 times, the unit to be searched B is copied 2 times, and the units to be searched are connected by one edge, where the function of the edge may be simply to transmit data, and the connection manner is as shown in fig. 5. It is understood that besides the unit to be searched, nodes not belonging to a certain unit to be searched, such as the input nodes and the output nodes in fig. 5, may be included in the structure to be searched.

Therefore, when designing a large structure to be searched, a user can concentrate on designing a small unit to be searched, and the design difficulty of the model is favorably reduced. After the unit to be searched is designed, a more complex structure to be searched can be constructed in a shorter time in a modular mode.

The embodiment of the invention also provides an image processing method adopting the neural network model. The neural network model used in the method comprises an input layer, a middle layer and an output layer, wherein the three layers are general structures of the current neural network model, and the specific meanings are not explained in detail. The image processing method specifically comprises the following steps:

step a: and constructing a structure to be searched, wherein the structure to be searched comprises a plurality of nodes and directional edges connecting the plurality of nodes, the nodes represent units for caching data in the neural network, the edges represent terminal nodes for inputting the data cached by the initial nodes of the edges to the edges after candidate operation processing, at least one edge is connected between any two connected nodes, and each edge in at least one edge corresponds to different candidate operations.

Step b: and training a structure to be searched by using the images in the training set, determining a model obtained after one edge is reserved in the edge between every two connected nodes as the model to be trained in the current iteration during each iteration in the training process, and if the model to be trained in the current iteration contains the edge trained in the previous iteration, determining the trained parameter corresponding to the trained edge as the initial parameter of the trained edge during the current iteration.

Step c: after the structure to be searched is trained, at least one available model is selected from models contained in the structure to be searched according to a test result of the performance of the model, wherein the model contained in the structure to be searched is a model obtained after one edge is reserved in the edges between every two connected nodes.

The above steps a to c and steps S20 to S22 are similar, except that the training data are limited to images and are not specifically described.

Step d: a target model is determined from at least one available model.

The object model refers to a neural network model to be used by a specific image processing task, and such image processing tasks may be, but are not limited to, image classification, object detection, image segmentation, image recognition, and the like.

The manner in which the target model is determined based on at least one available model is not limited, e.g., the one with the best performance may be selected, the one with the fastest running speed may be selected, the one with the simplest model structure may be selected, etc. In particular, since the image processing task uses only one model, only one available model may be selected in step c, which is directly taken as the target model in step d.

In some implementations, the target model does not have to be selected directly from at least one available model, but rather the available models may be processed before the target model is selected from the processed models. For example, at least one available model may be further trained using data from the image processing task, and a model with the best performance may be selected as the target model based on the results of the further training.

Step e: the method includes receiving an input image with an input layer of an object model, extracting image features of the input image with an intermediate layer of the object model, and outputting a processing result for the input image with an output layer of the object model.

The processing method in step e is a commonly used method of the neural network model at present, and the specific process thereof is not explained in detail.

The target model used in the image processing method is obtained by using the model searching method provided by the embodiment of the invention, and the model searching method has higher model searching efficiency and can cover a larger searching range, so that the model suitable for the image processing task can be searched, a better processing result can be obtained, and the efficiency of the whole image processing process can be improved.

The embodiment of the invention also provides a model searching device 300, which is used for searching the neural network model. Referring to fig. 6, the apparatus includes: a building module 310, a training module 320, and a selection module 330.

The building module 310 is configured to build a structure to be searched, where the structure to be searched includes multiple nodes and a directional edge connecting the multiple nodes, each node represents a unit for caching data in a neural network, and each edge represents a terminal node that inputs data cached by a start node of the edge to the edge after being processed by a candidate operation, where at least one edge is connected between any two connected nodes, and each edge in the at least one edge corresponds to a different candidate operation;

the training module 320 is configured to train a structure to be searched by using data in a training set, determine, for each iteration in a training process, a model obtained after one edge is reserved in edges between every two connected nodes as a model to be trained in the current iteration, and if the model to be trained in the current iteration includes an edge that has been trained in a previous iteration, determine a trained parameter corresponding to the trained edge as an initial parameter of the trained edge in the current iteration;

the selecting module 330 is configured to select at least one available model from models included in the structure to be searched according to a test result of the model performance after the structure to be searched is trained, where the model included in the structure to be searched is a model obtained after an edge is reserved in an edge between every two connected nodes.

In some implementation manners of the device, the model to be trained in the iteration is a model obtained after an edge is randomly reserved in an edge between every two connected nodes.

In some implementations of the apparatus, the selection module 330 is specifically configured to:

Referring to fig. 7, in some implementations of the apparatus, the selection module 330 includes: a test unit 331 and a selection unit 332.

The test unit 331 is configured to test performance of each model included in the structure to be searched;

the selection unit 332 is configured to select the top N models with the best performance from all models included in the structure to be searched according to the test result of the model performance.

In some implementations of the apparatus, the selecting unit 332 is specifically configured to:

With continued reference to fig. 7, in some implementations of the apparatus, the apparatus further includes: and the retraining module 340, wherein the retraining module 340 is configured to further train at least one available model by using the data of the target task, and select a model with the optimal performance according to a result of the further training.

In some implementations of the apparatus, the building module 330 is specifically configured to:

In some implementations of the apparatus, the candidate operation includes a multiply-by-0 operation.

In some implementations of the apparatus, the plurality of nodes includes a node with a summing function, and the node with the summing function can sum input data from different nodes to obtain data that the node needs to cache.

The model searching apparatus 300 according to the embodiment of the present invention has been introduced in the foregoing method embodiments, and for the sake of brief description, reference may be made to the corresponding contents in the method embodiments for parts that are not mentioned in the apparatus embodiments.

The embodiment of the invention also provides an image processing device 400 adopting the neural network model. The neural network model comprises an input layer, a middle layer and an output layer. Referring to fig. 8, the apparatus includes: a building module 410, a training module 420, and a selection module 430, a model determination module 440, and an execution module 450.

A building module 410, configured to build a structure to be searched, where the structure to be searched includes multiple nodes and a directional edge connecting the multiple nodes, each node represents a unit for caching data in a neural network, and each edge represents a terminal node that inputs data cached by a start node of the edge to the edge after being processed by a candidate operation, where at least one edge is connected between any two connected nodes, and each edge in the at least one edge corresponds to a different candidate operation;

the training module 420 is configured to train a structure to be searched by using the images in the training set, determine, as a model to be trained in the current iteration, a model obtained after one edge is reserved in edges between every two connected nodes in each iteration in the training process, and if the model to be trained in the current iteration includes an edge that has been trained in the previous iteration, determine a trained parameter corresponding to the trained edge as an initial parameter of the trained edge in the current iteration;

the selecting module 430 is configured to select at least one available model from models included in the structure to be searched according to a test result of the model performance after the structure to be searched is trained, where the model included in the structure to be searched is a model obtained by reserving one edge in an edge between every two connected nodes;

a model determination module 440 for determining a target model from at least one available model;

the execution module 450 is configured to receive an input image using an input layer of the target model, extract image features of the input image using an intermediate layer of the target model, and output a processing result for the input image using an output layer of the target model.

The implementation principle and the generated technical effect of the image processing apparatus 400 using the neural network model provided by the embodiment of the present invention have been introduced in the foregoing method embodiments, and for the sake of brief description, reference may be made to corresponding contents in the method embodiments where no part of the embodiment of the apparatus is mentioned.

The embodiment of the present invention further provides a computer-readable storage medium, where computer program instructions are stored on the computer-readable storage medium, and when the computer program instructions are read and executed by a processor, the steps of the model searching method and/or the image processing method provided by the embodiment of the present invention are executed. Such a computer-readable storage medium may be, but is not limited to, storage device 104 shown in fig. 1.

The embodiment of the present invention further provides an electronic device, which includes a memory and a processor, where the memory stores computer program instructions, and the computer program instructions are read by the processor and executed to perform the model search method and/or the steps provided in the embodiment of the present invention. The electronic device may be, but is not limited to, the electronic device 100 shown in fig. 1.

It should be noted that, in the present specification, the embodiments are all described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments may be referred to each other. For the device-like embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method can be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

In addition, the functional modules in the embodiments of the present invention may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.

The functions may be stored in a computer-readable storage medium if they are implemented in the form of software functional modules and sold or used as separate products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device to execute all or part of the steps of the method according to the embodiments of the present invention. The aforementioned computer device includes: various devices having the capability of executing program codes, such as a personal computer, a server, a mobile device, an intelligent wearable device, a network device, and a virtual device, the storage medium includes: u disk, removable hard disk, read only memory, random access memory, magnetic disk, magnetic tape, or optical disk.

The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. An image processing method using a neural network model, the neural network model including an input layer, an intermediate layer, and an output layer, the method comprising:

constructing a structure to be searched, wherein the structure to be searched comprises a plurality of nodes and directional edges connected with the nodes, the nodes represent units for caching data in a neural network, the edges represent terminal nodes for inputting the data cached by the initial nodes of the edges to the edges after candidate operation processing, at least one edge is connected between any two connected nodes, and each edge in at least one edge corresponds to different candidate operations;

training the structure to be searched by using images in a training set, determining a model obtained after one edge is reserved in edges between every two connected nodes as a model to be trained in the current iteration during each iteration in the training process, and determining a trained parameter corresponding to the trained edge as an initial parameter of the trained edge during the current iteration if the model to be trained in the current iteration contains the trained edge in the previous iteration;

after the structure to be searched is trained, selecting at least one available model from models contained in the structure to be searched according to a test result of the model performance, wherein the model contained in the structure to be searched is a model obtained by reserving one edge in the edge between every two connected nodes;

determining a target model from the at least one available model;

the method includes receiving an input image with an input layer of the target model, extracting image features of the input image with an intermediate layer of the target model, and outputting a processing result for the input image with an output layer of the target model.

2. The image processing method according to claim 1, wherein the model to be trained in the current iteration is a model obtained by randomly reserving an edge in an edge between every two connected nodes.

3. The image processing method according to claim 1, wherein selecting at least one available model from the models included in the structure to be searched according to the test result of the model performance comprises:

4. The image processing method according to claim 3, wherein said selecting the top N models with the best performance from the models included in the structure to be searched according to the test result of the model performance comprises:

and selecting the top N models with the optimal performance from all models contained in the structure to be searched according to the test result of the model performance.

5. The image processing method according to claim 3, wherein said selecting the top N models with the best performance from the models included in the structure to be searched according to the test result of the model performance comprises:

6. The image processing method according to any of claims 1-5, characterized in that after said selecting at least one available model from the models comprised by the structure to be searched according to the test results on the model performance, the method further comprises:

and further training the at least one available model by using the image of the target task, and selecting the model with the optimal performance according to the result of the further training.

7. The image processing method according to claim 1, wherein the constructing a structure to be searched comprises:

and constructing the structure to be searched according to the at least one unit to be searched, wherein each unit to be searched can be copied into a plurality of units during construction.

8. The image processing method of claim 1, wherein the candidate operation comprises a multiply-by-0 operation.

9. The image processing method according to claim 1, wherein the plurality of nodes include a node having a summation function, and the node having the summation function is capable of adding input data from different nodes to obtain data to be buffered by the node.

10. An image processing apparatus employing a neural network model, the neural network model including an input layer, an intermediate layer, and an output layer, the apparatus comprising:

the device comprises a building module, a searching module and a searching module, wherein the building module is used for building a structure to be searched and comprises a plurality of nodes and directional edges connected with the nodes, the nodes represent units for caching data in a neural network, the edges represent terminal nodes for inputting the data cached by a starting node of the edge into the edge after candidate operation processing, at least one edge is connected between any two connected nodes, and each edge in the at least one edge corresponds to different candidate operations;

the training module is used for training the structure to be searched by using images in a training set, determining a model obtained after one edge is reserved in edges between every two connected nodes as a model to be trained in the current iteration during each iteration in the training process, and determining a trained parameter corresponding to the trained edge as an initial parameter of the trained edge during the current iteration if the model to be trained in the current iteration contains the edge trained in the previous iteration;

the selection module is used for selecting at least one available model from models contained in the structure to be searched according to a test result of the model performance after the structure to be searched is trained, wherein the model contained in the search structure is a model obtained by reserving one edge in the edges between every two connected nodes;

a model determination module for determining a target model from the at least one available model;

the execution module is used for receiving an input image by utilizing an input layer of the target model, extracting image characteristics of the input image by utilizing a middle layer of the target model, and outputting a processing result of the input image by utilizing an output layer of the target model.

11. A computer-readable storage medium, having stored thereon computer program instructions, which, when read and executed by a processor, perform the steps of the method of any one of claims 1-9.

12. An electronic device comprising a memory and a processor, the memory having stored therein computer program instructions, wherein the computer program instructions, when read and executed by the processor, perform the steps of the method of any of claims 1-9.