CN110781999B

CN110781999B - Neural network architecture selection method and device

Info

Publication number: CN110781999B
Application number: CN201911037319.0A
Authority: CN
Inventors: 初祥祥; 许瑞军; 张勃; 李吉祥; 李庆源; 王斌
Original assignee: Beijing Xiaomi Mobile Software Co Ltd
Current assignee: Beijing Xiaomi Mobile Software Co Ltd
Priority date: 2019-10-29
Filing date: 2019-10-29
Publication date: 2022-10-11
Anticipated expiration: 2039-10-29
Also published as: CN110781999A

Abstract

The disclosure relates to a method and a device for selecting a neural network architecture, wherein the method comprises the following steps: obtaining the t generation population R _t (ii) a For population R _t The individuals contained in the method are subjected to non-dominated sorting to obtain k individual groups with different ranks; sequentially selecting i individual groups from the k individual groups; calculating weighted crowding distances of the jth individual under a plurality of targets according to the crowding distance of the jth individual under each target and the corresponding weight of each target; m individuals are selected from the ith individual group according to the weighted crowding distance, and the m individuals and all the individuals in the i-1 individual group form a t +1 generation population R _t+1 (ii) a Let t = t +1, and again from the pair population R _t The step of non-dominated sorting of the individuals contained in (1) is started, when the T generation population R is generated _T Then, the population R is determined _T Is a target population. The method and the device generate the population meeting the preference of a decision maker, and improve the flexibility of determining the crowding degree of the individual.

Description

Neural network architecture selection method and device

Technical Field

The embodiment of the disclosure relates to the technical field of computers, in particular to a method and a device for selecting a neural network architecture.

Background

NSGA-II (Non-dominant Sorting Genetic Algorithm with elite strategy-II) is an improved Algorithm for NSGA (Non-dominant Sorting Genetic Algorithm).

In NSGA-II, each individual in a population has two attributes: the non-dominant ranking attribute and the congestion degree attribute, when the non-dominant ranking attributes of the two individuals are inconsistent, namely the non-dominant rankings are different, the individuals with smaller non-dominant rankings are preferentially reserved; when the non-dominant ranking attributes of two individuals are consistent, i.e. the non-dominant rankings are the same, the crowdedness attribute is better, i.e. less crowded individuals around will be preferentially retained.

However, for individuals with consistent non-dominating ranking attributes, the crowding degree attribute is determined according to the crowding degree of the individual on different optimization targets, and the determination mode is single.

Disclosure of Invention

The embodiment of the disclosure provides a method and a device for selecting a neural network architecture. The technical scheme is as follows:

according to a first aspect of the embodiments of the present disclosure, there is provided a population generating method, including:

obtaining the t generation population R _t Said population R _t Including a population P _t And a population Q _t Said population Q _t Is said population P _t Obtained after treatment, the population P _t And said population Q _t The number of individuals contained in the formula (I) is n, the t is a positive integer, the initial value of the t is 1, and the n is a positive integer;

for the population R _t The individuals contained in the method are subjected to non-dominated sorting to obtain k individual groups with different ranks, wherein k is a positive integer greater than 1;

sequentially selecting i individual groups from the k individual groups according to the sequence from small to large of the rank, wherein the total number of individuals contained in the i individual groups is larger than or equal to the n, and the total number of individuals contained in the i-1 individual groups is smaller than the n;

for a jth individual in the ith individual group, calculating weighted crowding distances of the jth individual under multiple targets according to the crowding distance of the jth individual under each target and the corresponding weight of each target;

selecting m individuals from the ith individual group according to the weighted crowding distance of each individual in the ith individual group, wherein m is the difference value between n and the total number of individuals contained in the i-1 individual group;

generating the t +1 generation population R _t+1 The population R _t+1 Including the population P _t+1 And a population Q _t+1 Said population P _t+1 Including said m individuals and individuals in said i-1 group of individuals, said population Q _t+1 Is said population P _t+1 Obtained after treatment, the population Q _t+1 And said population P _t+1 The number of individuals contained in (a) is n;

let t = t +1, and again from the pair of the populations R _t The step of obtaining k individual groups with different ranks is started to be executed when the T generation population R is generated _T Then determining said population R _T And T is a preset value and is an integer greater than 1.

Optionally, the calculating the weighted crowding distance of the jth individual under the multiple targets according to the crowding distance of the jth individual under each target and the weight corresponding to each target includes: for a target x in the multiple targets, sorting each individual in the ith individual group according to a function value corresponding to the target x, wherein x is a positive integer; calculating the crowding distance of the jth individual under the target x according to the function values of two adjacent individuals of the jth individual and the maximum value and the minimum value in the sequencing results; and calculating the weighted crowding distance of the jth individual under the plurality of targets according to the crowding distance of the jth individual under the target x and the weight corresponding to the target x.

Optionally, the calculating a crowding distance of the jth individual under the target x according to the function values of two adjacent individuals of the jth individual and a maximum value and a minimum value in the sorting results includes: calculating a difference value of the function values of two adjacent individuals of the j-th individual; calculating a difference between the maximum value and the minimum value; and dividing the difference value of the function values by the difference value of the maximum value and the minimum value to obtain the crowding distance of the j-th individual under the target x.

Optionally, the calculating, according to the crowding distance of the jth individual under each of the objectives and the weight corresponding to each of the objectives, the weighted crowding distance of the jth individual under the multiple objectives includes: multiplying the crowding distance of the jth individual under the target x by the weight corresponding to the target x; and accumulating the multiplication results of the jth individual and the multiple targets to obtain the weighted crowding distance of the jth individual under the multiple targets.

Optionally, said determining said population R _T After the target population is obtained, the method further comprises the following steps: acquiring preset indexes of all individuals in the target population; and selecting target individuals from the target population according to the preset index.

According to a second aspect of embodiments of the present disclosure, there is provided a method of selecting a neural network architecture, the method comprising:

obtaining a t-th generation architecture population R _t The architecture group R _t Including an architecture group P _t And architecture group Q _t Said architecture group Q _t Is the architecture population P _t Obtained after processing, the architecture population P _t And said architecture population Q _t The number of the neural network architectures contained in the method is n, t is a positive integer, the initial value of t is 1, and n is a positive integer;

for the architecture population R _t The neural network architecture contained in the method carries out non-dominated sorting to obtain k individual groups with different ranks, wherein k is a positive integer greater than 1;

sequentially selecting i individual groups from the k individual groups according to the sequence from small to large, wherein the total number of neural network architectures contained in the i individual groups is greater than or equal to n, and the total number of neural network architectures contained in the i-1 individual groups is less than n;

for a jth neural network architecture in the ith individual group, calculating weighted crowding distances of the jth neural network architecture under multiple targets according to the crowding distance of the jth neural network architecture under each target and the corresponding weight of each target;

selecting m neural network architectures from the ith individual group according to the weighted crowding distance of each neural network architecture in the ith individual group, wherein m is the difference between n and the total number of the neural network architectures contained in the i-1 individual group;

generating a t +1 th generation architecture population R _t+1 Said architecture population R _t+1 Including an architecture group P _t+1 And architecture group Q _t+1 Said architecture population P _t+1 Including the m neural network architectures and the neural network architectures in the i-1 individual groups, the architecture population Q _t+1 Is the architecture population P _t+1 Obtained after processing, the architecture population Q _t+1 And the architecture population P _t+1 The number of the neural network architectures contained in the network element is n;

let t = t +1, and again from the pair of architecture populations R _t The neural network architecture contained in the method carries out non-dominated sorting, the step of obtaining k individual groups with different ranks is started to be executed, and when a T generation architecture population R is generated _T Determining the architecture population R _T The target architecture population is shown, and T is a preset value and is an integer greater than 1;

selecting a target neural network architecture from the target architecture population;

deploying the target neural network architecture in a terminal.

Optionally, the selecting a target neural network architecture from the target architecture population includes: acquiring parameter indexes of each neural network architecture in the target architecture population; and selecting the target neural network architecture from the target architecture population according to the parameter index.

Optionally, the selecting a target neural network architecture from the target architecture population according to the parameter index includes: selecting z neural network architectures from the target architecture population according to the operation speed, wherein z is a positive integer; and selecting the target neural network architecture from the z neural network architectures according to an evaluation index, wherein the evaluation index comprises a business evaluation index and/or a floating point operand.

According to a third aspect of embodiments of the present disclosure, there is provided a cluster generation apparatus, the apparatus comprising:

a population acquisition module configured to acquire a t-th generation population R _t Said population R _t Including a population P _t And a population Q _t Said population Q _t Is said population P _t Obtained after treatment, the population P _t And said population Q _t The number of individuals contained in the formula (I) is n, t is a positive integer, the initial value of t is 1, and n is a positive integer;

an individual group generation module configured to generate the population R _t The individuals contained in the method are subjected to non-dominated sorting to obtain k individual groups with different ranks, wherein k is a positive integer greater than 1;

an individual group selecting module configured to sequentially select i individual groups from the k individual groups according to a descending order of the rank, wherein the total number of individuals included in the i individual groups is greater than or equal to the n, and the total number of individuals included in the i-1 individual groups is less than the n;

a weighted crowding distance calculation module configured to calculate, for a jth individual in the ith individual group, weighted crowding distances of the jth individual under multiple targets according to the crowding distance of the jth individual under each target and a weight corresponding to each target;

an individual selecting module configured to select m individuals from the ith individual group according to the weighted crowding distance of each individual in the ith individual group, wherein m is a difference value between n and a total number of individuals included in the i-1 individual group;

a population generation module configured to generate a t +1 th generation population R _t+1 The population R _t+1 Including the population P _t+1 And a population Q _t+1 Said population P _t+1 Including said m individuals and individuals in said i-1 group of individuals, said population Q _t+1 Is said population P _t+1 Obtained after treatment, the population Q _t+1 And said population P _t+1 The number of individuals contained in (a) is n;

group determination moduleConfigured to let t = t +1 and to again derive from said population R _t The step of obtaining k individual groups with different ranks is started to be executed when the T generation population R is generated _T Then determining said population R _T And T is a preset value and is an integer greater than 1.

Optionally, the weighted congestion distance calculating module includes: the individual sorting submodule is configured to, for a target x in the multiple targets, sort each individual in the ith individual group according to a function value corresponding to the target x, wherein x is a positive integer; a congestion distance calculation sub-module configured to calculate a congestion distance of the jth individual under the target x according to the function values of two adjacent individuals of the jth individual and a maximum value and a minimum value in the sorting results; and the weighted crowding distance calculation sub-module is configured to calculate weighted crowding distances of the jth individual under the multiple targets according to the crowding distance of the jth individual under the target x and the weight corresponding to the target x.

Optionally, the congestion distance calculation sub-module is configured to: calculating a difference value of the function values of two adjacent individuals of the jth individual; calculating a difference between the maximum value and the minimum value; and dividing the difference value of the function values by the difference value of the maximum value and the minimum value to obtain the crowding distance of the j-th individual under the target x.

Optionally, the weighted congestion distance calculation sub-module is configured to: multiplying the crowding distance of the j individual under the target x by the weight corresponding to the target x; and accumulating the multiplication results of the jth individual and the multiple targets to obtain the weighted crowding distance of the jth individual under the multiple targets.

Optionally, the apparatus further comprises: the preset index acquisition module is configured to acquire preset indexes of all individuals in the target population; and the target individual selection module is configured to select target individuals from the target population according to the preset indexes.

According to a fourth aspect of embodiments of the present disclosure, there is provided a selection apparatus of a neural network architecture, the apparatus including:

an architecture group acquisition module configured to acquire a tth generation architecture group R _t The architecture group R _t Including an architecture group P _t And architecture group Q _t Said architecture group Q _t Is the architecture group P _t Obtained after processing, the architecture population P _t And said architecture population Q _t The number of the neural network architectures contained in the method is n, t is a positive integer, the initial value of t is 1, and n is a positive integer;

an individual group generation module configured to generate the architecture population R _t The neural network architecture contained in the method carries out non-dominated sorting to obtain k individual groups with different ranks, wherein k is a positive integer greater than 1;

an individual group selection module configured to sequentially select i individual groups from the k individual groups according to a descending order of the rank, wherein a total number of neural network architectures included in the i individual groups is greater than or equal to the n, and a total number of neural network architectures included in the i-1 individual groups is less than the n;

a weighted crowding distance calculation module configured to calculate, for a jth neural network architecture in the ith individual group, weighted crowding distances of the jth neural network architecture under a plurality of targets according to the crowding distance of the jth neural network architecture under each target and a weight corresponding to each target;

an individual selecting module configured to select m neural network architectures from the ith individual group according to the weighted congestion distance of each neural network architecture in the ith individual group, wherein m is a difference value between the n and the total number of the neural network architectures included in the i-1 individual groups;

an architecture population generation module configured to generate a t +1 th generation architecture population R _t+1 Said architecture population R _t+1 Including an architecture group P _t+1 And architecture group Q _t+1 Said architecture population P _t+1 Including the m neural networksArchitecture and neural network architecture in the i-1 individual group, the architecture population Q _t+1 Is the architecture population P _t+1 Obtained after processing, the architecture population Q _t+1 And the architecture population P _t+1 The number of the neural network architectures contained in the step (a) is n;

an architecture population determination module configured to let t = t +1 and again to determine the architecture population R _t The neural network architecture contained in the method carries out non-dominated sorting, the step of obtaining k individual groups with different ranks is started to be executed, and when a T generation architecture population R is generated _T Then determining the architecture population R _T The target architecture population is shown, and T is a preset value and is an integer greater than 1;

a neural network architecture selection module configured to select a target neural network architecture from the target architecture population;

a neural network architecture deployment module configured to deploy the target neural network architecture in a terminal.

Optionally, the neural network architecture selecting module includes: a parameter index obtaining submodule configured to obtain a parameter index of each neural network architecture in the target architecture population; and the neural network architecture selecting submodule is configured to select the target neural network architecture from the target architecture population according to the parameter index.

Optionally, the neural network architecture selection sub-module is configured to: selecting z neural network architectures from the target architecture population according to the operation speed, wherein z is a positive integer; and selecting the target neural network architecture from the z neural network architectures according to an evaluation index, wherein the evaluation index comprises a business evaluation index and/or a floating point operand.

According to a fifth aspect of embodiments of the present disclosure, there is provided a computer device comprising a processor and a memory, the memory having stored therein a computer program, the computer program being loaded and executed by the processor to implement the steps of the population generating method according to the first aspect as described above, or to implement the steps of the selecting method of the neural network architecture according to the second aspect as described above.

According to a sixth aspect of embodiments of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the population generation method according to the first aspect or implements the steps of the selection method of the neural network architecture according to the second aspect.

The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects:

the weighted crowding distance of each individual is calculated according to the crowding distance of each individual in the individual group under each target and the weight corresponding to each target, then the individuals with the required number are selected from the individual group according to the weighted crowding distance to generate the target population, a population generation mode is expanded, the crowding degree of the individuals is determined according to the weights of the targets, then the target population is generated by selecting the individuals with the required number from the individual group according to the crowding degree, therefore, the priority of the targets can be adjusted according to the preference of a decision maker to the targets, the population meeting the preference of the decision maker is generated, and the flexibility of determining the crowding degree of the individuals is improved.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure.

FIG. 1 is a flow diagram illustrating a population generation method in accordance with an exemplary embodiment;

FIG. 2 is a schematic diagram illustrating an individual selection method according to an exemplary embodiment;

FIG. 3 is a flow diagram illustrating a method of selecting a neural network architecture in accordance with an exemplary embodiment;

FIG. 4 is a block diagram illustrating a cluster generation apparatus in accordance with an exemplary embodiment;

FIG. 5 is a block diagram illustrating a cluster generation apparatus in accordance with another exemplary embodiment;

FIG. 6 is a block diagram illustrating a selection device of a neural network architecture, in accordance with an exemplary embodiment;

FIG. 7 is a block diagram illustrating a selection apparatus of a neural network architecture, in accordance with another exemplary embodiment;

FIG. 8 is a block diagram illustrating a computer device in accordance with an example embodiment.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the disclosure, as detailed in the appended claims.

According to the technical scheme provided by the embodiment of the disclosure, the execution main body of each step can be a computer device, such as a server with computing and storing capabilities, and optionally, the computer device can be one server, a server cluster composed of a plurality of servers, or a cloud computing service center. For convenience of explanation, in the following method embodiments, each step is described as being executed by a server, but the present invention is not limited thereto.

FIG. 1 is a flow diagram illustrating a population generation method according to an exemplary embodiment. The method can comprise the following steps (steps 101-107):

step 101, obtaining the t generation population R _t T is a positive integer and the initial value of t is 1.

T generation population R _t Including the population P _t And a population Q _t Population Q _t Is a population P _t Obtained after treatment. Alternatively, the population Q _t Is a population P _t Sequentially carrying out a series of treatments such as selection treatment, cross treatment, mutation treatment and the like.

The selection process, also called a replication process, is to select a certain number of individuals from the population according to a certain probability, so as to use the individuals as parents and perform subsequent reproduction to generate new individuals, optionally, the size of the probability and the number of the individuals are specifically set according to an actual application scenario, and the specific manner of the selection process includes roulette wheel selection (roulette wheel selection), tournament selection (tour selection), steady-state replication (steady-state reproduction), proportion transformation and ranking selection, sharing (sharing) method, and the like, which is not limited in this embodiment of the disclosure.

The interleaving processing refers to replacing and recombining the partial structures of each pair of individuals selected as parents by the selection processing, and optionally, the specific manner of the interleaving processing is influenced by the encoding manner of each pair of individuals as parents, for example, when the encoding manner is binary encoding or gray code encoding, the manner of the interleaving processing is interleaving, when the encoding manner is real number encoding, the manner of the interleaving processing is recombining, and when the encoding manner is integer or letter permutation encoding, the manner of the interleaving processing is rearranging.

The mutation processing refers to selecting a certain number of individuals from the population after the cross processing according to a certain probability, and randomly selecting one of the selected individuals to perform negation, optionally, the size of the probability and the number of the individuals are specifically set according to an actual application scenario, and the mutation processing mode includes binary mutation, real number mutation, serial number mutation, and the like, which is not limited in the embodiment of the present disclosure.

In the disclosed embodiment, the population Q _t The number of individuals and the population P contained in _t The number of individuals contained in (a) is the same, i.e. the population P _t And a population Q _t The number of individuals contained in the formula (I) is n, and n is a positive integer.

Step 102, for population R _t The individuals contained in the Chinese patent application are subjected to non-dominated sorting to obtain k individual groups with different ranks, and k is a positive integer greater than 1.

The rank refers to an order, and the obtained k individual groups with different ranks are k individual groups with different orders. Non-dominated sorting refers to sorting by population R _t The dominant and non-dominant relationships of the individuals contained in (a) are ordered. Assuming that the population contains an individual A and an individual B, when the place of the individual AWhen the objective function values are all larger than the individual B, the individual A and the individual B are considered to be in a domination relationship, and the individual A dominates the individual B; when individual a has an objective function value less than individual B, individual a is considered to be in a non-dominant relationship with individual B.

For example, first, a population R is found _t All the non-dominant individuals are stored into an individual group with the rank of 0, namely, the individuals without dominant relationship with other individuals are found and stored into the individual group with the rank of 0, and each individual in the individual group with the rank of 0 is endowed with the same non-dominant sequence 0; then, for each individual in the individual group with the rank of 0, if an individual dominated by the individual does not exist in the individual group dominated by other individuals except the individual, storing the individuals in the individual group into the individual group with the rank of 1, and assigning the same non-dominated order 1 to each individual in the individual group with the rank of 1; then, for each individual in the individual group with the rank of 1, performing the same operation as each individual in the individual group with the rank of 0, and storing the selected individual into the individual group with the rank of 2, and assigning the same non-dominant order 2 to each individual in the individual group with the rank of 2; and so on until the population R _t All individuals in (a) are stored in a corresponding group of individuals and assigned a corresponding non-dominant order.

And 103, sequentially selecting i individual groups from the k individual groups according to the sequence from small to large, wherein the total number of individuals in the i individual groups is greater than or equal to n, and the total number of individuals in the i-1 individual groups is less than n.

Each of the k individual groups corresponds to a rank, and optionally, a value of the rank is a natural number. And the server sequentially selects i individual groups from the k individual groups with different ranks according to the rank from small to large.

For example, as shown in FIG. 2, population P _t And a population Q _t The number of individuals contained in (1) is n, and the population P _t And a population Q _t Constituent population R _t The number of individuals contained in (1) is 2n, the population R _t The contained individuals in (1) form k individual groups after non-dominated sorting, the k individual groups are arranged according to the sequence from small rank to large rank, and the k individual groups are arranged according to the sequenceRank 0 next to group F ₀ Rank 1 of the group of individuals F ₁ Rank 2 of the individual group F ₂ \8230; hypothetical Individual group F ₀ And individual group F ₁ In a total number of individuals of less than n, group of individuals F ₀ Individual group F ₁ And individuals group F ₂ If the total number of individuals contained in the group is greater than or equal to n, determining that 3 individual groups are sequentially selected from the k individual groups, and respectively forming an individual group F ₀ Group of individuals F ₁ And individuals group F ₂ The remaining group of individuals and individuals in the group of individuals are eliminated, i.e., discarded.

And 104, calculating the weighted congestion distance of the jth individual in the ith individual group under a plurality of targets according to the congestion distance of the jth individual under each target and the corresponding weight of each target.

In the embodiment of the present disclosure, the slave population R is required _t N individuals are selected to generate the next generation population, but when the total number of individuals included in the i individual groups selected in step 103 is greater than n and the total number of individuals included in the i-1 individual groups is less than n, a certain number of individuals need to be eliminated from the individuals included in the i individual groups, and at this time, the server needs to select a proper individual from the i individual groups according to certain data to be kept. In the disclosed embodiment, the server determines whether each individual in the group of i individuals survives based on the weighted crowding distance of the individual.

The weighted congestion distance is based on the individual congestion degrees of a plurality of targets. The weighted crowding distance of an individual is calculated based on the weight corresponding to each target and the crowding distance of the individual under each target. Optionally, in order to facilitate the calculation of the weighted congestion distance by the server and reduce the processing overhead of the server, in the embodiment of the present disclosure, the sum of the weights corresponding to each target is 1. Optionally, in order to embody the preference degree of the decision maker for each target and satisfy the requirement that the decision maker flexibly sets the weight according to different application scenarios, the weight corresponding to each target in the embodiment of the present disclosure may be set according to an actual application scenario, or according to the importance degree of the target, for example, three targets, denoted as target 1, target 2, and target 3, are used in common in calculating the weighted congestion distance, and assuming that the decision maker best sees the goodness and badness degree of the weighted individuals on target 2, that is, the decision maker prefers that the individuals better performing on target 2 are retained, the weight corresponding to target 2 may be set higher than the weights corresponding to targets 1 and 3, for example, the weight corresponding to target 1 is set to 0.2, the weight corresponding to target 2 is set to 0.7, and the weight corresponding to target 3 is set to 0.1.

And 105, selecting m individuals from the ith individual group according to the weighted crowding distance of each individual in the ith individual group, wherein m is the difference between n and the total number of individuals in the i-1 individual group.

After the server calculates the weighted crowding distance of each individual in the ith individual group, m remaining individuals are selected from the ith individual group according to the weighted crowding distance, and the rest individuals are eliminated. Optionally, in order to facilitate selection of the individuals in the ith individual group by the server, selecting m individuals from the ith individual group according to the weighted congestion distance of each individual in the ith individual group includes: sorting the individuals in the ith individual group according to the weighted crowding distance of each individual in the ith individual group; and selecting m individuals from the ith individual group according to the sorting result. For example, as shown in FIG. 2, the ith individual group is individual group F ₂ From the individual group F ₂ Selecting m individuals, and calculating an individual group F ₂ Weighted congestion distance of each individual; then, the individuals are sorted according to the weighted crowding distance, in the embodiment of the disclosure, the individuals can be sorted in the order from the weighted crowding distance to be smaller, or sorted in the order from the weighted crowding distance to be smaller; and then selecting m individuals with larger weighted congestion distances from the sorting result, namely selecting the individuals corresponding to the first m weighted congestion distances from the sorting result to reserve if the individuals are sorted according to the sequence of the weighted congestion distances from large to small, and selecting the individuals corresponding to the last m weighted congestion distances from the sorting result to reserve if the individuals are sorted according to the sequence of the weighted congestion distances from small to large.

In the disclosed embodiment, in order to ensure the slave population R _t For generating the next generation speciesThe number of individuals in the population is n, and m is the difference between n and the total number of individuals contained in the i-1 individual group.

106, generating a t +1 generation population R _t+1 。

T +1 th generation population R _t+1 Including a population P _t+1 And a population Q _t+1 Population Q _t+1 Is a population P _t+1 Obtained after treatment. Alternatively, the population Q _t+1 Is a population P _t+1 The detailed descriptions of the selection processing, the crossover processing, and the mutation processing are referred to in step 101, and are not repeated here.

In the disclosed embodiment, the population P _t+1 Including m individuals and i-1 individuals in the group of individuals, as shown in FIG. 2, the ith group of individuals is group F of individuals ₂ I-1 Individual groups are individually groups F ₀ And individual group F ₁ According to individual group F ₂ Weighted crowding distance of each individual in (1), from the individual group F ₂ M individuals are selected, and an individual group F is eliminated ₂ And the other individuals in (1) and the group of individuals F ₀ Group of individuals F ₁ And individuals group F ₂ A group of individuals other than the group of individuals, and the m individuals are grouped with an individual group F ₀ And individual group F ₁ In (2) into a population P _t+1 Then according to the population P _t+1 Generating a population Q _t+1 And the population P _t+1 And group Q _t+1 Are combined into a population R _t+1 . In the disclosed embodiment, the population Q _t+1 The number of individuals and the population P contained in _t+1 The number of individuals contained in (a) is the same, i.e. the population P _t+1 And a population Q _t+1 The number of individuals contained in (a) is n.

Step 107, let t = t +1, and again from pair population R _t The step of obtaining k individual groups with different ranks is started to be executed when the T generation population R is generated _T Then, the population R is determined _T T is a preset value and is an integer larger than 1.

Based on the introduction of the non-dominated sorting in step 102 and the process of obtaining k individual groups with different ranks, the relevant introduction in step 107 is obtained, and will not be described herein again. Optionally, in order to meet the personalized requirement of the actual application scenario, the server specifically determines a value of T according to the requirement of the actual application scenario, for example, the value of T is 50.

Optionally, in order to further select the required individuals from the target population, step 107 further includes: acquiring preset indexes of each individual in a target population; and selecting target individuals from the target population according to preset indexes. For example, when the technical solution of the embodiment of the present disclosure is applied to a neural network architecture, the preset index may include an operation speed, an operation precision, a floating point operand, and the like, and for detailed description, reference is made to the following embodiments, which are not repeated herein. Optionally, in order to facilitate the server to select a needed individual from the target population, selecting the target individual from the target population according to a preset index includes: sequencing all individuals in the target population according to a preset index; and selecting target individuals from the target population according to the sequencing result. Optionally, in order to introduce a preference degree of the decision maker for a preset index, the sorting, according to the preset index, each individual in the target population includes: and sequencing each individual in the target population according to the preset indexes and the weights corresponding to the preset indexes. For example, in a neural network architecture, there are three preset indexes, which are an operation speed, a service evaluation index and a floating point operand, respectively, and a decision maker looks at the speed of the operation speed most, the weight corresponding to the operation speed is set to be the largest, so that each individual in a target population is sorted, and then according to a sorting result, a target individual is selected from the target population, and the target individual meets the requirement of the decision maker on the operation speed.

In summary, according to the technical scheme provided by the embodiment of the present disclosure, the weighted congestion distance of each individual is calculated according to the congestion distance of each individual in the individual group under each target and the weight corresponding to each target, and then the required number of individuals are selected from the individual group to generate the target population according to the weighted congestion distance, so that a population generation manner is extended, the congestion degree of the individual is determined according to the weights of multiple targets, and then the required number of individuals are selected from the individual group to generate the target population according to the congestion degree, so that the priorities of the multiple targets can be adjusted according to the preference of a decision maker for the multiple targets, the population meeting the preference of the decision maker is generated, and the flexibility of determining the congestion degree of the individual is improved.

In addition, according to the technical scheme provided by the embodiment of the disclosure, the preset indexes of the individuals in the generated population are obtained, and then the needed individuals are selected from the generated population according to the preset indexes and applied to the actual application scene, so that the needed individuals are flexibly selected according to different application scenes, and the individual requirements of a decision maker on the individuals are met.

In one possible implementation, the step 104 includes: for a target x in a plurality of targets, sorting each individual in the ith individual group according to a function value corresponding to the target x, wherein x is a positive integer; calculating the crowding distance of the jth individual under the target x according to the function values of two adjacent individuals of the jth individual and the maximum value and the minimum value in the sequencing result; and calculating weighted crowding distances of the jth individual under a plurality of targets according to the crowding distance of the jth individual under the target x and the weight corresponding to the target x.

In the embodiment of the present disclosure, each target is represented in a form of a function, and each individual in the ith individual group corresponds to one function value on each target. For an object x in the multiple objects, the individuals in the ith individual group may be sorted according to the magnitude of the function value, and the server may sort the individuals according to the function value from small to large, or sort the individuals according to the function value from large to small, which is not limited in this embodiment of the disclosure. For the j individual in the ith individual group, the server may calculate the crowding distance of the j individual under the target x according to the function values of two adjacent individuals of the j individual and the maximum value and the minimum value in the sorting result, where the two adjacent individuals of the j individual are two individuals whose function values are closest to the function value of the j individual in the sorting result. After the server calculates the congestion distance of the jth individual at object x, the server can calculate the weighted congestion distances of the jth individual at a plurality of objects based on the congestion distance and the weight corresponding to object x.

Optionally, the calculating a crowding distance of the jth individual under the target x according to the function values of two adjacent individuals of the jth individual and the maximum value and the minimum value in the sorting result includes: calculating the difference value of the function values of two adjacent individuals of the jth individual; calculating the difference value between the maximum value and the minimum value; and dividing the difference value of the function values by the difference value of the maximum value and the minimum value to obtain the crowding distance of the jth individual under the target x.

After the server obtains two adjacent individuals of the jth individual and the function values of the two adjacent individuals on the target x, the difference between the function values of the two adjacent individuals can be calculated according to the function values of the two adjacent individuals, and optionally, in order to facilitate subsequent calculation and processing by the server, the difference between the function values of the two adjacent individuals is calculated by subtracting the smaller function value from the larger function value of the two adjacent individuals. It should be noted that, the step of calculating the difference between the function values of two adjacent individuals of j individuals by the server, and the step of calculating the difference between the maximum value and the minimum value by the server may be executed simultaneously or sequentially. And the server obtains the crowding distance of the j-th individual under the target x after the difference value of the function values is in the difference value of the maximum value and the minimum value. Optionally, the formula for the server to calculate the crowding distance of the jth individual under the target x is as follows:

where D represents the crowding distance of the jth individual under the target x, and O ₊ ^x Represents the larger of the function values of two adjacent ones of the j-th individual, O _- ^x Represents the smaller of the function values of two adjacent ones of the j-th individual, O _max ^x Representing the maximum value, O, in the ranking result based on the target x _min ^x Representing the minimum value in the ranking result based on the target x.

Optionally, the calculating the weighted crowding distance of the jth individual under the multiple objectives according to the crowding distance of the jth individual under each objective and the weight corresponding to each objective includes: multiplying the crowding distance of the jth individual under the target x by the weight corresponding to the target x; and accumulating the multiplication results of the jth individual and the plurality of targets to obtain the weighted congestion distance of the jth individual under the plurality of targets.

In the embodiment of the present disclosure, for the crowd distance of the jth individual in the multiple targets except for target x, the same calculation method as that for the crowd distance in target x is also used, and the related calculation process is described above and is not described again here. After calculating the crowding distances of the j individual on the plurality of targets, the server multiplies the crowding distances by the weights of the corresponding targets, and obtains the weighted crowding distances of the j individual on the plurality of targets by adopting a weighted summation mode. Optionally, the server calculates the weighted crowding distance of the jth individual under the multiple targets according to the following formula:

wherein D (j) represents the weighted congestion distance of the jth individual, O ₊ ^x Represents the larger of the function values of the adjacent two individuals of the j-th individual under the target x, O _- ^x Represents the smaller of the function values of the adjacent two individuals of the j-th individual under the target x, O _max ^x Represents the maximum value in the ranking result based on the target x,O _min ^x representing the minimum value, w, in the ranking result based on the target x _x The weight corresponding to the target x is shown, y represents the number of targets, and y is a positive integer.

In summary, according to the technical scheme provided by the embodiment of the present disclosure, the crowding distance of an individual under a single target is calculated according to the adjacent individuals of the individual under the single target and the maximum value and the minimum value under the single target, and then the weighted crowding distance of the individual under multiple targets is calculated according to the crowding distance of the individual under the single target and the weight corresponding to the single target, so that a manner of determining the crowding degree of the individual is expanded, the weight corresponding to the single target is flexibly set according to actual requirements, and the crowding degree of the individual is determined according to the preference degree of a decision maker to each target is introduced.

FIG. 3 is a flow chart illustrating a method of selecting a neural network architecture, according to an example embodiment. The method can comprise the following steps (steps 301-309):

step 301, obtaining the t-th generation architecture population R _t Architecture group R _t Including an architecture group P _t And architecture group Q _t Architecture group Q _t Is an architecture group P _t Obtained after processing, architecture population P _t And architecture group Q _t The number of the neural network architectures included in the method is n, t is a positive integer, the initial value of t is 1, and n is a positive integer.

Step 302, for the architecture population R _t The neural network architecture contained in the method carries out non-dominated sorting to obtain k individual groups with different ranks, wherein k is a positive integer greater than 1.

Step 303, sequentially selecting i individual groups from the k individual groups according to a sequence from small to large, wherein the total number of the neural network architectures contained in the i individual groups is greater than or equal to n, and the total number of the neural network architectures contained in the i-1 individual groups is less than n.

And step 304, for the jth neural network architecture in the ith individual group, calculating weighted congestion distances of the jth neural network architecture under a plurality of targets according to the congestion distance of the jth neural network architecture under each target and the corresponding weight of each target.

Step 305, according to the weighted crowding distance of each neural network architecture in the ith individual group, selecting m neural network architectures from the ith individual group, wherein m is the difference between n and the total number of the neural network architectures contained in the i-1 individual group.

Step 306, generating the t +1 th generation architecture population R _t+1 Building of population R _t+1 Including an architecture group P _t+1 And architecture group Q _t+1 Architecture group P _t+1 Comprises m neural network architectures and a neural network architecture in i-1 individual groups, and an architecture group Q _t+1 Is an architecture group P _t+1 Obtained after processing, architecture group Q _t+1 And an architecture group P _t+1 The number of the neural network architectures included in the above is n.

Step 307, let t = t +1, and again pair the architecture population R _t The neural network architecture contained in the method carries out non-dominated sorting, the step of obtaining k individual groups with different ranks is started to be executed, and when a T generation architecture population R is generated _T Then, determining the architecture population R _T T is a preset value and is an integer greater than 1 for the target architecture population.

Based on the descriptions of step 101 to step 107 in the embodiment shown in fig. 1, the descriptions of step 301 to step 307 in this embodiment are obtained, and for the details of step 301 to step 307, please refer to the embodiment shown in fig. 1, which is not repeated herein.

And 308, selecting a target neural network architecture from the target architecture population.

Optionally, the selecting the target neural network architecture from the target architecture population includes: and selecting a target neural network architecture from the target architecture population according to a preset index. The preset index is an index for evaluating each neural network architecture in the target architecture population, and the preset index may be a value corresponding to one parameter index for evaluating the quality of the neural network architecture or a value obtained by integrating multiple parameter indexes for evaluating the quality of the neural network architecture, which is not limited in the embodiments of the present disclosure. Optionally, in order to facilitate the server to select the target neural network architecture from the target architecture population, the selecting the target neural network architecture from the target architecture population according to the preset index includes: sorting the neural network architectures in the target architecture population according to a preset index; and selecting a target neural network architecture from the target architecture population according to the sequencing result. In the embodiment of the present disclosure, the neural network architectures may be sorted from small to large according to the preset index, and the neural network architectures may also be sorted from large to small according to the preset index.

Optionally, in order to clarify a specific selection manner of the neural network architecture, the step 308 includes: acquiring parameter indexes of each neural network architecture in a target architecture population; and selecting a target neural network architecture from the target architecture population according to the parameter index. The parameter indexes may be flexibly set according to different practical application scenarios, for example, in the embodiment of the present disclosure, the parameter indexes are applied to a selection method of a neural network architecture, optionally, the parameter indexes may at least include any one of an operation speed, a service evaluation index, and a floating point operand, and the number of the parameter indexes is not limited in the embodiment of the present disclosure.

Optionally, in order to simplify the selection of the target neural network architecture from the target architecture population by the server, the selecting the target neural network architecture from the target architecture population according to the parameter index includes: selecting z neural network architectures from the target architecture population according to the operation speed; selecting a target neural network architecture from the z neural network architectures according to an evaluation index, wherein the evaluation index comprises a business evaluation index and/or a floating point operand, and the business evaluation index comprises any one of the following items: accuracy, mAP (Mean Average Precision), PSNR (Peak Signal to Noise Ratio), ioU (Intersection over Unit). In the embodiment of the present disclosure, the server may preset a speed threshold, and for a neural network architecture in the target architecture population, if the operating speed of the neural network architecture is less than the speed threshold, the neural network architecture is discarded, and if the operating speed of the neural network architecture is greater than or equal to the speed threshold, the neural network architecture is retained; the server can also sequence the neural network architectures in the target architecture population from large to small according to the operation speed, and then takes the first z neural network architectures with the operation speed. After the server selects z neural network architectures, a required target neural network architecture can be selected from the z neural network architectures according to the business evaluation index and/or the floating point operand.

Step 309, deploying the target neural network architecture in the terminal.

In the embodiment of the present disclosure, after the server selects the target neural network architecture, the neural network architecture is deployed in the terminal. Optionally, the server selects the target neural network architecture according to the performance of the deployed terminal, for example, if the target neural network architecture is applied to a terminal with lower performance, if it is required that the neural network architecture can also exert the same operation speed as that of the terminal with higher performance in the terminal with lower performance, the target neural network architecture with lower service evaluation index may be selected from z neural network architectures and applied to the terminal with lower performance. Optionally, the target neural network architecture selected by the server may be deployed in the terminal and used in the technical fields of image recognition, voice recognition, and the like, which is not limited in the embodiment of the present disclosure.

To sum up, according to the technical scheme provided by the embodiment of the present disclosure, the weighted congestion distance of each neural network architecture is calculated according to the congestion distance of each neural network architecture in each target in the individual group and the weight corresponding to each target, then the neural network architectures of the required number are selected from the individual group to generate the target architecture population according to the weighted congestion distance, and finally the target neural network architecture is selected from the target architecture population according to the preset index, so that a manner of selecting the neural network architecture is extended, the target neural network architecture meeting the application condition is selected from the target architecture population according to the preset index set by the decision maker, and the selected target neural network architecture is more suitable for the demand of the decision maker.

The following are embodiments of the disclosed apparatus that may be used to perform embodiments of the disclosed methods. For details not disclosed in the embodiments of the apparatus of the present disclosure, refer to the embodiments of the method of the present disclosure.

FIG. 4 is a block diagram illustrating a cluster generation apparatus according to an exemplary embodiment. The device has the functions of implementing the method examples, and the functions can be realized by hardware or by hardware executing corresponding software. The apparatus may be the computer device described above, or may be provided in a computer device. As shown in fig. 4, the apparatus 400 may include: the system comprises a population acquisition module 410, an individual group generation module 420, an individual group selection module 430, a weighted congestion distance calculation module 440, an individual selection module 450, a population generation module 460 and a population determination module 470.

A population acquisition module 410 configured to acquire a tth generation population R _t The population R _t Including a population P _t And a population Q _t Said population Q _t Is said population P _t Obtained after treatment, the population P _t And said population Q _t The number of individuals in the group is n, the t is a positive integer, the initial value of the t is 1, and the n is a positive integer.

An individual group generating module 420 configured to generate the population R _t The individuals contained in the step (1) are subjected to non-dominated sorting to obtain k individual groups with different ranks, and k is a positive integer greater than 1.

And an individual group selecting module 430, configured to sequentially select i individual groups from the k individual groups according to the order from small to large, wherein the total number of individuals included in the i individual groups is greater than or equal to n, and the total number of individuals included in the i-1 individual groups is less than n.

A weighted crowding distance calculating module 440 configured to calculate, for a jth individual in the ith individual group, weighted crowding distances of the jth individual under multiple targets according to the crowding distance of the jth individual under each target and the weight corresponding to each target.

An individual selecting module 450 configured to select m individuals from the ith individual group according to the weighted crowding distance of each individual in the ith individual group, wherein m is a difference between n and a total number of individuals included in the i-1 individual group.

A population generation module 460 configured to generate a t +1 th generation population R _t+1 Said population R _t+1 Including the population P _t+1 And a population Q _t+1 Said population P _t+1 Including said m individuals and individuals in said i-1 group of individuals, said population Q _t+1 Is said population P _t+1 Obtained after treatment, the population Q _t+1 And said population P _t+1 The number of individuals contained in (a) is n.

A population determination module 470 configured to let t = t +1 and to again determine the population R from the pair _t The step of obtaining k individual groups with different ranks is started to be executed when the T generation population R is generated _T Then determining said population R _T And T is a preset value and is an integer larger than 1.

Optionally, as shown in fig. 5, the weighted congestion distance calculating module 440 includes: the individual sorting submodule 441 is configured to, for a target x in the multiple targets, sort each individual in the ith individual group according to a function value corresponding to the target x, where x is a positive integer; a crowding distance calculating sub-module 442 configured to calculate a crowding distance of the jth individual under the target x according to the function values of two adjacent individuals of the jth individual and the maximum value and the minimum value in the sorting results; the weighted crowding distance calculating sub-module 443 is configured to calculate weighted crowding distances of the jth individual under the multiple targets according to the crowding distance of the jth individual under the target x and weights corresponding to the target x.

Optionally, as shown in fig. 5, the crowding distance calculating sub-module 443 is configured to: calculating a difference value of the function values of two adjacent individuals of the jth individual; calculating a difference between the maximum value and the minimum value; and dividing the difference value of the function values by the difference value of the maximum value and the minimum value to obtain the crowding distance of the j individual under the target x.

Optionally, as shown in fig. 5, the weighted congestion distance calculating sub-module 443 is configured to: multiplying the crowding distance of the jth individual under the target x by the weight corresponding to the target x; and accumulating the product results of the jth individual and the multiple targets to obtain the weighted crowding distance of the jth individual under the multiple targets.

Optionally, as shown in fig. 5, the apparatus 400 further includes: a preset index obtaining module 480 configured to obtain preset indexes of the individuals in the target population; and the target individual selecting module 490 is configured to select a target individual from the target population according to the preset index.

FIG. 6 is a block diagram illustrating a selection device of a neural network architecture, in accordance with an exemplary embodiment. The device has the functions of realizing the method examples, and the functions can be realized by hardware or by hardware executing corresponding software. The apparatus may be the computer device described above, or may be provided in a computer device. As shown in fig. 6, the apparatus 600 may include: the system comprises an architecture population acquisition module 610, an individual group generation module 620, an individual group selection module 630, a weighted congestion distance calculation module 640, an individual selection module 650, an architecture population generation module 660, an architecture population determination module 670, a neural network architecture selection module 680 and a neural network architecture deployment module 690.

An architecture population acquisition module 610 configured to acquire a tth generation architecture population R _t Said architecture population R _t Including an architecture group P _t And architecture group Q _t Said architecture group Q _t Is the architecture population P _t Obtained after processing, the architecture population P _t And said architecture population Q _t The number of the neural network architectures included in the network is n, the t is a positive integer, the initial value of the t is 1, and the n is a positive integer.

An individual group generating module 620 configured to generate the architecture population R _t The neural network architecture contained in the method carries out non-dominated sorting to obtain k individual groups with different ranks, wherein k is a positive integer greater than 1.

An individual group selecting module 630, configured to sequentially select i individual groups from the k individual groups according to a descending order of the rank, where a total number of neural network architectures included in the i individual groups is greater than or equal to the n, and a total number of neural network architectures included in the i-1 individual groups is less than the n.

A weighted crowding distance calculation module 640 configured to calculate, for a jth neural network architecture in the ith individual group, weighted crowding distances of the jth neural network architecture under multiple targets according to the crowding distance of the jth neural network architecture under each target and a weight corresponding to each target.

An individual selecting module 650 configured to select m neural network architectures from the ith individual group according to the weighted crowding distance of each neural network architecture in the ith individual group, wherein m is a difference between n and a total number of neural network architectures included in the i-1 individual group.

An architecture population generation module 660 configured to generate a t +1 th generation architecture population R _t+1 Said architecture population R _t+1 Including an architecture group P _t+1 And architecture group Q _t+1 Said architecture population P _t+1 Including the m neural network architectures and the neural network architectures in the i-1 individual groups, the architecture population Q _t+1 Is the architecture group P _t+1 Obtained after processing, the architecture population Q _t+1 And the architecture population P _t+1 The number of the neural network architectures included in (2) is n.

An architecture population determination module 670 configured to let t = t +1 and again to determine the architecture population R _t The neural network architecture contained in the method carries out non-dominated sorting, the step of obtaining k individual groups with different ranks is started to be executed, and when a T generation architecture population R is generated _T Then determining the architecture population R _T And T is a preset value and is an integer greater than 1, and is the target architecture population.

A neural network architecture selection module 680 configured to select a target neural network architecture from the target architecture population.

A neural network architecture deployment module 690 configured to deploy the target neural network architecture in a terminal.

Optionally, as shown in fig. 7, the neural network architecture selecting module 680 includes: a parameter index obtaining sub-module 681 configured to obtain a parameter index of each neural network architecture in the target architecture population; a neural network architecture selection submodule 682 configured to select the target neural network architecture from the target architecture population according to the parameter index.

Optionally, the neural network architecture selection submodule 682 is configured to: selecting z neural network architectures from the target architecture population according to the operation speed, wherein z is a positive integer; and selecting the target neural network architecture from the z neural network architectures according to an evaluation index, wherein the evaluation index comprises a business evaluation index and/or a floating point operand.

An exemplary embodiment of the present disclosure also provides a group generating apparatus, including: a processor; a memory for storing executable instructions of the processor; wherein the processor is configured to:

obtaining the t generation population R _t Said population R _t Including the population P _t And a population Q _t Said population Q _t Is said population P _t Obtained after treatment, the population P _t And said population Q _t The number of individuals contained in the formula (I) is n, t is a positive integer, the initial value of t is 1, and n is a positive integer;

for the population R _t The individuals contained in the Chinese patent application are subjected to non-dominated sorting to obtain k individual groups with different ranks, wherein k is a positive integer greater than 1;

sequentially selecting i individual groups from the k individual groups according to the sequence from small to large, wherein the total number of individuals contained in the i individual groups is greater than or equal to n, and the total number of individuals contained in the i-1 individual groups is less than n;

generating a t +1 generation population R _t+1 Said population R _t+1 Including a population P _t+1 And group Q _t+1 The population P _t+1 Including said m individuals and individuals of said i-1 group of individuals, said population Q _t+1 Is said population P _t+1 Obtained after treatment, the population Q _t+1 And said population P _t+1 The number of individuals contained in (a) is n;

let t = t +1 and again from the population R _t The step of obtaining k individual groups with different ranks is started to be executed when the T generation population R is generated _T Determining said population R _T And T is a preset value and is an integer larger than 1.

Optionally, the calculating a crowding distance of the jth individual under the target x according to the function values of two adjacent individuals of the jth individual and a maximum value and a minimum value in the sorting results includes: calculating a difference value of the function values of two adjacent individuals of the j-th individual; calculating a difference between the maximum value and the minimum value; and dividing the difference value of the function values by the difference value of the maximum value and the minimum value to obtain the crowding distance of the j individual under the target x.

Optionally, the calculating the weighted crowding distance of the jth individual under the multiple targets according to the crowding distance of the jth individual under each target and the weight corresponding to each target includes: multiplying the crowding distance of the jth individual under the target x by the weight corresponding to the target x; and accumulating the multiplication results of the jth individual and the multiple targets to obtain the weighted crowding distance of the jth individual under the multiple targets.

An exemplary embodiment of the present disclosure also provides a selecting apparatus of a neural network architecture, the apparatus including: a processor; a memory for storing executable instructions of the processor; wherein the processor is configured to:

obtaining a t-th generation architecture population R _t Said architecture population R _t Including an architecture group P _t And architecture group Q _t Said architecture group Q _t Is the architecture population P _t Obtained after processing, the architecture population P _t And said architecture population Q _t The number of the neural network architectures contained in the method is n, t is a positive integer, the initial value of t is 1, and n is a positive integer;

sequentially selecting i individual groups from the k individual groups according to the sequence from small to large, wherein the total number of neural network architectures contained in the i individual groups is greater than or equal to the n, and the total number of neural network architectures contained in the i-1 individual groups is less than the n;

generating a t +1 th generation architecture population R _t+1 The architecture group R _t+1 Including an architecture group P _t+1 And architecture group Q _t+1 The architecture group P _t+1 Including the m neural network architectures and the neural network architectures in the i-1 individual groups, the architecture group Q _t+1 Is the architecture population P _t+1 Obtained after processing, the architecture population Q _t+1 And the architecture population P _t+1 The number of the neural network architectures contained in the network element is n;

let t = t +1, and again from the pair of the architecture populations R _t The neural network architecture contained in the method carries out non-dominated sorting, the step of obtaining k individual groups with different ranks is started to be executed, and when a T generation architecture population R is generated _T Determining the architecture population R _T The T is a preset value and is an integer larger than 1, and is a target architecture population;

deploying the target neural network architecture in a terminal.

FIG. 8 is a block diagram illustrating a computer device 800 according to an example embodiment. For example, the computer device 800 may be provided as a server. Referring to fig. 8, computer device 800 includes a processing component 822, which further includes one or more processors, and memory resources, represented by memory 832, for storing computer programs that are executable by processing component 822. The memory 832 stores the computer programs described above. Furthermore, the processing component 822 is configured to execute a computer program to implement the population generating method described above, or to implement the selecting method of the neural network architecture described above.

The computer device 800 may also include a power supply component 826 configured to perform power management of the computer device 800, a wired or wireless network interface 850 configured to connect the computer device 800 to a network, and an input/output (I/O) interface 858. The computer device 800 may operate based on an operating system stored in the memory 832, such as Windows Server (TM), mac OS XTM, unix (TM), linux (TM), freeBSD (TM), or the like.

An exemplary embodiment of the present disclosure also provides a non-transitory computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements the steps of the population generating method as described above, or implements the steps of the selecting method of the neural network architecture as described above.

It should be understood that reference herein to "a plurality" means two or more. "and/or" describes the association relationship of the associated object, indicating that there may be three relationships, for example, a and/or B, which may indicate: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice in the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. A method of selecting a neural network architecture, performed by a computer device, the method comprising:

for a jth neural network architecture in the ith individual group, calculating weighted crowding distances of the jth neural network architecture under a plurality of targets according to the crowding distance of the jth neural network architecture under each target and the corresponding weight of each target;

generating a t +1 th generation architecture population R _t+1 Said architecture population R _t+1 Including an architecture group P _t+1 And architecture group Q _t+1 Said architecture population P _t+1 Including the m neural network architectures and the neural network architectures in the i-1 individual groups, the architecture group Q _t+1 Is the architecture group P _t+1 Obtained after processing, the architecture population Q _t+1 And the architecture population P _t+1 The number of the neural network architectures contained in the step (a) is n;

acquiring parameter indexes of each neural network architecture in the target architecture population, wherein the parameter indexes comprise operation speed and evaluation indexes;

selecting a target neural network architecture from the target architecture population according to the performance of the deployed terminal, wherein the selecting comprises the following steps: selecting z neural network architectures with the operation speeds larger than a speed threshold value from the target architecture population, wherein z is a positive integer; selecting the target neural network architecture from the z neural network architectures according to the evaluation indexes, wherein the evaluation indexes comprise business evaluation indexes; wherein the business evaluation index includes, but is not limited to, any of the following: accuracy, average precision mean mAP, peak signal-to-noise ratio PSNR and cross-over ratio IoU;

deploying the target neural network architecture in the terminal.

2. The method of claim 1, wherein calculating the weighted crowding distance of the jth neural network architecture under multiple objectives according to the crowding distance of the jth neural network architecture under each objective and the corresponding weight of each objective comprises:

for a target x in the multiple targets, sorting each neural network architecture in the ith individual group according to a function value corresponding to the target x, wherein x is a positive integer;

calculating the crowding distance of the jth neural network architecture under the target x according to the function values of two adjacent neural network architectures of the jth neural network architecture and the maximum value and the minimum value in the sequencing results;

and calculating the weighted crowding distance of the jth neural network architecture under the multiple targets according to the crowding distance of the jth neural network architecture under each target and the weight corresponding to each target.

3. The method of claim 2, wherein the calculating the crowding distance of the jth neural network architecture under the target x according to the function values of two adjacent neural network architectures of the jth neural network architecture and the maximum value and the minimum value in the ranking results comprises:

calculating a difference value of the function values of two adjacent neural network architectures of the jth neural network architecture;

calculating a difference between the maximum value and the minimum value;

and dividing the difference value of the function values by the difference value of the maximum value and the minimum value to obtain the crowding distance of the jth neural network architecture under the target x.

4. The method of claim 2, wherein calculating the weighted crowding distances of the jth neural network architecture under the plurality of targets according to the crowding distances of the jth neural network architecture under each target and the corresponding weights of each target comprises:

multiplying the crowding distance of the jth neural network architecture under the target x by the weight corresponding to the target x;

and accumulating the multiplication results of the jth neural network architecture and the multiple targets to obtain the weighted crowding distance of the jth neural network architecture under the multiple targets.

5. An apparatus for selecting a neural network architecture, the apparatus comprising:

an architecture population acquisition module configured to acquire a tth-generation architecture population R _t Said architecture population R _t Including an architecture group P _t And architecture group Q _t Said architecture group Q _t Is the architecture population P _t Obtained after processing, the architecture population P _t And the architecture group Q _t The number of the neural network architectures contained in the network element is n, the t is a positive integer, the initial value of the t is 1, and the n is the positive integer;

an individual selecting module configured to select m neural network architectures from the ith individual group according to the weighted crowding distance of each neural network architecture in the ith individual group, wherein m is a difference value between n and a total number of neural network architectures included in the i-1 individual group;

an architecture population generation module configured to generate a t +1 th generation architecture population R _t+1 Said architecture population R _t+1 Including an architecture group P _t+1 And architecture group Q _t+1 Said architecture population P _t+1 Including the m neural network architectures and the neural network architectures in the i-1 individual groups, the architecture group Q _t+1 Is the architecture population P _t+1 Obtained after processing, the architecture population Q _t+1 And the architecture population P _t+1 The number of the neural network architectures contained in the network element is n;

a parameter index obtaining submodule configured to obtain parameter indexes of each neural network architecture in the target architecture population, where the parameter indexes include an operation speed and an evaluation index;

a neural network architecture selection sub-module configured to select a target neural network architecture from the target architecture population according to performance of a deployed terminal, including: selecting z neural network architectures with the operation speeds larger than a speed threshold value from the target architecture population, wherein z is a positive integer; selecting the target neural network architecture from the z neural network architectures according to the evaluation indexes, wherein the evaluation indexes comprise business evaluation indexes; wherein the business evaluation index includes, but is not limited to, any of the following: accuracy, average precision mean mAP, peak signal-to-noise ratio PSNR and cross-over ratio IoU;

a neural network architecture deployment module configured to deploy the target neural network architecture in the terminal.

6. A computer device, characterized in that it comprises a processor and a memory, in which a computer program is stored, which computer program is loaded and executed by the processor to implement the steps of the selection method of the neural network architecture according to any one of claims 1 to 4.

7. A non-transitory computer readable storage medium, having stored thereon a computer program, wherein the computer program, when being executed by a processor, implements the steps of the method for selecting a neural network architecture according to any one of claims 1 to 4.