WO2020048389A1 - Method for compressing neural network model, device, and computer apparatus - Google Patents

Method for compressing neural network model, device, and computer apparatus Download PDF

Info

Publication number
WO2020048389A1
WO2020048389A1 PCT/CN2019/103511 CN2019103511W WO2020048389A1 WO 2020048389 A1 WO2020048389 A1 WO 2020048389A1 CN 2019103511 W CN2019103511 W CN 2019103511W WO 2020048389 A1 WO2020048389 A1 WO 2020048389A1
Authority
WO
WIPO (PCT)
Prior art keywords
neural network
network model
compressed
layer
compression
Prior art date
Application number
PCT/CN2019/103511
Other languages
French (fr)
Chinese (zh)
Inventor
金玲玲
饶东升
何文玮
Original Assignee
深圳灵图慧视科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳灵图慧视科技有限公司 filed Critical 深圳灵图慧视科技有限公司
Publication of WO2020048389A1 publication Critical patent/WO2020048389A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/086Learning methods using evolutionary algorithms, e.g. genetic algorithms or genetic programming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]

Definitions

  • the present application relates to the field of computer application technology, and in particular, to a method and a device for compressing a neural network model, a computer device, and a computer-readable medium.
  • neural network In recent years, with the development of artificial intelligence, neural network (NN) algorithms have been widely used in image processing, speech recognition, natural language processing, and other fields.
  • deep neural networks with good performance often have a large number of nodes (neurons) and model parameters, which not only have a large amount of calculation but also occupy a large part of the space in actual deployment, which limits its application to both storage and computing resources. Restricted equipment. Therefore, how to compress the neural network model is particularly important. Especially the compressed neural network model will be compressed, which will help to apply the trained neural network model to application scenarios such as embedded devices and integrated hardware devices.
  • embodiments of the present invention provide a method and apparatus for compressing a neural network model, a computer device, and a computer-readable medium, which can compress a trained neural network model, thereby reducing the amount of calculation and storage of the neural network model.
  • Space enabling neural network models to be applied to devices with limited storage and computing resources.
  • a method for compressing a neural network model includes: obtaining a trained first neural network model; selecting at least one layer from each layer of the first neural network model as a layer to be compressed; and treating it according to a preset rule
  • the compression layer is sorted; according to the order of sorting, the genetic algorithm is used to perform compression processing on part or all of the compression layer to obtain a second neural network model, wherein the accuracy of the second neural network model based on the preset training samples is not lower than Preset precision.
  • a neural network model compression device includes: an acquisition module for acquiring a trained first neural network model; a selection module for selecting at least one layer from each layer of the first neural network model As a layer to be compressed; a sorting module for sorting the compressed layers according to a preset rule; a compression module for performing a compression process on a part or all of the compressed layers using a genetic algorithm according to the sequencing order to obtain a second neural network Model, wherein the accuracy of the second neural network model based on the preset training samples is not lower than the preset accuracy.
  • a computer device includes: a processor; and a memory on which executable instructions are stored, wherein the executable instructions, when executed, cause the processor to perform the aforementioned method.
  • a computer-readable medium has executable instructions stored thereon, wherein the executable instructions, when executed, cause a computer to perform the aforementioned method.
  • the solution of the embodiment of the present invention uses genetic algorithms to compress the trained neural network model, reduces the calculation amount and storage space of the neural network model, and enables it to be applied to storage and computing resources. Both are restricted devices.
  • the solution of the embodiment of the present invention can simultaneously take into account the accuracy and compression of the neural network model.
  • FIG. 1 is an exemplary architecture diagram to which an embodiment of the present invention can be applied;
  • FIG. 1 is an exemplary architecture diagram to which an embodiment of the present invention can be applied;
  • FIG. 2 is a flowchart of a neural network model compression method according to an embodiment of the present invention
  • FIG. 3 is a flowchart of a method for performing a compression process on a compression layer using a genetic algorithm according to an embodiment of the present invention
  • Figure 3a is an example diagram of a neural network structure
  • FIG. 4 is a flowchart of a neural network model compression apparatus according to an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of a computer device according to an embodiment of the present invention.
  • FIG. 6 is a block diagram of an exemplary computer device suitable for use in implementing embodiments of the present invention, according to one embodiment of the present invention.
  • the term “including” and variations thereof mean open terms, meaning “including but not limited to.”
  • the term “based on” means “based at least in part on.”
  • the terms “one embodiment” and “an embodiment” mean “at least one embodiment.”
  • the term “another embodiment” means “at least one other embodiment.”
  • the terms “first”, “second”, etc. may refer to different or the same objects. Other definitions can be included below, either explicitly or implicitly. Unless the context clearly indicates otherwise, the definition of a term is consistent throughout the specification.
  • the embodiment of the present invention uses a genetic algorithm to compress a neural network model.
  • the genetic algorithm and the neural network are briefly described below.
  • Genetic algorithm is a kind of randomized search method that evolved from the evolutionary laws of the biological world (survival of the fittest, genetic mechanism of survival of the fittest). It was first proposed by Professor J. Holland of the United States in 1975. Its main feature is to directly operate on structural objects, and there are no restrictions on derivative and function continuity; it has inherent implicit parallelism and better global optimization Ability; using a probabilistic optimization method, it can automatically obtain and guide the optimized search space, adaptively adjust the search direction, and no need to determine the rules. These properties of genetic algorithms have been widely used in the fields of combinatorial optimization, machine learning, signal processing, adaptive control, and artificial life. It is a key technology in modern intelligent computing.
  • Neural Network (Neural Network, NN) is a research hotspot that has emerged in the field of artificial intelligence since the 1980s. It abstracts the human brain neuron network from the perspective of information processing, establishes some simple model, and forms different networks according to different connection methods.
  • a neural network is a computing model that consists of a large number of nodes (or neurons) connected to each other. Each node represents a specific output function, called an activation function. Each connection between two nodes represents a weighted value for signals passing through the connection, which is called the connection weight. The output of the network is different depending on the connection mode, connection weight and incentive function of the network.
  • the structural information of the neural network includes information such as nodes and connection rights.
  • FIG. 1 illustrates an exemplary system architecture 100 to which a neural network model compression method or a neural network model compression apparatus of an embodiment of the present invention can be applied.
  • the system architecture 100 may include servers 102, 104 and a network 106.
  • the network 106 is a medium that provides a communication link between the server 102 and the server 104.
  • the network 106 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
  • the server 102 may be a server that provides various services, such as a data storage server that stores a trained neural network model.
  • the server 104 may be a server providing various services, such as a server for compressing a neural network model.
  • the server 104 may obtain the trained neural network model from the server 102, analyze the neural network model, and perform processing such as analysis, and store the processing result (for example, the compressed neural network model).
  • the neural network model compression method in the embodiment of the present invention is generally executed by the server 104, and accordingly, the neural network model compression device is generally disposed in the server 104.
  • the system architecture may not include the server 102.
  • FIG. 1 the number of servers and networks in FIG. 1 is merely exemplary. According to actual needs, there can be any number of servers and networks.
  • FIG. 2 shows a flowchart of a neural network model compression method according to an embodiment of the present invention.
  • the method 200 shown in FIG. 2 may be performed by a computer or an electronic device with computing capabilities (such as the server 104 shown in FIG. 1).
  • a computer or an electronic device with computing capabilities such as the server 104 shown in FIG. 1.
  • any system that performs the method 200 is within the scope and spirit of embodiments of the present invention.
  • a trained first neural network model is obtained.
  • an electronic device for example, the server 104 shown in FIG. 1
  • a server for example, The server 102
  • the electronic device may also obtain the first neural network model locally.
  • the first neural network model has been previously trained on a training sample, and its accuracy has met a preset accuracy requirement.
  • the first neural network model in this embodiment may be any general neural network model, for example, it may be a back propagation neural network (BPNN: Back Propagation Neural Network) model, a convolutional neural network (CNN: Convolutional Neural Network) model, based on Convolutional neural network (RCNN: Region, Basic, Neural, Network) model of regional information, recurrent neural network (RNN: Recurrent Neural Network) model, long short-term memory model (LSTM: Long Short-Term Memory), or gated recurring unit (GRU) : Gated Recurrent Unit), in addition, it can also be other types of neural network models or cascade neural network models combined by multiple neural networks.
  • BPNN Back Propagation Neural Network
  • CNN Convolutional Neural Network
  • RCNN Region, Basic, Neural, Network
  • RNN Recurrent Neural Network
  • LSTM Long Short-Term Memory
  • GRU Gated Re
  • step S204 at least one layer is selected from the layers of the first neural network model as a layer to be compressed.
  • the electronic device may select at least one layer from each layer of the obtained first neural network model as a layer to be compressed.
  • the above electronic device may select each layer of the first neural network model as a layer to be compressed.
  • the above electronic device may select at least one convolution layer and at least one A fully connected layer acts as the layer to be compressed.
  • step S206 the compressed layers are sorted according to a preset rule.
  • the electronic device may sort the compressed layers according to a preset rule.
  • the foregoing electronic device may sort the compressed layers in order from the order of the number of levels of the layers to be compressed in the first neural network model.
  • the first neural network model may include, for example, at least one input layer, at least one hidden layer, and at least one output layer. Each layer of the first neural network model may have a corresponding number of layers. As an example, it is assumed that the first neural network model includes an input layer, a hidden layer, and an output layer.
  • the input layer may be at the first layer of the first neural network model, and the number of levels of the input layer may be 1.
  • the hidden layer It can be at the second layer of the first neural network model, the number of levels of the hidden layer can be 2; the output layer can be at the third layer of the first neural network model, and the number of levels of the output layer can be 3;
  • the order of the numbers from large to small is: output layer, hidden layer, and input layer.
  • the electronic device may also sort the layers to be compressed according to the contribution of the layer to be compressed to the loss of the first neural network model.
  • the loss of the first neural network model can be transmitted to each layer of the first neural network model through a back propagation method (Back Propagation, BP), and then the contribution degree of each layer to the network loss is calculated, and then according to the contribution degree, Sort from small to large to the compression layer.
  • BP back propagation method
  • step S208 a part or all of the compression layer to be compressed is performed by using a genetic algorithm to obtain a second neural network model, wherein the accuracy of the second neural network model based on the preset training samples is not lower than the Set the accuracy.
  • the genetic algorithm is used to perform compression processing on the compression layer.
  • the principle of its implementation is based on the principle of "the survival of the fittest" of the genetic algorithm, and taking into account the accuracy of the neural network model, using the "compression layer to be compressed” as a criterion.
  • Various genetic operations are performed on the layer to be compressed, and finally a structure to be compressed is obtained.
  • a chromosome individual that meets the requirements can be selected to perform genetic operations to generate a chromosome individual with the best network simplification (that is, the most simplified structure).
  • a compression-treated layer to be compressed is obtained.
  • the compression-based fitness value refers to a fitness value that can reflect network simplification (or network complexity). For example, the larger the fitness value, the higher the network simplification, that is, effective compression is achieved. ; The smaller the fitness value, the lower the network simplification, that is, no effective compression is achieved.
  • the genetic algorithm is used to compress the neural network model, chromosomal individuals with a large fitness value can be selected to perform genetic operations.
  • the chromosome with the highest fitness value among the chromosome individuals generated in the Nth generation population is the optimal chromosome individual.
  • a chromosome with a small fitness value can be selected to perform a genetic operation, and the chromosome with the smallest fitness value among the chromosome individuals generated in the Nth generation population is optimal Individual chromosomes.
  • a preset accuracy in order to balance the accuracy and compression of the first neural network model, can be set to constrain the compression of the first neural network model. It should be noted that the preset accuracy It can be the original accuracy of the first neural network model, or a value slightly lower than the original accuracy. The preset accuracy may be set manually, or may be set by the foregoing electronic device based on a preset algorithm, and the preset accuracy may be adjusted according to actual needs, which is not limited in this embodiment.
  • the compression process includes deleting at least one node of the layer to be compressed and its corresponding connection, and / or deleting at least one connection of the layer to be compressed, so as to reduce the network complexity of the layer to be compressed Degree, that is, to improve the network simplification of the layer to be compressed.
  • the preset neural network model is used to train the current neural network model; if the current neural network model is used, The accuracy of is not lower than the preset accuracy.
  • the compression processing of the next to-be-compressed layer is continued according to the sequence.
  • the current neural network model is determined as the second neural network model after compression processing; if the accuracy of the current neural network model is lower than a preset accuracy, the neural network model after performing compression processing on the previous layer to be compressed is determined as The second neural network model after compression processing.
  • the number of layers to be compressed is N.
  • the obtained sequence is as follows: layer 1 to be compressed, layer 2 to be compressed, layer 3 to be compressed, ..., layer N to be compressed.
  • First use genetic algorithm to perform compression processing on the layer 1 to be compressed, and then replace the uncompressed layer 1 to be compressed in the first neural network model with the compressed layer 1 to be compressed.
  • the neural network model is trained to obtain the accuracy of the current neural network model, determine whether the accuracy is lower than the preset accuracy, and if it is not lower than the preset accuracy, continue to perform compression processing on the compression layer 2 and repeat the same steps, and so on , Until after the compression processing is performed on the layer N to be compressed, the accuracy of the current neural network model is still not lower than the preset accuracy, then the current neural network model (all layers to be compressed are replaced by the layer to be compressed after compression processing) is determined The second neural network model after compression processing.
  • the current neural network model (at this time, the layers to be compressed 1, 2 and 3 of the first neural network model have been replaced with the compression to be compressed) Layer)
  • the accuracy is lower than the preset accuracy
  • the neural network model after the compression processing is performed on the previous layer to be compressed that is, the layers 1 and 2 to be compressed of the first neural network model are replaced by the compression to be compressed layers
  • the current neural network model can be fine-tuning.
  • the neural network model that is slightly lower than the preset accuracy can be fine-tuned to meet the preset accuracy requirement, so that the neural network model can be further compressed.
  • the electronic device may store a second neural network model obtained through compression processing, for example, a server that is stored locally (for example, a hard disk or a memory) of the electronic device or is remotely connected to the electronic device.
  • a server that is stored locally (for example, a hard disk or a memory) of the electronic device or is remotely connected to the electronic device.
  • the solution provided by the embodiment of the present invention utilizes a genetic algorithm to compress a trained neural network model, reduces the calculation amount and storage space of the neural network model, and enables it to be applied to storage and computing resources Both are restricted devices. Further, the solution of the embodiment of the present invention can simultaneously take into account the accuracy and compression of the neural network model.
  • FIG. 3 shows a flowchart of a method for performing a compression process on a compression layer using a genetic algorithm according to an embodiment of the present invention.
  • the method 300 shown in FIG. 3 may be implemented by a computer or an electronic device having computing capabilities (for example, as shown in FIG. 1). Server 104).
  • any system that performs the method 300 is within the scope and spirit of embodiments of the present invention.
  • step S302 network structure information of a layer to be compressed is acquired.
  • step S304 according to the network structure information of the layer to be compressed, the layer to be compressed is encoded to obtain a chromosome.
  • the structure of the neural network needs to be expressed as a genetic algorithm's individual chromosome code in order to be able to perform calculations with the genetic algorithm.
  • N neurons in the layer to be compressed, and the nodes are numbered from 1 to N.
  • An N ⁇ N matrix may be used to represent the network structure of the layer to be compressed.
  • the neural network structure with 7 nodes shown in FIG. 3a is taken as an example to illustrate the method for encoding the neural network model in this embodiment.
  • Table 1 is the node connection relationship of the neural network structure.
  • the element corresponding to (i, j) in the matrix represents the connection relationship from the i-th node to the j-th node. Because the embodiment of the present invention does not involve changing the connection right of the neural network model to be compressed when compressing the neural network model, this embodiment expresses the connection relationship of the nodes as a form of 0, 1, -1, where "0 "1" indicates no connection; “1” indicates that the connection weight is 1, which has an excitation effect, which is indicated by a solid line in Figure 3a; “-1” indicates that the connection weight is -1, which has an inhibitory effect, as shown in the figure. 3a is indicated by a dotted line. It can be seen that Table 1 is equivalent to the structure shown in Figure 3a.
  • the coding of the neural network can be expressed as a digital string form composed of 0, 1, and -1, from element (3,1) to element (7,6) from left to right, Connected from top to bottom, they form the following chromosome code:
  • step S306 based on the chromosomes obtained above, population initialization is performed to generate an initial population.
  • a replication operation may be performed on the chromosomes obtained above, a predetermined number of chromosome individuals are randomly generated, and the set of these chromosome individuals is used as an initial population.
  • the size of the initial population is determined by the population size M, which may be, for example, but not limited to, 10-100. Because of the replication operation, all chromosomal individuals in the initial population are the same.
  • step S308 the fitness value of individual chromosomes in the population is calculated.
  • the fitness function may use the following formula:
  • f (i, t) represents the fitness of the i-th individual of the t-th generation
  • E (i, t) represents the network error of the neural network model corresponding to the i-th individual of the t-th generation
  • H (i, t) represents Network simplification of the i-th individual in the t-th generation.
  • E (i, t) can be calculated using the following formula:
  • the neural network model corresponding to the i-th individual of the t-th generation is based on the expected output value and the actual output value of the preset q-th training sample. The smaller the network error value, the higher the accuracy.
  • H (i, t) can be calculated using the following formula:
  • m (i, t) is the number of nodes of the i-th individual in the t-th generation. The fewer the number of nodes, the larger the network simplification value, the higher the network simplification, and the simpler the neural network model.
  • the network error E (i, t) is used to constrain the compression process of the neural network model to be compressed, and both accuracy and compression can be taken into account at the same time.
  • the fitness function may also use the following formula:
  • f (i, t) represents the fitness of the i-th individual of the t-th generation
  • E (i, t) represents the network error of the neural network model corresponding to the i-th individual of the t-th generation
  • H (i, t) represents Network simplification of the i-th individual in the t-th generation.
  • the fitness function includes formula 1 and formula 2.
  • formula 1 is a fitness function based on network errors, which reflects the accuracy of the neural network model
  • formula 2 is a fitness function based on network simplification, which reflects the compression of neural network models. Therefore, in this embodiment, the accuracy-based fitness value and the compression-based fitness value of the individual chromosome are calculated separately.
  • the termination condition may include a preset threshold for the number of iterations or a set convergence condition.
  • the number of iterations can be set to, for example, but not limited to, 500, but it is determined that the termination condition is reached when the number of iterations reaches 500.
  • the convergence condition may be set, for example, but not limited to, when the fitness value meets a certain condition, it is determined that the termination condition is reached. For example, the fitness value may be set to be greater than a preset threshold.
  • step S312 if it is determined in step S310 that the termination condition is not met, then using the fitness value as a standard, select a chromosome individual whose fitness value meets the requirements and perform genetic operations such as replication, crossover, or mutation to generate a new generation of population. Then, it returns to step S308.
  • this embodiment selects a chromosomal individual with a relatively large fitness value to perform a genetic operation, and eliminates some chromosomal individuals with a small fitness value.
  • the selection criteria of this embodiment may adopt the following steps: (1) Calculate the accuracy-based fitness of each individual chromosome individual in the population by formula 1 Value, and then calculate the first selection probability of the individual being selected, and select the first chromosome individual according to the first selection probability; (2) Calculate the fitness value of each chromosome individual in the population based on the compression, and then calculate the individual
  • the selected second selection probability is to select a second chromosome individual from the first chromosome individuals selected in step (1) according to the second selection probability.
  • the chromosome individuals with the highest and lowest fitness values in the current population can be found, the best chromosome individuals are retained and directly entered into the next generation, and the worst chromosome individuals are eliminated, which can ensure Pass on good genes to the next generation.
  • the selection strategy of this embodiment can restrict the compression process of the compression layer through accuracy constraints, and can ensure that chromosome individuals with small network errors and large network simplifications enter the next generation.
  • the fitness-proportion selection method can be used as a commonly used selection method, which means that the higher the fitness, the greater the probability of being selected, that is, :
  • p (i, t) is the selection probability of the i-th individual in the t-th generation
  • f (i, t) is the fitness of the i-th individual in the t-th generation
  • f (sum, t) is the total fitness of the t-th population.
  • the replication operation refers to directly copying the selected parental chromosome individuals from the current generation to the new generation of individuals without any change.
  • Cross operation refers to randomly selecting two parental chromosome individuals from the population according to the above-mentioned selection method, replacing some components of the two parental chromosome individuals with each other to form a new offspring chromosome individual.
  • the mutation operation refers to randomly selecting a parent chromosome individual from the population according to the selection method described above, and then randomly selecting a node as a mutation point in the expression of the individual, and changing the value of the mutation point gene to another Valid values, forming new offspring chromosome individuals.
  • Whether a cross operation occurs can be determined according to the cross probability P c .
  • the method is to randomly generate a random number P between 0 and 1.
  • P ⁇ P c the cross operation occurs, and when P> P c , the cross does not occur.
  • whether the mutation operation occurs can also be determined according to the mutation probability P m . Since it is the prior art, the description thereof is omitted here.
  • a crossover point when performing a crossover operation, may be randomly selected in each parent chromosome individual with a certain probability, and the lower part of the crossover point is referred to as a crossover segment.
  • the first parental chromosome individual deletes its crossover segment
  • the second parental chromosome individual's crossover segment is inserted at his intersection, so that the first offspring chromosome individual is generated.
  • the second parent chromosome individual deletes its cross section
  • the first parent chromosome individual's cross section is inserted at his intersection to form a second offspring chromosome individual.
  • the two parents of the selected chromosomes are the same, but due to their different intersections, the resulting offspring chromosomes are also different, which effectively avoids inbreeding and improves the global search ability.
  • one of the following operations may be adopted randomly: (a) delete at least one node in the hidden layer of the neural network model and its corresponding connection; (b) delete the hidden layer of the neural network model At least one of the connections; (c) Randomly repair the deleted node or connection with a certain probability; (d) Add hidden layer nodes to randomly generate corresponding connection weights.
  • deleting nodes always precedes adding nodes, and the number of added nodes should not be greater than the number of deleted nodes. At the same time, only when the deleted nodes cannot produce a good child, the nodes are added.
  • Such a mutation operation can guarantee the method Always go in the direction of compressing the neural network model.
  • step S314 if the determination result in step S210 is that the termination condition is reached, the chromosome individual with the best fitness value is output, so as to obtain the compressed layer to be compressed.
  • the optimal chromosome individual may be set to max (f, i), and the chromosome individual having the greatest fitness when the termination condition is reached is regarded as the optimal chromosome individual.
  • FIG. 4 shows a schematic diagram of a neural network model compression device according to an embodiment of the present invention.
  • the device 400 shown in FIG. 4 corresponds to the above-mentioned neural network model compression method. Since the embodiment of the device 400 is basically similar to the method embodiment, it is described relatively simply. For the relevant part, refer to the description of the method embodiment.
  • the device 400 may be implemented in software, hardware, or a combination of software and hardware, and may be installed in a computer or other suitable electronic device with computing capabilities.
  • the device 400 may include an acquisition module 402, a selection module 404, a sorting module 406, and a compression module 408.
  • the obtaining module 402 is configured to obtain a trained first neural network model.
  • the selection module 404 is configured to select at least one layer from each layer of the first neural network model as a layer to be compressed.
  • the sorting module 406 is configured to sort the compressed layers according to a preset rule.
  • the compression module 408 is configured to perform a compression process on a part or all of the compression layer using a genetic algorithm to obtain a second neural network model according to the sequencing order, wherein the accuracy of the second neural network model based on the preset training samples is not lower than Preset precision.
  • the ranking module 406 is specifically configured to sort the layers to be compressed according to the number of levels of the layers to be compressed in the first neural network model.
  • the ranking module 406 is specifically configured to sort the layers to be compressed according to the contribution of the layer to be compressed to the loss of the first neural network model.
  • the compression module 408 includes a training unit and a determination unit.
  • the training unit is configured to train a current neural network model using a preset training sample after performing compression processing on one of the layers to be compressed each time by using a genetic algorithm.
  • the determining unit is configured to: if the accuracy of the current neural network model is not lower than a preset accuracy, when there is a layer to be compressed that has not been compressed yet, continue to perform compression processing on the next layer to be compressed, and When the compression layer performs compression processing, the current neural network model is determined as the second neural network model obtained after the compression processing; if the accuracy of the current neural network model is lower than the preset accuracy, the previous layer to be compressed is compressed.
  • the processed neural network model is determined as the second neural network model obtained after compression processing.
  • the compression module 408 further includes an acquisition unit, a coding unit, an initialization unit, a calculation unit, a judgment unit, a genetic operation unit, and an output unit.
  • the obtaining unit is configured to obtain network structure information of a layer to be compressed.
  • the encoding unit is configured to encode the layer to be compressed according to the network structure information of the layer to be compressed to obtain a chromosome.
  • the initialization unit is configured to perform population initialization according to a chromosome obtained to generate an initial population.
  • the calculation unit is used to calculate the fitness value of the individual chromosomes in the population.
  • the judging unit is used to judge whether the termination condition is reached.
  • the genetic operation unit is used to select a chromosome individual whose fitness value meets the requirements based on the fitness value if the termination condition is not reached, and perform replication, crossover or mutation operations to generate a new generation of population.
  • the output unit is used to output the chromosome individual with the best fitness value if the termination condition is reached, so as to obtain the compressed layer to be compressed.
  • the calculation unit is further configured to calculate the precision-based and compression-based fitness values of the individual chromosomes in the population, respectively.
  • the genetic operation unit is further configured to obtain the first selection probability of the chromosome individuals in the population according to the fitness value based on the accuracy, select the first chromosome individual according to the first selection probability, and obtain the compression value based on the fitness value.
  • the second selection probability of the chromosomal individuals in the population, and the second chromosomal individual is selected from the first chromosomal individuals according to the second selection probability; the second chromosomal individuals are copied, crossed or mutated to generate a new generation of population.
  • Figure 5 shows a schematic diagram of a computer device according to an embodiment of the invention.
  • the computer device 500 may include a processor 502 and a memory 504, where the memory 502 stores executable instructions, where the executable instructions, when executed, cause the processor 502 to execute the instructions shown in FIG.
  • FIG. 6 shows a block diagram of an exemplary computer device suitable for use in implementing embodiments of the present invention.
  • the computer device 600 shown in FIG. 6 is only an example, and should not impose any limitation on the functions and scope of use of the embodiments of the present invention.
  • the computer device 600 is implemented in the form of a general-purpose computing device.
  • the components of the computer device 600 may include, but are not limited to, a processor 602, a system memory 604, and a bus 606 connecting different system components (including the processor 602 and the system memory 604).
  • the bus 606 represents one or more of several types of bus structures, including a memory bus or a memory controller, a peripheral bus, a graphics acceleration port, a processor, or a local area bus using any of a variety of bus structures.
  • these architectures include, but are not limited to, the Industry Standard Architecture (ISA) bus, the Micro Channel Architecture (MAC) bus, the enhanced ISA bus, the Video Electronics Standards Association (VESA) local area bus, and peripheral component interconnects ( PCI) bus.
  • Computer device 600 typically includes a variety of computer system-readable media. These media can be any available media that can be accessed by the computer device 600, including volatile and non-volatile media, removable and non-removable media.
  • System memory 604 may include computer system-readable media in the form of volatile memory, such as random access memory (RAM) 608 and / or cache memory 610.
  • Computer device 600 may further include other removable / non-removable, volatile / nonvolatile computer system storage media.
  • the storage system 612 may be used to read and write non-removable, non-volatile magnetic media (not shown in FIG. 6 and is commonly referred to as a "hard drive").
  • a disk drive for reading and writing to a removable non-volatile disk such as a "floppy disk”
  • a removable non-volatile optical disk such as a CD-ROM, DVD-ROM, etc.
  • each drive may be connected to the bus 606 through one or more data medium interfaces.
  • the system memory 604 may include at least one program product having a set (for example, at least one) of program modules configured to perform the functions of the embodiment of FIG. 1 or FIG. 2 of the present invention.
  • a program / utility tool 614 having a set (at least one) of program modules 616 may be stored in, for example, system memory 604.
  • Such program modules 616 include, but are not limited to, an operating system, one or more application programs, other program modules, and programs Data, each or some combination of these examples may include an implementation of the network environment.
  • the program module 616 generally performs the functions and / or methods in the embodiment of FIG. 1 or FIG. 2 described in the present invention.
  • the computer device 600 may also communicate with one or more external devices 700 (such as a keyboard, pointing device, display 800, etc.), and may also communicate with one or more devices that enable a user to interact with the computer device 600, and / or with Any device (eg, network card, modem, etc.) that enables the computer device 600 to communicate with one or more other computing devices. This communication can take place through an input / output (I / O) interface 618.
  • the computer device 600 may also communicate with one or more networks (such as a local area network (LAN), a wide area network (WAN), and / or a public network, such as the Internet) through the network adapter 520. As shown, the network adapter 620 communicates with other modules of the computer device 600 through the bus 606.
  • LAN local area network
  • WAN wide area network
  • public network such as the Internet
  • the processor 602 executes various functional applications and data processing by running a program stored in the system memory 604, for example, implementing the neural network model compression method shown in the foregoing embodiment.
  • An embodiment of the present invention further provides a computer-readable medium having executable instructions stored thereon, where the executable instructions, when executed, cause a computer to execute the method 200 shown in FIG. 2 or the method shown in FIG. 3 300.
  • the computer-readable medium of this embodiment may include the RAM 608 and / or the cache memory 610 and / or the storage system 612 in the system memory 604 in the embodiment shown in FIG. 6.
  • the computer-readable medium in this embodiment may include not only a tangible medium but also an intangible medium.
  • the computer-readable medium of this embodiment may adopt any combination of one or more computer-readable media.
  • the computer-readable medium may be a computer-readable signal medium or a computer-readable storage medium.
  • the computer-readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples (non-exhaustive list) of computer-readable storage media include: electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), Erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the foregoing.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in combination with an instruction execution system, apparatus, or device.
  • the computer-readable signal medium may include a data signal in baseband or propagated as part of a carrier wave, which carries a computer-readable program code. Such a propagated data signal may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • the computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, and the computer-readable medium may send, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device .
  • Program code embodied on a computer-readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
  • Computer program code for performing the operations of the present invention may be written in one or more programming languages, or combinations thereof, including programming languages such as Java, Smalltalk, C ++, and also conventional Procedural programming language, such as "C" or similar programming language.
  • the program code can be executed entirely on the user's computer, partly on the user's computer, as an independent software package, partly on the user's computer, partly on a remote computer, or entirely on a remote computer or server.
  • the remote computer can be connected to the user's computer through any kind of network, including a local area network (LAN) or wide area network (WAN), or it can be connected to an external computer (such as through the Internet using an Internet service provider) connection).
  • LAN local area network
  • WAN wide area network
  • Internet service provider Internet service provider
  • the embodiments of the present invention may be provided as a method, an apparatus, or a computer program product. Therefore, the embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the embodiments of the present invention may take the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, etc.) containing computer-usable program code.
  • computer-usable storage media including but not limited to disk storage, CD-ROM, etc.
  • Embodiments of the present invention are described with reference to flowcharts and / or block diagrams of methods, apparatuses, and computer program products according to embodiments of the present invention. It should be understood that each process and / or block in the flowcharts and / or block diagrams, and combinations of processes and / or blocks in the flowcharts and / or block diagrams can be implemented by computer program instructions.
  • These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal device to produce a machine, such that the instructions executed by the processor of the computer or other programmable data processing terminal device Means are generated for implementing the functions specified in one or more of the flowcharts and / or one or more of the block diagrams.

Abstract

A method for compressing a neural network model, a device, a computer apparatus, and a computer readable medium. The method comprises: acquiring a first trained neural network model (S202); selecting one or more layers from layers of the first neural network model as layers to be compressed (S204); sorting the layers to be compressed according to a pre-determined rule (S206); and compressing, according to a sequential order from the sorting and by means of a genetic algorithm, a portion or all of the layers to be compressed, and obtaining a second neural network model (S208), wherein the accuracy of the second neural network model based on a pre-configured training sample is not less than a pre-determined accuracy value. The method, the device, the computer apparatus, and the computer readable medium compress a trained neural network model by means of a genetic algorithm, thereby reducing a computational load and storage space of the neural network model, and providing applicability of the same to apparatuses having limited memory and computational resources without compromising accuracy or compression of the neural network model.

Description

神经网络模型压缩方法、装置和计算机设备Neural network model compression method, device and computer equipment 技术领域Technical field
本申请涉及计算机应用技术领域,尤其涉及一种神经网络模型压缩方法及装置、计算机设备及计算机可读介质。The present application relates to the field of computer application technology, and in particular, to a method and a device for compressing a neural network model, a computer device, and a computer-readable medium.
背景技术Background technique
近年来,随着人工智能的发展,神经网络(Neural Network,NN)算法被广泛应用于图像处理,语音识别,自然语言处理等多个领域。然而,效果较好的深度神经网络往往有着数量较大的节点(神经元)和模型参数,不仅计算量大而且在实际部署中模型占据较大一部分空间,限制了其应用于存储和计算资源都受限的设备。因此,如何对神经网络模型进行压缩显得尤为重要,特别是对已经训练好的神经网络模型进行压缩,将有利于把训练好的神经网络模型应用到诸如嵌入式设备、集成硬件设备中等应用场景。In recent years, with the development of artificial intelligence, neural network (NN) algorithms have been widely used in image processing, speech recognition, natural language processing, and other fields. However, deep neural networks with good performance often have a large number of nodes (neurons) and model parameters, which not only have a large amount of calculation but also occupy a large part of the space in actual deployment, which limits its application to both storage and computing resources. Restricted equipment. Therefore, how to compress the neural network model is particularly important. Especially the compressed neural network model will be compressed, which will help to apply the trained neural network model to application scenarios such as embedded devices and integrated hardware devices.
发明内容Summary of the Invention
鉴于以上问题,本发明的实施例提供一种神经网络模型压缩方法及装置、计算机设备及计算机可读介质,其能对已经训练好的神经网络模型进行压缩,降低神经网络模型的计算量和存储空间,使神经网络模型能应用于存储和计算资源受限的设备。In view of the above problems, embodiments of the present invention provide a method and apparatus for compressing a neural network model, a computer device, and a computer-readable medium, which can compress a trained neural network model, thereby reducing the amount of calculation and storage of the neural network model. Space, enabling neural network models to be applied to devices with limited storage and computing resources.
按照本发明的实施例的神经网络模型压缩方法,包括:获取经训练后的第一神经网络模型;从第一神经网络模型的各层中选取至少一层作为待压缩层;按照预设规则对待压缩层进行排序;根据排序的先后顺序,利用遗传算法对待压缩层的部分或全部执行压缩处理以获得第二神经网络模型,其中,第二神经网络模型基于预置的训练样本的精度不低于预设精度。A method for compressing a neural network model according to an embodiment of the present invention includes: obtaining a trained first neural network model; selecting at least one layer from each layer of the first neural network model as a layer to be compressed; and treating it according to a preset rule The compression layer is sorted; according to the order of sorting, the genetic algorithm is used to perform compression processing on part or all of the compression layer to obtain a second neural network model, wherein the accuracy of the second neural network model based on the preset training samples is not lower than Preset precision.
按照本发明的实施例的神经网络模型压缩装置,包括:获取模块,用于获取经训练后的第一神经网络模型;选取模块,用于从第一神经网络模型的各层中选取至少一层作为待压缩层;排序模块,用于按照预设规则对待压缩层进行排序;压缩模块,用于根据排序的先后顺序,利用遗传算法对待压缩层的部分或全部执行压缩处理以获得第二神经网络模型,其中,第二神经网络模型基于预置的训练样本的精度不低于预设精度。A neural network model compression device according to an embodiment of the present invention includes: an acquisition module for acquiring a trained first neural network model; a selection module for selecting at least one layer from each layer of the first neural network model As a layer to be compressed; a sorting module for sorting the compressed layers according to a preset rule; a compression module for performing a compression process on a part or all of the compressed layers using a genetic algorithm according to the sequencing order to obtain a second neural network Model, wherein the accuracy of the second neural network model based on the preset training samples is not lower than the preset accuracy.
按照本发明的实施例的计算机设备,包括:处理器;以及存储器,其上存储有可执行指令,其中,所述可执行指令当被执行时使得所述处理器执行前述的方法。A computer device according to an embodiment of the invention includes: a processor; and a memory on which executable instructions are stored, wherein the executable instructions, when executed, cause the processor to perform the aforementioned method.
按照本发明的实施例的计算机可读介质,其上存储有可执行指令,其中,所述可执行指令当被执行时使得计算机执行前述的方法。A computer-readable medium according to an embodiment of the present invention has executable instructions stored thereon, wherein the executable instructions, when executed, cause a computer to perform the aforementioned method.
从以上的描述可以看出,本发明的实施例的方案利用遗传算法对已经训练好的神经网络模型进行压缩,降低了神经网络模型的计算量和存储空间,使其能应用于存储和计算资源都 受限的设备。并且,本发明的实施例的方案能同时兼顾神经网络模型的精度和压缩。It can be seen from the above description that the solution of the embodiment of the present invention uses genetic algorithms to compress the trained neural network model, reduces the calculation amount and storage space of the neural network model, and enables it to be applied to storage and computing resources. Both are restricted devices. In addition, the solution of the embodiment of the present invention can simultaneously take into account the accuracy and compression of the neural network model.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1为本发明的实施例可以应用于其中的示例性架构图;FIG. 1 is an exemplary architecture diagram to which an embodiment of the present invention can be applied; FIG.
图2为按照本发明的一个实施例的神经网络模型压缩方法的流程图;2 is a flowchart of a neural network model compression method according to an embodiment of the present invention;
图3为按照本发明的一个实施例的利用遗传算法对待压缩层执行压缩处理的方法的流程图;FIG. 3 is a flowchart of a method for performing a compression process on a compression layer using a genetic algorithm according to an embodiment of the present invention; FIG.
图3a为一神经网络结构的示例图;Figure 3a is an example diagram of a neural network structure;
图4为按照本发明的一个实施例的神经网络模型压缩装置的流程图;4 is a flowchart of a neural network model compression apparatus according to an embodiment of the present invention;
图5为按照本发明的一个实施例的计算机设备的示意图;5 is a schematic diagram of a computer device according to an embodiment of the present invention;
图6为按照本发明的一个实施例的适于用来实现本发明实施方式的示例性计算机设备的框图。FIG. 6 is a block diagram of an exemplary computer device suitable for use in implementing embodiments of the present invention, according to one embodiment of the present invention.
具体实施方式detailed description
现在将参考示例实施方式讨论本文描述的主题。应该理解,讨论这些实施方式只是为了使得本领域技术人员能够更好地理解从而实现本文描述的主题,并非是对权利要求书中所阐述的保护范围、适用性或者示例的限制。可以在不脱离本公开内容的保护范围的情况下,对所讨论的元素的功能和排列进行改变。各个示例可以根据需要,省略、替代或者添加各种过程或组件。例如,所描述的方法可以按照与所描述的顺序不同的顺序来执行,以及各个步骤可以被添加、省略或者组合。另外,相对一些示例所描述的特征在其他例子中也可以进行组合。The subject matter described herein will now be discussed with reference to example embodiments. It should be understood that these embodiments are discussed only to enable those skilled in the art to better understand and implement the subject matter described herein, and are not intended to limit the scope, applicability, or examples set forth in the claims. Changes may be made in the function and arrangement of elements in question without departing from the scope of protection of the present disclosure. Various examples can omit, substitute, or add various procedures or components as needed. For example, the methods described may be performed in a different order than that described, and various steps may be added, omitted, or combined. In addition, the features described with respect to some examples may be combined in other examples.
如本文中使用的,术语“包括”及其变型表示开放的术语,含义是“包括但不限于”。术语“基于”表示“至少部分地基于”。术语“一个实施例”和“一实施例”表示“至少一个实施例”。术语“另一个实施例”表示“至少一个其他实施例”。术语“第一”、“第二”等可以指代不同的或相同的对象。下面可以包括其他的定义,无论是明确的还是隐含的。除非上下文中明确地指明,否则一个术语的定义在整个说明书中是一致的。As used herein, the term "including" and variations thereof mean open terms, meaning "including but not limited to." The term "based on" means "based at least in part on." The terms "one embodiment" and "an embodiment" mean "at least one embodiment." The term "another embodiment" means "at least one other embodiment." The terms "first", "second", etc. may refer to different or the same objects. Other definitions can be included below, either explicitly or implicitly. Unless the context clearly indicates otherwise, the definition of a term is consistent throughout the specification.
本发明的实施例采用了遗传算法来压缩神经网络模型,下面对遗传算法和神经网络做一下简要介绍。The embodiment of the present invention uses a genetic algorithm to compress a neural network model. The genetic algorithm and the neural network are briefly described below.
遗传算法(Genetic Algorithm,即GA)是一类借鉴生物界的进化规律(适者生存,优胜劣汰遗传机制)演化而来的随机化搜索方法。它是由美国的J.Holland教授于1975年首先提出,其主要特点是直接对结构对象进行操作,不存在求导和函数连续性的限定;具有内在的隐并行性和更好的全局寻优能力;采用概率化的寻优方法,能自动获取和指导优化的搜索空间,自适应地调整搜索方向,不需要确定的规则。遗传算法的这些性质,已被人们广泛地 应用于组合优化、机器学习、信号处理、自适应控制和人工生命等领域。它是现代有关智能计算中的关键技术。Genetic algorithm (GA) is a kind of randomized search method that evolved from the evolutionary laws of the biological world (survival of the fittest, genetic mechanism of survival of the fittest). It was first proposed by Professor J. Holland of the United States in 1975. Its main feature is to directly operate on structural objects, and there are no restrictions on derivative and function continuity; it has inherent implicit parallelism and better global optimization Ability; using a probabilistic optimization method, it can automatically obtain and guide the optimized search space, adaptively adjust the search direction, and no need to determine the rules. These properties of genetic algorithms have been widely used in the fields of combinatorial optimization, machine learning, signal processing, adaptive control, and artificial life. It is a key technology in modern intelligent computing.
神经网络(NeuralNetwork,即NN),是20世纪80年代以来人工智能领域兴起的研究热点。它从信息处理角度对人脑神经元网络进行抽象,建立某种简单模型,按不同的连接方式组成不同的网络。神经网络是一种运算模型,由大量的节点(或称神经元)之间相互联接构成。每个节点代表一种特定的输出函数,称为激励函数(activation function)。每两个节点间的连接都代表一个对于通过该连接信号的加权值,称之为连接权。网络的输出则依网络的连接方式,连接权和激励函数的不同而不同。神经网络的结构信息包括节点和连接权等信息。Neural Network (Neural Network, NN) is a research hotspot that has emerged in the field of artificial intelligence since the 1980s. It abstracts the human brain neuron network from the perspective of information processing, establishes some simple model, and forms different networks according to different connection methods. A neural network is a computing model that consists of a large number of nodes (or neurons) connected to each other. Each node represents a specific output function, called an activation function. Each connection between two nodes represents a weighted value for signals passing through the connection, which is called the connection weight. The output of the network is different depending on the connection mode, connection weight and incentive function of the network. The structural information of the neural network includes information such as nodes and connection rights.
图1示出了可以应用本发明的实施例的神经网络模型压缩方法或神经网络模型压缩装置的示例性系统架构100。FIG. 1 illustrates an exemplary system architecture 100 to which a neural network model compression method or a neural network model compression apparatus of an embodiment of the present invention can be applied.
如图1所示,系统架构100可以包括服务器102、104和网络106。网络106用以在服务器102和服务器104之间提供通信链路的介质。网络106可以包括各种连接类型,例如有线、无线通信链路或者光纤电缆等等。As shown in FIG. 1, the system architecture 100 may include servers 102, 104 and a network 106. The network 106 is a medium that provides a communication link between the server 102 and the server 104. The network 106 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
服务器102可以是提供各种服务的服务器,例如对经训练后的神经网络模型进行存储的数据存储服务器。The server 102 may be a server that provides various services, such as a data storage server that stores a trained neural network model.
服务器104可以是提供各种服务的服务器,例如用于压缩神经网络模型的服务器。服务器104可以从服务器102获取经训练后的神经网络模型,并对该神经网络模型进行分析等处理,并对处理结果(例如压缩处理后的神经网络模型)进行存储。The server 104 may be a server providing various services, such as a server for compressing a neural network model. The server 104 may obtain the trained neural network model from the server 102, analyze the neural network model, and perform processing such as analysis, and store the processing result (for example, the compressed neural network model).
需要说明的是,本发明实施例的神经网络模型压缩方法一般由服务器104执行,相应地,神经网络模型压缩装置一般设置于服务器104中。It should be noted that the neural network model compression method in the embodiment of the present invention is generally executed by the server 104, and accordingly, the neural network model compression device is generally disposed in the server 104.
需要指出的是,若服务器104所获取的神经网络模型预先存储在本地,那么系统架构也可以不包括服务器102。It should be noted that if the neural network model obtained by the server 104 is stored locally in advance, the system architecture may not include the server 102.
应该理解,图1中的服务器和网络的数目仅仅是示意性的。根据实际需要,可以有任意数量的服务器和网络。It should be understood that the number of servers and networks in FIG. 1 is merely exemplary. According to actual needs, there can be any number of servers and networks.
图2示出了按照本发明的一个实施例的神经网络模型压缩方法的流程图。图2所示的方法200可以由计算机或具有计算能力的电子设备(例如图1所示的服务器104)来执行。此外,本领域技术人员将理解,执行方法200的任何系统都在本发明的实施例的范围和精神内。FIG. 2 shows a flowchart of a neural network model compression method according to an embodiment of the present invention. The method 200 shown in FIG. 2 may be performed by a computer or an electronic device with computing capabilities (such as the server 104 shown in FIG. 1). In addition, those skilled in the art will understand that any system that performs the method 200 is within the scope and spirit of embodiments of the present invention.
如图2所示,在步骤S202,获取经训练后的第一神经网络模型。在本实施例中,神经网络模型压缩方法运行于其上的电子设备(例如图1所示的服务器104)可以通过有线连接方式或者无线连接方式从远程通信连接的服务器(例如图1所示的服务器102)获取待压缩的第一神经网络模型。当然,若第一神经网络模型预先存储在上述电子设备本地,上述电子设备也可以本地获取第一神经网络模型。As shown in FIG. 2, in step S202, a trained first neural network model is obtained. In this embodiment, an electronic device (for example, the server 104 shown in FIG. 1) on which the neural network model compression method is run may be connected to a server (for example, The server 102) obtains a first neural network model to be compressed. Of course, if the first neural network model is stored locally in the electronic device in advance, the electronic device may also obtain the first neural network model locally.
在本实施例中,第一神经网络模型先前已经在训练样本上被训练,并且其精度已满足预设精度要求。本实施例的第一神经网络模型可以是任何通用的神经网络模型,例如可以是反向传播神经网络(BPNN:Back Propagation Neural Network)模型、卷积神经网络(CNN:Convolutional Neural Network)模型、基于区域信息的卷积神经网络(RCNN:Region Based Convolutional Neural Network)模型、循环神经网络(RNN:Recurrent Neural Network)模型、长短期记忆模型(LSTM:Long Short-Term Memory)或门控循环单元(GRU:Gated Recurrent Unit),此外,还可以是其他类型的神经网络模型或由多种神经网络结合的级联神经网络模型。In this embodiment, the first neural network model has been previously trained on a training sample, and its accuracy has met a preset accuracy requirement. The first neural network model in this embodiment may be any general neural network model, for example, it may be a back propagation neural network (BPNN: Back Propagation Neural Network) model, a convolutional neural network (CNN: Convolutional Neural Network) model, based on Convolutional neural network (RCNN: Region, Basic, Neural, Network) model of regional information, recurrent neural network (RNN: Recurrent Neural Network) model, long short-term memory model (LSTM: Long Short-Term Memory), or gated recurring unit (GRU) : Gated Recurrent Unit), in addition, it can also be other types of neural network models or cascade neural network models combined by multiple neural networks.
在步骤S204,从第一神经网络模型的各层中选取至少一层作为待压缩层。本实施例中,上述电子设备可以从所获取的第一神经网络模型的各层中选取至少一层作为待压缩层。例如,上述电子设备可以选取第一神经网络模型的每层作为待压缩层。In step S204, at least one layer is selected from the layers of the first neural network model as a layer to be compressed. In this embodiment, the electronic device may select at least one layer from each layer of the obtained first neural network model as a layer to be compressed. For example, the above electronic device may select each layer of the first neural network model as a layer to be compressed.
本实施例的一些可选的实现方式中,若第一神经网络模型包括卷积层(Convolution Layer)和全连接层(Fully Connected Layer,FC),上述电子设备可以选取至少一个卷积层和至少一个全连接层作为待压缩层。In some optional implementations of this embodiment, if the first neural network model includes a convolution layer and a fully connected layer (FC), the above electronic device may select at least one convolution layer and at least one A fully connected layer acts as the layer to be compressed.
在步骤S206,按照预设规则对待压缩层进行排序。本实施例中,上述电子设备从所获取的第一神经网络模型选取出待压缩层后,上述电子设备可以按照预设规则对待压缩层进行排序。In step S206, the compressed layers are sorted according to a preset rule. In this embodiment, after the electronic device selects a layer to be compressed from the obtained first neural network model, the electronic device may sort the compressed layers according to a preset rule.
在本实施例的一种可选的实现方式中,上述电子设备可以按照待压缩层在第一神经网络模型中所处的层级的层级数由大到小的顺序对待压缩层进行先后排序。第一神经网络模型例如可以包括至少一个输入层(input layer)、至少一个隐藏层(hidden layer)和至少一个输出层(output layer)。其中,第一神经网络模型的每层可以具有对应的层级数。作为示例,假设第一神经网络模型包括一个输入层、一个隐藏层和一个输出层,该输入层可以处于第一神经网络模型的第一层,该输入层的层级数可以是1;该隐藏层可以处于第一神经网络模型的第二层,该隐藏层的层级数可以是2;该输出层可以处于第一神经网络模型的第三层,该输出层的层级数可以是3;则按照层级数由大到小进行排序的顺序先后为:输出层、隐藏层、输入层。In an optional implementation manner of this embodiment, the foregoing electronic device may sort the compressed layers in order from the order of the number of levels of the layers to be compressed in the first neural network model. The first neural network model may include, for example, at least one input layer, at least one hidden layer, and at least one output layer. Each layer of the first neural network model may have a corresponding number of layers. As an example, it is assumed that the first neural network model includes an input layer, a hidden layer, and an output layer. The input layer may be at the first layer of the first neural network model, and the number of levels of the input layer may be 1. The hidden layer It can be at the second layer of the first neural network model, the number of levels of the hidden layer can be 2; the output layer can be at the third layer of the first neural network model, and the number of levels of the output layer can be 3; The order of the numbers from large to small is: output layer, hidden layer, and input layer.
在本实施例的另一种可选的实现方式中,上述电子设备也可以按照待压缩层对第一神经网络模型损失的贡献度由小到大对待压缩层进行先后排序。其中,可将第一神经网络模型的损失通过反向传播方法(Back Propagation,BP)传递到第一神经网络模型的每个层,然后计算各层对网络损失的贡献度,然后按照贡献度由小到大对待压缩层进行先后排序。In another optional implementation manner of this embodiment, the electronic device may also sort the layers to be compressed according to the contribution of the layer to be compressed to the loss of the first neural network model. Among them, the loss of the first neural network model can be transmitted to each layer of the first neural network model through a back propagation method (Back Propagation, BP), and then the contribution degree of each layer to the network loss is calculated, and then according to the contribution degree, Sort from small to large to the compression layer.
本实施例的一些可选的实施方式中,待压缩层可由一联结矩阵来表示,如一N×N矩阵C=(c ij)N×N表示一个有N个节点的网络结构,其中c ij的值表示从节点i到节点j的连接权;c ij=0表示从节点i到节点j之间无连接;c ii表示节点i的偏置。则待压缩层的贡献度可采用以 下公式计算: In some optional implementations of this embodiment, the layer to be compressed may be represented by a connection matrix, such as an N × N matrix C = (c ij ) N × N represents a network structure with N nodes, where c ij The value represents the connection right from node i to node j; c ij = 0 means no connection from node i to node j; c ii represents the bias of node i. Then the contribution of the layer to be compressed can be calculated using the following formula:
Figure PCTCN2019103511-appb-000001
Figure PCTCN2019103511-appb-000001
其中,|c ij|为第k个待压缩层从节点i到节点j的连接权值的绝对值;i=1,2,3,…,N;j=1,2,3,…,N。G k越大,说明第k个待压缩层所产生的误差对整个神经网络性能的影响越大,第k个待压缩层的重要性越低,贡献度越小。 Where | c ij | is the absolute value of the connection weight of the k-th layer to be compressed from node i to node j; i = 1, 2, 3, ..., N; j = 1, 2, 3, ..., N . The larger G k indicates that the error generated by the k-th layer to be compressed has a greater impact on the performance of the entire neural network.
在步骤S208,根据排序的先后顺序,利用遗传算法对待压缩层的部分或全部执行压缩处理以获得第二神经网络模型,其中,第二神经网络模型基于预置的训练样本的精度不低于预设精度。在本实施例中,采用遗传算法对待压缩层进行压缩处理,其实现的原理是根据遗传算法“优胜劣汰”的原理,在兼顾神经网络模型精度的情况下,以“压缩待压缩层”作为准则,对待压缩层执行各种遗传操作,最后获得结构简化的待压缩层。具体实施时,可以基于压缩的适应度值为标准,选择适应度值满足要求的染色体个体执行遗传操作,以产生网络简化度最优(即结构最简化)的染色体个体,根据该最优染色体个体获得经压缩处理后的待压缩层。在本实施例中,基于压缩的适应度值是指能反映网络简化度(或网络复杂度)的适应度值,例如可以是适应度值越大,网络简化度越高,即实现了有效压缩;适应度值越小,网络简化度越低,即没有实现有效压缩。则在利用遗传算法对神经网络模型进行压缩时,可选择适应度值大的染色体个体执行遗传操作,最终在第N代群体产生的染色体个体中适应度值最大的染色体个体为最优染色体个体。In step S208, according to the sequencing order, a part or all of the compression layer to be compressed is performed by using a genetic algorithm to obtain a second neural network model, wherein the accuracy of the second neural network model based on the preset training samples is not lower than the Set the accuracy. In this embodiment, the genetic algorithm is used to perform compression processing on the compression layer. The principle of its implementation is based on the principle of "the survival of the fittest" of the genetic algorithm, and taking into account the accuracy of the neural network model, using the "compression layer to be compressed" as a criterion. Various genetic operations are performed on the layer to be compressed, and finally a structure to be compressed is obtained. In specific implementation, based on the compressed fitness value, a chromosome individual that meets the requirements can be selected to perform genetic operations to generate a chromosome individual with the best network simplification (that is, the most simplified structure). A compression-treated layer to be compressed is obtained. In this embodiment, the compression-based fitness value refers to a fitness value that can reflect network simplification (or network complexity). For example, the larger the fitness value, the higher the network simplification, that is, effective compression is achieved. ; The smaller the fitness value, the lower the network simplification, that is, no effective compression is achieved. When the genetic algorithm is used to compress the neural network model, chromosomal individuals with a large fitness value can be selected to perform genetic operations. Finally, the chromosome with the highest fitness value among the chromosome individuals generated in the Nth generation population is the optimal chromosome individual.
需要说明的是,在本发明的其他实施方式中,也可以采用适应度值越大,网络复杂度越高,即没有实现有效压缩;适应度值越小,网络复杂度越低,即实现了有效压缩,则在利用遗传算法对神经网络模型进行压缩时,可选择适应度值小的染色体个体执行遗传操作,最终在第N代群体产生的染色体个体中适应度值最小的染色体个体为最优染色体个体。It should be noted that in other embodiments of the present invention, the larger the fitness value, the higher the network complexity, that is, the effective compression is not achieved; the smaller the fitness value, the lower the network complexity, that is, the realization For effective compression, when using a genetic algorithm to compress a neural network model, a chromosome with a small fitness value can be selected to perform a genetic operation, and the chromosome with the smallest fitness value among the chromosome individuals generated in the Nth generation population is optimal Individual chromosomes.
本实施例的一些可选的实现方式中,为兼顾第一神经网络模型的精度和压缩,可通过设置一预设精度来约束第一神经网络模型的压缩,需要说明的是,该预设精度可以是第一神经网络模型的原始精度,或略低于该原始精度的数值。该预设精度可以是人为设置的,也可以是上述电子设备基于预置的算法设置的,而且该预设精度是可以根据实际需要进行调整的,本实施例不对此方面做任何限定。In some optional implementations of this embodiment, in order to balance the accuracy and compression of the first neural network model, a preset accuracy can be set to constrain the compression of the first neural network model. It should be noted that the preset accuracy It can be the original accuracy of the first neural network model, or a value slightly lower than the original accuracy. The preset accuracy may be set manually, or may be set by the foregoing electronic device based on a preset algorithm, and the preset accuracy may be adjusted according to actual needs, which is not limited in this embodiment.
本实施例的一些可选的实现方式中,压缩处理包括删除待压缩层的至少一个节点及其相应的连接,和/或,删除待压缩层的至少一个连接,以降低待压缩层的网络复杂度,即提高待压缩层的网络简化度。In some optional implementations of this embodiment, the compression process includes deleting at least one node of the layer to be compressed and its corresponding connection, and / or deleting at least one connection of the layer to be compressed, so as to reduce the network complexity of the layer to be compressed Degree, that is, to improve the network simplification of the layer to be compressed.
本实施例的一些可选的实现方式中,在每一次利用遗传算法对其中一待压缩层执行压缩处理后,利用预置的训练样本对当前的神经网络模型进行训练;若当前的神经网络模型的精度不低于预设精度,则在还剩有未执行压缩处理的待压缩层时,根据先后顺序继续对后一待 压缩层执行压缩处理,在已对全部待压缩层执行压缩处理时,将当前的神经网络模型确定为经压缩处理后的第二神经网络模型;若当前的神经网络模型的精度低于预设精度,则将前一待压缩层执行压缩处理后的神经网络模型确定为经压缩处理后的第二神经网络模型。In some optional implementations of this embodiment, after performing compression processing on one of the layers to be compressed each time by using a genetic algorithm, the preset neural network model is used to train the current neural network model; if the current neural network model is used, The accuracy of is not lower than the preset accuracy. When there are uncompressed layers to be compressed, the compression processing of the next to-be-compressed layer is continued according to the sequence. When the compression processing has been performed on all the to-be-compressed layers, The current neural network model is determined as the second neural network model after compression processing; if the accuracy of the current neural network model is lower than a preset accuracy, the neural network model after performing compression processing on the previous layer to be compressed is determined as The second neural network model after compression processing.
作为示例,假设待压缩层的数目为N,对这N层待压缩层进行排序后,得到的先后顺序如下待压缩层1,待压缩层2,待压缩层3,…,待压缩层N。首先利用遗传算法对待压缩层1执行压缩处理,然后将第一神经网络模型中未压缩处理的待压缩层1替换为经压缩处理后的待压缩层1,通过预置的训练样本对替换后的神经网络模型进行训练,得到当前神经网络模型的精度,判断该精度是否低于预设精度,若不低于预设精度则继续对待压缩层2执行压缩处理,并重复同样的步骤,以此类推,直至到对待压缩层N执行压缩处理后,当前神经网络模型的精度仍不低于预设精度,则将当前的神经网络模型(所有待压缩层均替换为压缩处理后的待压缩层)确定为经压缩处理后的第二神经网络模型。若对某一待压缩层例如待压缩层3执行压缩处理后,当前的神经网络模型(此时已经将第一神经网络模型的待压缩层1、2和3替换为经压缩处理后的待压缩层)的精度低于预设精度,则将前一待压缩层执行压缩处理后的神经网络模型(即第一神经网络模型的待压缩层1和2替换为经压缩处理后的待压缩层)确定为经压缩处理后的第二神经网络模型。As an example, it is assumed that the number of layers to be compressed is N. After sorting the N layers to be compressed, the obtained sequence is as follows: layer 1 to be compressed, layer 2 to be compressed, layer 3 to be compressed, ..., layer N to be compressed. First use genetic algorithm to perform compression processing on the layer 1 to be compressed, and then replace the uncompressed layer 1 to be compressed in the first neural network model with the compressed layer 1 to be compressed. The neural network model is trained to obtain the accuracy of the current neural network model, determine whether the accuracy is lower than the preset accuracy, and if it is not lower than the preset accuracy, continue to perform compression processing on the compression layer 2 and repeat the same steps, and so on , Until after the compression processing is performed on the layer N to be compressed, the accuracy of the current neural network model is still not lower than the preset accuracy, then the current neural network model (all layers to be compressed are replaced by the layer to be compressed after compression processing) is determined The second neural network model after compression processing. If compression processing is performed on a layer to be compressed, for example, layer 3 to be compressed, the current neural network model (at this time, the layers to be compressed 1, 2 and 3 of the first neural network model have been replaced with the compression to be compressed) Layer), if the accuracy is lower than the preset accuracy, the neural network model after the compression processing is performed on the previous layer to be compressed (that is, the layers 1 and 2 to be compressed of the first neural network model are replaced by the compression to be compressed layers) Determined as the second neural network model after compression processing.
需要说明的是,上述电子设备在对经压缩处理后的神经网络模型进行训练时,可以对当前的神经网络模型进行微调(fine-tuning)。这样可以将略低于预设精度的神经网络模型微调至满足预设精度要求,从而可进一步对神经网络模型进行压缩。It should be noted that, when the above-mentioned electronic device trains the neural network model after compression processing, the current neural network model can be fine-tuning. In this way, the neural network model that is slightly lower than the preset accuracy can be fine-tuned to meet the preset accuracy requirement, so that the neural network model can be further compressed.
本实施例中,上述电子设备可存储经压缩处理获得的第二神经网络模型,例如存储至上述电子设备本地(例如硬盘或内存)或与上述电子设备远程通信连接的服务器。In this embodiment, the electronic device may store a second neural network model obtained through compression processing, for example, a server that is stored locally (for example, a hard disk or a memory) of the electronic device or is remotely connected to the electronic device.
从以上描述可以看出,本发明实施例所提供的方案利用遗传算法对经训练好的神经网络模型进行压缩,降低了神经网络模型的计算量和存储空间,使其能应用于存储和计算资源都受限的设备。进一步的,本发明的实施例的方案能同时兼顾神经网络模型的精度和压缩。As can be seen from the above description, the solution provided by the embodiment of the present invention utilizes a genetic algorithm to compress a trained neural network model, reduces the calculation amount and storage space of the neural network model, and enables it to be applied to storage and computing resources Both are restricted devices. Further, the solution of the embodiment of the present invention can simultaneously take into account the accuracy and compression of the neural network model.
图3示出了按照本发明的一个实施例的利用遗传算法对待压缩层执行压缩处理的方法的流程图,图3所示的方法300可以由计算机或具有计算能力的电子设备(例如图1所示的服务器104)来执行。此外,本领域技术人员将理解,执行方法300的任何系统都在本发明的实施例的范围和精神内。FIG. 3 shows a flowchart of a method for performing a compression process on a compression layer using a genetic algorithm according to an embodiment of the present invention. The method 300 shown in FIG. 3 may be implemented by a computer or an electronic device having computing capabilities (for example, as shown in FIG. 1). Server 104). In addition, those skilled in the art will understand that any system that performs the method 300 is within the scope and spirit of embodiments of the present invention.
如图3所示,在步骤S302,获取待压缩层的网络结构信息。网络结构可由一联结矩阵来表示,如一N×N矩阵C=(c ij)N×N表示一个有N个节点的网络结构,其中c ij的值表示从节点i到节点j的连接权;c ij=0表示从节点i到节点j之间无连接;c ii表示节点i的偏置。 As shown in FIG. 3, in step S302, network structure information of a layer to be compressed is acquired. The network structure can be represented by a connection matrix, such as an N × N matrix C = (c ij ) N × N represents a network structure with N nodes, where the value of c ij represents the connection right from node i to node j; c ij = 0 indicates no connection from node i to node j; c ii indicates the offset of node i.
在步骤S304,根据待压缩层的网络结构信息,对待压缩层进行编码,以得到一染色体。神经网络的结构需要表示为一个遗传算法个体染色体编码,才能够用遗传算法来进行计算。在一实施方式中,设待压缩层有N个神经元,序号是从1到N排列的节点,可用一N×N矩阵 来表示待压缩层的网络结构。现以图3a所示具有7个节点的神经网络结构作为示例,以阐述本实施例对神经网络模型的编码方法。表1为该神经网络结构的节点连接关系,在表1中,矩阵中(i,j)对应的元素表示从第i个节点到第j个节点的连接关系。由于本发明实施例在对神经网络模型进行压缩时不会涉及对待压缩神经网络模型连接权的改动,因此本实施例将节点的连接关系表示为0,1,-1的形式,其中,“0”表示没有连接;“1”表示连接权值为1,具有激发(excitory)作用,图3a中以实线表示;“-1”表示连接权值为-1,具有抑制(inhibitory)作用,图3a中以虚线表示。由此可见,表1与图3a所示结构等价。In step S304, according to the network structure information of the layer to be compressed, the layer to be compressed is encoded to obtain a chromosome. The structure of the neural network needs to be expressed as a genetic algorithm's individual chromosome code in order to be able to perform calculations with the genetic algorithm. In one embodiment, it is assumed that there are N neurons in the layer to be compressed, and the nodes are numbered from 1 to N. An N × N matrix may be used to represent the network structure of the layer to be compressed. The neural network structure with 7 nodes shown in FIG. 3a is taken as an example to illustrate the method for encoding the neural network model in this embodiment. Table 1 is the node connection relationship of the neural network structure. In Table 1, the element corresponding to (i, j) in the matrix represents the connection relationship from the i-th node to the j-th node. Because the embodiment of the present invention does not involve changing the connection right of the neural network model to be compressed when compressing the neural network model, this embodiment expresses the connection relationship of the nodes as a form of 0, 1, -1, where "0 "1" indicates no connection; "1" indicates that the connection weight is 1, which has an excitation effect, which is indicated by a solid line in Figure 3a; "-1" indicates that the connection weight is -1, which has an inhibitory effect, as shown in the figure. 3a is indicated by a dotted line. It can be seen that Table 1 is equivalent to the structure shown in Figure 3a.
表1、本实施例示例神经网络结构连接关系Table 1. Example neural network structure connection relationship in this embodiment
Figure PCTCN2019103511-appb-000002
Figure PCTCN2019103511-appb-000002
根据表1所示的节点连接关系,可以将该神经网络的编码表示为0,1,-1组成的数字串形式,将元素(3,1)到元素(7,6)自左至右、自上而下顺序连接起来,组成下面的染色体编码:According to the node connection relationship shown in Table 1, the coding of the neural network can be expressed as a digital string form composed of 0, 1, and -1, from element (3,1) to element (7,6) from left to right, Connected from top to bottom, they form the following chromosome code:
Figure PCTCN2019103511-appb-000003
Figure PCTCN2019103511-appb-000003
在步骤S306,根据上述得到的染色体,进行群体初始化生成初始群体。在本实施例中,具体实施时,可对上述得到的染色体执行复制操作,随机生成预定数量的染色体个体,将这些染色体个体的集合作为初始群体。初始群体的大小由群体规模M来确定,群体规模M可以例如但不限于10~100。由于采用的是复制操作,因此初始群体中的所有染色体个体均相同。In step S306, based on the chromosomes obtained above, population initialization is performed to generate an initial population. In this embodiment, during specific implementation, a replication operation may be performed on the chromosomes obtained above, a predetermined number of chromosome individuals are randomly generated, and the set of these chromosome individuals is used as an initial population. The size of the initial population is determined by the population size M, which may be, for example, but not limited to, 10-100. Because of the replication operation, all chromosomal individuals in the initial population are the same.
在步骤S308,计算群体中染色体个体的适应度值。在本实施例的一些可选的实现方式中,适应度函数可以采用以下公式:In step S308, the fitness value of individual chromosomes in the population is calculated. In some optional implementations of this embodiment, the fitness function may use the following formula:
Figure PCTCN2019103511-appb-000004
Figure PCTCN2019103511-appb-000005
Figure PCTCN2019103511-appb-000004
or
Figure PCTCN2019103511-appb-000005
其中,f(i,t)表示第t代的第i个体的适应度;E(i,t)表示第t代的第i个体对应的神经网络模型的网络误差;H(i,t)表示第t代的第i个体的网络简化度。Among them, f (i, t) represents the fitness of the i-th individual of the t-th generation; E (i, t) represents the network error of the neural network model corresponding to the i-th individual of the t-th generation; H (i, t) represents Network simplification of the i-th individual in the t-th generation.
具体实施时,E(i,t)可采用以下公式计算:In specific implementation, E (i, t) can be calculated using the following formula:
Figure PCTCN2019103511-appb-000006
Figure PCTCN2019103511-appb-000006
其中,
Figure PCTCN2019103511-appb-000007
分别为第t代的第i个体对应的神经网络模型基于预置的第q个训练样本的期望输出值和实际输出值。网络误差值越小,精度越高。
among them,
Figure PCTCN2019103511-appb-000007
The neural network model corresponding to the i-th individual of the t-th generation is based on the expected output value and the actual output value of the preset q-th training sample. The smaller the network error value, the higher the accuracy.
H(i,t)可采用以下公式计算:H (i, t) can be calculated using the following formula:
Figure PCTCN2019103511-appb-000008
Figure PCTCN2019103511-appb-000008
其中,m(i,t)为第t代的第i个体的节点个数。节点个数越少,网络简化度值越大,网络简化度越高,神经网络模型越简化。Among them, m (i, t) is the number of nodes of the i-th individual in the t-th generation. The fewer the number of nodes, the larger the network simplification value, the higher the network simplification, and the simpler the neural network model.
该实现方式中,利用网络误差E(i,t)来约束对待压缩神经网络模型的压缩处理过程,能同时兼顾精度和压缩。网络误差E(i,t)越小,则压缩处理后的神经网络模型的精度越高。网络简化度值越大,则压缩处理后的神经网络模型的结构越简化。因此,在本实施方式中,网络误差越小、网络简化度越大的染色体个体,适应度值越大。In this implementation manner, the network error E (i, t) is used to constrain the compression process of the neural network model to be compressed, and both accuracy and compression can be taken into account at the same time. The smaller the network error E (i, t), the higher the accuracy of the neural network model after compression processing. The larger the network simplification value, the simpler the structure of the neural network model after compression processing. Therefore, in this embodiment, the smaller the network error and the larger the simplification degree of the network is, the larger the fitness value is.
在本实施例的其他可选的实现方式中,适应度函数还可以采用以下公式:In other optional implementations of this embodiment, the fitness function may also use the following formula:
Figure PCTCN2019103511-appb-000009
Figure PCTCN2019103511-appb-000009
其中,f(i,t)表示第t代的第i个体的适应度;E(i,t)表示第t代的第i个体对应的神经网络模型的网络误差;H(i,t)表示第t代的第i个体的网络简化度。Among them, f (i, t) represents the fitness of the i-th individual of the t-th generation; E (i, t) represents the network error of the neural network model corresponding to the i-th individual of the t-th generation; H (i, t) represents Network simplification of the i-th individual in the t-th generation.
本实施例中,适应度函数包括公式①和公式②。其中,公式①基于网络误差的适应度函数,其反映的是神经网络模型的精度;公式②是基于网络简化度的适应度函数,其反映的是神经网络模型的压缩。由此本实施例分别计算染色体个体基于精度的适应度值和基于压缩的适应度值。In this embodiment, the fitness function includes formula ① and formula ②. Among them, formula ① is a fitness function based on network errors, which reflects the accuracy of the neural network model; formula ② is a fitness function based on network simplification, which reflects the compression of neural network models. Therefore, in this embodiment, the accuracy-based fitness value and the compression-based fitness value of the individual chromosome are calculated separately.
在步骤S310,判断是否达到终止条件。其中,终止条件可以包括预先设定的迭代次数阈值或设定的收敛条件。迭代次数例如但不限于可以设置为500次,但迭代次数达到500次时即判断为达到终止条件。收敛条件例如但不限于可以设置为当适应度值满足一定的条件时,判断为达到终止条件,例如可以设定适应度值大于预设阈值。In step S310, it is determined whether a termination condition is reached. The termination condition may include a preset threshold for the number of iterations or a set convergence condition. The number of iterations can be set to, for example, but not limited to, 500, but it is determined that the termination condition is reached when the number of iterations reaches 500. The convergence condition may be set, for example, but not limited to, when the fitness value meets a certain condition, it is determined that the termination condition is reached. For example, the fitness value may be set to be greater than a preset threshold.
在步骤S312,若步骤S310判断结果为未达到终止条件,则以适应度值为标准,选择部分适应度值满足要求的染色体个体,执行复制、交叉或变异等遗传操作,从而产生新一代群体,然后返回步骤S308。根据S308的适应度值函数,本实施例选择适应度值相对较大的染色体个体执行遗传操作,并淘汰一些适应度值较小的染色体个体。In step S312, if it is determined in step S310 that the termination condition is not met, then using the fitness value as a standard, select a chromosome individual whose fitness value meets the requirements and perform genetic operations such as replication, crossover, or mutation to generate a new generation of population. Then, it returns to step S308. According to the fitness value function of S308, this embodiment selects a chromosomal individual with a relatively large fitness value to perform a genetic operation, and eliminates some chromosomal individuals with a small fitness value.
当采用分别计算染色体个体基于精度的适应度值和基于压缩的适应度值时,本实施例的选择标准可以采用以下步骤:(1)以公式①计算群体中每个染色体个体基于精度的适应度值,然后计算个体被选中的第一选择概率,根据所述第一选择概率选择出第一染色体个体;(2) 以公式②计算群体中每个染色体个体基于压缩的适应度值,然后计算个体被选中的第二选择概率,根据所述第二选择概率从步骤(1)选择出的第一染色体个体中选择第二染色体个体。可选的,在根据选择概率选择染色体个体前,可先找出当前群体中适应度值最高和最低的染色体个体,将最佳染色体个体保留直接进入下一代,淘汰最差染色体个体,这样能保证将优良的基因遗传给下一代。本实施例的选择策略能通过精度约束对待压缩层的压缩处理过程,能保证网络误差小、网络简化度大的染色体个体进入下一代。When the accuracy-based fitness value and the compression-based fitness value of the individual chromosomes are separately calculated, the selection criteria of this embodiment may adopt the following steps: (1) Calculate the accuracy-based fitness of each individual chromosome individual in the population by formula ① Value, and then calculate the first selection probability of the individual being selected, and select the first chromosome individual according to the first selection probability; (2) Calculate the fitness value of each chromosome individual in the population based on the compression, and then calculate the individual The selected second selection probability is to select a second chromosome individual from the first chromosome individuals selected in step (1) according to the second selection probability. Optionally, before selecting chromosome individuals according to the selection probability, the chromosome individuals with the highest and lowest fitness values in the current population can be found, the best chromosome individuals are retained and directly entered into the next generation, and the worst chromosome individuals are eliminated, which can ensure Pass on good genes to the next generation. The selection strategy of this embodiment can restrict the compression process of the compression layer through accuracy constraints, and can ensure that chromosome individuals with small network errors and large network simplifications enter the next generation.
本实施例的一些可选的实现方式中,可以采用适应度-比例选择法(轮盘选择法)是一种常用的选择方法,其含义是适应度越高,被选中的概率越大,即:In some optional implementations of this embodiment, the fitness-proportion selection method (roulette selection method) can be used as a commonly used selection method, which means that the higher the fitness, the greater the probability of being selected, that is, :
Figure PCTCN2019103511-appb-000010
Figure PCTCN2019103511-appb-000010
其中,p(i,t)为第t代第i个体的选择概率,f(i,t)为第t代第i个体的适应度,f(sum,t)为第t代群体总适应度。Among them, p (i, t) is the selection probability of the i-th individual in the t-th generation, f (i, t) is the fitness of the i-th individual in the t-th generation, and f (sum, t) is the total fitness of the t-th population. .
对被选出的染色体个体执行复制、交叉或变异操作。其中,复制操作是指将被选出的父代染色体个体在未经任何变化的条件下从当前代直接复制到新一代个体中。交叉操作是指从群体中按上述的选择方法随机选择两个父代染色体个体,将两个父代染色体个体的部分组分相互替代,形成新的子代染色体个体。变异操作是指从群体中按上述的选择方法随机选择一个父代染色体个体,然后在该个体的表达式上随机选定一个结点作为变异点,通过将该变异点基因的值变为另一个有效值,形成新的子代染色体个体。Perform replication, crossover, or mutation operations on selected chromosomes. Among them, the replication operation refers to directly copying the selected parental chromosome individuals from the current generation to the new generation of individuals without any change. Cross operation refers to randomly selecting two parental chromosome individuals from the population according to the above-mentioned selection method, replacing some components of the two parental chromosome individuals with each other to form a new offspring chromosome individual. The mutation operation refers to randomly selecting a parent chromosome individual from the population according to the selection method described above, and then randomly selecting a node as a mutation point in the expression of the individual, and changing the value of the mutation point gene to another Valid values, forming new offspring chromosome individuals.
交叉操作是否发生可根据交叉概率P c来决定,其方法为,随机产生一个0~1之间的随机数P,当P≤P c,交叉操作发生,当P>P c,交叉不发生。同样,变异操作是否发生也可根据变异概率P m来决定,由于为现有技术,在此省略对其的描述。 Whether a cross operation occurs can be determined according to the cross probability P c . The method is to randomly generate a random number P between 0 and 1. When P ≤ P c , the cross operation occurs, and when P> P c , the cross does not occur. Similarly, whether the mutation operation occurs can also be determined according to the mutation probability P m . Since it is the prior art, the description thereof is omitted here.
本实施例中,执行交叉操作时,可在每个父代染色体个体中按照一定概率随机选择一个交叉点,交叉点以下部分称为交叉段。第一个父代染色体个体删除其交叉段后,把第二个父代染色体个体的交叉段插入到他的交叉点处,这样就生成了第一个子代染色体个体。同样,第二个父代染色体个体删除其交叉段后,将第一个父代染色体个体的交叉段插入到他的交叉点处后而形成第二个子代染色体个体。这种情况下,如果选择的两个父代染色体个体相同,但由于其交叉点不同,所产生的子代染色体个体也不相同,有效避免了近亲繁殖,提高了全局搜索能力。In this embodiment, when performing a crossover operation, a crossover point may be randomly selected in each parent chromosome individual with a certain probability, and the lower part of the crossover point is referred to as a crossover segment. After the first parental chromosome individual deletes its crossover segment, the second parental chromosome individual's crossover segment is inserted at his intersection, so that the first offspring chromosome individual is generated. Similarly, after the second parent chromosome individual deletes its cross section, the first parent chromosome individual's cross section is inserted at his intersection to form a second offspring chromosome individual. In this case, if the two parents of the selected chromosomes are the same, but due to their different intersections, the resulting offspring chromosomes are also different, which effectively avoids inbreeding and improves the global search ability.
本实施例中,执行变异操作时,可以是随机采用以下操作之一:(a)删除神经网络模型隐含层中的至少一个节点及其相应的连接;(b)删除神经网络模型隐含层中的至少一个连接;(c)对被删除的节点或连接以一定的概率进行随机修复;(d)增加隐含层节点,随机产生相应的连接权值。其中,删除节点总是先于增加节点,且增加的节点数不应大于删除的节点数,同时,只有当删除节点不能产生一个好的子代时,才增加节点,这样的变异操作能保证方法 始终往压缩神经网络模型的方向进行。In this embodiment, when performing the mutation operation, one of the following operations may be adopted randomly: (a) delete at least one node in the hidden layer of the neural network model and its corresponding connection; (b) delete the hidden layer of the neural network model At least one of the connections; (c) Randomly repair the deleted node or connection with a certain probability; (d) Add hidden layer nodes to randomly generate corresponding connection weights. Among them, deleting nodes always precedes adding nodes, and the number of added nodes should not be greater than the number of deleted nodes. At the same time, only when the deleted nodes cannot produce a good child, the nodes are added. Such a mutation operation can guarantee the method Always go in the direction of compressing the neural network model.
在步骤S314,若步骤S210判断结果为达到终止条件,则输出适应度值最优的染色体个体,以此获得经压缩处理后的待压缩层。In step S314, if the determination result in step S210 is that the termination condition is reached, the chromosome individual with the best fitness value is output, so as to obtain the compressed layer to be compressed.
本实施例的一些可选的实现方式中,最优染色体个体可设置为max f(i,t),即将达到终止条件时具有最大适应度的染色体个体作为最优染色体个体。对最优染色体个体执行解码操作,即可得到待压缩层最优的网络结构。In some optional implementations of this embodiment, the optimal chromosome individual may be set to max (f, i), and the chromosome individual having the greatest fitness when the termination condition is reached is regarded as the optimal chromosome individual. By performing the decoding operation on the optimal chromosome individual, the optimal network structure of the layer to be compressed can be obtained.
图4示出了按照本发明的一个实施例的神经网络模型压缩装置的示意图。图4所示的装置400与上述神经网络模型压缩方法相对应,由于装置400的实施例基本相似于方法的实施例,所以描述得比较简单,相关之处参见方法实施例的部分说明即可。装置400可以利用软件、硬件或软硬件结合的方式来实现,可以安装在计算机或其他合适的具有计算能力的电子设备中。FIG. 4 shows a schematic diagram of a neural network model compression device according to an embodiment of the present invention. The device 400 shown in FIG. 4 corresponds to the above-mentioned neural network model compression method. Since the embodiment of the device 400 is basically similar to the method embodiment, it is described relatively simply. For the relevant part, refer to the description of the method embodiment. The device 400 may be implemented in software, hardware, or a combination of software and hardware, and may be installed in a computer or other suitable electronic device with computing capabilities.
如图4所示,装置400可以包括获取模块402、选取模块404、排序模块406和压缩模块408。获取模块402用于获取经训练后的第一神经网络模型。选取模块404用于从第一神经网络模型的各层中选取至少一层作为待压缩层。排序模块406用于按照预设规则对待压缩层进行排序。压缩模块408用于根据排序的先后顺序,利用遗传算法对待压缩层的部分或全部执行压缩处理以获得第二神经网络模型,其中,第二神经网络模型基于预置的训练样本的精度不低于预设精度。As shown in FIG. 4, the device 400 may include an acquisition module 402, a selection module 404, a sorting module 406, and a compression module 408. The obtaining module 402 is configured to obtain a trained first neural network model. The selection module 404 is configured to select at least one layer from each layer of the first neural network model as a layer to be compressed. The sorting module 406 is configured to sort the compressed layers according to a preset rule. The compression module 408 is configured to perform a compression process on a part or all of the compression layer using a genetic algorithm to obtain a second neural network model according to the sequencing order, wherein the accuracy of the second neural network model based on the preset training samples is not lower than Preset precision.
在装置400的一实施方式中,排序模块406具体用于按照待压缩层在第一神经网络模型中所处的层级的层级数由大到小对待压缩层进行先后排序。In an embodiment of the apparatus 400, the ranking module 406 is specifically configured to sort the layers to be compressed according to the number of levels of the layers to be compressed in the first neural network model.
在装置400的另一实施方式中,排序模块406具体用于按照待压缩层对第一神经网络模型损失的贡献度对由小到大对待压缩层进行先后排序。In another implementation of the apparatus 400, the ranking module 406 is specifically configured to sort the layers to be compressed according to the contribution of the layer to be compressed to the loss of the first neural network model.
在装置400的又一实施方式中,压缩模块408包括训练单元和确定单元。训练单元用于在每一次利用遗传算法对其中一待压缩层执行压缩处理后,利用预置的训练样本对当前的神经网络模型进行训练。确定单元用于若当前的神经网络模型的精度不低于预设精度,则在还剩有未执行压缩处理的待压缩层时,继续对后一待压缩层执行压缩处理,在已对全部待压缩层执行压缩处理时,将当前的神经网络模型确定为经压缩处理后获得的第二神经网络模型;若当前的神经网络模型的精度低于预设精度,则将前一待压缩层执行压缩处理后的神经网络模型确定为经压缩处理后获得的第二神经网络模型。In yet another embodiment of the apparatus 400, the compression module 408 includes a training unit and a determination unit. The training unit is configured to train a current neural network model using a preset training sample after performing compression processing on one of the layers to be compressed each time by using a genetic algorithm. The determining unit is configured to: if the accuracy of the current neural network model is not lower than a preset accuracy, when there is a layer to be compressed that has not been compressed yet, continue to perform compression processing on the next layer to be compressed, and When the compression layer performs compression processing, the current neural network model is determined as the second neural network model obtained after the compression processing; if the accuracy of the current neural network model is lower than the preset accuracy, the previous layer to be compressed is compressed. The processed neural network model is determined as the second neural network model obtained after compression processing.
在装置400的再一实施方式中,压缩模块408还包括获取单元、编码单元、初始化单元、计算单元、判断单元、遗传操作单元和输出单元。获取单元用于获取待压缩层的网络结构信息。编码单元用于根据待压缩层的网络结构信息,对待压缩层进行编码,以获得一染色体。初始化单元用于根据获得的一染色体,进行群体初始化生成初始群体。计算单元用于计算群体中染色体个体的适应度值。判断单元用于判断是否达到终止条件。遗传操作单元用于若未 达到终止条件,则以适应度值为标准,选择部分适应度值满足要求的染色体个体,执行复制、交叉或变异操作,从而产生新一代群体。输出单元用于若达到终止条件,则输出适应度值最优的染色体个体,以此获得经压缩处理后的待压缩层。In still another embodiment of the apparatus 400, the compression module 408 further includes an acquisition unit, a coding unit, an initialization unit, a calculation unit, a judgment unit, a genetic operation unit, and an output unit. The obtaining unit is configured to obtain network structure information of a layer to be compressed. The encoding unit is configured to encode the layer to be compressed according to the network structure information of the layer to be compressed to obtain a chromosome. The initialization unit is configured to perform population initialization according to a chromosome obtained to generate an initial population. The calculation unit is used to calculate the fitness value of the individual chromosomes in the population. The judging unit is used to judge whether the termination condition is reached. The genetic operation unit is used to select a chromosome individual whose fitness value meets the requirements based on the fitness value if the termination condition is not reached, and perform replication, crossover or mutation operations to generate a new generation of population. The output unit is used to output the chromosome individual with the best fitness value if the termination condition is reached, so as to obtain the compressed layer to be compressed.
在装置400的又再一实施方式中,计算单元进一步用于分别计算群体中染色体个体基于精度和基于压缩的适应度值。相应的,遗传操作单元进一步用于根据基于精度的适应度值,获取群体中染色体个体的第一选择概率,根据第一选择概率选择第一染色体个体,以及,根据基于压缩的适应度值,获取群体中染色体个体的第二选择概率,根据第二选择概率从第一染色体个体中选择第二染色体个体;对第二染色体个体执行复制、交叉或变异操作,从而产生新一代群体。In still another embodiment of the apparatus 400, the calculation unit is further configured to calculate the precision-based and compression-based fitness values of the individual chromosomes in the population, respectively. Correspondingly, the genetic operation unit is further configured to obtain the first selection probability of the chromosome individuals in the population according to the fitness value based on the accuracy, select the first chromosome individual according to the first selection probability, and obtain the compression value based on the fitness value. The second selection probability of the chromosomal individuals in the population, and the second chromosomal individual is selected from the first chromosomal individuals according to the second selection probability; the second chromosomal individuals are copied, crossed or mutated to generate a new generation of population.
图5示出了按照本发明的一个实施例的计算机设备的示意图。如图5所示,计算机设备500可以包括处理器502和存储器504,其中,存储器502上存储有可执行指令,其中,所述可执行指令当被执行时使得处理器502执行图2所示的方法200或图3所示的方法300。Figure 5 shows a schematic diagram of a computer device according to an embodiment of the invention. As shown in FIG. 5, the computer device 500 may include a processor 502 and a memory 504, where the memory 502 stores executable instructions, where the executable instructions, when executed, cause the processor 502 to execute the instructions shown in FIG. The method 200 or the method 300 shown in FIG. 3.
图6示出了适于用来实现本发明实施方式的示例性计算机设备的框图。图6所示的计算机设备600仅仅是一个示例,不应对本发明实施例的功能和使用范围带来任何限制。FIG. 6 shows a block diagram of an exemplary computer device suitable for use in implementing embodiments of the present invention. The computer device 600 shown in FIG. 6 is only an example, and should not impose any limitation on the functions and scope of use of the embodiments of the present invention.
如图6所示,计算机设备600以通用计算设备的形式实现。计算机设备600的组件可以包括但不限于:处理器602,系统存储器604,连接不同系统组件(包括处理器602和系统存储器604)的总线606。As shown in FIG. 6, the computer device 600 is implemented in the form of a general-purpose computing device. The components of the computer device 600 may include, but are not limited to, a processor 602, a system memory 604, and a bus 606 connecting different system components (including the processor 602 and the system memory 604).
总线606表示几类总线结构中的一种或多种,包括存储器总线或者存储器控制器,外围总线,图形加速端口,处理器或者使用多种总线结构中的任意总线结构的局域总线。举例来说,这些体系结构包括但不限于工业标准体系结构(ISA)总线,微通道体系结构(MAC)总线,增强型ISA总线、视频电子标准协会(VESA)局域总线以及外围组件互连(PCI)总线。The bus 606 represents one or more of several types of bus structures, including a memory bus or a memory controller, a peripheral bus, a graphics acceleration port, a processor, or a local area bus using any of a variety of bus structures. By way of example, these architectures include, but are not limited to, the Industry Standard Architecture (ISA) bus, the Micro Channel Architecture (MAC) bus, the enhanced ISA bus, the Video Electronics Standards Association (VESA) local area bus, and peripheral component interconnects ( PCI) bus.
计算机设备600典型地包括多种计算机系统可读介质。这些介质可以是任何能够被计算机设备600访问的可用介质,包括易失性和非易失性介质,可移动的和不可移动的介质。 Computer device 600 typically includes a variety of computer system-readable media. These media can be any available media that can be accessed by the computer device 600, including volatile and non-volatile media, removable and non-removable media.
系统存储器604可以包括易失性存储器形式的计算机系统可读介质,例如随机存取存储器(RAM)608和和/或高速缓存存储器610。计算机设备600可以进一步包括其它可移动/不可移动的、易失性/非易失性计算机系统存储介质。仅作为举例,存储系统612可以用于读写不可移动的、非易失性磁介质(图6未显示,通常称为“硬盘驱动器”)。尽管图6中未示出,可以提供用于对可移动非易失性磁盘(例如“软盘”)读写的磁盘驱动器,以及对可移动非易失性光盘(例如CD-ROM,DVD-ROM或者其它光介质)读写的光盘驱动器。在这些情况下,每个驱动器可以通过一个或者多个数据介质接口与总线606相连。系统存储器604可以包括至少一个程序产品,该程序产品具有一组(例如至少一个)程序模块,这些程序模块被配置以执行本发明上述图1或图2实施例的功能。System memory 604 may include computer system-readable media in the form of volatile memory, such as random access memory (RAM) 608 and / or cache memory 610. Computer device 600 may further include other removable / non-removable, volatile / nonvolatile computer system storage media. For example only, the storage system 612 may be used to read and write non-removable, non-volatile magnetic media (not shown in FIG. 6 and is commonly referred to as a "hard drive"). Although not shown in FIG. 6, a disk drive for reading and writing to a removable non-volatile disk (such as a "floppy disk"), and a removable non-volatile optical disk (such as a CD-ROM, DVD-ROM, etc.) may be provided. Or other optical media). In these cases, each drive may be connected to the bus 606 through one or more data medium interfaces. The system memory 604 may include at least one program product having a set (for example, at least one) of program modules configured to perform the functions of the embodiment of FIG. 1 or FIG. 2 of the present invention.
具有一组(至少一个)程序模块616的程序/实用工具614,可以存储在例如系统存储器 604中,这样的程序模块616包括但不限于操作系统、一个或者多个应用程序、其它程序模块以及程序数据,这些示例中的每一个或某种组合中可能包括网络环境的实现。程序模块616通常执行本发明所描述的上述图1或图2实施例中的功能和/或方法。A program / utility tool 614 having a set (at least one) of program modules 616 may be stored in, for example, system memory 604. Such program modules 616 include, but are not limited to, an operating system, one or more application programs, other program modules, and programs Data, each or some combination of these examples may include an implementation of the network environment. The program module 616 generally performs the functions and / or methods in the embodiment of FIG. 1 or FIG. 2 described in the present invention.
计算机设备600也可以与一个或多个外部设备700(例如键盘、指向设备、显示器800等)通信,还可与一个或者多个使得用户能与该计算机设备600交互的设备通信,和/或与使得该计算机设备600能与一个或多个其它计算设备进行通信的任何设备(例如网卡,调制解调器等等)通信。这种通信可以通过输入/输出(I/O)接口618进行。并且,计算机设备600还可以通过网络适配器520与一个或者多个网络(例如局域网(LAN),广域网(WAN)和/或公共网络,例如因特网)通信。如图所示,网络适配器620通过总线606与计算机设备600的其它模块通信。应当明白,尽管图中未示出,可以结合计算机设备600使用其它硬件和/或软件模块,包括但不限于:微代码、设备驱动器、冗余处理器、外部磁盘驱动阵列、RAID系统、磁带驱动器以及数据备份存储系统等。The computer device 600 may also communicate with one or more external devices 700 (such as a keyboard, pointing device, display 800, etc.), and may also communicate with one or more devices that enable a user to interact with the computer device 600, and / or with Any device (eg, network card, modem, etc.) that enables the computer device 600 to communicate with one or more other computing devices. This communication can take place through an input / output (I / O) interface 618. Moreover, the computer device 600 may also communicate with one or more networks (such as a local area network (LAN), a wide area network (WAN), and / or a public network, such as the Internet) through the network adapter 520. As shown, the network adapter 620 communicates with other modules of the computer device 600 through the bus 606. It should be understood that although not shown in the figures, other hardware and / or software modules may be used in conjunction with the computer device 600, including but not limited to: microcode, device drivers, redundant processors, external disk drive arrays, RAID systems, tape drives And data backup storage systems.
处理器602通过运行存储在系统存储器604中的程序,从而执行各种功能应用以及数据处理,例如实现上述实施例所示的神经网络模型压缩方法。The processor 602 executes various functional applications and data processing by running a program stored in the system memory 604, for example, implementing the neural network model compression method shown in the foregoing embodiment.
本发明的实施例还提供一种计算机可读介质,其上存储有可执行指令,其中,所述可执行指令当被执行时使得计算机执行图2所示的方法200或图3所示的方法300。An embodiment of the present invention further provides a computer-readable medium having executable instructions stored thereon, where the executable instructions, when executed, cause a computer to execute the method 200 shown in FIG. 2 or the method shown in FIG. 3 300.
本实施例的计算机可读介质可以包括上述图6所示实施例中的系统存储器604中的RAM608、和/或高速缓存存储器610、和/或存储系统612。The computer-readable medium of this embodiment may include the RAM 608 and / or the cache memory 610 and / or the storage system 612 in the system memory 604 in the embodiment shown in FIG. 6.
随着科技的发展,计算机程序的传播途径不再受限于有形介质,还可以直接从网络下载,或者采用其他方式获取。因此,本实施例中的计算机可读介质不仅可以包括有形的介质,还可以包括无形的介质。With the development of science and technology, the spread of computer programs is no longer limited to tangible media, and can also be downloaded directly from the network or obtained by other means. Therefore, the computer-readable medium in this embodiment may include not only a tangible medium but also an intangible medium.
本实施例的计算机可读介质可以采用一个或多个计算机可读的介质的任意组合。计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质。计算机可读存储介质例如可以是但不限于电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本文件中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。The computer-readable medium of this embodiment may adopt any combination of one or more computer-readable media. The computer-readable medium may be a computer-readable signal medium or a computer-readable storage medium. The computer-readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples (non-exhaustive list) of computer-readable storage media include: electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), Erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the foregoing. In this document, a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in combination with an instruction execution system, apparatus, or device.
计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质 以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。The computer-readable signal medium may include a data signal in baseband or propagated as part of a carrier wave, which carries a computer-readable program code. Such a propagated data signal may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. The computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, and the computer-readable medium may send, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device .
计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于无线、电线、光缆、RF等等,或者上述的任意合适的组合。Program code embodied on a computer-readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
可以以一种或多种程序设计语言或其组合来编写用于执行本发明操作的计算机程序代码,所述程序设计语言包括面向对象的程序设计语言,诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言,诸如”C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络,包括局域网(LAN)或广域网(WAN),连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。Computer program code for performing the operations of the present invention may be written in one or more programming languages, or combinations thereof, including programming languages such as Java, Smalltalk, C ++, and also conventional Procedural programming language, such as "C" or similar programming language. The program code can be executed entirely on the user's computer, partly on the user's computer, as an independent software package, partly on the user's computer, partly on a remote computer, or entirely on a remote computer or server. In the case of a remote computer, the remote computer can be connected to the user's computer through any kind of network, including a local area network (LAN) or wide area network (WAN), or it can be connected to an external computer (such as through the Internet using an Internet service provider) connection).
本领域的技术人员应明白,本发明实施例可提供为方法、装置、或计算机程序产品。因此,本发明实施例可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本发明实施例可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM等)上实施的计算机程序产品的形式。Those skilled in the art should understand that the embodiments of the present invention may be provided as a method, an apparatus, or a computer program product. Therefore, the embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the embodiments of the present invention may take the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, etc.) containing computer-usable program code.
本发明实施例是参照根据本发明实施例的方法、装置、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理终端设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理终端设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。Embodiments of the present invention are described with reference to flowcharts and / or block diagrams of methods, apparatuses, and computer program products according to embodiments of the present invention. It should be understood that each process and / or block in the flowcharts and / or block diagrams, and combinations of processes and / or blocks in the flowcharts and / or block diagrams can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal device to produce a machine, such that the instructions executed by the processor of the computer or other programmable data processing terminal device Means are generated for implementing the functions specified in one or more of the flowcharts and / or one or more of the block diagrams.
上面结合附图阐述的具体实施方式描述了示例性实施例,但并不表示可以实现的或者落入权利要求书的保护范围的所有实施例。在整个本说明书中使用的术语“示例性”意味着“用作示例、实例或例示”,并不意味着比其它实施例“优选”或“具有优势”。出于提供对所描述技术的理解的目的,具体实施方式包括具体细节。然而,可以在没有这些具体细节的情况下实施这些技术。在一些实例中,为了避免对所描述的实施例的概念造成难以理解,公知的结构和装置以框图形式示出。The specific embodiments described above in conjunction with the drawings describe exemplary embodiments, but do not represent all embodiments that can be implemented or fall within the scope of protection of the claims. The term "exemplary" used throughout this specification means "serving as an example, instance, or illustration" and does not mean "preferred" or "having an advantage" over other embodiments. The specific embodiments include specific details for the purpose of providing an understanding of the techniques described. However, these techniques can be implemented without these specific details. In some examples, to avoid obscuring the concept of the described embodiments, well-known structures and devices are shown in block diagram form.
本公开内容的上述描述被提供来使得本领域任何普通技术人员能够实现或者使用本公开内容。对于本领域普通技术人员来说,对本公开内容进行的各种修改是显而易见的,并且,也可以在不脱离本公开内容的保护范围的情况下,将本文所定义的一般性原理应用于其它变型。因此,本公开内容并不限于本文所描述的示例和设计,而是与符合本文公开的原理和新颖性特征的最广范围相一致。The foregoing description of the disclosure is provided to enable any person skilled in the art to make or use the disclosure. It will be apparent to those skilled in the art that various modifications can be made to the present disclosure, and the general principles defined herein can also be applied to other variations without departing from the scope of protection of the present disclosure. . Accordingly, the disclosure is not limited to the examples and designs described herein, but is in accordance with the broadest scope consistent with the principles and novelty features disclosed herein.

Claims (14)

  1. 神经网络模型压缩方法,包括:Neural network model compression methods, including:
    获取经训练后的第一神经网络模型;Obtaining a trained first neural network model;
    从第一神经网络模型的各层中选取至少一层作为待压缩层;Selecting at least one layer from each layer of the first neural network model as a layer to be compressed;
    按照预设规则对待压缩层进行排序;Sort the compressed layers according to a preset rule;
    根据排序的先后顺序,利用遗传算法对待压缩层的部分或全部执行压缩处理以获得第二神经网络模型,其中,第二神经网络模型基于预置的训练样本的精度不低于预设精度。According to the sequencing order, the genetic algorithm is used to perform compression processing on part or all of the compression layer to obtain a second neural network model, wherein the accuracy of the second neural network model based on the preset training samples is not lower than the preset accuracy.
  2. 根据权利要求1所述的方法,其中,按照预设规则对待压缩层进行排序,包括:The method according to claim 1, wherein sorting the compressed layers according to a preset rule comprises:
    按照待压缩层在第一神经网络模型中所处的层级的层级数由大到小对待压缩层进行先后排序。Sort the layers to be compressed according to the number of levels of the layers to be compressed in the first neural network model.
  3. 根据权利要求1所述的方法,其中,按照预设规则对待压缩层进行排序,包括:The method according to claim 1, wherein sorting the compressed layers according to a preset rule comprises:
    按照待压缩层对第一神经网络模型损失的贡献度由小到大对待压缩层进行先后排序。The layers to be compressed are ranked in order of their contribution to the loss of the first neural network model from small to large.
  4. 根据权利要求1-3任一项所述的方法,其中,根据排序的先后顺序,利用遗传算法对待压缩层的部分或全部执行压缩处理以获得第二神经网络模型,包括:The method according to any one of claims 1 to 3, wherein, according to an order of sorting, using a genetic algorithm to perform compression processing on part or all of the compression layer to obtain a second neural network model, comprising:
    在每一次利用遗传算法对其中一待压缩层执行压缩处理后,利用预置的训练样本对当前的神经网络模型进行训练;After each time using a genetic algorithm to perform compression processing on one of the layers to be compressed, the current neural network model is trained using preset training samples;
    若当前的神经网络模型的精度不低于预设精度,则在还剩有未执行压缩处理的待压缩层时,根据排序的先后顺序继续对后一待压缩层执行压缩处理,在已对全部待压缩层执行压缩处理时,将当前的神经网络模型确定为经压缩处理后的第二神经网络模型;若当前的神经网络模型的精度低于预设精度,则将前一待压缩层执行压缩处理后的神经网络模型确定为经压缩处理后的第二神经网络模型。If the accuracy of the current neural network model is not lower than the preset accuracy, when there are remaining layers to be compressed that have not yet been compressed, the compression processing will continue to be performed on the next to-be-compressed layer according to the sequencing order. When performing compression processing on the layer to be compressed, the current neural network model is determined as the second neural network model after the compression processing; if the accuracy of the current neural network model is lower than a preset accuracy, the previous layer to be compressed is compressed. The processed neural network model is determined as the second neural network model after compression processing.
  5. 根据权利要求1所述的方法,其中,利用遗传算法对待压缩层的部分或全部执行压缩处理,包括:The method according to claim 1, wherein using a genetic algorithm to perform compression processing on part or all of the compression layer comprises:
    获取待压缩层的网络结构信息;Obtain the network structure information of the layer to be compressed;
    根据待压缩层的网络结构信息,对待压缩层进行编码,以获得一染色体;Encode the layer to be compressed according to the network structure information of the layer to be compressed to obtain a chromosome;
    根据获得的一染色体,进行群体初始化生成初始群体;According to a chromosome obtained, perform population initialization to generate an initial population;
    计算群体中染色体个体的适应度值;Calculate the fitness value of individual chromosomes in the population;
    判断是否达到终止条件;Determine whether the termination conditions are met;
    若未达到终止条件,则以适应度值为标准,选择部分适应度值满足要求的染色体个体,执行复制、交叉或变异操作,从而产生新一代群体,然后返回计算群体中染色体个体的适应度值步骤;If the termination condition is not met, then use the fitness value as a standard, select some chromosome individuals whose fitness value meets the requirements, perform replication, crossover or mutation operations to generate a new generation of population, and then return to calculate the fitness value of the chromosome individuals in the population step;
    若达到终止条件,则输出适应度值最优的染色体个体,以此获得经压缩处理后的待压缩层。If the termination condition is reached, a chromosome individual with the best fitness value is output, so as to obtain a layer to be compressed after compression processing.
  6. 根据权利要求5所述的方法,其中,计算群体中染色体个体的适应度值,包括:The method according to claim 5, wherein calculating the fitness value of the individual chromosome in the population comprises:
    分别计算群体中染色体个体基于精度和基于压缩的适应度值;Calculate the precision-based and compression-based fitness values of individual chromosomes in the population;
    相应的,以适应度值为标准,选择部分适应度值满足要求的染色体个体,执行复制、交叉或变异操作,从而产生新一代群体,包括:Correspondingly, using the fitness value as a standard, select some chromosomal individuals whose fitness value meets the requirements, and perform replication, crossover or mutation operations to generate a new generation of population, including:
    根据基于精度的适应度值,获取群体中染色体个体的第一选择概率,根据第一选择概率选择第一染色体个体,以及,根据基于压缩的适应度值,获取群体中染色体个体的第二选择概率,根据第二选择概率从第一染色体个体中选择第二染色体个体;对第二染色体个体执行复制、交叉或变异操作,从而产生新一代群体。A first selection probability of chromosomal individuals in the population is obtained according to the fitness value based on precision, a first chromosome individual is selected according to the first selection probability, and a second selection probability of chromosome individuals in the population is obtained according to the fitness value based on compression , Selecting a second chromosome individual from the first chromosome individual according to the second selection probability; performing replication, crossover, or mutation operations on the second chromosome individual to generate a new generation of population.
  7. 神经网络模型压缩装置,包括:Neural network model compression device, including:
    获取模块,用于获取经训练后的第一神经网络模型;An acquisition module, configured to acquire a trained first neural network model;
    选取模块,用于从第一神经网络模型的各层中选取至少一层作为待压缩层;A selection module for selecting at least one layer from each layer of the first neural network model as a layer to be compressed;
    排序模块,用于按照预设规则对待压缩层进行排序;A sorting module for sorting compressed layers according to a preset rule;
    压缩模块,用于根据排序的先后顺序,利用遗传算法对待压缩层的部分或全部执行压缩处理以获得第二神经网络模型,其中,第二神经网络模型基于预置的训练样本的精度不低于预设精度。The compression module is configured to perform a compression process on a part or all of the compression layer using a genetic algorithm to obtain a second neural network model according to the sequencing order, wherein the accuracy of the second neural network model based on preset training samples is not lower than Preset precision.
  8. 根据权利要求7所述的装置,其中,排序模块具体用于:The apparatus according to claim 7, wherein the sorting module is specifically configured to:
    按照待压缩层在第一神经网络模型中所处的层级的层级数由大到小对待压缩层进行先后排序。Sort the layers to be compressed according to the number of levels of the layers to be compressed in the first neural network model.
  9. 根据权利要求7所述的装置,其中,排序模块具体用于:The apparatus according to claim 7, wherein the sorting module is specifically configured to:
    按照待压缩层对第一神经网络模型损失的贡献度由小到大对待压缩层进行先后排序。The layers to be compressed are ranked in order of their contribution to the loss of the first neural network model from small to large.
  10. 根据权利要求7-9任一项所述的装置,其中,压缩模块包括:The apparatus according to any one of claims 7-9, wherein the compression module includes:
    训练单元,用于在每一次利用遗传算法对其中一待压缩层执行压缩处理后,利用预置的训练样本对当前的神经网络模型进行训练;A training unit, configured to train a current neural network model using a preset training sample after performing compression processing on one of the layers to be compressed each time by using a genetic algorithm;
    确定单元,用于若当前的神经网络模型的精度不低于预设精度,则在还剩有未执行压缩处理的待压缩层时,根据先后顺序继续对后一待压缩层执行压缩处理,在已对全部待压缩层执行压缩处理时,将当前的神经网络模型确定为经压缩处理后获得的第二神经网络模型;若当前的神经网络模型的精度低于预设精度,则将前一待压缩层执行压缩处理后的神经网络模型确定为经压缩处理后获得的第二神经网络模型。A determining unit configured to, if the accuracy of the current neural network model is not lower than a preset accuracy, continue to perform compression processing on the next to-be-compressed layer according to the sequence when there are remaining to-be-compressed layers that have not been compressed yet When compression processing has been performed on all layers to be compressed, the current neural network model is determined as the second neural network model obtained after compression processing; if the accuracy of the current neural network model is lower than the preset accuracy, the previous neural network model is The neural network model after the compression layer performs the compression process is determined as the second neural network model obtained after the compression process.
  11. 根据权利要求7所述的装置,其中,压缩模块还包括:The apparatus according to claim 7, wherein the compression module further comprises:
    获取单元,用于获取待压缩层的网络结构信息;An obtaining unit, configured to obtain network structure information of a layer to be compressed;
    编码单元,用于根据待压缩层的网络结构信息,对待压缩层进行编码,以获得一染色体;A coding unit, configured to code the layer to be compressed according to the network structure information of the layer to be compressed to obtain a chromosome;
    初始化单元,用于根据获得的一染色体,进行群体初始化生成初始群体;An initialization unit, configured to initialize a population based on a chromosome obtained to generate an initial population;
    计算单元,用于计算群体中染色体个体的适应度值;A calculation unit for calculating fitness values of individual chromosomes in a population;
    判断单元,用于判断是否达到终止条件;A judging unit, configured to judge whether a termination condition is reached;
    遗传操作单元,用于若未达到终止条件,则以适应度值为标准,选择部分适应度值满足要求的染色体个体,执行复制、交叉或变异操作,从而产生新一代群体;The genetic operation unit is used to select a chromosome individual whose fitness value meets the requirements based on the fitness value if the termination condition is not reached, and perform replication, crossover or mutation operations to generate a new generation of population;
    输出单元,用于若达到终止条件,则输出适应度值最优的染色体个体,以此获得经压缩处理后的待压缩层。An output unit is configured to output a chromosome individual with an optimal fitness value if a termination condition is reached, so as to obtain a layer to be compressed after compression processing.
  12. 根据权利要求11所述的装置,其中,计算单元进一步用于:The apparatus according to claim 11, wherein the computing unit is further configured to:
    分别计算群体中染色体个体基于精度和基于压缩的适应度值;Calculate the precision-based and compression-based fitness values of individual chromosomes in the population;
    相应的,遗传操作单元进一步用于:Accordingly, the genetic operation unit is further used for:
    根据基于精度的适应度值,获取群体中染色体个体的第一选择概率,根据第一选择概率选择第一染色体个体,以及,根据基于压缩的适应度值,获取群体中染色体个体的第二选择概率,根据第二选择概率从第一染色体个体中选择第二染色体个体;对第二染色体个体执行复制、交叉或变异操作,从而产生新一代群体。A first selection probability of chromosomal individuals in the population is obtained according to the fitness value based on precision, a first chromosome individual is selected according to the first selection probability, and a second selection probability of chromosome individuals in the population is obtained according to the fitness value based on compression , Selecting a second chromosome individual from the first chromosome individual according to the second selection probability; performing replication, crossover, or mutation operations on the second chromosome individual to generate a new generation of population.
  13. 一种计算机设备,包括:A computer device including:
    处理器;以及Processor; and
    存储器,其上存储有可执行指令,其中,所述可执行指令当被执行时使得所述处理器执行权利要求1-4任一项所述的方法。The memory stores executable instructions thereon, wherein the executable instructions, when executed, cause the processor to perform the method according to any one of claims 1-4.
  14. 一种计算机可读介质,其上存储有可执行指令,其中,所述可执行指令当被执行时使得计算机执行权利要求1-4任一项所述的方法。A computer-readable medium having executable instructions stored thereon, wherein the executable instructions, when executed, cause a computer to perform the method of any one of claims 1-4.
PCT/CN2019/103511 2018-09-05 2019-08-30 Method for compressing neural network model, device, and computer apparatus WO2020048389A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811037330.2A CN109165720A (en) 2018-09-05 2018-09-05 Neural network model compression method, device and computer equipment
CN201811037330.2 2018-09-05

Publications (1)

Publication Number Publication Date
WO2020048389A1 true WO2020048389A1 (en) 2020-03-12

Family

ID=64894255

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/103511 WO2020048389A1 (en) 2018-09-05 2019-08-30 Method for compressing neural network model, device, and computer apparatus

Country Status (2)

Country Link
CN (1) CN109165720A (en)
WO (1) WO2020048389A1 (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109165720A (en) * 2018-09-05 2019-01-08 深圳灵图慧视科技有限公司 Neural network model compression method, device and computer equipment
CN112219208A (en) * 2019-02-01 2021-01-12 华为技术有限公司 Deep neural network quantization method, device, equipment and medium
CN110175671B (en) * 2019-04-28 2022-12-27 华为技术有限公司 Neural network construction method, image processing method and device
CN110135498A (en) * 2019-05-17 2019-08-16 电子科技大学 A kind of image-recognizing method based on depth Evolutionary Neural Network
CN110276448B (en) * 2019-06-04 2023-10-24 深圳前海微众银行股份有限公司 Model compression method and device
CN112348177B (en) * 2019-07-05 2024-01-09 安徽寒武纪信息科技有限公司 Neural network model verification method, device, computer equipment and storage medium
CN112784952B (en) * 2019-11-04 2024-03-19 珠海格力电器股份有限公司 Convolutional neural network operation system, method and equipment
CN111028226A (en) * 2019-12-16 2020-04-17 北京百度网讯科技有限公司 Method and device for algorithm transplantation
CN111338816B (en) * 2020-02-18 2023-05-12 深圳鲲云信息科技有限公司 Instruction interaction method, system, equipment and storage medium based on neural network
CN111275190B (en) * 2020-02-25 2023-10-10 北京百度网讯科技有限公司 Compression method and device of neural network model, image processing method and processor
CN112529278B (en) * 2020-12-02 2021-08-31 中国人民解放军93209部队 Method and device for planning navigation network based on connection matrix optimization
CN114239792B (en) * 2021-11-01 2023-10-24 荣耀终端有限公司 System, apparatus and storage medium for image processing using quantization model

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101599138A (en) * 2009-07-07 2009-12-09 武汉大学 Land evaluation method based on artificial neural network
CN103971162A (en) * 2014-04-04 2014-08-06 华南理工大学 Method for improving BP (back propagation) neutral network and based on genetic algorithm
CN106503802A (en) * 2016-10-20 2017-03-15 上海电机学院 A kind of method of utilization genetic algorithm optimization BP neural network system
CN108038546A (en) * 2017-12-29 2018-05-15 百度在线网络技术(北京)有限公司 Method and apparatus for compressing neutral net
CN109165720A (en) * 2018-09-05 2019-01-08 深圳灵图慧视科技有限公司 Neural network model compression method, device and computer equipment

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7313550B2 (en) * 2002-03-27 2007-12-25 Council Of Scientific & Industrial Research Performance of artificial neural network models in the presence of instrumental noise and measurement errors
CN108229646A (en) * 2017-08-08 2018-06-29 北京市商汤科技开发有限公司 neural network model compression method, device, storage medium and electronic equipment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101599138A (en) * 2009-07-07 2009-12-09 武汉大学 Land evaluation method based on artificial neural network
CN103971162A (en) * 2014-04-04 2014-08-06 华南理工大学 Method for improving BP (back propagation) neutral network and based on genetic algorithm
CN106503802A (en) * 2016-10-20 2017-03-15 上海电机学院 A kind of method of utilization genetic algorithm optimization BP neural network system
CN108038546A (en) * 2017-12-29 2018-05-15 百度在线网络技术(北京)有限公司 Method and apparatus for compressing neutral net
CN109165720A (en) * 2018-09-05 2019-01-08 深圳灵图慧视科技有限公司 Neural network model compression method, device and computer equipment

Also Published As

Publication number Publication date
CN109165720A (en) 2019-01-08

Similar Documents

Publication Publication Date Title
WO2020048389A1 (en) Method for compressing neural network model, device, and computer apparatus
CN110674880B (en) Network training method, device, medium and electronic equipment for knowledge distillation
US11487954B2 (en) Multi-turn dialogue response generation via mutual information maximization
CN108875807B (en) Image description method based on multiple attention and multiple scales
CN108536679B (en) Named entity recognition method, device, equipment and computer readable storage medium
CN110366734B (en) Optimizing neural network architecture
CN111128137A (en) Acoustic model training method and device, computer equipment and storage medium
CN110929515A (en) Reading understanding method and system based on cooperative attention and adaptive adjustment
CN109919221B (en) Image description method based on bidirectional double-attention machine
US20200134471A1 (en) Method for Generating Neural Network and Electronic Device
WO2020238783A1 (en) Information processing method and device, and storage medium
CN112508085A (en) Social network link prediction method based on perceptual neural network
CN112465120A (en) Fast attention neural network architecture searching method based on evolution method
CN111538827A (en) Case recommendation method and device based on content and graph neural network and storage medium
CN112766496B (en) Deep learning model safety guarantee compression method and device based on reinforcement learning
CN116049459B (en) Cross-modal mutual retrieval method, device, server and storage medium
CN110941964A (en) Bilingual corpus screening method and device and storage medium
CN111814489A (en) Spoken language semantic understanding method and system
CN115455171B (en) Text video mutual inspection rope and model training method, device, equipment and medium
CN117475038B (en) Image generation method, device, equipment and computer readable storage medium
CN113779244B (en) Document emotion classification method and device, storage medium and electronic equipment
CN112884019B (en) Image language conversion method based on fusion gate circulation network model
CN112818658A (en) Method for training classification model by text, classification method, equipment and storage medium
CN117521674B (en) Method, device, computer equipment and storage medium for generating countermeasure information
CN115269844B (en) Model processing method, device, electronic equipment and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19856593

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19856593

Country of ref document: EP

Kind code of ref document: A1