WO2021179117A1 - Procédé et appareil de recherche de nombre de canaux de réseau de neurones artificiels - Google Patents
Procédé et appareil de recherche de nombre de canaux de réseau de neurones artificiels Download PDFInfo
- Publication number
- WO2021179117A1 WO2021179117A1 PCT/CN2020/078413 CN2020078413W WO2021179117A1 WO 2021179117 A1 WO2021179117 A1 WO 2021179117A1 CN 2020078413 W CN2020078413 W CN 2020078413W WO 2021179117 A1 WO2021179117 A1 WO 2021179117A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- sub
- weighting coefficients
- feature
- tensors
- convolutional layer
- Prior art date
Links
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 104
- 238000000034 method Methods 0.000 title claims abstract description 95
- 238000012545 processing Methods 0.000 claims description 30
- 238000003672 processing method Methods 0.000 claims description 9
- 238000011176 pooling Methods 0.000 description 33
- 238000004364 calculation method Methods 0.000 description 25
- 238000004891 communication Methods 0.000 description 24
- 238000013527 convolutional neural network Methods 0.000 description 18
- 238000010586 diagram Methods 0.000 description 18
- 238000005516 engineering process Methods 0.000 description 17
- 230000006870 function Effects 0.000 description 15
- 238000012549 training Methods 0.000 description 15
- 239000011159 matrix material Substances 0.000 description 14
- 238000013473 artificial intelligence Methods 0.000 description 5
- 238000006243 chemical reaction Methods 0.000 description 5
- 230000000717 retained effect Effects 0.000 description 5
- 239000000284 extract Substances 0.000 description 4
- 230000006835 compression Effects 0.000 description 3
- 238000007906 compression Methods 0.000 description 3
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 241000271175 Neomysis integer Species 0.000 description 2
- 238000004590 computer program Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000001537 neural effect Effects 0.000 description 2
- 230000000306 recurrent effect Effects 0.000 description 2
- 230000011218 segmentation Effects 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
Definitions
- This application relates to the field of artificial intelligence, and more specifically, to a method and device for searching the number of neural network channels.
- Artificial intelligence is a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge, and use knowledge to obtain the best results.
- artificial intelligence is a branch of computer science that attempts to understand the essence of intelligence and produce a new kind of intelligent machine that can react in a similar way to human intelligence.
- Artificial intelligence is to study the design principles and implementation methods of various intelligent machines, so that the machines have the functions of perception, reasoning and decision-making.
- neural networks for example, deep neural networks
- a neural network with good performance often has a sophisticated network structure, which requires human experts with superb skills and rich experience to spend a lot of energy to construct.
- NAS neural architecture search
- Neural network structure search technology can be divided into different categories according to the search method. Differentiable search technology is one of the important technologies of NAS. It is mainly divided into three stages: construct a differentiable neural network search space, perform network structure search, and search The result is decoded to get the final network structure.
- the calculation unit can be a single operation, such as convolution, pooling, etc., or a block operation composed of multiple basic operations.
- Differentiable search technology can search for different computing units when searching for a network structure, but it does not support a single convolution channel number search. It cannot meet the requirements when it is desired to search for a network with a smaller amount of calculation.
- the present application provides a neural network channel number search method and device, which can realize the differentiable search technology to be able to perform the network channel number search problem, and reduce the computational complexity of the network while ensuring the network performance.
- a method for searching the number of neural network channels includes: determining the number of output channels N of the convolutional layer, where N is a positive integer; and dividing the feature tensor output by the convolutional layer into n sub-features Tensor, the number of channels of each sub-feature tensor is N/n, n is an integer divisible by N, and n ⁇ 2; determine n sets of weighting coefficients, each set of weighting coefficients includes multiple weighting coefficients, multiple weighting coefficients One-to-one correspondence with multiple sub-feature tensors in n sub-feature tensors; determine the sub-feature tensor corresponding to the maximum value in each set of weighting coefficients in n sets of weighting coefficients, to obtain the sub-feature corresponding to the n maximum values Tensor; re-determine the number of output channels of the convolutional layer according to the sub-feature tensor corresponding to the n maximum values.
- the calculation unit can be a single operation, such as convolution, pooling, etc., or a block operation composed of multiple basic operations.
- Differentiable search technology can search for different computing units when searching for a network structure, but it does not support the search for the number of channels of a single convolution. It cannot meet the requirements when it is desired to search for a network with a smaller amount of calculation.
- the method for searching the number of neural network channels provided by the embodiments of the present application can realize the search of the number of neural network channels based on a differentiable search technology.
- each group of weighting coefficients includes n weighting coefficients, and the n weighting coefficients have a one-to-one correspondence with the n sub-feature tensors.
- each group of weighting coefficients includes m weighting coefficients, and the m weighting coefficients are in one-to-one correspondence with m sub-feature tensors in the n sub-feature tensors, where m is a positive value smaller than n. Integer.
- This embodiment of the application provides two possible implementations to determine the sub-feature tensor corresponding to the maximum value in each group of weighting coefficients in n sets of weighting coefficients, that is, the sub-feature tensor corresponding to each maximum value can be from n It can be determined from the number of sub-feature tensors, or it can be determined from part of the sub-feature tensors in n sub-feature tensors.
- the method before determining the sub-feature tensor corresponding to the maximum value of each of the n sets of weighting coefficients, the method further includes: according to n sets of weighting coefficients and n sub-feature tensors Multiple sub-feature tensors in the feature tensor generate n candidate feature tensors, and a set of weighting coefficients corresponds to one candidate feature tensor.
- the sub-feature tensor corresponding to the maximum value of each group of weighting coefficients in the n sets of weighting coefficients is determined to obtain the sub-feature tensor corresponding to the n maximum values, It also includes: determining the sub-feature tensor with the largest weight among the multiple sub-feature tensors for generating each candidate feature tensor, so as to obtain n sub-feature tensors with the largest weights.
- each candidate feature tensor can be calculated based on each set of weighting coefficients and multiple sub feature tensors in n sub feature tensors, where multiple sub feature tensors can be n sub feature tensors, or part of n sub feature tensors Sub-feature tensor, and then generate the sub-feature tensor with the largest weight among each candidate feature tensor as the sub-feature tensor corresponding to the maximum value.
- re-determining the number of output channels of the convolutional layer according to the sub-feature tensors corresponding to the n maximum values includes: determining the mutual relationship among the sub-feature tensors corresponding to the n maximum values.
- the number of different sub-feature tensors k, k is a positive integer less than or equal to n; the number of output channels of the re-determined convolutional layer is kN/n.
- the number of output channels of the convolutional layer is re-determined according to the number of sub-feature tensors that are different from each other in the sub-feature tensors corresponding to the n maximum values.
- the re-determined number of channels is the original k/n, which can realize the neural
- the number of network channels is compressed, thereby reducing the computational complexity of the neural network.
- an image processing method including: acquiring an image to be processed; classifying the image to be processed according to a target neural network to obtain a classification result of the image to be processed; wherein the determination of the number of channels of the target neural network includes: determining The number of output channels of the convolutional layer is N, where N is a positive integer; the feature tensor output by the convolutional layer is divided into n sub-feature tensors, the number of channels of each sub-feature tensor is N/n, n is divisible by N Integer of, and n ⁇ 2; determine n sets of weighting coefficients, each set of weighting coefficients includes multiple weighting coefficients, and the multiple weighting coefficients are in one-to-one correspondence with multiple sub-feature tensors of the n sub-feature tensors; determine n sets of weighting coefficients The sub-feature tensor corresponding to the maximum value in each group of weighting coefficients in the, to obtain the sub-
- the neural network searched by the neural network channel number search method provided in the embodiments of the present application is used for image processing. Compared with the neural network without channel number compression, the overall computational complexity of the neural network is reduced.
- each group of weighting coefficients includes n weighting coefficients, and the n weighting coefficients are in one-to-one correspondence with the n sub-feature tensors.
- each group of weighting coefficients includes m weighting coefficients, and the m weighting coefficients are in one-to-one correspondence with m sub-feature tensors in the n sub-feature tensors, where m is a positive value smaller than n. Integer.
- the method before determining the sub-feature tensor corresponding to the maximum value of each of the n sets of weighting coefficients, the method further includes: according to n sets of weighting coefficients and n sub-features Multiple sub-feature tensors in the tensor generate n candidate feature tensors, and one set of weighting coefficients corresponds to one candidate feature tensor.
- the sub-feature tensor corresponding to the maximum value of each group of weighting coefficients in the n sets of weighting coefficients is determined to obtain the sub-feature tensor corresponding to the n maximum values, It also includes: determining the sub-feature tensor with the largest weight among the multiple sub-feature tensors for generating each candidate feature tensor, so as to obtain n sub-feature tensors with the largest weights.
- re-determining the number of output channels of the convolutional layer according to the sub-feature tensors corresponding to the n maximum values includes: determining the mutual relationship among the sub-feature tensors corresponding to the n maximum values.
- the number of different sub-feature tensors k, k is a positive integer less than or equal to n; the number of output channels of the re-determined convolutional layer is kN/n.
- a neural network channel number search device including: a memory for storing programs; a processor for executing programs stored in the memory, and when the programs stored in the memory are executed, the processor is configured to perform the following processes : Determine the number of output channels of the convolutional layer N, where N is a positive integer; divide the feature tensor output by the convolutional layer into n sub-feature tensors, the number of channels of each sub-feature tensor is N/n, where n is An integer divisible by N, and n ⁇ 2; determine n groups of weighting coefficients, each group of weighting coefficients includes multiple weighting coefficients, the multiple weighting coefficients correspond to multiple sub-feature tensors of the n sub-feature tensors one-to-one; determine n groups The sub-feature tensor corresponding to the maximum value in each group of weighting coefficients in the weighting coefficients to obtain the sub-feature tensor corresponding to the n maximum values; the
- an image processing device including: a memory for storing a program; a processor for executing the program stored in the memory, and when the program stored in the memory is executed, the processor is configured to Perform the following process: obtain the image to be processed; classify the image to be processed according to the target neural network to obtain the classification result of the image to be processed; wherein the determination of the number of channels of the target neural network includes: determining the number of output channels of the convolutional layer N, N Is a positive integer; divide the feature tensor output by the convolutional layer into n sub-feature tensors, the number of channels of each sub-feature tensor is N/n, n is an integer divisible by N, and n ⁇ 2; determine n Group weighting coefficients, each group of weighting coefficients includes multiple weighting coefficients, and the multiple weighting coefficients correspond to multiple sub-feature tensors of the n sub-feature tensors one-to-one; determine the maximum value of each
- a computer-readable storage medium stores program code for device execution, and the program code includes any one of the implementation manners of the first aspect to the second aspect described above. In the method.
- a computer program product containing instructions is provided.
- the computer program product runs on a computer, the computer executes the method in any one of the foregoing first to second aspects.
- a chip in a seventh aspect, includes a processor and a data interface.
- the processor reads instructions stored in a memory through the data interface to execute any one of the first aspect to the second aspect. One way to achieve this.
- the chip may further include a memory in which instructions are stored, and the processor is configured to execute the instructions stored in the memory, and when the instructions are executed, The processor is configured to execute the method in any one of the implementation manners of the first aspect to the second aspect.
- Fig. 1 is a schematic structural diagram of a convolutional neural network provided by an embodiment of the present application
- FIG. 2 is a schematic block diagram of a method for searching a differentiable neural network structure provided by an embodiment of the present application
- FIG. 3 is a schematic flowchart of a method for searching the number of neural network channels provided by an embodiment of the present application
- FIG. 4 is a schematic block diagram of a method for searching the number of neural network channels provided by an embodiment of the present application
- FIG. 5 is a schematic block diagram of another neural network channel number search method provided by an embodiment of the present application.
- FIG. 6 is a schematic diagram of a super-division neural network structure provided by an embodiment of the present application.
- FIG. 7 is a schematic flowchart of an image processing method provided by an embodiment of the present application.
- FIG. 8 is a schematic diagram of the hardware structure of a neural network channel number search device provided by an embodiment of the present application.
- FIG. 9 is a schematic diagram of the hardware structure of an image processing apparatus provided by an embodiment of the present application.
- FIG. 10 is a schematic diagram of the hardware structure of a neural network training device according to an embodiment of the present application.
- FIG. 11 is a schematic structural block diagram of a neural network channel number search device provided by an embodiment of the present application.
- FIG. 12 is a schematic structural block diagram of an image processing device provided by an embodiment of the present application.
- the neural network obtained according to the method for searching for the number of neural network channels may be a convolutional neural network (CNN), a deep convolutional neural network (DCNN), or a recurrent neural network (recurrent neural network). neural network, RNN) and so on. Since CNN is a very common neural network, the structure of CNN will be introduced below in conjunction with Figure 1.
- a convolutional neural network (CNN) 100 may include an input layer 110, a convolutional layer/pooling layer 120 (the pooling layer is optional), and a neural network layer 130.
- the input layer 110 can obtain the image to be processed, and pass the obtained image to be processed to the convolutional layer/pooling layer 120 and the subsequent neural network layer 130 for processing, and the processing result of the image can be obtained.
- the following describes the internal layer structure of CNN 100 in Figure 1 in detail.
- the convolutional layer/pooling layer 120 may include layers 121-126 as shown in the examples.
- layer 121 is a convolutional layer
- layer 122 is a pooling layer
- layer 123 is a convolutional layer.
- Layers, 124 are pooling layers
- 125 are convolutional layers
- 126 are pooling layers; in another implementation, 121 and 122 are convolutional layers, 123 are pooling layers, and 124 and 125 are convolutional layers.
- Layer, 126 is the pooling layer. That is, the output of the convolutional layer can be used as the input of the subsequent pooling layer, or as the input of another convolutional layer to continue the convolution operation.
- the convolution layer 121 can include many convolution operators.
- the convolution operator is also called a kernel. Its role in image processing is equivalent to a filter that extracts specific information from the input image matrix.
- the convolution operator is essentially It can be a weight matrix, which is usually predefined. In the process of image convolution operation, the weight matrix is usually along the horizontal direction of the input image one pixel after another pixel (or two pixels then two pixels...It depends on the value of stride). Processing, so as to complete the work of extracting specific features from the image.
- the size of the weight matrix should be related to the size of the image. It should be noted that the depth dimension of the weight matrix and the depth dimension of the input image are the same.
- the weight matrix will extend to Enter the entire depth of the image. Therefore, convolution with a single weight matrix will produce a single depth dimension convolution output, but in most cases, a single weight matrix is not used, but multiple weight matrices of the same size (row ⁇ column) are applied. That is, multiple homogeneous matrices.
- the output of each weight matrix is stacked to form the depth dimension of the convolutional image, where the dimension can be understood as determined by the "multiple" mentioned above.
- Different weight matrices can be used to extract different features in the image. For example, one weight matrix is used to extract edge information of the image, another weight matrix is used to extract specific colors of the image, and another weight matrix is used to eliminate unwanted noise in the image.
- the multiple weight matrices have the same size (row ⁇ column), the size of the convolution feature maps extracted by the multiple weight matrices of the same size are also the same, and then the multiple extracted convolution feature maps of the same size are merged to form The output of the convolution operation.
- weight values in these weight matrices need to be obtained through a lot of training in practical applications, and each weight matrix formed by the weight values obtained through training can extract information from the input image, so that the convolutional neural network 100 can make correct predictions.
- the initial convolutional layer (such as 121) often extracts more general features, which can also be called low-level features; with the convolutional neural network
- the subsequent convolutional layers for example, 126
- features such as high-level semantics
- the pooling layer can also be a multi-layer convolutional layer followed by one or more pooling layers.
- the pooling layer may include an average pooling operator and/or a maximum pooling operator for sampling the input image to obtain an image with a smaller size.
- the average pooling operator can calculate the pixel values in the image within a specific range to generate an average value as the result of the average pooling.
- the maximum pooling operator can take the pixel with the largest value within a specific range as the result of the maximum pooling.
- the operators in the pooling layer should also be related to the image size.
- the size of the image output after processing by the pooling layer can be smaller than the size of the image of the input pooling layer, and each pixel in the image output by the pooling layer represents the average value or the maximum value of the corresponding sub-region of the image input to the pooling layer.
- the convolutional neural network 100 After processing by the convolutional layer/pooling layer 120, the convolutional neural network 100 is not enough to output the required output information. Because as mentioned above, the convolutional layer/pooling layer 120 only extracts features and reduces the parameters brought by the input image. However, in order to generate the final output information (required class information or other related information), the convolutional neural network 100 needs to use the neural network layer 130 to generate one or a group of required classes of output. Therefore, the neural network layer 130 may include multiple hidden layers (131, 132 to 13n as shown in FIG. 1) and an output layer 140. The parameters contained in the multiple hidden layers may be based on specific task types. The relevant training data of the, for example, the task type can include image recognition, image classification, image super-resolution reconstruction and so on.
- the output layer 140 After the multiple hidden layers in the neural network layer 130, that is, the final layer of the entire convolutional neural network 100 is the output layer 140.
- the output layer 140 has a loss function similar to the classification cross entropy, which is specifically used to calculate the prediction error.
- the neural network shown in Figure 1 can be obtained by the neural network structure search method.
- the differentiable search technology is one of the important techniques of neural network structure search. Perform decoding to get the final network structure.
- the following is a brief introduction to the differentiable neural network structure search technology in conjunction with Figure 2.
- the first step is to construct a differentiable neural network search space.
- the candidate network calculation units 1, 2, 3, 4, and 5 are deployed in a network and constructed by a weighted summation method.
- the weighting coefficient is obtained by the gunbel_softmax conversion function.
- the gunbel_softmax conversion function can convert the weighting coefficient into a vector between [0,1] and add to 1.
- the parameter temperature controls the output distribution. When the temperature is very low, the output tends to average Distribution, when the temperature is high, the output tends to one-hot distribution, that is, only one element tends to 1, and the other elements tend to 0.
- a1, a2, a3, a4, a5 are weighting coefficients, where a1 is the weighting coefficient of calculation unit 1, a2 is the weighting coefficient of calculation unit 2...
- the second step is to search the network structure.
- alternate training of the network parameters and weighting coefficients of the network calculation unit is performed, where the weighting coefficients represent the structural parameters of the network structure.
- the input data is passed into the calculation units 1 to 5 respectively, and the network parameter training of the calculation unit is performed.
- the data processed by the calculation units 1 to 5 are respectively multiplied by the weighting coefficients a1 to a5 and then added to obtain an output 1.
- the data of output 1 is passed into the calculation units 1'to 5'respectively, and the network parameter training of the calculation units is performed.
- the third step is to decode the search results.
- the calculation unit with the largest weighting coefficient is retained, and other calculation units are deleted, and the final network structure is obtained as the search result.
- the largest weighting coefficient among the weighting coefficients a1 to a5 the corresponding calculation unit is retained, and other calculation units are deleted; according to the largest weighting coefficient among the weighting coefficients b1 to b5, the corresponding calculation unit is retained, and Delete other calculation units.
- the calculation unit can be an operation, such as convolution, pooling, etc., or a block operation composed of multiple basic operations.
- Differentiable search technology can search for different computing units when searching for a network structure, but it does not support the search for the number of channels of a single convolution. It cannot meet the requirements when it is desired to search for a network with a smaller amount of calculation.
- the method for searching the number of neural network channels provided by the embodiments of the present application can realize the search of the number of neural network channels based on a differentiable search technology.
- Fig. 3 shows a schematic flow chart of the method for searching the number of neural network channels provided by the present application.
- the method shown in FIG. 3 can be executed by a neural network structure search device.
- the neural network structure search device can be a computer, a server, a cloud device, and other devices with sufficient computing power to implement a neural network structure search.
- the method shown in FIG. 3 includes steps 301 to 305, which are described in detail below.
- S301 Determine the number N of output channels of the convolutional layer, where N is a positive integer.
- the number N of output channels of the convolutional layer may be the maximum number of output channels of the convolutional layer, and the maximum number of output channels of the convolutional layer may be a value set according to a specific embodiment.
- S302 Divide the feature tensor output by the convolutional layer into n sub-feature tensors, the number of channels of each sub-feature tensor is N/n, where n is an integer divisible by N, and n ⁇ 2.
- the segmentation of the feature tensor output by the convolutional layer is the segmentation in the channel dimension.
- a picture is a three-dimensional tensor, the length and width are each dimension, and the third dimension is the number of channels.
- the feature tensor is divided in the channel dimension, so that each sub-feature tensor can equally divide the number of channels of the convolutional layer.
- each group of weighting coefficients includes multiple weighting coefficients, and the multiple weighting coefficients are in one-to-one correspondence with multiple sub-feature tensors of the n sub-feature tensors.
- each group of weighting coefficients includes n weighting coefficients, and the n weighting coefficients have a one-to-one correspondence with the n sub-feature tensors.
- n weighting coefficients have a one-to-one correspondence with the n sub-feature tensors.
- four sets of weighting coefficients are determined, and each set of weighting coefficients includes four weighting coefficients a1, a2, a3, and a4.
- These four weighting coefficients correspond to the four sub-feature tensors T1, T2, T3, and T4, namely a1 Corresponds to T1, a2 corresponds to T2, a3 corresponds to T3, and a4 corresponds to T4.
- each set of weighting coefficients may be different from each other.
- each group of weighting coefficients includes m weighting coefficients, and the m weighting coefficients have a one-to-one correspondence with m sub-feature tensors in the n sub-feature tensors, where m is a positive integer less than n.
- m is a positive integer less than n.
- four sets of weighting coefficients are determined, and each set of weighting coefficients includes two weighting coefficients a1, a2, and these two weighting coefficients correspond to any two of the four sub-feature tensors T1, T2, T3, and T4 one-to-one, for example
- a1 corresponds to T1 and a2 corresponds to T2
- a1 corresponds to T1 and a2 corresponds to T3
- the weighting coefficients of each group may be different from each other.
- This embodiment of the application provides two possible implementations to determine the sub-feature tensor corresponding to the maximum value in each group of weighting coefficients in n sets of weighting coefficients, that is, the sub-feature tensor corresponding to each maximum value can be from n It can be determined from the number of sub-feature tensors, or it can be determined from part of the sub-feature tensors in n sub-feature tensors.
- S304 Determine the sub-feature tensor corresponding to the maximum value of each group of weighting coefficients in the n groups of weighting coefficients, so as to obtain the sub-feature tensor corresponding to the n maximum values.
- the sub-feature tensor corresponding to the maximum value is T1; the second group having the largest weighting coefficient is also a1, then the sub-feature tensor corresponding to the maximum value is a1.
- the feature tensor is also T1; the largest weighting coefficient in the third group is a2, and the sub-feature tensor corresponding to the maximum value is T2; the largest weighting coefficient in the fourth group is a3, then the sub-feature corresponding to the maximum value
- the tensor is T3.
- the sub-feature tensors corresponding to the four maximum values thus obtained are T1, T1, T2, and T3, respectively.
- n candidate tensors may be generated according to the n sets of weighting coefficients and multiple sub-feature tensors of the n sub-feature tensors Feature tensor, where a set of weighting coefficients corresponds to a candidate feature tensor. Then determine the sub-feature tensor with the largest weight among the multiple sub-feature tensors for generating each candidate feature tensor, so as to obtain n sub-feature tensors with the largest weights.
- a set of weighting coefficients is used for description.
- a candidate feature is generated based on the 4 weighting coefficients a1, a2, a3, a4 and 4 sub-feature tensors T1, T2, T3, T4 in a group.
- T1 the sub-feature tensor with the largest weights
- S305 Re-determine the number of output channels of the convolutional layer according to the sub-feature tensors corresponding to the n maximum values.
- the sub-feature tensors corresponding to the four maximum values are T1, T1, T2, and T3, respectively.
- the sub-feature tensors that are different from each other are T1, T2, T3, and the number is 3. Therefore, the number of output channels of the convolutional layer can be re-determined as 3N/4, which can realize the compression of the number of neural network channels, thereby reducing the computational complexity of the neural network.
- the overall process of the neural network channel number search method according to the embodiment of the present application will be introduced below in conjunction with FIG. 4.
- the method of constructing a search space with a searchable number of convolutional channels in the embodiment of the present application is as follows.
- the maximum number of output channels N of the convolutional layer may be a value set according to a specific embodiment.
- the maximum number of search channels is determined according to the maximum number of output channels N of the convolutional layer.
- Convolutional layer 1 outputs the feature tensor T, and divides the feature tensor T into four sub-feature tensors as shown in Figure 4 in the channel dimension: T0, T1, T2, and T3. Thus, the number of channels for each sub-feature tensor Is N/4.
- g is a randomly generated variable
- ⁇ is the set value
- ⁇ is the input
- y is the output, which is the calculated weighting coefficient.
- a corresponding set of weighting coefficients a01, a02, a03, a04 can be calculated, where a01, a02, a03, and a04 are all between [0,1 and add up A vector of 1.
- 4 sets of weighting coefficients can be obtained.
- the candidate feature tensor TC0 is a00 ⁇ T0+a01 ⁇ T1+a02 ⁇ T2+a03 ⁇ T3
- TC1 is a10 ⁇ T0+a11 ⁇ T1+a12 ⁇ T2+a13 ⁇ T3
- TC2 is a20 ⁇ T0+a21 ⁇ T1+a22 ⁇ T2+a23 ⁇ T3
- TC3 is a30 ⁇ T0+a31 ⁇ T1+a32 ⁇ T2+a33 ⁇ T3.
- the 4 candidate feature tensors TC0, TC1, TC2, and TC3 are spliced into a feature tensor Tout as the output of the convolutional layer 1, and the number of channels is still N.
- the feature tensor Tout is input to the next convolutional layer 2.
- the sub-feature tensor with the largest contribution among each candidate feature tensor can be obtained. For example, for TC0, a00>a01>a02>a03, so the sub-feature tensor that contributes the most to TC0 is T0; for TC1, a10>a11>a12>a13, the sub-feature tensor that contributes the most to TC1 is T0; TC2, a23>a20>a21>a22, so the sub-feature tensor that contributes the most to TC2 is T3; for TC3, a32>a30>a31>a33, so the sub-feature tensor that contributes the most to TC3 is T2.
- the sub-feature tensors that contribute the most are T0, only T0, T2, and T3 need to be retained in the four sub-feature tensors T0, T1, T2, and T3, so the number of channels only needs to be 3N/4. This can achieve compression of the number of channels.
- the actual number of channels of the searched network structure in the convolutional layer 1 is only 3/4 of the original number of channels, the number of channels is reduced, and the overall computational complexity of the network is reduced.
- FIG. 5 shows another way of generating candidate feature tensors.
- the sub-feature tensor T0 is directly used as the candidate feature tensor TC0, and the sub-feature tensor T0 and the sub-feature tensor T1 are weighted and summed to generate the candidate feature tensor TC1 as a10 ⁇ T0+a11 ⁇ T1.
- Sub-feature tensor T0 and sub-feature tensor T2 are weighted and summed to generate candidate feature tensor TC2 as a20 ⁇ T0+a21 ⁇ T2, and sub-feature tensor T0 and sub-feature tensor T3 are weighted and summed to generate candidate feature tensor TC3 as a30 ⁇ T0+a31 ⁇ T3.
- the sub-feature tensor that contributes the most to TC0 is obviously T0, for TC1, a10>a11, so the sub-feature tensor that contributes the most to TC1 is T0; for TC2, a20>a21, so the sub-feature tensor that contributes the most to TC2
- the quantity is T0; for TC3, a31>a30, the sub-feature tensor that contributes the most to TC3 is T3.
- the sub-feature tensors that contribute the most are all T0, only T0 and T3 need to be retained in the four sub-feature tensors T0, T1, T2, and T3, so the number of channels only needs to be N/2, As a result, the number of channels can be compressed.
- the combination method of generating candidate feature tensors shown in FIG. 5 is beneficial to search for a network structure with a smaller number of channels.
- FIG. 6 is a schematic diagram of a super-divided neural network structure searched by the neural network channel number search method provided by an embodiment of the present application.
- the performance of the super-divided neural network structure searched out without using the neural network channel number search method provided by the embodiment of the present application is equivalent in performance.
- the number of channels of the convolutional layer 0 of the super-division neural network structure searched by the neural network channel number search method provided by the embodiments of the present application is reduced by 50%
- the number of channels of the convolutional layer 1 is reduced by 50%
- the convolution The number of channels in layer 2 is reduced by 0%
- the number of channels in convolutional layer 3 is reduced by 0%
- the number of channels in convolutional layer 4 is reduced by 50%.
- the overall computational complexity of the network is reduced by 37%.
- the neural network structure searched for by the method provided in the embodiments of the present application has the effect of processing the image, which is equivalent to the subjective effect of the original image, and is better than using the difference method to process the image.
- the details of the image after processing the image using the neural network structure searched out by the method provided in the embodiment of the present application are clearer than the original image.
- FIG. 7 is a schematic flowchart of an image processing method according to an embodiment of the present application. It should be understood that the above definitions, explanations and extensions of the relevant content of the method shown in FIG. 3 are also applicable to the method shown in FIG. 7, and repeated descriptions are appropriately omitted when introducing the method shown in FIG.
- the method shown in FIG. 7 can be applied to terminal equipment, including:
- S701 Acquire an image to be processed.
- S702 Classify the image to be processed according to the target neural network to obtain a classification result of the image to be processed.
- the determination of the number of channels of the target neural network includes: determining the number of output channels of the convolutional layer N, where N is a positive integer; the feature tensor output by the convolutional layer is divided into n sub-feature tensors, and the channel of each sub-feature tensor The number is N/n, n is an integer divisible by N, and n ⁇ 2; determine n sets of weighting coefficients, each set of weighting coefficients includes multiple weighting coefficients, multiple weighting coefficients and multiple sub-features of the n sub-feature tensors One-to-one correspondence of the quantities; determine the sub-feature tensor corresponding to the maximum value of each group of weighting coefficients in the n groups of weighting coefficients to obtain the sub-feature tensor corresponding to the n maximum values; according to the sub-feature tensor corresponding to the n maximum values The feature tensor re-determines the number of output channels of the convolutional layer.
- FIG. 8 is a schematic diagram of the hardware structure of a neural network channel number search device provided by an embodiment of the present application.
- the neural network channel number search device 800 shown in FIG. 8 (the device 800 may specifically be a computer device) includes a memory 801, a processor 802, a communication interface 803, and a bus 804. Among them, the memory 801, the processor 802, and the communication interface 803 realize the communication connection between each other through the bus 804.
- the memory 801 may be a read only memory (ROM), a static storage device, a dynamic storage device, or a random access memory (RAM).
- the memory 801 may store a program. When the program stored in the memory 801 is executed by the processor 802, the processor 802 is configured to execute each step of the method for searching the number of neural network channels in the embodiment of the present application.
- the processor 802 may adopt a general central processing unit (CPU), a microprocessor, an application specific integrated circuit (ASIC), a graphics processing unit (GPU), or one or more
- the integrated circuit is used to execute related programs to implement the neural network channel number search method in the method embodiment of the present application.
- the processor 802 may also be an integrated circuit chip with signal processing capability.
- each step of the method for searching the number of neural network channels of the present application can be completed by an integrated logic circuit of hardware in the processor 802 or instructions in the form of software.
- the above-mentioned processor 802 may also be a general-purpose processor, a digital signal processing (digital signal processing, DSP), an application-specific integrated circuit (ASIC), an off-the-shelf programmable gate array (field programmable gate array, FPGA) or other programmable logic devices, Discrete gates or transistor logic devices, discrete hardware components.
- DSP digital signal processing
- ASIC application-specific integrated circuit
- FPGA field programmable gate array
- the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
- the steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware decoding processor, or executed and completed by a combination of hardware and software modules in the decoding processor.
- the software module can be located in a mature storage medium in the field, such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers.
- the storage medium is located in the memory 801, and the processor 802 reads the information in the memory 801, and combines its hardware to complete the functions required by the units included in the neural network channel number search device, or execute the neural network channel of the method embodiment of the application Number of search methods.
- the communication interface 803 uses a transceiving device such as but not limited to a transceiver to implement communication between the device 800 and other devices or a communication network. For example, the information of the target neural network to be determined and the training data needed in the process of determining the target neural network can be obtained through the communication interface 803.
- a transceiving device such as but not limited to a transceiver to implement communication between the device 800 and other devices or a communication network. For example, the information of the target neural network to be determined and the training data needed in the process of determining the target neural network can be obtained through the communication interface 803.
- the bus 804 may include a path for transferring information between various components of the device 800 (for example, the memory 801, the processor 802, and the communication interface 803).
- FIG. 9 is a schematic diagram of the hardware structure of an image processing apparatus according to an embodiment of the present application.
- the image processing apparatus 900 shown in FIG. 9 includes a memory 901, a processor 902, a communication interface 903, and a bus 904.
- the memory 901, the processor 902, and the communication interface 903 implement communication connections between each other through the bus 904.
- the memory 901 may be ROM, static storage device and RAM.
- the memory 901 may store a program. When the program stored in the memory 901 is executed by the processor 902, the processor 902 and the communication interface 903 are used to execute each step of the image processing method of the embodiment of the present application.
- the processor 902 may adopt a general-purpose CPU, a microprocessor, an ASIC, a GPU or one or more integrated circuits to execute related programs to realize the functions required by the units in the image processing apparatus of the embodiments of the present application. Or execute the image processing method in the method embodiment of this application.
- the processor 902 may also be an integrated circuit chip with signal processing capability.
- each step of the image processing method of the embodiment of the present application can be completed by an integrated logic circuit of hardware in the processor 902 or instructions in the form of software.
- the aforementioned processor 902 may also be a general-purpose processor, DSP, ASIC, FPGA or other programmable logic device, discrete gate or transistor logic device, or discrete hardware component.
- the methods, steps, and logical block diagrams disclosed in the embodiments of the present application can be implemented or executed.
- the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
- the steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware decoding processor, or executed and completed by a combination of hardware and software modules in the decoding processor.
- the software module can be located in a mature storage medium in the field, such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers.
- the storage medium is located in the memory 901, and the processor 902 reads the information in the memory 901, and combines its hardware to complete the functions required by the units included in the image processing apparatus of the embodiment of the present application, or perform the image processing of the method embodiment of the present application method.
- the communication interface 903 uses a transceiving device such as but not limited to a transceiver to implement communication between the device 900 and other devices or a communication network.
- a transceiving device such as but not limited to a transceiver to implement communication between the device 900 and other devices or a communication network.
- the image to be processed can be acquired through the communication interface 903.
- the bus 904 may include a path for transferring information between various components of the device 900 (for example, the memory 901, the processor 902, and the communication interface 903).
- FIG. 10 is a schematic diagram of the hardware structure of a neural network training device according to an embodiment of the present application. Similar to the aforementioned device 800 and device 900, the neural network training device 1000 shown in FIG. 10 includes a memory 1001, a processor 1002, a communication interface 1003, and a bus 1004. Among them, the memory 1001, the processor 1002, and the communication interface 1003 implement communication connections between each other through the bus 1004.
- the neural network After the neural network has been searched by the neural network channel number search device shown in FIG. 8, the neural network can be trained by the neural network training device 1000 shown in FIG. 10, and the trained neural network can be used to execute this Apply for the image processing method of the embodiment.
- the device shown in FIG. 10 can obtain training data and the neural network to be trained from the outside through the communication interface 1003, and then the processor trains the neural network to be trained according to the training data.
- the device 800, device 900, and device 1000 only show a memory, a processor, and a communication interface, in the specific implementation process, those skilled in the art should understand that the device 800, device 900, and device 1000 may also Including other devices necessary for normal operation. At the same time, according to specific needs, those skilled in the art should understand that the device 800, the device 900, and the device 1000 may also include hardware devices that implement other additional functions. In addition, those skilled in the art should understand that the device 800, the device 900, and the device 1000 may also only include the components necessary to implement the embodiments of the present application, and not necessarily include all the components shown in FIGS. 8, 9 and 10.
- FIG. 11 is a schematic structural block diagram of a neural network channel number search device provided by an embodiment of the present application, where the neural network channel number search device 1100 includes:
- the first determining unit 1101 is configured to determine the number N of output channels of the convolutional layer, where N is a positive integer;
- the dividing unit 1102 is configured to divide the feature tensor output by the convolutional layer into n sub-feature tensors, the number of channels of each sub-feature tensor is N/n, n is an integer divisible by N, and n ⁇ 2;
- the second determining unit 1103 is configured to determine n sets of weighting coefficients, where each set of weighting coefficients includes multiple weighting coefficients, and the multiple weighting coefficients have a one-to-one correspondence with multiple sub-feature tensors of the n sub-feature tensors;
- the third determining unit 1104 is configured to determine the sub-feature tensor corresponding to the maximum value in each group of weighting coefficients in the n groups of weighting coefficients, so as to obtain the sub-feature tensor corresponding to the n maximum values;
- the update unit 1105 is configured to update the number of output channels of the convolutional layer according to the sub-feature tensor corresponding to the n maximum values.
- each group of weighting coefficients includes n weighting coefficients, and the n weighting coefficients are in one-to-one correspondence with the n sub-feature tensors.
- each set of weighting coefficients includes m weighting coefficients, and the m weighting coefficients have a one-to-one correspondence with m sub-feature tensors in the n sub-feature tensors, where m is less than n Positive integer.
- the third determining unit 1104 is further configured to generate n candidate feature tensors according to the n sets of weighting coefficients and multiple sub-feature tensors of the n sub-feature tensors.
- the weighting coefficient corresponds to one of the candidate feature tensors.
- the third determining unit 1104 is further configured to determine the sub-feature tensor with the largest weight among the multiple sub-feature tensors for generating each of the candidate feature tensors, so as to obtain n largest weights.
- the update unit 1105 is specifically configured to determine the number k of sub-feature tensors that are different from each other among the sub-feature tensors corresponding to the n maximum values, and k is a positive value less than or equal to n.
- k is a positive value less than or equal to n.
- An integer, the number of output channels of the updated convolutional layer is kN/n.
- FIG. 12 is a schematic structural block diagram of an image processing device according to an embodiment of the present application, where the image processing device 1200 includes:
- the acquiring unit 1201 is configured to acquire the image to be processed
- the classification unit 1202 is configured to classify the image to be processed according to a target neural network to obtain a classification result of the image to be processed, wherein the number of channels of the target neural network is determined by the device 1100.
- the disclosed system, device, and method can be implemented in other ways.
- the device embodiments described above are only illustrative.
- the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or It can be integrated into another system, or some features can be ignored or not implemented.
- the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
- the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
- the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
- the function is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
- the technical solution of the present application essentially or the part that contributes to the existing technology or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application.
- the aforementioned storage media include: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disks or optical disks and other media that can store program codes. .
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
Abstract
L'invention concerne un procédé et un appareil permettant de rechercher le nombre de canaux de réseau de neurones artificiels, pouvant mettre en oeuvre une technique de recherche différentiable et résoudre un problème de recherche du nombre de canaux de réseau, et réduire la complexité de calcul d'un réseau tout en garantissant les performances du réseau. Le procédé consiste à : déterminer le nombre N de canaux de sortie d'une couche de convolution, N étant un nombre entier positif (S301) ; segmenter un tenseur de caractéristique délivré par la couche de convolution en n tenseurs de sous-caractéristiques, le nombre de canaux de chaque tenseur de sous-caractéristique étant N/n, n étant un nombre entier divisible par N, et n étant supérieur ou égal à 2 (S302) ; et déterminer n groupes de coefficients de pondération, chaque groupe de coefficients de pondération comprenant de multiples coefficients de pondération, et les multiples coefficients de pondération ayant une correspondance biunivoque avec de multiples tenseurs de sous-caractéristiques dans n tenseurs de sous-caractéristiques (S303) ; déterminer des tenseurs de sous-caractéristiques correspondant aux valeurs maximales dans les groupes de coefficients de pondération des n groupes de coefficients de pondération pour obtenir des tenseurs de sous-caractéristiques correspondant à n valeurs maximales (S304) ; et re-déterminer le nombre de canaux de sortie de la couche de convolution selon les tenseurs de sous-caractéristiques correspondant aux n valeurs maximales (S305).
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202080091992.7A CN114902240A (zh) | 2020-03-09 | 2020-03-09 | 神经网络通道数搜索方法和装置 |
PCT/CN2020/078413 WO2021179117A1 (fr) | 2020-03-09 | 2020-03-09 | Procédé et appareil de recherche de nombre de canaux de réseau de neurones artificiels |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2020/078413 WO2021179117A1 (fr) | 2020-03-09 | 2020-03-09 | Procédé et appareil de recherche de nombre de canaux de réseau de neurones artificiels |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021179117A1 true WO2021179117A1 (fr) | 2021-09-16 |
Family
ID=77671220
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/078413 WO2021179117A1 (fr) | 2020-03-09 | 2020-03-09 | Procédé et appareil de recherche de nombre de canaux de réseau de neurones artificiels |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN114902240A (fr) |
WO (1) | WO2021179117A1 (fr) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117634711A (zh) * | 2024-01-25 | 2024-03-01 | 北京壁仞科技开发有限公司 | 张量维度切分方法、系统、设备和介质 |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105631466A (zh) * | 2015-12-21 | 2016-06-01 | 中国科学院深圳先进技术研究院 | 图像分类的方法及装置 |
CN108596274A (zh) * | 2018-05-09 | 2018-09-28 | 国网浙江省电力有限公司 | 基于卷积神经网络的图像分类方法 |
US20190026600A1 (en) * | 2017-07-19 | 2019-01-24 | XNOR.ai, Inc. | Lookup-based convolutional neural network |
CN109635842A (zh) * | 2018-11-14 | 2019-04-16 | 平安科技(深圳)有限公司 | 一种图像分类方法、装置及计算机可读存储介质 |
CN110197258A (zh) * | 2019-05-29 | 2019-09-03 | 北京市商汤科技开发有限公司 | 神经网络搜索方法、图像处理方法及装置、设备和介质 |
CN110533068A (zh) * | 2019-07-22 | 2019-12-03 | 杭州电子科技大学 | 一种基于分类卷积神经网络的图像对象识别方法 |
-
2020
- 2020-03-09 WO PCT/CN2020/078413 patent/WO2021179117A1/fr active Application Filing
- 2020-03-09 CN CN202080091992.7A patent/CN114902240A/zh active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105631466A (zh) * | 2015-12-21 | 2016-06-01 | 中国科学院深圳先进技术研究院 | 图像分类的方法及装置 |
US20190026600A1 (en) * | 2017-07-19 | 2019-01-24 | XNOR.ai, Inc. | Lookup-based convolutional neural network |
CN108596274A (zh) * | 2018-05-09 | 2018-09-28 | 国网浙江省电力有限公司 | 基于卷积神经网络的图像分类方法 |
CN109635842A (zh) * | 2018-11-14 | 2019-04-16 | 平安科技(深圳)有限公司 | 一种图像分类方法、装置及计算机可读存储介质 |
CN110197258A (zh) * | 2019-05-29 | 2019-09-03 | 北京市商汤科技开发有限公司 | 神经网络搜索方法、图像处理方法及装置、设备和介质 |
CN110533068A (zh) * | 2019-07-22 | 2019-12-03 | 杭州电子科技大学 | 一种基于分类卷积神经网络的图像对象识别方法 |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117634711A (zh) * | 2024-01-25 | 2024-03-01 | 北京壁仞科技开发有限公司 | 张量维度切分方法、系统、设备和介质 |
CN117634711B (zh) * | 2024-01-25 | 2024-05-14 | 北京壁仞科技开发有限公司 | 张量维度切分方法、系统、设备和介质 |
Also Published As
Publication number | Publication date |
---|---|
CN114902240A (zh) | 2022-08-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107622302B (zh) | 用于卷积神经网络的超像素方法 | |
CN109949255B (zh) | 图像重建方法及设备 | |
Mahmoud et al. | Diffy: A Déjà vu-free differential deep neural network accelerator | |
WO2020216227A9 (fr) | Procédé et appareil de classification d'image et procédé et appareil de traitement de données | |
WO2021018163A1 (fr) | Procédé et appareil de recherche de réseau neuronal | |
WO2021022521A1 (fr) | Procédé de traitement de données et procédé et dispositif d'apprentissage de modèle de réseau neuronal | |
WO2021057056A1 (fr) | Procédé de recherche d'architecture neuronale, procédé et dispositif de traitement d'image, et support de stockage | |
US11429817B2 (en) | Neural network model training method and device, and time-lapse photography video generating method and device | |
WO2020073211A1 (fr) | Accélérateur d'opération, procédé de traitement et dispositif associé | |
CN112001914A (zh) | 深度图像补全的方法和装置 | |
Wang et al. | TRC‐YOLO: A real‐time detection method for lightweight targets based on mobile devices | |
Nakahara et al. | High-throughput convolutional neural network on an FPGA by customized JPEG compression | |
KR20220137076A (ko) | 이미지 프로세싱 방법 및 관련된 디바이스 | |
WO2020062299A1 (fr) | Processeur de réseau neuronal, procédé de traitement de données et dispositif associé | |
WO2021042857A1 (fr) | Procédé de traitement et appareil de traitement pour modèle de segmentation d'image | |
CN114973049A (zh) | 一种统一卷积与自注意力的轻量视频分类方法 | |
Niu et al. | Machine learning-based framework for saliency detection in distorted images | |
KR20220070505A (ko) | 미세 구조 마스크를 사용한 다중 스케일 인자 이미지 슈퍼 해상도 | |
Lv et al. | An inverted residual based lightweight network for object detection in sweeping robots | |
US11948090B2 (en) | Method and apparatus for video coding | |
WO2021179117A1 (fr) | Procédé et appareil de recherche de nombre de canaux de réseau de neurones artificiels | |
CN114049491A (zh) | 指纹分割模型训练、指纹分割方法、装置、设备及介质 | |
CN115082840B (zh) | 基于数据组合和通道相关性的动作视频分类方法和装置 | |
WO2023122896A1 (fr) | Procédé et appareil de traitement de données | |
CN112580772B (zh) | 卷积神经网络的压缩方法及装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20923794 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20923794 Country of ref document: EP Kind code of ref document: A1 |