WO2020097795A1

WO2020097795A1 - Image processing method, apparatus and device, and storage medium and program product

Info

Publication number: WO2020097795A1
Application number: PCT/CN2018/115252
Authority: WO
Inventors: 阙灿; 白涛
Original assignee: 北京比特大陆科技有限公司
Priority date: 2018-11-13
Filing date: 2018-11-13
Publication date: 2020-05-22
Also published as: CN112913253A

Abstract

Embodiments of the present invention relate to an image processing method, apparatus and device, and a storage medium and a program product. The method comprises: acquiring an image to be processed, convolving the image to be processed according to a convolution layer in a neutral network to obtain a first number of image features; checking the image features for convolution according to a second number of preset convolution kernels to obtain a first number of processing results; inputting the processing results into a full connection layer of the neutral network so that the full connection layer determines an output result according to the processing results; wherein, the second number is greater than the first number. In the solutions provided by the embodiments of the present invention, after the image features are extracted through the convolution layer, the image features are not processed by a pooling layer in the prior art, instead, the preset convolution kernels are set to check the image features for convolution to obtain processing results containing more feature information, then the processing results are fused by the full connection layer to determine a more accurate output result.

Description

Image processing method, device, equipment, storage medium and program product

Technical field

This application relates to the field of image processing, for example, to an image processing method, device, device, storage medium, and program product.

Background technique

Deep convolutional network is considered to be the most effective machine learning algorithm at present, and it is widely used in the technical field of image processing, such as image detection, classification, recognition and so on.

At present, when processing an image through a neural network, the image features can be extracted through the convolution layer, the extracted features are compressed through the pooling layer, the main features are extracted, and then the extracted main features are sent to the fully connected layer for calculation Output value. When the pooling layer in the prior art processes image features, two methods are used: average pooling and maximum pooling. The average pooling refers to calculating the average value of the image area as the pooled value of the area; Maximum pooling refers to selecting the maximum value of the image area as the value after pooling the area.

However, in the prior art, the processing method of extracting all the features to take the average value or the maximum value will cause the difference in the extracted features to be insignificant, resulting in the problem of inaccurate output results of the neural network during the image processing process.

Summary of the invention

An embodiment of the present disclosure provides an image processing method, including:

Obtain the image to be processed, and perform convolution on the image to be processed according to the convolution layer in the neural network to obtain a first number of image features;

Performing convolution processing on the image features according to a second number of preset convolution kernels to obtain a first number of processing results;

Input the processing result into the fully connected layer of the neural network, so that the fully connected layer determines the output result according to the processing result;

Wherein, the second quantity is greater than or equal to the first quantity.

An embodiment of the present disclosure also provides an image processing apparatus, including:

An obtaining module, configured to obtain an image to be processed, and performing convolution processing on the image to be processed according to a convolution layer in a neural network to obtain a first number of image features;

A processing module, configured to perform convolution processing on the image features according to a second number of preset convolution kernels to obtain a first number of processing results;

A determining module, configured to input the processing result into a fully connected layer of the neural network, so that the fully connected layer determines an output result according to the processing result;

Wherein, the second quantity is greater than or equal to the first quantity.

An embodiment of the present disclosure also provides a computer including the above-mentioned image processing device.

An embodiment of the present disclosure also provides a computer-readable storage medium that stores computer-executable instructions that are configured to perform the above-described image processing method.

An embodiment of the present disclosure also provides a computer program product. The computer program product includes a computer program stored on a computer-readable storage medium. The computer program includes program instructions. When the program instructions are executed by a computer, the The computer executes the image processing method described above.

An embodiment of the present disclosure also provides an electronic device, including:

At least one processor; and

A memory communicatively connected to the at least one processor; wherein,

The memory stores instructions executable by the at least one processor. When the instructions are executed by the at least one processor, the at least one processor executes the above-mentioned image processing method.

The present disclosure provides an image processing method, device, equipment, storage medium, and program product, including: acquiring an image to be processed, and performing convolution processing on the image to be processed according to a convolutional layer in a neural network to obtain a first number of image features; Perform convolution processing on the image features according to the second number of preset convolution kernels to obtain the first number of processing results; input the processing results into the fully connected layer of the neural network, so that the fully connected layer determines the output result according to the processing results; wherein, The second quantity is greater than or equal to the first quantity. In the method, device, device, storage medium, and program product provided in this embodiment, after the image features are extracted through the convolution layer, the pooling layer in the prior art is not used to process the image features, but a preset volume is set Convolution kernel, which performs convolution processing on image features by a preset convolution kernel to obtain a processing result including more feature information, and then fuses the processing result by a fully connected layer to determine an output result with higher accuracy.

BRIEF DESCRIPTION

One or more embodiments are exemplified by the corresponding drawings. These exemplary descriptions and the drawings do not constitute a limitation on the embodiments. Elements with the same reference numerals in the drawings are shown as similar elements. The drawings do not constitute a proportional limitation, and among them:

FIG. 1 is an exemplary embodiment showing a structure diagram of a neural network for processing pictures in the prior art;

2 is a flowchart of an image processing method according to an exemplary embodiment of the present invention;

FIG. 3 is a flowchart of an image processing method according to another exemplary embodiment of the present invention;

4 is a structural diagram of an image processing apparatus according to an exemplary embodiment of the present invention;

5 is a structural diagram of an image processing apparatus shown in another exemplary embodiment of the present invention;

Fig. 6 is a structural diagram of an electronic device according to an exemplary embodiment of the present invention.

detailed description

In order to understand the features and technical contents of the embodiments of the present disclosure in more detail, the following describes the implementation of the embodiments of the present disclosure in detail with reference to the drawings. The accompanying drawings are for reference only and are not intended to limit the embodiments of the present disclosure. In the following technical description, for convenience of explanation, various details are provided to provide a sufficient understanding of the disclosed embodiments. However, without these details, one or more embodiments can still be implemented. In other cases, to simplify the drawings, well-known structures and devices can be simplified.

FIG. 1 is an exemplary embodiment showing a structure diagram of a neural network for processing pictures in the prior art.

As shown in FIG. 1, taking ResNet in the prior art as an example, the image is first subjected to 34-layer convolution to extract image features, and these features are averaged by the average pooling layer, and then the fully connected layer outputs the fused features. For example, if a feature output includes 9 feature values, the average pooling layer will average these nine feature values, and then input the average value to the fully connected layer. Assuming that a total of 512 features are included, the average pooling layer outputs 512 averages to the fully connected layer. This causes these picture features to be averaged, causing the accuracy of the fully connected layer to recognize pictures based on the picture features.

In the solution disclosed in this embodiment, the conventional average pooling layer or the maximum pooling layer is not set, but the extracted image features are processed in the form of convolution kernels to avoid the average picture feature or the extraction of image features. The maximum value leads to the problem that the accuracy of the final recognition picture decreases.

Fig. 2 is a flowchart of an image processing method according to an exemplary embodiment of the present invention.

As shown in FIG. 2, the image processing method provided in this embodiment includes:

Step 101: Obtain an image to be processed, and perform convolution processing on the image to be processed according to the convolutional layer in the neural network to obtain a first number of image features.

Among them, the method provided in this embodiment may be executed by an electronic device with a computing function, for example, a computer, a mobile phone, a tablet computer, or the like. The method provided in this embodiment may be packaged in software, and then the method provided in this embodiment may be executed through the software.

Specifically, the method provided in this embodiment may also be set in a background server, where the background server is used to process images, and the processing result may be output through front-end software.

Further, the electronic device that executes the method provided in this embodiment can acquire the image to be processed, and the image can be in a picture format (such as jpg, png, tif, gif, etc.) or a video format (such as RMVB, AVI, WMV, MPG, etc.). If the image acquired by the electronic device is in a picture format, the picture can be directly processed. If the image acquired by the electronic device is in a video format, several pictures can be extracted from the video in units of frames, and then the pictures are processed.

In practical applications, if the electronic device is a background server for image processing, the server can obtain the image to be processed from a preset database, or the user can input the image in the front-end terminal for image processing, and then the front-end terminal Send the image to the server, so that the server can obtain the image to be processed. The above-mentioned preset database may be set in the background server or in other devices, for example, it may be a removable storage device, or it may be a cloud database.

The electronic device used for image processing may also be a user terminal. The user uploads an image in the terminal or specifies an image in the terminal. The method provided in this embodiment is provided in the terminal, and the user upload or Specify the image and process it.

Specifically, the method provided in this embodiment processes the image based on the neural network. After acquiring the image to be processed, the image to be processed may be convoluted based on the convolution layer set in the neural network to obtain the first number of image features.

Further, the convolutional layer of the neural network may include one convolutional layer or multiple convolutional layers. These convolutional layers can be used to extract image features and output a first number of image features. The value of the first number is related to the number of convolution kernels set in the convolutional layer. Assuming that the image to be processed is a color image, it has three channels of R, G, and B when it is input to the convolution layer. Assuming that 5 convolution kernels are set in the convolution layer, the 5 convolution kernels are used to compare R , G, and B channels of images are convolved. When a convolution kernel is used to process one channel of images, one image feature can be output, and a convolution kernel is used to perform R, G, and B channel images. The image features output by the convolution process are superimposed to obtain the image features of the convolution kernel image to be processed by convolution extraction, and then the image features of 5 images to be processed can be output.

Step 102: Perform convolution processing on image features according to a second number of preset convolution kernels to obtain a first number of processing results.

In actual application, a second number of preset convolution kernels may be set for processing image features output by the convolution layer. The second quantity is greater than or equal to the first quantity. The specific second quantity may be an integer multiple of the first quantity. For example, if the first quantity is 512, the second quantity may also be 512, or may be a value such as 512 × 512.

Among them, the preset number of convolution kernels can be set according to requirements, which is not limited in this embodiment.

Specifically, before the image to be processed is processed using the method provided in this embodiment, preset convolution kernels may also be trained to determine the weight values inside these convolution kernels.

Further, you can build a convolutional layer, a preset convolution kernel, and a fully connected layer in advance, and input the training data into the convolutional layer, so that the convolutional layer processes the training data, extracts the data features, and then enters the data features into the Set up a convolution kernel, perform convolution processing with preset convolution kernel data features, and output the processing results to the fully connected layer. The fully connected layer can calculate the data results and compare it with the known data results in the training data , And then adjust the weight value in the convolutional layer and the preset convolution kernel according to the comparison result.

In actual application, the weight value in the preset convolution kernel is initially a random value. After training and learning, the weight value can be adjusted. When the ratio of the data result output by the fully connected layer to the known data result is greater than the allowable value, then The training of the preset convolution kernel can be stopped. Training rules can also be added during training to adjust the weight values in the convolution kernels so that the preset convolution kernels are not the same.

Wherein, when processing the image to be processed, the first image feature of the image to be processed can be convoluted using the preset convolution kernel after training, so as to output an accurate processing result.

Specifically, a convolution kernel corresponding to the image feature can be used to perform convolution processing on the image feature. The specific convolution process is the same as the conventional convolution method. For example, if the size of the image feature is 3 × 3, the convolution kernel The size of is 3 × 3, then multiply the image feature value of the corresponding position and the weight value in the convolution kernel, and then superpose the product to obtain the processing result. Suppose an image feature is

The corresponding convolution kernel is

The resulting processing result is 4.

Further, if the size of the preset convolution kernel is the same as the size of the image feature, specifically the size of the two, there is no need to set the offset parameter, otherwise, the offset parameter needs to be set. If the size of the preset convolution kernel is the same as the size of the image feature, if both are 3 × 3, the preset convolution kernel can complete the convolution processing of the image feature without moving. If the two are different, for example, the preset convolution kernel is 3 × 3, and the image feature size is 3 × 4, then the preset convolution kernel needs to move one step in the direction of the row, and perform two convolutions and operations to complete Convolution of image features.

In actual application, the size of the image feature is related to the size of the convolution kernel in the convolution layer. Therefore, the size of the image feature can be determined first according to the size of the convolution kernel set in the convolution layer, and then according to the size of the image feature Preset the size of the convolution kernel, and train the preset convolution kernel.

Wherein, if the first number is the same as the second number, the image features can be directly convoluted according to the preset convolution kernel, and the first number of processing results can be obtained. If the second quantity is greater than the first quantity, the convolution result can be further processed to obtain the processing result. For example, if the first number is 512, including a total of 512 × 512 convolution kernels, the preset convolution kernels corresponding to each image feature may be 512, and the image may be checked with the corresponding preset convolution kernels The feature is convolved to obtain the convolution result, and then the convolution result is superimposed to obtain the processing result corresponding to the image feature, and finally 512 processing results are obtained.

By processing the image features through preset convolution kernels, the values included in each image feature can be comprehensively considered, compared with directly averaging the values in the image features or directly taking a feature The maximum value of is used as the processing result, and the resulting processing result has a smaller loss in feature value, so that the data input to the fully connected layer includes more detailed image features, making the output result of the fully connected layer determined based on the input data more accurate.

Specifically, the number of fully connected input data in the prior art is the same as the number of features output by the convolution layer. Therefore, in the method provided in this embodiment, after processing image features according to a preset convolution kernel, the first For a large number of processing results, these processing results can be directly input to the fully connected layer without changing the structure of the fully connected layer in the prior art.

Step 103: Input the processing result into the fully connected layer of the neural network, so that the fully connected layer determines the output result according to the processing result.

The processing result obtained by performing convolution processing on the image features by the preset convolution kernel can be input to the fully connected layer of the neural network, and then the fully connected layer determines the final output result. Because the processing result of the input fully connected layer includes richer image feature information, the accuracy of the output result determined by the fully connected layer is higher, and the result of image processing is more accurate.

The image features of the image to be processed can be extracted through the convolutional layer, and these features are independent. The preset image convolution kernel can reduce and compress the extracted image features, for example, reduce the dimension of 3 × 3 image features For a value, the results after dimensionality reduction are also independent. Therefore, the fully connected layer needs to combine all the results to determine the output result of the image. The output result may be the result of image recognition, classification, and detection. In the method provided in this embodiment, the principle of the fully connected layer in the prior art may be used.

The method provided in this embodiment is used to process an image. The method is performed by a device provided with the method provided in this embodiment, and the device is usually implemented in hardware and / or software.

An embodiment of the present disclosure provides an image processing method, including: acquiring an image to be processed, and performing convolution processing on the image to be processed according to a convolution layer in a neural network to obtain a first number of image features; and a second number of presets The convolution kernel performs convolution processing on the image features to obtain the first number of processing results; the processing results are input to the fully connected layer of the neural network, so that the fully connected layer determines the output result according to the processing results; wherein, the second number is greater than or equal to the first Quantity. In the method provided in this embodiment, after the image features are extracted through the convolution layer, the image features are not processed using the pooling layer in the prior art, but a preset convolution kernel is set, and the image is checked by the preset convolution kernel The features are subjected to convolution processing to obtain a processing result that includes more feature information, and the processing result is fused by the fully connected layer to determine an output result with higher accuracy.

Fig. 3 is a flowchart of an image processing method according to another exemplary embodiment of the present invention.

As shown in FIG. 3, the image processing method provided in this embodiment includes:

Step 201: Obtain an image to be processed, and perform convolution processing on the image to be processed according to the convolutional layer in the neural network to obtain a first number of image features.

The specific principles and implementations of step 201 and step 101 are similar, and will not be repeated here.

After the image features of the image to be processed are extracted, the image features can be processed based on the preset convolution kernel.

If the second number of preset convolution kernels is the same as the first number, step 2021 is performed; if the first number is N, the preset convolution kernels are N groups, and each group includes N preset volumes For the accumulation kernel, step 2022 is performed; if the first number is N, M groups of preset convolution kernels are set in total, and each group includes N preset convolution kernels, then step 2024 is performed.

In the method provided in this embodiment, the size of the preset convolution kernel is the same as the size of the image feature. Each preset convolution kernel has a corresponding image feature, and the preset convolution kernel has the same size as its corresponding image feature, specifically referring to the same size of data, for example, the image feature is m × n, that is, the image feature includes m rows And n columns of feature values, the size of the corresponding preset convolution kernel is also m × n, which includes m rows and n columns of weight values.

When using preset convolution kernels to process image features, because the two are the same size, you only need to perform a convolution calculation on the image features through the preset convolution kernels to process the image features without setting a preset The offset parameter of the convolution kernel. The offset parameter is the step size that is used when the convolution kernel is used to perform the convolution calculation on the image features. For example, if the offset parameter is 1, when the convolution kernel is used to process the image features, the second convolution calculation will Receptive field, move 1 pixel to the right or down based on the first calculation.

Step 2021: Perform convolution calculation according to each preset convolution kernel and corresponding image features to obtain a first number of processing results.

Wherein, if the number of preset convolution kernels is the same as the number of image features, then one preset convolution kernel corresponds to one image feature. Each image feature is calculated by the convolutional layer on the entire image to be processed. Therefore, each image feature represents a type of feature of the entire image. You can set different types of preset convolution kernels, and use the corresponding preset convolution kernels and image features to perform convolution calculations.

Assuming that the number of preset convolution kernels is N and the number of image features is also N, then each preset convolution kernel and its corresponding image feature are convoluted to obtain N processing results.

Step 2022: Perform convolution calculation on each preset convolution kernel in each group of convolution kernels and corresponding image features to obtain N dimensionality reduction feature values corresponding to each group of convolution kernels.

In another implementation provided by this embodiment, if the first number is N, N groups of preset convolution kernels may be set, and each group includes N preset convolution kernels. In each group of preset convolution kernels, each preset convolution kernel corresponds to one image feature, that is, N preset convolution kernels included in a group correspond to N image features one-to-one.

At this time, N preset convolution kernels in a group and corresponding N image features can be convoluted to obtain N dimensionality reduction feature values.

In step 2023, N dimension reduction feature values corresponding to each group of convolution kernels are superimposed to obtain N processing results.

In general, the number of features output by the convolutional layer is the same as the number of data of the fully connected input. If all the processing results of the N groups are directly input to the fully connected layer, the calculation amount of the fully connected layer will be excessive. Therefore, N dimensionality reduction feature values corresponding to a group of convolution kernels can be superimposed to obtain one processing result, then N groups of preset convolution kernels can obtain N processing results.

In actual application, the image features are convolved with the preset convolution kernel to reduce the dimensionality of the image features, so that the dimensionality reduction feature values do not lose the original image feature meaning, that is, each dimensionality reduction feature value can still Represents a type of feature of a picture to be processed. The N dimensionality reduction feature values in a set of convolution kernels are added to obtain a processing result, which is used to represent the overall characteristics of the image to be processed, and setting N sets of preset convolution kernels can be based on N sets of preset convolution kernels To determine the N overall features of the image to be processed.

Step 2024: Perform convolution calculation on each preset convolution kernel in each group of convolution kernels and corresponding image features to obtain N dimensionality reduction feature values corresponding to each group of convolution kernels.

In another embodiment, if the number of image features is N, M groups of preset convolution kernels may also be set, and each group includes N preset convolution kernels.

Similar to the above embodiment, the N preset convolution kernels included in each group have a one-to-one correspondence with the image features, and the N convolution kernels included in each group can be used to perform convolution calculation with the image features to obtain N dimensionality reduction feature values . If M sets of preset convolution kernels are set, a total of M × N dimension reduction feature values can be obtained.

Step 2025, the N dimension reduction feature values corresponding to each group of convolution kernels are equally divided into T groups, and each group of dimension reduction feature values are superimposed to obtain T processing results corresponding to each group of convolution kernels.

Among them, the product of T and M is N.

The N dimensionality reduction feature values corresponding to each group of preset convolution kernels can be scored as T groups. Each dimension-reduction feature value can represent a type of feature of the image to be processed, and the N dimension-reduction feature values are divided into T groups, then the T-type features can be divided into one group, and then the dimension-reduction features in each group Values are superimposed to obtain T processing results. For a set of preset convolution kernels, T processing results can be obtained, and if M sets of preset convolution kernels are set, T × M processing results, that is, N processing results can be obtained.

Each dimensionality reduction feature value can represent a type of feature information of the image to be processed. After grouping the dimensionality reduction feature values, T groups of dimensionality reduction feature values can be obtained, and then the features of each group can be superimposed to obtain T images to be processed Comprehensive feature information. By setting M sets of preset convolution kernels, T × M comprehensive feature information can be obtained.

After step 2021 or 2023 or 2025 is executed, step 203 may be executed.

Step 203: Input the processing result into the fully connected layer of the neural network, so that the fully connected layer determines the output result according to the processing result.

The specific principles and implementations of step 203 and step 103 are similar, and will not be repeated here.

Optionally, in the method provided in this embodiment, before using the preset convolution kernel to process the image to be processed, the preset convolution kernel needs to be trained to determine the weight value inside each preset convolution kernel.

When training the preset convolution kernel, the weight value in the preset convolution kernel needs to be randomly generated. During the training process, the weight value of the preset convolution kernel will also be adjusted. After the weight value is randomly generated, or After adjusting the weight value every / several times, the method provided in this embodiment may further include:

Determine the corresponding vector according to the preset convolution kernel;

The convolution kernel is normalized according to the vector to enhance the difference between the convolution kernels.

After the weight value changes, the vector corresponding to the preset convolution kernel can be determined according to the weight value. Then determine whether each weight value is similar according to the vector. If it is similar, adjust the weight value in the corresponding preset convolution kernel to make the two vectors have a certain difference.

Among them, the weight value in the preset convolution kernel can be determined as a one-dimensional vector. For example, if the convolution kernel is 3 × 3, a total of 9 weight values can be included. These weight values can be spliced in the order of rows. That is, the first row, the second row, and the third row are spliced in sequence, and a one-dimensional vector can be obtained. The corresponding one-dimensional vector can be generated according to the weight value of each preset convolution kernel.

Specifically, the inner product of every two vectors can be calculated, and whether the two vectors are similar according to the inner product result can be determined. The inner product is equal to the product of the magnitude of the two vectors and the cosine of the angle. It can be considered that the smaller the inner product, the greater the angle between the two vectors, and the greater the difference between the two. Therefore, it can be determined whether the two are similar based on the inner product of the vector corresponding to the preset convolution kernel.

There are obvious differences between preset convolution kernels, which can compress image features from different angles, thereby improving the expressiveness of image features and improving the accuracy of picture processing.

Fig. 4 is a structural diagram of an image processing apparatus according to an exemplary embodiment of the present invention.

As shown in FIG. 4, the image processing apparatus provided in this embodiment includes:

The obtaining module 41 is configured to obtain an image to be processed, and perform convolution processing on the image to be processed according to a convolutional layer in a neural network to obtain a first number of image features;

The processing module 42 is configured to perform convolution processing on the image features according to a second number of preset convolution kernels to obtain a first number of processing results;

The determining module 43 is configured to input the processing result into the fully connected layer of the neural network, so that the fully connected layer determines the output result according to the processing result;

Wherein, the second quantity is greater than or equal to the first quantity.

The image processing device provided in this embodiment includes an acquisition module for acquiring an image to be processed, and performing a convolution process on the image to be processed according to a convolutional layer in a neural network to obtain a first number of image features; a processing module is used for The second number of preset convolution kernels perform convolution processing on the image features to obtain the first number of processing results; the determination module is used to input the processing results into the fully connected layer of the neural network, so that the fully connected layer determines the output according to the processing results Results; where the second quantity is greater than or equal to the first quantity. In the device provided in this embodiment, after the image features are extracted through the convolution layer, the image features are not processed using the pooling layer in the prior art, but a preset convolution kernel is set, and the image is checked by the preset convolution kernel The features are subjected to convolution processing to obtain a processing result that includes more feature information, and the processing result is fused by the fully connected layer to determine an output result with higher accuracy.

The specific principles and implementations of the image processing apparatus provided in this embodiment are similar to the embodiment shown in FIG. 2 and will not be repeated here.

Fig. 5 is a structural diagram of an image processing apparatus according to another exemplary embodiment of the present invention.

As shown in FIG. 5, on the basis of the foregoing embodiment, in the image processing apparatus provided in this embodiment, the size of the preset convolution kernel is the same as the size of the image feature.

Optionally, the second quantity is the same as the first quantity;

The processing module 42 includes:

The first processing unit 421 is configured to perform convolution calculation according to each of the preset convolution kernels and the corresponding image features to obtain a first number of the processing results.

Optionally, the second number of preset convolution kernels includes N groups of preset convolution kernels, and each group includes N preset convolution kernels, where N is equal to the first number;

The processing module 42 includes a second processing unit 422 for:

Performing convolution calculation on each of the preset convolution kernels in each group of convolution kernels and the corresponding image features to obtain N dimensionality reduction feature values corresponding to each group of convolution kernels;

N dimensionality reduction feature values corresponding to each set of convolution kernels are superimposed to obtain N processing results.

Optionally, the second number of preset convolution kernels includes M groups of convolution kernels, and each group includes N convolution kernels, where N is equal to the first number;

The processing module 42 includes a third processing unit 423 for:

N dimensionality reduction feature values corresponding to each group of convolution kernels are equally divided into T groups, and the dimensionality reduction feature values of each group are superimposed to obtain T processing results corresponding to each group of convolution kernels;

Among them, the product of T and M is N.

Optionally, each of the preset convolution kernels does not include offset parameters.

Optionally, the device provided in this embodiment further includes an adjustment module 44 for:

Determine the corresponding vector according to the preset convolution kernel;

Optionally, the adjustment module 44 is specifically used to:

Determine the inner product of the two vectors, and determine whether the preset convolution kernel corresponding to the vector is similar according to the inner product, and if so, adjust the preset convolution kernel.

The specific principle and implementation of the image processing apparatus provided in this embodiment are similar to the embodiment shown in FIG. 3, and details are not described herein again.

An embodiment of the present disclosure also provides a computer program product. The computer program product includes a computer program stored on a computer-readable storage medium. The computer program includes program instructions. When the program instructions are executed by a computer, the The computer executes the above image processing method.

The aforementioned computer-readable storage medium may be a transient computer-readable storage medium or a non-transitory computer-readable storage medium.

An embodiment of the present disclosure also provides an electronic device, whose structure is shown in FIG. 6, the electronic device includes:

At least one processor (processor) 60, one processor 60 is taken as an example in FIG. 6; and the memory (memory) 61 may further include a communication interface (Communication Interface) 62 and a bus 63. The processor 60, the communication interface 62, and the memory 61 can complete communication with each other through the bus 63. The communication interface 62 may be used for information transmission. The processor 60 may call logic instructions in the memory 61 to execute the image processing method of the above-mentioned embodiment.

In addition, the logic instructions in the aforementioned memory 61 can be implemented in the form of software functional units and sold or used as an independent product, and can be stored in a computer-readable storage medium.

The memory 61 is a computer-readable storage medium that can be used to store software programs and computer-executable programs, such as program instructions / modules corresponding to the methods in the embodiments of the present disclosure. The processor 60 executes functional applications and data processing by running software programs, instructions, and modules stored in the memory 61, that is, implementing the image processing method in the above method embodiments.

The memory 61 may include a storage program area and a storage data area, wherein the storage program area may store an operating system and application programs required for at least one function; the storage data area may store data created according to the use of a terminal device and the like. In addition, the memory 61 may include a high-speed random access memory, and may also include a non-volatile memory.

The technical solutions of the embodiments of the present disclosure may be embodied in the form of software products, which are stored in a storage medium and include one or more instructions to make a computer device (which may be a personal computer, server, or network) Equipment, etc.) to perform all or part of the steps of the method described in the embodiments of the present disclosure. The aforementioned storage medium may be a non-transitory storage medium, including: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disk, etc. A medium that can store program codes may also be a transient storage medium.

When used in this application, although the terms "first", "second", etc. may be used in this application to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another. For example, without changing the meaning of the description, the first element can be called the second element, and likewise, the second element can be called the first element, as long as all occurrences of the "first element" are consistently renamed and all occurrences of The "second component" can be renamed consistently. The first element and the second element are both elements, but they may not be the same element.

The terms used in this application are only used to describe the embodiments and are not used to limit the claims. As used in the description of the embodiments and claims, unless the context clearly indicates otherwise, the singular forms "a", "an" and "said" are intended to include plural forms as well . Similarly, the term "and / or" as used in this application is meant to include any and all possible combinations of one or more associated lists. In addition, when used in this application, the term "comprise" and its variations "comprises" and / or includes etc. refer to the stated features, wholes, steps, operations, elements, and / or The presence of components does not exclude the presence or addition of one or more other features, wholes, steps, operations, elements, components, and / or groups of these.

The various aspects, implementations, implementations, or features in the described embodiments can be used alone or in any combination. Various aspects in the described embodiments may be implemented by software, hardware, or a combination of software and hardware. The described embodiments may also be embodied by a computer-readable medium that stores computer-readable code including instructions executable by at least one computing device. The computer-readable medium can be associated with any data storage device capable of storing data, which can be read by a computer system. Computer-readable media used for examples may include read-only memory, random access memory, CD-ROM, HDD, DVD, magnetic tape, optical data storage devices, and the like. The computer-readable medium may also be distributed in computer systems connected through a network, so that computer-readable codes can be stored and executed in a distributed manner.

The above technical description may refer to the accompanying drawings, which form a part of the present application, and the description shows an implementation according to the described embodiments in the drawings. Although these embodiments are described in sufficient detail to enable those skilled in the art to implement these embodiments, these embodiments are non-limiting; so that other embodiments can be used without departing from the scope of the described embodiments Changes can also be made under circumstances. For example, the sequence of operations described in the flowchart is non-limiting, so the sequence of two or more operations explained in the flowchart and described according to the flowchart may be changed according to several embodiments. As another example, in several embodiments, one or more operations illustrated in the flowchart and described in accordance with the flowchart are optional or may be deleted. In addition, certain steps or functions may be added to the disclosed embodiments, or two or more steps may be replaced in sequence. All these changes are considered to be included in the disclosed embodiments and claims.

In addition, terminology is used in the above technical description to provide a thorough understanding of the described embodiments. However, no excessively detailed details are required to implement the described embodiments. Therefore, the above description of the embodiments is presented for explanation and description. The embodiments presented in the above description and the examples disclosed according to these embodiments are provided separately to add context and help to understand the described embodiments. The above description is not intended to be without omission or to limit the described embodiments to the precise form of this disclosure. Based on the above teachings, several modifications, choices and changes are possible. In some cases, well-known processing steps are not described in detail to avoid unnecessarily affecting the described embodiments.

Claims

An image processing method, characterized in that it includes:

Obtain the image to be processed, and perform convolution on the image to be processed according to the convolution layer in the neural network to obtain a first number of image features;

Performing convolution processing on the image features according to a second number of preset convolution kernels to obtain a first number of processing results;

Input the processing result into the fully connected layer of the neural network, so that the fully connected layer determines the output result according to the processing result;

Wherein, the second quantity is greater than or equal to the first quantity.
The method according to claim 1, wherein the size of the preset convolution kernel is the same as the size of the image feature.
The method according to claim 1 or 2, wherein the second quantity is the same as the first quantity;

The convolution processing of the image features according to a preset convolution kernel to obtain a first number of processing results includes:

Performing convolution calculation according to each of the preset convolution kernels and the corresponding image features to obtain a first number of the processing results.
The method according to claim 1 or 2, wherein the second number of preset convolution kernels includes N groups of preset convolution kernels, and each group includes N preset convolution kernels, where N is equal to The first quantity;

The convolution processing of the image features according to a preset convolution kernel to obtain a first number of processing results includes:

Performing convolution calculation on each of the preset convolution kernels in each group of convolution kernels and the corresponding image features to obtain N dimensionality reduction feature values corresponding to each group of convolution kernels;

N dimensionality reduction feature values corresponding to each set of convolution kernels are superimposed to obtain N processing results.
The method according to claim 1 or 2, wherein the second number of preset convolution kernels includes M groups of convolution kernels, and each group includes N convolution kernels, where N is equal to the first Quantity

The convolution processing of the image features according to a preset convolution kernel to obtain a first number of processing results includes:

Performing convolution calculation on each of the preset convolution kernels in each group of convolution kernels and the corresponding image features to obtain N dimensionality reduction feature values corresponding to each group of convolution kernels;

N dimensionality reduction feature values corresponding to each group of convolution kernels are equally divided into T groups, and the dimensionality reduction feature values of each group are superimposed to obtain T processing results corresponding to each group of convolution kernels;

Among them, the product of T and M is N.
The method according to claim 1 or 2, wherein each of the preset convolution kernels does not include an offset parameter.
The method according to claim 1, further comprising:

Determine the corresponding vector according to the preset convolution kernel;

The convolution kernel is normalized according to the vector to enhance the difference between the convolution kernels.
The method according to claim 7, wherein the normalizing the convolution kernel according to the vector comprises:

Determine the inner product of the two vectors, and determine whether the preset convolution kernel corresponding to the vector is similar according to the inner product, and if so, adjust the preset convolution kernel.
An image processing device, characterized in that it includes:

An obtaining module, configured to obtain an image to be processed, and performing convolution processing on the image to be processed according to a convolution layer in a neural network to obtain a first number of image features;

A processing module, configured to perform convolution processing on the image features according to a second number of preset convolution kernels to obtain a first number of processing results;

A determining module, configured to input the processing result into a fully connected layer of the neural network, so that the fully connected layer determines an output result according to the processing result;

Wherein, the second quantity is greater than or equal to the first quantity.
The apparatus according to claim 9, wherein the size of the preset convolution kernel is the same as the size of the image feature.
The device according to claim 9 or 10, wherein the second quantity is the same as the first quantity;

The processing module includes:

The first processing unit is configured to perform convolution calculation according to each of the preset convolution kernels and the corresponding image features to obtain a first number of the processing results.
The apparatus according to claim 9 or 10, wherein the second number of preset convolution kernels includes N groups of preset convolution kernels, and each group includes N preset convolution kernels, where N is equal to The first quantity;

The processing module includes a second processing unit for:

Performing convolution calculation on each of the preset convolution kernels in each group of convolution kernels and the corresponding image features to obtain N dimensionality reduction feature values corresponding to each group of convolution kernels;

N dimensionality reduction feature values corresponding to each set of convolution kernels are superimposed to obtain N processing results.
The apparatus according to claim 9 or 10, wherein the second number of preset convolution kernels includes M groups of convolution kernels, and each group includes N convolution kernels, where N is equal to the first Quantity

The processing module includes a third processing unit for:

Performing convolution calculation on each of the preset convolution kernels in each group of convolution kernels and the corresponding image features to obtain N dimensionality reduction feature values corresponding to each group of convolution kernels;

N dimensionality reduction feature values corresponding to each group of convolution kernels are equally divided into T groups, and the dimensionality reduction feature values of each group are superimposed to obtain T processing results corresponding to each group of convolution kernels;

Among them, the product of T and M is N.
The apparatus according to claim 9 or 10, wherein each of the preset convolution kernels does not include an offset parameter.
The device according to claim 9, further comprising an adjustment module for:

Determine the corresponding vector according to the preset convolution kernel;

The convolution kernel is normalized according to the vector to enhance the difference between the convolution kernels.
The device according to claim 15, wherein the adjustment module is specifically configured to:

Determine the inner product of the two vectors, and determine whether the preset convolution kernel corresponding to the vector is similar according to the inner product, and if so, adjust the preset convolution kernel.
A computer, characterized by comprising the device according to any one of claims 9-16.
An electronic device, characterized in that it includes:

At least one processor; and

A memory communicatively connected to the at least one processor; wherein,

The memory stores instructions executable by the at least one processor, and when the instructions are executed by the at least one processor, causes the at least one processor to perform the method of any one of claims 1-8 .
A computer-readable storage medium, characterized in that computer-executable instructions are stored, and the computer-executable instructions are configured to perform the method of any one of claims 1-8.
A computer program product, characterized in that the computer program product includes a computer program stored on a computer-readable storage medium, the computer program includes program instructions, and when the program instructions are executed by a computer, the computer Performing the method of any one of claims 1-8.