WO2022198517A1

WO2022198517A1 - Image processing method and apparatus, and medium and electronic device

Info

Publication number: WO2022198517A1
Application number: PCT/CN2021/082809
Authority: WO
Inventors: 聂谷洪; 施泽浩; 王栋
Original assignee: 深圳市大疆创新科技有限公司
Priority date: 2021-03-24
Filing date: 2021-03-24
Publication date: 2022-09-29

Abstract

The present invention provides an image processing method and apparatus, and a computer-readable medium and an electronic device. The image processing method comprises: obtaining an original image, the original image comprising a target object; when the original image and a preset image processing model are not matched, respectively performing first processing on the original image and performing second processing on the image processing model, such that the original image and the image processing model are matched; and processing the original image by means of the image processing model, so as to recognize the target object in the original image. According to the present invention, only one image processing model needs to be stored and loaded, thereby greatly saving memory space and reducing a loading duration, and the original image does not generate lateral or longitudinal deformation, such that the image display effect is ensured, the processing effect of the image processing model is further ensured, the real-time requirements of target detection are satisfied, and the image processing model can be deployed on a photographing device for on-line use.

Description

Image processing method, apparatus, medium and electronic device

technical field

The present disclosure relates to the technical field of image processing, and in particular, to an image processing method, an image processing apparatus, a computer-readable medium, and an electronic device.

Background technique

Imaging devices with both horizontal and vertical shooting modes often need to run functions such as object detection. However, functions such as target detection can only be realized by adapting two image processing models corresponding to the two shooting modes of horizontal shooting and vertical shooting.

However, maintaining two image processing models would result in an oversized application deploying the image processing model and wasting memory space.

In view of this, there is an urgent need to develop a new image processing method in the art.

It should be noted that the information disclosed in the above Background section is only for enhancement of understanding of the background of the present disclosure, and therefore may contain information that does not form the prior art that is already known to a person of ordinary skill in the art.

SUMMARY OF THE INVENTION

The present disclosure provides an image processing method, an image processing apparatus, a computer-readable medium, and an electronic device, thereby improving the technical problem of wasting memory space in the prior art at least to a certain extent.

Other features and advantages of the present disclosure will become apparent from the following detailed description, or be learned in part by practice of the present disclosure.

According to a first aspect of the present disclosure, there is provided an image processing method, comprising: acquiring an original image, the original image including a target object;

When the original image and the preset image processing model do not match, respectively perform the first processing on the original image and the second processing on the image processing model, so that the original image and the image processing model match ;

The original image is processed by the image processing model to identify the target object in the original image.

In an exemplary embodiment of the present disclosure, the original image is generated after an image is captured by a preset photographing device;

Wherein, when the photographing device shoots horizontally, the horizontal side of the original image is larger than the vertical side; when the photographing device shoots vertically, the horizontal side of the original image is smaller than the vertical side.

In an exemplary embodiment of the present disclosure, the method further includes:

When the horizontal side of the original image is larger than the vertical side, the original image and the preset image processing model match each other; when the horizontal side of the original image is smaller than the vertical side, the original image and the preset image processing model match each other; model mismatch; or,

When the horizontal side of the original image is smaller than the vertical side, the original image and the preset image processing model match each other; when the horizontal side of the original image is larger than the vertical side, the original image and the preset image processing model match each other; Model does not match.

In an exemplary embodiment of the present disclosure, the target object corresponds to a horizontal edge of the original image in the original image.

In an exemplary embodiment of the present disclosure, when the original image does not match a preset image processing model, the first processing is performed on the original image and the second processing is performed on the image processing model, respectively. ,include:

Determine the shape of the original image according to the data collected by the inertial measurement unit of the photographing device;

When the shape of the original image does not match the preset image processing model, the first processing is performed on the original image, and the second processing is performed on the image processing model.

In an exemplary embodiment of the present disclosure, performing the first processing on the original image includes:

The original image is rotated by 90 degrees so that the target object in the original image is also rotated by 90 degrees.

In an exemplary embodiment of the present disclosure, before the first processing on the original image and the second processing on the image processing model respectively, the method further includes:

Determine the orientation of the target object of the original image according to the data collected by the inertial measurement unit of the photographing device;

When the orientation of the target object is inverted, the original image is flipped 180 degrees.

When the original image and the preset image processing model do not match, a second process is performed on the image processing model to match the original image and the image processing model.

In an exemplary embodiment of the present disclosure, performing the second processing on the image processing model includes:

If the image processing model is a convolutional neural network model, obtain the weight matrix in the convolutional neural network model;

Transpose processing is performed on the weight matrix to perform a second processing on the convolutional neural network model.

In an exemplary embodiment of the present disclosure, the processing of the original image by the image processing model includes:

Perform an inner product operation on the original image by using the weight matrix after the transposition process to obtain image features;

Performing nonlinear processing on the image features to obtain nonlinear features, and performing feature compression processing on the nonlinear features to obtain compressed features;

Perform full connection processing on the compressed features to process the original image.

When the original image matches a preset image processing model, the original image is processed by the image processing model to identify the target object in the original image.

According to a second aspect of the present disclosure, there is provided an image processing apparatus, comprising: a processor;

a memory for storing executable instructions for the processor;

wherein the processor is configured to perform, via executing the executable instructions:

obtaining an original image, the original image including the target object;

In an exemplary embodiment of the present disclosure, the apparatus further includes:

In an exemplary embodiment of the present disclosure, before performing the first processing on the original image and performing the second processing on the image processing model respectively, the apparatus further includes:

According to a third aspect of the present disclosure, there is provided a computer-readable medium on which a computer program is stored, and when the computer program is executed by a processor, implements any one of the image processing methods provided in the first aspect.

According to a fourth aspect of the present disclosure, there is provided an electronic device, comprising:

processor; and

a memory for storing executable instructions for the processor;

Wherein, the processor is configured to execute any one of the image processing methods provided in the first aspect by executing the executable instructions

The technical solution of the present disclosure has the following beneficial effects:

According to the above-mentioned image processing method, image processing apparatus, computer-readable medium and electronic device, on the one hand, by performing the first processing on the original image and the second processing on the image processing model, only one image processing model needs to be stored and loaded, and it is extremely The memory space is greatly saved and the loading time is reduced; on the other hand, the target object can be identified through the processing of the original image by the image processing model, and the two shooting modes can be completed under the condition of ensuring the performance of the image processing model. The target detection processing meets the real-time requirements of target detection, and is easy to deploy on shooting equipment for online use.

It is to be understood that the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the present disclosure.

Description of drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description serve to explain the principles of the disclosure. Obviously, the drawings in the following description are only some embodiments of the present disclosure, and for those of ordinary skill in the art, other drawings can also be obtained from these drawings without creative effort.

FIG. 1 shows a schematic flowchart of an image processing method in an exemplary embodiment of the present disclosure;

FIG. 2 shows a schematic flowchart of a method for flipping an original image in an exemplary embodiment of the present disclosure;

FIG. 3 shows interface schematic diagrams of four orientations of a target object in an exemplary embodiment of the present disclosure;

FIG. 4 shows a schematic flowchart of a method for respectively performing a first process and a second process in an exemplary embodiment of the present disclosure;

FIG. 5 shows a schematic flowchart of a method for performing a second process in an exemplary embodiment of the present disclosure;

FIG. 6 shows a schematic flowchart of a method for processing an original image in an exemplary embodiment of the present disclosure;

FIG. 7 shows a schematic interface diagram of transposing a weight matrix in an exemplary embodiment of the present disclosure;

FIG. 8 shows a schematic flowchart of an image processing apparatus in an exemplary embodiment of the present disclosure;

FIG. 9 schematically shows an electronic device for implementing an image processing method in an exemplary embodiment of the present disclosure;

FIG. 10 schematically illustrates a computer-readable storage medium for implementing an image processing method in an exemplary embodiment of the present disclosure.

Detailed ways

Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments, however, can be embodied in various forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided in order to give a thorough understanding of the embodiments of the present disclosure. However, those skilled in the art will appreciate that the technical solutions of the present disclosure may be practiced without one or more of the specific details, or other methods, components, devices, steps, etc. may be employed. In other instances, well-known solutions have not been shown or described in detail to avoid obscuring aspects of the present disclosure.

The terms "a", "an", "the" and "said" are used in this specification to indicate the presence of one or more elements/components/etc; the terms "include" and "have" are used to indicate open-ended Inclusive means and means that additional elements/components/etc may be present in addition to the listed elements/components/etc; the terms "first" and "second" etc. are used only as labels, not for The number of its objects is limited.

Furthermore, the drawings are merely schematic illustrations of the present disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus their repeated descriptions will be omitted. Some of the block diagrams shown in the figures are functional entities that do not necessarily necessarily correspond to physically or logically separate entities.

For imaging devices such as mobile phones or cameras that have two modes of horizontal shooting and vertical shooting, when you need to run functions such as target detection (such as face, head and shoulders, whole body, etc.), you need to adapt two different modes for the two modes. image processing model.

Generally, the image processing model is distinguished from the two modes of horizontal and vertical shooting. At this point, two image processing models need to be maintained. Since the size of image processing models such as convolutional neural networks is often measured in M (mega), it can easily lead to oversized applications deploying image processing models. Moreover, in order to ensure smooth switching, two image processing modes need to be loaded at the same time, which wastes memory space and increases the loading time.

Alternatively, the image processing model in the landscape mode can be combined with the image processing model in the portrait mode. In order to balance the performance in both modes, the input image size is usually square. The image in the positive direction will cause the horizontal and vertical images to be deformed in the horizontal and vertical directions, respectively, resulting in image distortion, affecting performance such as target detection, and reducing the user's sense of experience.

In addition, the fully convolutional neural network model can also make full use of the insensitivity to the size and shape of the image, and use the fully convolutional neural network to process the horizontal and vertical images respectively. However, the current mobile deployment platform is usually TFLite. However, TFLite does not support dynamic memory allocation. At this time, if one of the image processing models needs to be converted into another image processing model offline, only one image processing model needs to be maintained. But at this time, the application is too large, and the loading time is still too long, and the problem of wasting memory still exists.

In view of the problems existing in the related art, the present disclosure provides an image processing method, an image processing apparatus, a computer-readable medium, and an electronic device. Various aspects of the present exemplary embodiment are described in detail below.

FIG. 1 shows a schematic flowchart of an image processing method in this exemplary embodiment. As shown in FIG. 1 , the method includes at least the following steps S110 , S120 and S130 . specific:

Step S110. Acquire an original image, where the original image includes the target object.

Step S120. When the original image and the preset image processing model do not match, respectively perform the first processing on the original image and the second processing on the image processing model, so that the original image and the image processing model match.

Step S130. Process the original image through an image processing model to identify the target object in the original image.

In an exemplary embodiment of the present disclosure, on the one hand, by performing the first processing on the original image and the second processing on the image processing model, only one image processing model needs to be stored and loaded, which greatly saves memory space; On the other hand, the target object can be identified through the processing of the original image by the image processing model, and the target detection processing in the two shooting modes can be completed under the condition of ensuring the performance of the image processing model, which meets the real-time requirements of target detection. , which is easy to deploy and use online on shooting equipment.

Each step of the image processing method will be described in detail below.

In step S110, an original image is acquired, and the original image includes the target object.

In an exemplary embodiment of the present disclosure, the inclusion of the target object in the original image indicates that the original image is an image obtained by photographing the target object. The target object is, for example, a person, a tree, a vehicle, or a boat, which is not particularly limited in this exemplary embodiment.

In an optional embodiment, the original image is generated by a preset photographing device after collecting the image; wherein, when the photographing device is horizontally photographed, the horizontal side of the original image is larger than the vertical side; when the photographing device is vertically photographed, the horizontal side of the original image is The side is smaller than the vertical side.

Wherein, the photographing device may be an imaging device having two photographing modes of horizontal shooting and vertical shooting. For example, the photographing device may be a mobile phone and a camera, or other imaging devices, which are not particularly limited in this exemplary embodiment.

When the photographing device is in the horizontal shooting mode, the horizontal edge of the original image obtained by shooting is larger than the vertical edge; and when the shooting device is in the vertical shooting mode, the horizontal edge of the original image obtained by shooting is smaller than the vertical edge.

In an optional embodiment, the target object corresponds to a horizontal edge of the original image in the original image. That is, when the original image is displayed, the target object faces one of the horizontal sides of the original image.

In step S120, when the original image and the preset image processing model do not match, respectively perform the first processing on the original image and the second processing on the image processing model, so that the original image and the image processing model match.

In an exemplary embodiment of the present disclosure, in order to further perform image processing on the original image, an image processing model may be preset. To determine whether the image processing model can perform image processing on the original image, it may be determined whether the original image matches the preset image processing model.

Moreover, because in the horizontal shooting mode or the vertical shooting mode, the target object has two orientations. Specifically, when a photographing device, such as a mobile phone, adopts the horizontal shooting mode and is placed from left to right from the end to the bottom of the mobile phone, the target object has the first orientation in the original picture, and when the end to the bottom of the mobile phone is positioned from left to right, the target object has the first orientation. It is placed from right to left, and the target object has a second orientation in the original picture; when the shooting device, such as a mobile phone, is in vertical shooting mode, and the end to the bottom of the mobile phone is placed from top to bottom, the target object is in the original picture. It has the first orientation, and when the phone is placed from the bottom to the bottom, the target object has the second orientation in the original screen. Therefore, the orientation of the target object can be judged by using the inertial measurement unit on the photographing device first, so as to perform processing such as flipping.

In an optional embodiment, FIG. 2 shows a schematic flowchart of a method for flipping an original image. As shown in FIG. 2 , the method includes at least the following steps: In step S210, according to the information of the inertial measurement unit of the photographing device The acquired data determines the orientation of the target object of the original image.

Among them, the Inertial Measurement Unit (IMU) is composed of three single-axis acceleration sensors and three single-axis angular velocity sensors (gyroscopes), which can measure the IMU data, including the shooting equipment in three-dimensional space. acceleration data and angular velocity data. Based on this, the IMU can be installed in a portable device, such as a wearable writing device and a handheld device, so as to calculate the motion posture of the photographing device according to the IMU data measured by the IMU.

Specifically, FIG. 3 shows a schematic interface diagram of four orientations of the target object. As shown in FIG. 3 , in the direction A, the orientation of the target object is from bottom to top in the vertical direction; in the direction B, the target The orientation of the object is from top to bottom in the vertical direction; in the direction C, the orientation of the target object is from right to left in the horizontal direction; in the direction D, the orientation of the target object is from the left to the horizontal direction right.

In step S220, when the orientation of the target object is upside down, the original image is flipped 180 degrees.

In the direction A and the direction B in Fig. 3, it is obvious that the two original images are in the same vertical direction, but the relationship between the two is flipped by 180°, while in the direction C and the direction D, the two original images are both in the same vertical direction. It is in the horizontal direction, and there is also a consistent relationship if it is flipped 180°.

Moreover, the preset image processing model may be a model matching the horizontal shooting mode, or may be a model matching the vertical shooting mode.

In order to further judge whether the original image matches the preset image processing model, therefore, the original image in either direction of direction A and direction B can be regarded as an inverted image in the other direction, that is, the target object is inverted compared to the original image. , so the original image can be flipped 180 degrees. Similarly, the original image in either direction B or C can also be considered as an inverted image in the other direction, that is, the target object is inverted compared to the original image, and the original image can also be flipped 180 degrees. deal with.

In this exemplary embodiment, the original image can be flipped through the orientation of the target object, and the original image can be oriented upward, which reduces the computational cost of subsequent matching judgment and image processing, and improves the efficiency of image processing. .

After the original image is flipped, it can be further determined whether the original image matches the preset image processing model.

In an optional embodiment, when the horizontal side of the original image is larger than the vertical side, the original image and the preset image processing model match each other; when the horizontal side of the original image is smaller than the vertical side, the original image and the preset image processing model match each other; The models do not match; or, when the horizontal side of the original image is smaller than the vertical side, the original image and the preset image processing model match each other; when the horizontal side of the original image is larger than the vertical side, the original image and the preset image processing model do not match. match.

Specifically, when the preset image processing model is a model matching the horizontal shooting mode of the photographing device, and the original image is obtained by shooting in the horizontal shooting mode, the original image matches the preset image processing model. Wherein, the horizontal side of the original image captured in the horizontal shooting mode is larger than the vertical side.

On the contrary, when the original image is obtained by using the vertical shooting mode of the photographing device, the original image does not match the preset image processing model. Wherein, the horizontal side of the original image captured in the vertical shooting mode is smaller than the vertical side.

When the preset image processing model is a model matching the vertical shooting mode of the photographing device, and the original image is obtained by shooting in the vertical shooting mode, the original image matches the preset image processing model. Wherein, the horizontal side of the original image captured in the vertical shooting mode is smaller than the vertical side.

On the contrary, when the original image is obtained by using the horizontal shooting mode of the photographing device, the original image does not match the preset image processing model. Wherein, the horizontal side of the original image captured in the horizontal shooting mode is larger than the vertical side.

Further, when the original image does not match the preset image processing model, the first processing may be performed on the original image, and the second processing may be performed on the image processing model to match the original image and the image processing model.

In an optional embodiment, FIG. 4 shows a schematic flowchart of a method for performing the first processing and the second processing respectively. As shown in FIG. 4 , the method at least includes the following steps: In step S410, according to the inertial measurement of the photographing device The data collected by the unit determines the shape of the original image.

In addition to determining the orientation of the target object according to the data collected by the inertial measurement unit of the shooting device, the shape of the original image can also be determined.

Specifically, the shape of the original image may be an aspect ratio of the original image.

When the original image is captured in horizontal shooting mode, the aspect ratio of the original image is greater than 1. The preset image processing model is a model that matches the horizontal shooting mode of the photographing device, indicating that the image processing model matches the original image with an aspect ratio greater than 1.

When the original image is shot in vertical mode, the aspect ratio of the original image is less than 1. The preset image processing model is a model that matches the vertical shooting mode of the photographing device, indicating that the image processing model matches the original image with an aspect ratio smaller than 1.

In step S420, when the shape of the original image does not match the preset image processing model, the first processing is performed on the original image, and the second processing is performed on the image processing model.

When the aspect ratio of the original image is less than 1, and the preset image processing model is a model matching the horizontal shooting mode of the photographing device, it indicates that the shape of the original image does not match the preset image processing model.

When the aspect ratio of the original image is greater than 1 and the preset image processing model matches the vertical shooting mode of the time-domain shooting device, it can also indicate that the shape of the original image does not match the preset image processing model.

Therefore, a first processing can be performed on the original image and a second processing on the image processing model.

In this exemplary embodiment, the shape of the original image can be determined through the data collected by the inertial measurement unit, and the first processing and the second processing can be further determined. The determination method is simple and accurate, and has strong applicability.

Wherein, performing the first processing on the original image may be rotation processing.

In an alternative embodiment, the original image is rotated by 90 degrees so that the target object in the original image is also rotated by 90 degrees.

After judging and flipping the four directions as shown in FIG. 3 upside down, the original images in the horizontal direction and the vertical direction can be obtained.

When the image processing model is a model that matches the horizontal shooting mode of the shooting device, the original image in the vertical direction can be rotated by 90 degrees, so that the target object in the original image is also rotated by 90 degrees; when the image processing model is the same as When shooting a model that matches the vertical mode of the device, you can rotate the original image in the horizontal direction by 90 degrees, so that the target object in the original image is also rotated by 90 degrees.

Rotating the original image by 90 degrees can make the size or shape of the original image match the preset image processing model.

Further, the second processing may also be performed on the image processing model.

In an optional embodiment, FIG. 5 shows a schematic flowchart of a method for performing the second processing. As shown in FIG. 5 , the method includes at least the following steps: In step S510, if the image processing model is a convolutional neural network model to get the weight matrix in the convolutional neural network model.

Among them, Convolutional Neural Network (CNN) is an artificial neural network, which has a wide range of applications in the fields of image recognition and target detection. A typical CNN model includes convolutional layers, pooling layers, activation layers, and fully-connected layers. The upper layer performs corresponding operations based on the input data, and outputs the operation results to the next layer. After the operation, a final result is obtained.

Among them, the convolution operation of the convolution layer may be to use a convolution kernel (also called a filter) to operate on the image and then output another image, and the ortho acid may be the weight of the feature value of the image and the convolution kernel. value for inner product operation.

The calculation process of the convolutional layer is the most important feature extraction process. Each preset feature extraction convolutional neural network can design multiple convolutional layers, and each convolutional layer can include the size of the feature map of the input layer to perform feature traversal. The convolution kernel and the traversal step size of the convolution kernel on the feature map of the input layer. For example, the size of the feature map of the input layer is 32*32*3, the size of the convolution kernel is 5*5, and the traversal step size is 1, then the size of the feature map output by the convolutional layer is 28*28*3 .

Among them, the convolution kernel of the convolutional layer is the weight matrix in the convolutional neural network model.

In step S520, transpose processing is performed on the weight matrix to perform second processing on the convolutional neural network model.

When the convolutional neural network model is a model corresponding to the horizontal shooting mode of the shooting device, in order to make the convolutional neural network model suitable for the vertical shooting mode of the shooting device, the weight matrix of the convolutional neural network model can be transposed. .

Similarly, when the convolutional neural network model is a model corresponding to the vertical shooting mode of the shooting device, in order to make the convolutional neural network model suitable for the horizontal shooting mode of the shooting device, the weight matrix of the convolutional neural network model can also be adjusted. Transpose processing.

Therefore, after obtaining the weight matrix of the convolutional neural network model, that is, the convolution kernel, the weight matrix can be transposed.

The transposition can be mirror flipping of all elements of the weight matrix around a ray of 45 degrees to the lower right starting from the elements in the first row and the first column, which can obtain the transposed matrix of the weight matrix. That is, the first row of the weight matrix becomes the first column, the second row becomes the second column, ... and the last row becomes the last column.

It should be noted that the direction of the transposition processing is related to the orientation of the target object after the first processing. Specifically, the direction of the transpose process coincides with the rotation direction of the first process. For example, as shown in Figure 7, the first process is to rotate the original image vertically by 90 degrees to the right, then the target object is also rotated vertically by 90 degrees to the right, and the transposition process also needs to be transposed to the right. One row becomes the last column of the transposed matrix, so that the transposed matrix can correspond to the orientation of the target object and the target object can be accurately identified.

After transposing the weight matrix, the second processing of the convolutional neural network model is implemented.

In this exemplary embodiment, by transposing the weight matrix of the convolutional neural network model, the second processing of the convolutional neural network model can be realized, and the effect of unifying the image processing model for the two original images is realized, Provides a guarantee for the image processing effect.

After the first processing is performed on the original image and the second processing is performed on the image processing model, the shape of the original image, that is, the size, can be matched with the image processing model, and the target object is also matched with the processing orientation of the image processing model, so that the original image Image and image processing models achieve matching results.

For example, when the original image is an image whose horizontal side is smaller than the vertical side in the vertical shooting mode, the original image is rotated 90 degrees to the right, so that the horizontal side of the original image is larger than the vertical side. Moreover, when the target object in the original image is a portrait with its head up and feet down, and the target object is also rotated 90 degrees to the right following the original image, the orientation of the portrait becomes the head orientation The right side, while the foot is facing the left side.

Further, when the image processing model is a convolutional neural network model, the obtained weight matrix in the convolutional neural network model is that the first row is 1, 2, 3, the second row is 4, 5, 6, and the third row is 7, 8, and 9 matrices, and in order to make the transposed weight matrix still able to process the corresponding position of the portrait in the original image, the weight matrix can be transposed to the right to achieve the first 1,2,3 in one line still process the head position of the portrait, 4,5,6 in the second line still process the body part of the portrait, and 7,8,9 in the last line still process the feet of the portrait The position of the part is processed, so as to achieve the effect of matching the original image with the image processing model.

Besides, it is also possible to only perform the second processing on the image processing model to achieve matching between the original image and the image processing model.

In an optional embodiment, when the original image and the preset image processing model do not match, a second process is performed on the image processing model to match the original image and the image processing model.

When the shape, that is, the size of the original image, matches the image processing model, but the processing orientation of the target object does not match the image processing model, it can also be considered that the original image does not match the preset image processing model. Therefore, a second process can be performed on the image processing model.

First, if the image processing model is a convolutional neural network model, obtain the weight matrix in the convolutional neural network model. Then, the weight matrix is transposed to perform the second processing on the convolutional neural network model.

Specifically, the convolution operation of the convolutional layer in the convolutional neural network model may be to output another image after using the convolution kernel and the image to operate. Inner product operation.

In this exemplary embodiment, a processing method is provided for one of the mismatched cases, which realizes the effect of unifying the image processing model for the two original images in the corresponding situation, provides a guarantee for the image processing effect, and enlarges the image Handle the application scenarios of the model.

In step S130, the original image is processed by an image processing model to identify the target object in the original image.

In an exemplary embodiment of the present disclosure, after the image processing model is matched with the original image, the original image may be processed using the image processing model.

In an optional embodiment, FIG. 6 shows a flow chart of steps of a method for processing an original image. As shown in FIG. 6 , the method at least includes the following steps: in step S610 , using the weights after transposition processing The matrix performs the inner product operation on the original image to obtain the image features.

Specifically, when the image processing model is a convolutional neural network model, the input layer of the convolutional neural network model can detect pixel features of each region of the original image, such as pixel grayscale values of each region. Further, the convolutional layer of the convolutional neural network model can perform an inner product operation on the pixel features to obtain image features.

Among them, the inner product operation is performed by sliding the convolution kernel, that is, the weight matrix. Taking the upper left corner of the original image as the starting point, sliding the weight matrix to the lower right corner of the original image generates a feature map. Among them, after each sliding of the weight matrix, a feature matrix with the same size as the weight matrix can be extracted from the original image, and the corresponding image features can be generated by performing an inner product operation on the feature matrix and the weight matrix.

In step S620, nonlinear processing is performed on the image features to obtain nonlinear features, and feature compression processing is performed on the nonlinear features to obtain compressed features.

After the image features are obtained, the activation function of the convolutional neural network model can add nonlinear factors to the image features to improve the feature representation effect of the image features. Specifically, a specific activation function can be used to perform point-to-point mapping to obtain nonlinear features.

Further, the pooling layer of the activation function is used to compress the nonlinear features and simplify the computational complexity of the convolutional neural network for nonlinear feature extraction. Specifically, the feature compression processing may adopt a sliding window manner to obtain compressed features, or may adopt other manners, which are not particularly limited in this exemplary embodiment.

In step S630, full connection processing is performed on the compressed feature to process the original image.

After the compressed features are obtained, the compressed features can be input to the fully connected layer of the convolutional neural network model for full connection processing. Fully connected processing can map the compressed features into a long output vector and output it to process the original image.

In this exemplary embodiment, the transposed convolutional neural network model is used to process the original image, so that the image processing model is adapted to the original images of two sizes, and the detection effect of subsequent target detection is guaranteed.

After the image processing model processes the original image, the target object in the original image can be identified. Image target detection refers to the location detection and classification of target objects in the original image, and the convolutional neural network model is widely used for its high-precision detection effect.

In addition, when the original image has been matched with the preset image processing model, the image processing model can be directly used to process the original image.

In an optional embodiment, when the original image matches the preset image processing model, the original image is processed by the image processing model to identify the target object in the original image.

When the horizontal side of the original image is larger than the vertical side, the original image and the preset image processing model match each other, or when the horizontal side of the original image is smaller than the vertical side, the original image and the preset image processing model match each other.

In both cases, the original image can be processed directly using the image processing model.

First, use the untransposed weight matrix to perform inner product operation on the original image to obtain image features; then, perform nonlinear processing on image features to obtain nonlinear features, and perform feature compression on nonlinear features to obtain compressed features; finally, The compressed features are fully connected to process the original image and identify the target object.

After the compressed features are obtained, the compressed features can be input to the fully connected layer of the convolutional neural network model for full connection processing. The full connection processing can map the compressed features into a long output vector and output it to realize the target detection processing of the original image.

After the original image and the image processing model are processed, the image processing model can also be deployed on the photographing device.

Specifically, it can be implemented using TFLite. Among them, TFLite is a toolkit for deploying image processing models to mobile and embedded devices. TFLite can import the image output model into the resource directory of TFLite, and run the application to achieve the effect of deploying the image processing model to the shooting device.

The image processing method in the embodiment of the present disclosure will be described in detail below with reference to an application scenario.

Figure 7 shows a schematic diagram of the interface for transposing the weight matrix. As shown in Figure 7, the first row is the horizontal shooting mode of the photographing device, and the original image with the target object facing upward and the horizontal side larger than the vertical side can be photographed. . In the case of a convolutional neural network corresponding to the horizontal mode, the target object can be identified directly using the untransposed weight matrix.

The second row is the vertical shooting mode of the shooting device, which can capture the original image with the target facing up and the horizontal side smaller than the vertical side. When the convolutional neural network model is a fully convolutional neural network model, since the fully convolutional neural network model is not sensitive to the size and shape of the original image, the untransposed weight matrix can be used to identify the target. But since TFLite does not support dynamic allocation of model loading memory, it cannot be achieved.

Among them, all layers in the fully convolutional neural network model are convolutional layers without fully connected layers, so they are not sensitive to the size and shape of the original image.

The third row is the vertical mode of the shooting device, which rotates the original image by 90 degrees so that the horizontal side of the original image is larger than the vertical side. In this case, the size of the original image matches the landscape mode of the shooting device. In addition, in order to match the rotated original image with the image processing model adapted to the horizontal mode, the weight matrix of the image processing model can be transposed. At this time, the memory allocation has not changed, and the original image can be Matching of image processing models. Therefore, an image processing model can be used to process the original image to complete target detection.

In the application scenario of the present disclosure, on the one hand, by performing the first processing on the original image and the second processing on the image processing model, only one image processing model needs to be stored and loaded, which greatly saves memory space and reduces loading On the other hand, the target object can be identified through the processing of the original image by the image processing model, and the target detection processing in the two shooting modes can be completed under the condition of ensuring the performance of the image processing model, which satisfies the real-time target detection process. Sexual requirements, easy to deploy on the shooting equipment for online use.

It should be noted that although the implementation of the above exemplary embodiments describes the various steps of the methods in the present disclosure in a specific order, this does not require or imply that these steps must be performed in this specific order, or that all steps must be performed. to achieve the desired result. Additionally or alternatively, certain steps may be omitted, multiple steps may be combined into one step for execution, and/or one step may be decomposed into multiple steps for execution, and the like.

Furthermore, in an exemplary embodiment of the present disclosure, an image processing apparatus is also provided. FIG. 8 shows a schematic structural diagram of an image processing apparatus. As shown in FIG. 8 , the image processing apparatus may include: a memory 810 and a processor 820 . in:

a memory 810 for storing executable instructions of the processor 820;

wherein the processor 820 is configured to perform, via executing executable instructions:

obtaining an original image, the original image including the target object;

The specific details of the above image processing apparatus have been described in detail in the corresponding image processing method, and therefore are not repeated here.

The image processing apparatus provided by this exemplary embodiment, on the one hand, only needs to store and load one image processing model by performing the first processing on the original image and the second processing on the image processing model, which greatly saves memory space, and The loading time is reduced; on the other hand, the target object can be identified through the processing of the original image by the image processing model, and the target detection processing in the two shooting modes can be completed under the condition of ensuring the performance of the image processing model, which satisfies the target The real-time detection requirements are easy to deploy on the shooting equipment for online use.

It should be noted that although several modules or units of the image processing apparatus 800 are mentioned in the above detailed description, such division is not mandatory. Indeed, according to embodiments of the present disclosure, the features and functions of two or more modules or units described above may be embodied in one module or unit. Conversely, the features and functions of one module or unit described above may be further divided into multiple modules or units to be embodied.

In addition, in an exemplary embodiment of the present disclosure, an electronic device capable of implementing the above method is also provided.

An electronic device 900 according to such an embodiment of the present invention is described below with reference to FIG. 9 . The electronic device 900 shown in FIG. 9 is only an example, and should not impose any limitations on the function and scope of use of the embodiments of the present invention.

As shown in FIG. 9, electronic device 900 takes the form of a general-purpose computing device. The components of the electronic device 900 may include, but are not limited to: the above-mentioned at least one processing unit 910, the above-mentioned at least one storage unit 920, a bus 930 connecting different system components (including the storage unit 920 and the processing unit 910), and a display unit 940.

Wherein, the storage unit stores program codes, and the program codes can be executed by the processing unit 910, so that the processing unit 910 executes various exemplary methods according to the present invention described in the above-mentioned “Exemplary Methods” section of this specification Example steps.

The storage unit 920 may include a readable medium in the form of a volatile storage unit, such as a random access storage unit (RAM) 921 and/or a cache storage unit 922 , and may further include a read only storage unit (ROM) 923 .

The storage unit 920 may also include a program/utility 924 having a set (at least one) of program modules 925 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, An implementation of a network environment may be included in each or some combination of these examples.

The bus 930 may be representative of one or more of several types of bus structures, including a memory cell bus or memory cell controller, a peripheral bus, a graphics acceleration port, a processing unit, or a local area using any of a variety of bus structures bus.

The electronic device 900 may also communicate with one or more external devices 1100 (eg, keyboards, pointing devices, Bluetooth devices, etc.), with one or more devices that enable a user to interact with the electronic device 900, and/or with Any device (eg, router, modem, etc.) that enables the electronic device 900 to communicate with one or more other computing devices. Such communication may take place through input/output (I/O) interface 950 . Also, the electronic device 900 may communicate with one or more networks (eg, a local area network (LAN), a wide area network (WAN), and/or a public network such as the Internet) through a network adapter 960 . As shown, network adapter 960 communicates with other modules of electronic device 900 via bus 930 . It should be understood that, although not shown, other hardware and/or software modules may be used in conjunction with electronic device 900, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives and data backup storage systems.

From the description of the above embodiments, those skilled in the art can easily understand that the exemplary embodiments described herein may be implemented by software, or may be implemented by software combined with necessary hardware. Therefore, the technical solutions according to the embodiments of the present disclosure may be embodied in the form of software products, and the software products may be stored in a non-volatile storage medium (which may be CD-ROM, U disk, mobile hard disk, etc.) or on a network , including several instructions to cause a computing device (which may be a personal computer, a server, a terminal device, or a network device, etc.) to execute the method according to an embodiment of the present disclosure.

In an exemplary embodiment of the present disclosure, there is also provided a computer-readable storage medium on which a program product capable of implementing the above-described method of the present specification is stored. In some possible embodiments, aspects of the present invention may also be implemented in the form of a program product comprising program code for causing the program product to run on a terminal device when the program product is run The terminal device performs the steps according to various exemplary embodiments of the present invention described in the above-mentioned "Example Method" section of this specification.

Referring to FIG. 10, a program product 1000 for implementing the above method according to an embodiment of the present invention is described, which can adopt a portable compact disk read-only memory (CD-ROM) and include program codes, and can be stored in a terminal device, For example running on a personal computer. However, the program product of the present invention is not limited thereto, and in this document, a readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.

The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples (non-exhaustive list) of readable storage media include: electrical connections with one or more wires, portable disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.

A computer readable signal medium may include a propagated data signal in baseband or as part of a carrier wave with readable program code embodied thereon. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. A readable signal medium can also be any readable medium, other than a readable storage medium, that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.

Program code embodied on a readable medium may be transmitted using any suitable medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Program code for carrying out operations of the present invention may be written in any combination of one or more programming languages, including object-oriented programming languages—such as Java, C++, etc., as well as conventional procedural Programming Language - such as the "C" language or similar programming language. The program code may execute entirely on the user computing device, partly on the user device, as a stand-alone software package, partly on the user computing device and partly on a remote computing device, or entirely on the remote computing device or server execute on. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computing device (eg, using an Internet service provider business via an Internet connection).

Other embodiments of the present disclosure will readily suggest themselves to those skilled in the art upon consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the present disclosure that follow the general principles of the present disclosure and include common knowledge or techniques in the technical field not disclosed by the present disclosure . The specification and examples are to be regarded as exemplary only, with the true scope and spirit of the disclosure being indicated by the claims.

Claims

An image processing method, comprising:

obtaining an original image, the original image including the target object;

When the original image and the preset image processing model do not match, respectively perform the first processing on the original image and the second processing on the image processing model, so that the original image and the image processing model match ;

The original image is processed by the image processing model to identify the target object in the original image.
The method according to claim 1, wherein the original image is generated after the image is collected by a preset photographing device;

Wherein, when the photographing device shoots horizontally, the horizontal side of the original image is larger than the vertical side; when the photographing device shoots vertically, the horizontal side of the original image is smaller than the vertical side.
The method according to claim 1 or 2, wherein the method further comprises:

When the horizontal side of the original image is larger than the vertical side, the original image and the preset image processing model match each other; when the horizontal side of the original image is smaller than the vertical side, the original image and the preset image processing model match each other; model mismatch; or,

When the horizontal side of the original image is smaller than the vertical side, the original image and the preset image processing model match each other; when the horizontal side of the original image is larger than the vertical side, the original image and the preset image processing model match each other; Model does not match.
The method according to claim 1 or 2, wherein the target object corresponds to a horizontal edge of the original image in the original image.
The method according to claim 2, wherein when the original image does not match a preset image processing model, the original image is subjected to a first processing and the image processing model is respectively subjected to a second processing processing, including:

Determine the shape of the original image according to the data collected by the inertial measurement unit of the photographing device;

When the shape of the original image does not match the preset image processing model, the first processing is performed on the original image, and the second processing is performed on the image processing model.
The method according to claim 5, wherein the performing the first processing on the original image comprises:

The original image is rotated by 90 degrees so that the target object in the original image is also rotated by 90 degrees.
The method according to claim 2, characterized in that, before performing the first processing on the original image and performing the second processing on the image processing model respectively, the method further comprises:

Determine the orientation of the target object of the original image according to the data collected by the inertial measurement unit of the photographing device;

When the orientation of the target object is inverted, the original image is flipped 180 degrees.
The method according to claim 1, wherein the method further comprises:

When the original image and the preset image processing model do not match, a second process is performed on the image processing model to match the original image and the image processing model.
The method according to claim 7 or 8, wherein the second processing is performed on the image processing model, comprising:

If the image processing model is a convolutional neural network model, obtain the weight matrix in the convolutional neural network model;

Transpose processing is performed on the weight matrix to perform a second processing on the convolutional neural network model.
The method according to claim 9, wherein the processing the original image by the image processing model comprises:

Perform an inner product operation on the original image by using the weight matrix after the transposition process to obtain image features;

Performing nonlinear processing on the image features to obtain nonlinear features, and performing feature compression processing on the nonlinear features to obtain compressed features;

Perform full connection processing on the compressed features to process the original image.
The method according to claim 1, wherein the method further comprises:

When the original image matches a preset image processing model, the original image is processed by the image processing model to identify the target object in the original image.
An image processing device, comprising:

processor;

a memory for storing executable instructions for the processor;

wherein the processor is configured to perform, via executing the executable instructions:

obtaining an original image, the original image including the target object;

When the original image and the preset image processing model do not match, respectively perform the first processing on the original image and the second processing on the image processing model, so that the original image and the image processing model match ;

The original image is processed by the image processing model to identify the target object in the original image.
The apparatus according to claim 12, wherein the original image is generated after the image is collected by a preset photographing device;

Wherein, when the photographing device shoots horizontally, the horizontal side of the original image is larger than the vertical side; when the photographing device shoots vertically, the horizontal side of the original image is smaller than the vertical side.
The device according to claim 12 or 13, wherein the device further comprises:

When the horizontal side of the original image is larger than the vertical side, the original image and the preset image processing model match each other; when the horizontal side of the original image is smaller than the vertical side, the original image and the preset image processing model match each other; model mismatch; or,

When the horizontal side of the original image is smaller than the vertical side, the original image and the preset image processing model match each other; when the horizontal side of the original image is larger than the vertical side, the original image and the preset image processing model match each other; Model does not match.
The device according to claim 12 or 13, wherein the target object in the original image corresponds to a horizontal edge of the original image.
The device according to claim 13, wherein when the original image does not match a preset image processing model, the first processing is performed on the original image and the second processing is performed on the image processing model, respectively. processing, including:

Determine the shape of the original image according to the data collected by the inertial measurement unit of the photographing device;

When the shape of the original image does not match the preset image processing model, the first processing is performed on the original image, and the second processing is performed on the image processing model.
The apparatus according to claim 16, wherein the performing the first processing on the original image comprises:

The original image is rotated by 90 degrees so that the target object in the original image is also rotated by 90 degrees.
The device according to claim 13, characterized in that before the first processing on the original image and the second processing on the image processing model respectively, the device further comprises:

Determine the orientation of the target object of the original image according to the data collected by the inertial measurement unit of the photographing device;

When the orientation of the target object is inverted, the original image is flipped 180 degrees.
The apparatus of claim 12, wherein the apparatus further comprises:

When the original image and the preset image processing model do not match, a second process is performed on the image processing model to match the original image and the image processing model.
The apparatus according to claim 18 or 19, wherein the performing the second processing on the image processing model comprises:

If the image processing model is a convolutional neural network model, obtain the weight matrix in the convolutional neural network model;

Transpose processing is performed on the weight matrix to perform a second processing on the convolutional neural network model.
The apparatus according to claim 20, wherein the processing the original image by the image processing model comprises:

Perform an inner product operation on the original image by using the weight matrix after the transposition process to obtain image features;

Performing nonlinear processing on the image features to obtain nonlinear features, and performing feature compression processing on the nonlinear features to obtain compressed features;

Perform full connection processing on the compressed features to process the original image.
The apparatus of claim 12, wherein the apparatus further comprises:

When the original image matches a preset image processing model, the original image is processed by the image processing model to identify the target object in the original image.
A computer-readable medium on which a computer program is stored, characterized in that, when the computer program is executed by a processor, the image processing method according to any one of claims 1-11 is implemented.
An electronic device, comprising:

processor; and

a memory for storing executable instructions for the processor;

Wherein, the processor is configured to perform the image processing method of any one of claims 1-11 by executing the executable instructions.