WO2022198517A1 - Image processing method and apparatus, and medium and electronic device - Google Patents

Image processing method and apparatus, and medium and electronic device Download PDF

Info

Publication number
WO2022198517A1
WO2022198517A1 PCT/CN2021/082809 CN2021082809W WO2022198517A1 WO 2022198517 A1 WO2022198517 A1 WO 2022198517A1 CN 2021082809 W CN2021082809 W CN 2021082809W WO 2022198517 A1 WO2022198517 A1 WO 2022198517A1
Authority
WO
WIPO (PCT)
Prior art keywords
original image
image
processing
image processing
model
Prior art date
Application number
PCT/CN2021/082809
Other languages
French (fr)
Chinese (zh)
Inventor
聂谷洪
施泽浩
王栋
Original Assignee
深圳市大疆创新科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市大疆创新科技有限公司 filed Critical 深圳市大疆创新科技有限公司
Priority to PCT/CN2021/082809 priority Critical patent/WO2022198517A1/en
Publication of WO2022198517A1 publication Critical patent/WO2022198517A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition

Definitions

  • the present disclosure relates to the technical field of image processing, and in particular, to an image processing method, an image processing apparatus, a computer-readable medium, and an electronic device.
  • Imaging devices with both horizontal and vertical shooting modes often need to run functions such as object detection.
  • functions such as target detection can only be realized by adapting two image processing models corresponding to the two shooting modes of horizontal shooting and vertical shooting.
  • the present disclosure provides an image processing method, an image processing apparatus, a computer-readable medium, and an electronic device, thereby improving the technical problem of wasting memory space in the prior art at least to a certain extent.
  • an image processing method comprising: acquiring an original image, the original image including a target object;
  • the original image is processed by the image processing model to identify the target object in the original image.
  • the original image is generated after an image is captured by a preset photographing device
  • the horizontal side of the original image is larger than the vertical side; when the photographing device shoots vertically, the horizontal side of the original image is smaller than the vertical side.
  • the method further includes:
  • the target object corresponds to a horizontal edge of the original image in the original image.
  • the first processing is performed on the original image and the second processing is performed on the image processing model, respectively.
  • the first processing is performed on the original image
  • the second processing is performed on the image processing model
  • performing the first processing on the original image includes:
  • the original image is rotated by 90 degrees so that the target object in the original image is also rotated by 90 degrees.
  • the method before the first processing on the original image and the second processing on the image processing model respectively, the method further includes:
  • the original image is flipped 180 degrees.
  • the method further includes:
  • performing the second processing on the image processing model includes:
  • the image processing model is a convolutional neural network model, obtain the weight matrix in the convolutional neural network model;
  • Transpose processing is performed on the weight matrix to perform a second processing on the convolutional neural network model.
  • the processing of the original image by the image processing model includes:
  • the method further includes:
  • the original image matches a preset image processing model
  • the original image is processed by the image processing model to identify the target object in the original image.
  • an image processing apparatus comprising: a processor;
  • a memory for storing executable instructions for the processor
  • processor is configured to perform, via executing the executable instructions:
  • the original image including the target object
  • the original image is processed by the image processing model to identify the target object in the original image.
  • the original image is generated after an image is captured by a preset photographing device
  • the horizontal side of the original image is larger than the vertical side; when the photographing device shoots vertically, the horizontal side of the original image is smaller than the vertical side.
  • the apparatus further includes:
  • the target object corresponds to a horizontal edge of the original image in the original image.
  • the first processing is performed on the original image and the second processing is performed on the image processing model, respectively.
  • the first processing is performed on the original image
  • the second processing is performed on the image processing model
  • performing the first processing on the original image includes:
  • the original image is rotated by 90 degrees so that the target object in the original image is also rotated by 90 degrees.
  • the apparatus before performing the first processing on the original image and performing the second processing on the image processing model respectively, the apparatus further includes:
  • the original image is flipped 180 degrees.
  • the apparatus further includes:
  • performing the second processing on the image processing model includes:
  • the image processing model is a convolutional neural network model, obtain the weight matrix in the convolutional neural network model;
  • Transpose processing is performed on the weight matrix to perform a second processing on the convolutional neural network model.
  • the processing of the original image by the image processing model includes:
  • the apparatus further includes:
  • the original image matches a preset image processing model
  • the original image is processed by the image processing model to identify the target object in the original image.
  • a computer-readable medium on which a computer program is stored, and when the computer program is executed by a processor, implements any one of the image processing methods provided in the first aspect.
  • an electronic device comprising:
  • a memory for storing executable instructions for the processor
  • the processor is configured to execute any one of the image processing methods provided in the first aspect by executing the executable instructions
  • image processing apparatus by performing the first processing on the original image and the second processing on the image processing model, only one image processing model needs to be stored and loaded, and it is extremely
  • the memory space is greatly saved and the loading time is reduced; on the other hand, the target object can be identified through the processing of the original image by the image processing model, and the two shooting modes can be completed under the condition of ensuring the performance of the image processing model.
  • the target detection processing meets the real-time requirements of target detection, and is easy to deploy on shooting equipment for online use.
  • FIG. 1 shows a schematic flowchart of an image processing method in an exemplary embodiment of the present disclosure
  • FIG. 2 shows a schematic flowchart of a method for flipping an original image in an exemplary embodiment of the present disclosure
  • FIG. 3 shows interface schematic diagrams of four orientations of a target object in an exemplary embodiment of the present disclosure
  • FIG. 4 shows a schematic flowchart of a method for respectively performing a first process and a second process in an exemplary embodiment of the present disclosure
  • FIG. 5 shows a schematic flowchart of a method for performing a second process in an exemplary embodiment of the present disclosure
  • FIG. 6 shows a schematic flowchart of a method for processing an original image in an exemplary embodiment of the present disclosure
  • FIG. 7 shows a schematic interface diagram of transposing a weight matrix in an exemplary embodiment of the present disclosure
  • FIG. 8 shows a schematic flowchart of an image processing apparatus in an exemplary embodiment of the present disclosure
  • FIG. 9 schematically shows an electronic device for implementing an image processing method in an exemplary embodiment of the present disclosure
  • FIG. 10 schematically illustrates a computer-readable storage medium for implementing an image processing method in an exemplary embodiment of the present disclosure.
  • Example embodiments will now be described more fully with reference to the accompanying drawings.
  • Example embodiments can be embodied in various forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art.
  • the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
  • numerous specific details are provided in order to give a thorough understanding of the embodiments of the present disclosure.
  • those skilled in the art will appreciate that the technical solutions of the present disclosure may be practiced without one or more of the specific details, or other methods, components, devices, steps, etc. may be employed.
  • well-known solutions have not been shown or described in detail to avoid obscuring aspects of the present disclosure.
  • the image processing model is distinguished from the two modes of horizontal and vertical shooting. At this point, two image processing models need to be maintained. Since the size of image processing models such as convolutional neural networks is often measured in M (mega), it can easily lead to oversized applications deploying image processing models. Moreover, in order to ensure smooth switching, two image processing modes need to be loaded at the same time, which wastes memory space and increases the loading time.
  • the image processing model in the landscape mode can be combined with the image processing model in the portrait mode.
  • the input image size is usually square.
  • the image in the positive direction will cause the horizontal and vertical images to be deformed in the horizontal and vertical directions, respectively, resulting in image distortion, affecting performance such as target detection, and reducing the user's sense of experience.
  • the fully convolutional neural network model can also make full use of the insensitivity to the size and shape of the image, and use the fully convolutional neural network to process the horizontal and vertical images respectively.
  • the current mobile deployment platform is usually TFLite.
  • TFLite does not support dynamic memory allocation. At this time, if one of the image processing models needs to be converted into another image processing model offline, only one image processing model needs to be maintained. But at this time, the application is too large, and the loading time is still too long, and the problem of wasting memory still exists.
  • the present disclosure provides an image processing method, an image processing apparatus, a computer-readable medium, and an electronic device.
  • Various aspects of the present exemplary embodiment are described in detail below.
  • FIG. 1 shows a schematic flowchart of an image processing method in this exemplary embodiment. As shown in FIG. 1 , the method includes at least the following steps S110 , S120 and S130 . specific:
  • Step S110 Acquire an original image, where the original image includes the target object.
  • Step S120 When the original image and the preset image processing model do not match, respectively perform the first processing on the original image and the second processing on the image processing model, so that the original image and the image processing model match.
  • Step S130 Process the original image through an image processing model to identify the target object in the original image.
  • the target object can be identified through the processing of the original image by the image processing model, and the target detection processing in the two shooting modes can be completed under the condition of ensuring the performance of the image processing model, which meets the real-time requirements of target detection. , which is easy to deploy and use online on shooting equipment.
  • step S110 an original image is acquired, and the original image includes the target object.
  • the inclusion of the target object in the original image indicates that the original image is an image obtained by photographing the target object.
  • the target object is, for example, a person, a tree, a vehicle, or a boat, which is not particularly limited in this exemplary embodiment.
  • the original image is generated by a preset photographing device after collecting the image; wherein, when the photographing device is horizontally photographed, the horizontal side of the original image is larger than the vertical side; when the photographing device is vertically photographed, the horizontal side of the original image is The side is smaller than the vertical side.
  • the photographing device may be an imaging device having two photographing modes of horizontal shooting and vertical shooting.
  • the photographing device may be a mobile phone and a camera, or other imaging devices, which are not particularly limited in this exemplary embodiment.
  • the horizontal edge of the original image obtained by shooting is larger than the vertical edge; and when the shooting device is in the vertical shooting mode, the horizontal edge of the original image obtained by shooting is smaller than the vertical edge.
  • the target object corresponds to a horizontal edge of the original image in the original image. That is, when the original image is displayed, the target object faces one of the horizontal sides of the original image.
  • step S120 when the original image and the preset image processing model do not match, respectively perform the first processing on the original image and the second processing on the image processing model, so that the original image and the image processing model match.
  • an image processing model may be preset. To determine whether the image processing model can perform image processing on the original image, it may be determined whether the original image matches the preset image processing model.
  • the target object has two orientations.
  • a photographing device such as a mobile phone
  • the target object has the first orientation in the original picture
  • the shooting device such as a mobile phone
  • the end to the bottom of the mobile phone is placed from top to bottom
  • the target object is in the original picture.
  • the target object has the first orientation, and when the phone is placed from the bottom to the bottom, the target object has the second orientation in the original screen. Therefore, the orientation of the target object can be judged by using the inertial measurement unit on the photographing device first, so as to perform processing such as flipping.
  • FIG. 2 shows a schematic flowchart of a method for flipping an original image.
  • the method includes at least the following steps: In step S210, according to the information of the inertial measurement unit of the photographing device The acquired data determines the orientation of the target object of the original image.
  • the Inertial Measurement Unit is composed of three single-axis acceleration sensors and three single-axis angular velocity sensors (gyroscopes), which can measure the IMU data, including the shooting equipment in three-dimensional space. acceleration data and angular velocity data.
  • the IMU can be installed in a portable device, such as a wearable writing device and a handheld device, so as to calculate the motion posture of the photographing device according to the IMU data measured by the IMU.
  • FIG. 3 shows a schematic interface diagram of four orientations of the target object.
  • the orientation of the target object is from bottom to top in the vertical direction; in the direction B, the target The orientation of the object is from top to bottom in the vertical direction; in the direction C, the orientation of the target object is from right to left in the horizontal direction; in the direction D, the orientation of the target object is from the left to the horizontal direction right.
  • step S220 when the orientation of the target object is upside down, the original image is flipped 180 degrees.
  • the preset image processing model may be a model matching the horizontal shooting mode, or may be a model matching the vertical shooting mode.
  • the original image in either direction of direction A and direction B can be regarded as an inverted image in the other direction, that is, the target object is inverted compared to the original image. , so the original image can be flipped 180 degrees.
  • the original image in either direction B or C can also be considered as an inverted image in the other direction, that is, the target object is inverted compared to the original image, and the original image can also be flipped 180 degrees. deal with.
  • the original image can be flipped through the orientation of the target object, and the original image can be oriented upward, which reduces the computational cost of subsequent matching judgment and image processing, and improves the efficiency of image processing. .
  • the original image After the original image is flipped, it can be further determined whether the original image matches the preset image processing model.
  • the original image and the preset image processing model match each other; when the horizontal side of the original image is smaller than the vertical side, the original image and the preset image processing model match each other; The models do not match; or, when the horizontal side of the original image is smaller than the vertical side, the original image and the preset image processing model match each other; when the horizontal side of the original image is larger than the vertical side, the original image and the preset image processing model do not match. match.
  • the preset image processing model is a model matching the horizontal shooting mode of the photographing device, and the original image is obtained by shooting in the horizontal shooting mode, the original image matches the preset image processing model.
  • the horizontal side of the original image captured in the horizontal shooting mode is larger than the vertical side.
  • the original image when the original image is obtained by using the vertical shooting mode of the photographing device, the original image does not match the preset image processing model.
  • the horizontal side of the original image captured in the vertical shooting mode is smaller than the vertical side.
  • the preset image processing model is a model matching the vertical shooting mode of the photographing device, and the original image is obtained by shooting in the vertical shooting mode, the original image matches the preset image processing model.
  • the horizontal side of the original image captured in the vertical shooting mode is smaller than the vertical side.
  • the original image when the original image is obtained by using the horizontal shooting mode of the photographing device, the original image does not match the preset image processing model.
  • the horizontal side of the original image captured in the horizontal shooting mode is larger than the vertical side.
  • the first processing may be performed on the original image
  • the second processing may be performed on the image processing model to match the original image and the image processing model.
  • FIG. 4 shows a schematic flowchart of a method for performing the first processing and the second processing respectively. As shown in FIG. 4 , the method at least includes the following steps: In step S410, according to the inertial measurement of the photographing device The data collected by the unit determines the shape of the original image.
  • the shape of the original image can also be determined.
  • the shape of the original image may be an aspect ratio of the original image.
  • the preset image processing model is a model that matches the horizontal shooting mode of the photographing device, indicating that the image processing model matches the original image with an aspect ratio greater than 1.
  • the preset image processing model is a model that matches the vertical shooting mode of the photographing device, indicating that the image processing model matches the original image with an aspect ratio smaller than 1.
  • step S420 when the shape of the original image does not match the preset image processing model, the first processing is performed on the original image, and the second processing is performed on the image processing model.
  • the preset image processing model is a model matching the horizontal shooting mode of the photographing device, it indicates that the shape of the original image does not match the preset image processing model.
  • the preset image processing model matches the vertical shooting mode of the time-domain shooting device, it can also indicate that the shape of the original image does not match the preset image processing model.
  • a first processing can be performed on the original image and a second processing on the image processing model.
  • the shape of the original image can be determined through the data collected by the inertial measurement unit, and the first processing and the second processing can be further determined.
  • the determination method is simple and accurate, and has strong applicability.
  • performing the first processing on the original image may be rotation processing.
  • the original image is rotated by 90 degrees so that the target object in the original image is also rotated by 90 degrees.
  • the original image in the vertical direction can be rotated by 90 degrees, so that the target object in the original image is also rotated by 90 degrees;
  • the image processing model is the same as When shooting a model that matches the vertical mode of the device, you can rotate the original image in the horizontal direction by 90 degrees, so that the target object in the original image is also rotated by 90 degrees.
  • Rotating the original image by 90 degrees can make the size or shape of the original image match the preset image processing model.
  • the second processing may also be performed on the image processing model.
  • FIG. 5 shows a schematic flowchart of a method for performing the second processing.
  • the method includes at least the following steps: In step S510, if the image processing model is a convolutional neural network model to get the weight matrix in the convolutional neural network model.
  • CNN Convolutional Neural Network
  • a typical CNN model includes convolutional layers, pooling layers, activation layers, and fully-connected layers.
  • the upper layer performs corresponding operations based on the input data, and outputs the operation results to the next layer. After the operation, a final result is obtained.
  • the convolution operation of the convolution layer may be to use a convolution kernel (also called a filter) to operate on the image and then output another image
  • the ortho acid may be the weight of the feature value of the image and the convolution kernel. value for inner product operation.
  • each preset feature extraction convolutional neural network can design multiple convolutional layers, and each convolutional layer can include the size of the feature map of the input layer to perform feature traversal.
  • the convolution kernel and the traversal step size of the convolution kernel on the feature map of the input layer For example, the size of the feature map of the input layer is 32*32*3, the size of the convolution kernel is 5*5, and the traversal step size is 1, then the size of the feature map output by the convolutional layer is 28*28*3 .
  • the convolution kernel of the convolutional layer is the weight matrix in the convolutional neural network model.
  • step S520 transpose processing is performed on the weight matrix to perform second processing on the convolutional neural network model.
  • the convolutional neural network model is a model corresponding to the horizontal shooting mode of the shooting device
  • the weight matrix of the convolutional neural network model can be transposed.
  • the convolutional neural network model is a model corresponding to the vertical shooting mode of the shooting device
  • the weight matrix of the convolutional neural network model can also be adjusted. Transpose processing.
  • the weight matrix can be transposed.
  • the transposition can be mirror flipping of all elements of the weight matrix around a ray of 45 degrees to the lower right starting from the elements in the first row and the first column, which can obtain the transposed matrix of the weight matrix. That is, the first row of the weight matrix becomes the first column, the second row becomes the second column, ... and the last row becomes the last column.
  • the direction of the transposition processing is related to the orientation of the target object after the first processing. Specifically, the direction of the transpose process coincides with the rotation direction of the first process. For example, as shown in Figure 7, the first process is to rotate the original image vertically by 90 degrees to the right, then the target object is also rotated vertically by 90 degrees to the right, and the transposition process also needs to be transposed to the right. One row becomes the last column of the transposed matrix, so that the transposed matrix can correspond to the orientation of the target object and the target object can be accurately identified.
  • the second processing of the convolutional neural network model can be realized, and the effect of unifying the image processing model for the two original images is realized, Provides a guarantee for the image processing effect.
  • the shape of the original image that is, the size
  • the target object is also matched with the processing orientation of the image processing model, so that the original image Image and image processing models achieve matching results.
  • the original image when the original image is an image whose horizontal side is smaller than the vertical side in the vertical shooting mode, the original image is rotated 90 degrees to the right, so that the horizontal side of the original image is larger than the vertical side.
  • the target object in the original image is a portrait with its head up and feet down, and the target object is also rotated 90 degrees to the right following the original image, the orientation of the portrait becomes the head orientation The right side, while the foot is facing the left side.
  • the obtained weight matrix in the convolutional neural network model is that the first row is 1, 2, 3, the second row is 4, 5, 6, and the third row is 7, 8, and 9 matrices, and in order to make the transposed weight matrix still able to process the corresponding position of the portrait in the original image, the weight matrix can be transposed to the right to achieve the first 1,2,3 in one line still process the head position of the portrait, 4,5,6 in the second line still process the body part of the portrait, and 7,8,9 in the last line still process the feet of the portrait The position of the part is processed, so as to achieve the effect of matching the original image with the image processing model.
  • a second process is performed on the image processing model to match the original image and the image processing model.
  • the image processing model is a convolutional neural network model
  • obtain the weight matrix in the convolutional neural network model Then, the weight matrix is transposed to perform the second processing on the convolutional neural network model.
  • the convolution operation of the convolutional layer in the convolutional neural network model may be to output another image after using the convolution kernel and the image to operate. Inner product operation.
  • each preset feature extraction convolutional neural network can design multiple convolutional layers, and each convolutional layer can include the size of the feature map of the input layer to perform feature traversal.
  • the convolution kernel and the traversal step size of the convolution kernel on the feature map of the input layer For example, the size of the feature map of the input layer is 32*32*3, the size of the convolution kernel is 5*5, and the traversal step size is 1, then the size of the feature map output by the convolutional layer is 28*28*3 .
  • the convolution kernel of the convolutional layer is the weight matrix in the convolutional neural network model.
  • the convolutional neural network model is a model corresponding to the horizontal shooting mode of the shooting device
  • the weight matrix of the convolutional neural network model can be transposed.
  • the convolutional neural network model is a model corresponding to the vertical shooting mode of the shooting device
  • the weight matrix of the convolutional neural network model can also be adjusted. Transpose processing.
  • the weight matrix can be transposed.
  • the transposition can be mirror flipping of all elements of the weight matrix around a ray of 45 degrees to the lower right starting from the elements in the first row and the first column, which can obtain the transposed matrix of the weight matrix. That is, the first row of the weight matrix becomes the first column, the second row becomes the second column, ... and the last row becomes the last column.
  • a processing method is provided for one of the mismatched cases, which realizes the effect of unifying the image processing model for the two original images in the corresponding situation, provides a guarantee for the image processing effect, and enlarges the image Handle the application scenarios of the model.
  • step S130 the original image is processed by an image processing model to identify the target object in the original image.
  • the original image may be processed using the image processing model.
  • FIG. 6 shows a flow chart of steps of a method for processing an original image.
  • the method at least includes the following steps: in step S610 , using the weights after transposition processing The matrix performs the inner product operation on the original image to obtain the image features.
  • the input layer of the convolutional neural network model can detect pixel features of each region of the original image, such as pixel grayscale values of each region. Further, the convolutional layer of the convolutional neural network model can perform an inner product operation on the pixel features to obtain image features.
  • the inner product operation is performed by sliding the convolution kernel, that is, the weight matrix. Taking the upper left corner of the original image as the starting point, sliding the weight matrix to the lower right corner of the original image generates a feature map. Among them, after each sliding of the weight matrix, a feature matrix with the same size as the weight matrix can be extracted from the original image, and the corresponding image features can be generated by performing an inner product operation on the feature matrix and the weight matrix.
  • step S620 nonlinear processing is performed on the image features to obtain nonlinear features, and feature compression processing is performed on the nonlinear features to obtain compressed features.
  • the activation function of the convolutional neural network model can add nonlinear factors to the image features to improve the feature representation effect of the image features.
  • a specific activation function can be used to perform point-to-point mapping to obtain nonlinear features.
  • the pooling layer of the activation function is used to compress the nonlinear features and simplify the computational complexity of the convolutional neural network for nonlinear feature extraction.
  • the feature compression processing may adopt a sliding window manner to obtain compressed features, or may adopt other manners, which are not particularly limited in this exemplary embodiment.
  • step S630 full connection processing is performed on the compressed feature to process the original image.
  • the compressed features can be input to the fully connected layer of the convolutional neural network model for full connection processing.
  • Fully connected processing can map the compressed features into a long output vector and output it to process the original image.
  • the transposed convolutional neural network model is used to process the original image, so that the image processing model is adapted to the original images of two sizes, and the detection effect of subsequent target detection is guaranteed.
  • Image target detection refers to the location detection and classification of target objects in the original image, and the convolutional neural network model is widely used for its high-precision detection effect.
  • the image processing model can be directly used to process the original image.
  • the original image matches the preset image processing model
  • the original image is processed by the image processing model to identify the target object in the original image.
  • the original image and the preset image processing model match each other, or when the horizontal side of the original image is smaller than the vertical side, the original image and the preset image processing model match each other.
  • the original image can be processed directly using the image processing model.
  • the input layer of the convolutional neural network model can detect pixel features of each region of the original image, such as pixel grayscale values of each region. Further, the convolutional layer of the convolutional neural network model can perform an inner product operation on the pixel features to obtain image features.
  • the inner product operation is performed by sliding the convolution kernel, that is, the weight matrix. Taking the upper left corner of the original image as the starting point, sliding the weight matrix to the lower right corner of the original image generates a feature map. Among them, after each sliding of the weight matrix, a feature matrix with the same size as the weight matrix can be extracted from the original image, and the corresponding image features can be generated by performing an inner product operation on the feature matrix and the weight matrix.
  • the activation function of the convolutional neural network model can add nonlinear factors to the image features to improve the feature representation effect of the image features.
  • a specific activation function can be used to perform point-to-point mapping to obtain nonlinear features.
  • the pooling layer of the activation function is used to compress the nonlinear features and simplify the computational complexity of the convolutional neural network for nonlinear feature extraction.
  • the feature compression processing may adopt a sliding window manner to obtain compressed features, or may adopt other manners, which are not particularly limited in this exemplary embodiment.
  • the compressed features can be input to the fully connected layer of the convolutional neural network model for full connection processing.
  • the full connection processing can map the compressed features into a long output vector and output it to realize the target detection processing of the original image.
  • the image processing model can also be deployed on the photographing device.
  • TFLite is a toolkit for deploying image processing models to mobile and embedded devices.
  • TFLite can import the image output model into the resource directory of TFLite, and run the application to achieve the effect of deploying the image processing model to the shooting device.
  • Figure 7 shows a schematic diagram of the interface for transposing the weight matrix.
  • the first row is the horizontal shooting mode of the photographing device, and the original image with the target object facing upward and the horizontal side larger than the vertical side can be photographed.
  • the target object can be identified directly using the untransposed weight matrix.
  • the second row is the vertical shooting mode of the shooting device, which can capture the original image with the target facing up and the horizontal side smaller than the vertical side.
  • the convolutional neural network model is a fully convolutional neural network model
  • the fully convolutional neural network model is not sensitive to the size and shape of the original image, the untransposed weight matrix can be used to identify the target.
  • TFLite does not support dynamic allocation of model loading memory, it cannot be achieved.
  • all layers in the fully convolutional neural network model are convolutional layers without fully connected layers, so they are not sensitive to the size and shape of the original image.
  • the third row is the vertical mode of the shooting device, which rotates the original image by 90 degrees so that the horizontal side of the original image is larger than the vertical side.
  • the size of the original image matches the landscape mode of the shooting device.
  • the weight matrix of the image processing model can be transposed. At this time, the memory allocation has not changed, and the original image can be Matching of image processing models. Therefore, an image processing model can be used to process the original image to complete target detection.
  • the target object can be identified through the processing of the original image by the image processing model, and the target detection processing in the two shooting modes can be completed under the condition of ensuring the performance of the image processing model, which satisfies the real-time target detection process.
  • sexual requirements easy to deploy on the shooting equipment for online use.
  • FIG. 8 shows a schematic structural diagram of an image processing apparatus.
  • the image processing apparatus may include: a memory 810 and a processor 820 . in:
  • a memory 810 for storing executable instructions of the processor 820
  • processor 820 is configured to perform, via executing executable instructions:
  • the original image including the target object
  • the original image is processed by the image processing model to identify the target object in the original image.
  • the original image is generated after an image is captured by a preset photographing device
  • the horizontal side of the original image is larger than the vertical side; when the photographing device shoots vertically, the horizontal side of the original image is smaller than the vertical side.
  • the apparatus further includes:
  • the target object corresponds to a horizontal edge of the original image in the original image.
  • the first processing is performed on the original image and the second processing is performed on the image processing model, respectively.
  • the first processing is performed on the original image
  • the second processing is performed on the image processing model
  • performing the first processing on the original image includes:
  • the original image is rotated by 90 degrees so that the target object in the original image is also rotated by 90 degrees.
  • the apparatus before performing the first processing on the original image and performing the second processing on the image processing model respectively, the apparatus further includes:
  • the original image is flipped 180 degrees.
  • the apparatus further includes:
  • performing the second processing on the image processing model includes:
  • the image processing model is a convolutional neural network model, obtain the weight matrix in the convolutional neural network model;
  • Transpose processing is performed on the weight matrix to perform a second processing on the convolutional neural network model.
  • the processing of the original image by the image processing model includes:
  • the apparatus further includes:
  • the original image matches a preset image processing model
  • the original image is processed by the image processing model to identify the target object in the original image.
  • the image processing apparatus on the one hand, only needs to store and load one image processing model by performing the first processing on the original image and the second processing on the image processing model, which greatly saves memory space, and The loading time is reduced; on the other hand, the target object can be identified through the processing of the original image by the image processing model, and the target detection processing in the two shooting modes can be completed under the condition of ensuring the performance of the image processing model, which satisfies the target
  • the real-time detection requirements are easy to deploy on the shooting equipment for online use.
  • modules or units of the image processing apparatus 800 are mentioned in the above detailed description, such division is not mandatory. Indeed, according to embodiments of the present disclosure, the features and functions of two or more modules or units described above may be embodied in one module or unit. Conversely, the features and functions of one module or unit described above may be further divided into multiple modules or units to be embodied.
  • an electronic device capable of implementing the above method is also provided.
  • FIG. 9 An electronic device 900 according to such an embodiment of the present invention is described below with reference to FIG. 9 .
  • the electronic device 900 shown in FIG. 9 is only an example, and should not impose any limitations on the function and scope of use of the embodiments of the present invention.
  • electronic device 900 takes the form of a general-purpose computing device.
  • the components of the electronic device 900 may include, but are not limited to: the above-mentioned at least one processing unit 910, the above-mentioned at least one storage unit 920, a bus 930 connecting different system components (including the storage unit 920 and the processing unit 910), and a display unit 940.
  • the storage unit stores program codes, and the program codes can be executed by the processing unit 910, so that the processing unit 910 executes various exemplary methods according to the present invention described in the above-mentioned “Exemplary Methods” section of this specification Example steps.
  • the storage unit 920 may include a readable medium in the form of a volatile storage unit, such as a random access storage unit (RAM) 921 and/or a cache storage unit 922 , and may further include a read only storage unit (ROM) 923 .
  • RAM random access storage unit
  • ROM read only storage unit
  • the storage unit 920 may also include a program/utility 924 having a set (at least one) of program modules 925 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, An implementation of a network environment may be included in each or some combination of these examples.
  • the bus 930 may be representative of one or more of several types of bus structures, including a memory cell bus or memory cell controller, a peripheral bus, a graphics acceleration port, a processing unit, or a local area using any of a variety of bus structures bus.
  • the electronic device 900 may also communicate with one or more external devices 1100 (eg, keyboards, pointing devices, Bluetooth devices, etc.), with one or more devices that enable a user to interact with the electronic device 900, and/or with Any device (eg, router, modem, etc.) that enables the electronic device 900 to communicate with one or more other computing devices. Such communication may take place through input/output (I/O) interface 950 . Also, the electronic device 900 may communicate with one or more networks (eg, a local area network (LAN), a wide area network (WAN), and/or a public network such as the Internet) through a network adapter 960 . As shown, network adapter 960 communicates with other modules of electronic device 900 via bus 930 . It should be understood that, although not shown, other hardware and/or software modules may be used in conjunction with electronic device 900, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives and data backup storage systems.
  • the exemplary embodiments described herein may be implemented by software, or may be implemented by software combined with necessary hardware. Therefore, the technical solutions according to the embodiments of the present disclosure may be embodied in the form of software products, and the software products may be stored in a non-volatile storage medium (which may be CD-ROM, U disk, mobile hard disk, etc.) or on a network , including several instructions to cause a computing device (which may be a personal computer, a server, a terminal device, or a network device, etc.) to execute the method according to an embodiment of the present disclosure.
  • a computing device which may be a personal computer, a server, a terminal device, or a network device, etc.
  • a computer-readable storage medium on which a program product capable of implementing the above-described method of the present specification is stored.
  • aspects of the present invention may also be implemented in the form of a program product comprising program code for causing the program product to run on a terminal device when the program product is run.
  • the terminal device performs the steps according to various exemplary embodiments of the present invention described in the above-mentioned "Example Method" section of this specification.
  • a program product 1000 for implementing the above method according to an embodiment of the present invention is described, which can adopt a portable compact disk read-only memory (CD-ROM) and include program codes, and can be stored in a terminal device, For example running on a personal computer.
  • a readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • the program product may employ any combination of one or more readable media.
  • the readable medium may be a readable signal medium or a readable storage medium.
  • the readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples (non-exhaustive list) of readable storage media include: electrical connections with one or more wires, portable disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
  • a computer readable signal medium may include a propagated data signal in baseband or as part of a carrier wave with readable program code embodied thereon. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a readable signal medium can also be any readable medium, other than a readable storage medium, that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
  • Program code embodied on a readable medium may be transmitted using any suitable medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
  • Program code for carrying out operations of the present invention may be written in any combination of one or more programming languages, including object-oriented programming languages—such as Java, C++, etc., as well as conventional procedural Programming Language - such as the "C" language or similar programming language.
  • the program code may execute entirely on the user computing device, partly on the user device, as a stand-alone software package, partly on the user computing device and partly on a remote computing device, or entirely on the remote computing device or server execute on.
  • the remote computing device may be connected to the user computing device through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computing device (eg, using an Internet service provider business via an Internet connection).
  • LAN local area network
  • WAN wide area network

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Image Processing (AREA)

Abstract

The present invention provides an image processing method and apparatus, and a computer-readable medium and an electronic device. The image processing method comprises: obtaining an original image, the original image comprising a target object; when the original image and a preset image processing model are not matched, respectively performing first processing on the original image and performing second processing on the image processing model, such that the original image and the image processing model are matched; and processing the original image by means of the image processing model, so as to recognize the target object in the original image. According to the present invention, only one image processing model needs to be stored and loaded, thereby greatly saving memory space and reducing a loading duration, and the original image does not generate lateral or longitudinal deformation, such that the image display effect is ensured, the processing effect of the image processing model is further ensured, the real-time requirements of target detection are satisfied, and the image processing model can be deployed on a photographing device for on-line use.

Description

图像处理方法、装置、介质以及电子设备Image processing method, apparatus, medium and electronic device 技术领域technical field
本公开涉及图像处理技术领域,尤其涉及图像处理方法、图像处理装置、计算机可读介质以及电子设备。The present disclosure relates to the technical field of image processing, and in particular, to an image processing method, an image processing apparatus, a computer-readable medium, and an electronic device.
背景技术Background technique
具有横拍和竖拍两种拍摄模式的影像设备通常需要运行目标检测等功能。而目标检测等功能需要对横拍和竖拍两种拍摄模式适配对应的两个图像处理模型才可以实现。Imaging devices with both horizontal and vertical shooting modes often need to run functions such as object detection. However, functions such as target detection can only be realized by adapting two image processing models corresponding to the two shooting modes of horizontal shooting and vertical shooting.
但是,维护两个图像处理模型会导致部署图像处理模型的应用程序过大而浪费内存空间。However, maintaining two image processing models would result in an oversized application deploying the image processing model and wasting memory space.
鉴于此,本领域亟需开发一种新的图像处理方法。In view of this, there is an urgent need to develop a new image processing method in the art.
需要说明的是,在上述背景技术部分公开的信息仅用于加强对本公开的背景的理解,因此可以包括不构成对本领域普通技术人员已知的现有技术的信息。It should be noted that the information disclosed in the above Background section is only for enhancement of understanding of the background of the present disclosure, and therefore may contain information that does not form the prior art that is already known to a person of ordinary skill in the art.
发明内容SUMMARY OF THE INVENTION
本公开提供了图像处理方法、图像处理装置、计算机可读介质以及电子设备,进而至少在一定程度上改善现有技术中浪费内存空间的技术问题。The present disclosure provides an image processing method, an image processing apparatus, a computer-readable medium, and an electronic device, thereby improving the technical problem of wasting memory space in the prior art at least to a certain extent.
本公开的其他特性和优点将通过下面的详细描述变得显然,或部分地通过本公开的实践而习得。Other features and advantages of the present disclosure will become apparent from the following detailed description, or be learned in part by practice of the present disclosure.
根据本公开的第一方面,提供一种图像处理方法,包括:获取原始图像,所述原始图像包括目标对象;According to a first aspect of the present disclosure, there is provided an image processing method, comprising: acquiring an original image, the original image including a target object;
当所述原始图像和预设的图像处理模型不匹配时,分别对所述原始图像进行第一处理和所述图像处理模型进行第二处理,以使得所述原始图像和所述图像处理模型匹配;When the original image and the preset image processing model do not match, respectively perform the first processing on the original image and the second processing on the image processing model, so that the original image and the image processing model match ;
通过所述图像处理模型对所述原始图像进行处理,以识别所述原始图像中的目标对象。The original image is processed by the image processing model to identify the target object in the original image.
在本公开的一种示例性实施例中,所述原始图像由预设的拍摄设备采集图像后生成;In an exemplary embodiment of the present disclosure, the original image is generated after an image is captured by a preset photographing device;
其中,当所述拍摄设备横拍时,所述原始图像的横边大于竖边;当所述拍摄设备竖拍时,所述原始图像的横边小于竖边。Wherein, when the photographing device shoots horizontally, the horizontal side of the original image is larger than the vertical side; when the photographing device shoots vertically, the horizontal side of the original image is smaller than the vertical side.
在本公开的一种示例性实施例中,所述方法还包括:In an exemplary embodiment of the present disclosure, the method further includes:
当所述原始图像的横边大于竖边时,所述原始图像和预设的图像处理模型相互匹配;当所述原始图像的横边小于竖边时,所述原始图像和预设的图像处理模型不匹配;或者,When the horizontal side of the original image is larger than the vertical side, the original image and the preset image processing model match each other; when the horizontal side of the original image is smaller than the vertical side, the original image and the preset image processing model match each other; model mismatch; or,
当所述原始图像的横边小于竖边时,所述原始图像和预设的图像处理模型相互匹配;当所述原始图像的横边大于竖边时,所述原始图像和预设的图像处理模型不匹配。When the horizontal side of the original image is smaller than the vertical side, the original image and the preset image processing model match each other; when the horizontal side of the original image is larger than the vertical side, the original image and the preset image processing model match each other; Model does not match.
在本公开的一种示例性实施例中,所述目标对象在所述原始图像中与所述原始图像的横边对应。In an exemplary embodiment of the present disclosure, the target object corresponds to a horizontal edge of the original image in the original image.
在本公开的一种示例性实施例中,所述当所述原始图像和预设的图像处理模型不匹配时,分别对所述原始图像进行第一处理和所述图像处理模型进行第二处理,包括:In an exemplary embodiment of the present disclosure, when the original image does not match a preset image processing model, the first processing is performed on the original image and the second processing is performed on the image processing model, respectively. ,include:
根据所述拍摄设备的惯性测量单元所采集到的数据确定所述原始图像的形状;Determine the shape of the original image according to the data collected by the inertial measurement unit of the photographing device;
当所述原始图像的形状与预设的图像处理模型不匹配时,对所述原始图像进行第一处理, 并对所述图像处理模型进行第二处理。When the shape of the original image does not match the preset image processing model, the first processing is performed on the original image, and the second processing is performed on the image processing model.
在本公开的一种示例性实施例中,所述对所述原始图像进行第一处理,包括:In an exemplary embodiment of the present disclosure, performing the first processing on the original image includes:
对所述原始图像旋转90度,以使得所述原始图像中的目标对象也旋转90度。The original image is rotated by 90 degrees so that the target object in the original image is also rotated by 90 degrees.
在本公开的一种示例性实施例中,在所述分别对所述原始图像进行第一处理和所述图像处理模型进行第二处理前,所述方法还包括:In an exemplary embodiment of the present disclosure, before the first processing on the original image and the second processing on the image processing model respectively, the method further includes:
根据所述拍摄设备的惯性测量单元所采集到的数据确定所述原始图像的目标对象的朝向;Determine the orientation of the target object of the original image according to the data collected by the inertial measurement unit of the photographing device;
当所述目标对象的朝向为倒置时,对所述原始图像进行180度翻转。When the orientation of the target object is inverted, the original image is flipped 180 degrees.
在本公开的一种示例性实施例中,所述方法还包括:In an exemplary embodiment of the present disclosure, the method further includes:
当所述原始图像和预设的图像处理模型不匹配时,对所述图像处理模型进行第二处理,以使所述原始图像和所述图像处理模型匹配。When the original image and the preset image processing model do not match, a second process is performed on the image processing model to match the original image and the image processing model.
在本公开的一种示例性实施例中,所述对所述图像处理模型进行第二处理,包括:In an exemplary embodiment of the present disclosure, performing the second processing on the image processing model includes:
若所述图像处理模型为卷积神经网络模型,获取所述卷积神经网络模型中的权重矩阵;If the image processing model is a convolutional neural network model, obtain the weight matrix in the convolutional neural network model;
对所述权重矩阵进行转置处理,以对所述卷积神经网络模型进行第二处理。Transpose processing is performed on the weight matrix to perform a second processing on the convolutional neural network model.
在本公开的一种示例性实施例中,所述通过所述图像处理模型对所述原始图像进行处理,包括:In an exemplary embodiment of the present disclosure, the processing of the original image by the image processing model includes:
利用转置处理之后的所述权重矩阵对所述原始图像进行内积操作得到图像特征;Perform an inner product operation on the original image by using the weight matrix after the transposition process to obtain image features;
对所述图像特征进行非线性处理得到非线性特征,并对所述非线性特征进行特征压缩处理得到压缩特征;Performing nonlinear processing on the image features to obtain nonlinear features, and performing feature compression processing on the nonlinear features to obtain compressed features;
对所述压缩特征进行全连接处理,以对所述原始图像进行处理。Perform full connection processing on the compressed features to process the original image.
在本公开的一种示例性实施例中,所述方法还包括:In an exemplary embodiment of the present disclosure, the method further includes:
当所述原始图像和预设的图像处理模型匹配时,通过所述图像处理模型对所述原始图像进行处理,以识别所述原始图像中的目标对象。When the original image matches a preset image processing model, the original image is processed by the image processing model to identify the target object in the original image.
根据本公开的第二方面,提供一种图像处理装置,包括:处理器;According to a second aspect of the present disclosure, there is provided an image processing apparatus, comprising: a processor;
存储器,用于存储所述处理器的可执行指令;a memory for storing executable instructions for the processor;
其中,所述处理器被配置为经由执行所述可执行指令来执行:wherein the processor is configured to perform, via executing the executable instructions:
获取原始图像,所述原始图像包括目标对象;obtaining an original image, the original image including the target object;
当所述原始图像和预设的图像处理模型不匹配时,分别对所述原始图像进行第一处理和所述图像处理模型进行第二处理,以使得所述原始图像和所述图像处理模型匹配;When the original image and the preset image processing model do not match, respectively perform the first processing on the original image and the second processing on the image processing model, so that the original image and the image processing model match ;
通过所述图像处理模型对所述原始图像进行处理,以识别所述原始图像中的目标对象。The original image is processed by the image processing model to identify the target object in the original image.
在本公开的一种示例性实施例中,所述原始图像由预设的拍摄设备采集图像后生成;In an exemplary embodiment of the present disclosure, the original image is generated after an image is captured by a preset photographing device;
其中,当所述拍摄设备横拍时,所述原始图像的横边大于竖边;当所述拍摄设备竖拍时,所述原始图像的横边小于竖边。Wherein, when the photographing device shoots horizontally, the horizontal side of the original image is larger than the vertical side; when the photographing device shoots vertically, the horizontal side of the original image is smaller than the vertical side.
在本公开的一种示例性实施例中,所述装置还包括:In an exemplary embodiment of the present disclosure, the apparatus further includes:
当所述原始图像的横边大于竖边时,所述原始图像和预设的图像处理模型相互匹配;当所述原始图像的横边小于竖边时,所述原始图像和预设的图像处理模型不匹配;或者,When the horizontal side of the original image is larger than the vertical side, the original image and the preset image processing model match each other; when the horizontal side of the original image is smaller than the vertical side, the original image and the preset image processing model match each other; model mismatch; or,
当所述原始图像的横边小于竖边时,所述原始图像和预设的图像处理模型相互匹配;当所述原始图像的横边大于竖边时,所述原始图像和预设的图像处理模型不匹配。When the horizontal side of the original image is smaller than the vertical side, the original image and the preset image processing model match each other; when the horizontal side of the original image is larger than the vertical side, the original image and the preset image processing model match each other; Model does not match.
在本公开的一种示例性实施例中,所述目标对象在所述原始图像中与所述原始图像的横边 对应。In an exemplary embodiment of the present disclosure, the target object corresponds to a horizontal edge of the original image in the original image.
在本公开的一种示例性实施例中,所述当所述原始图像和预设的图像处理模型不匹配时,分别对所述原始图像进行第一处理和所述图像处理模型进行第二处理,包括:In an exemplary embodiment of the present disclosure, when the original image does not match a preset image processing model, the first processing is performed on the original image and the second processing is performed on the image processing model, respectively. ,include:
根据所述拍摄设备的惯性测量单元所采集到的数据确定所述原始图像的形状;Determine the shape of the original image according to the data collected by the inertial measurement unit of the photographing device;
当所述原始图像的形状与预设的图像处理模型不匹配时,对所述原始图像进行第一处理,并对所述图像处理模型进行第二处理。When the shape of the original image does not match the preset image processing model, the first processing is performed on the original image, and the second processing is performed on the image processing model.
在本公开的一种示例性实施例中,所述对所述原始图像进行第一处理,包括:In an exemplary embodiment of the present disclosure, performing the first processing on the original image includes:
对所述原始图像旋转90度,以使得所述原始图像中的目标对象也旋转90度。The original image is rotated by 90 degrees so that the target object in the original image is also rotated by 90 degrees.
在本公开的一种示例性实施例中,在所述分别对所述原始图像进行第一处理和所述图像处理模型进行第二处理前,所述装置还包括:In an exemplary embodiment of the present disclosure, before performing the first processing on the original image and performing the second processing on the image processing model respectively, the apparatus further includes:
根据所述拍摄设备的惯性测量单元所采集到的数据确定所述原始图像的目标对象的朝向;Determine the orientation of the target object of the original image according to the data collected by the inertial measurement unit of the photographing device;
当所述目标对象的朝向为倒置时,对所述原始图像进行180度翻转。When the orientation of the target object is inverted, the original image is flipped 180 degrees.
在本公开的一种示例性实施例中,所述装置还包括:In an exemplary embodiment of the present disclosure, the apparatus further includes:
当所述原始图像和预设的图像处理模型不匹配时,对所述图像处理模型进行第二处理,以使所述原始图像和所述图像处理模型匹配。When the original image and the preset image processing model do not match, a second process is performed on the image processing model to match the original image and the image processing model.
在本公开的一种示例性实施例中,所述对所述图像处理模型进行第二处理,包括:In an exemplary embodiment of the present disclosure, performing the second processing on the image processing model includes:
若所述图像处理模型为卷积神经网络模型,获取所述卷积神经网络模型中的权重矩阵;If the image processing model is a convolutional neural network model, obtain the weight matrix in the convolutional neural network model;
对所述权重矩阵进行转置处理,以对所述卷积神经网络模型进行第二处理。Transpose processing is performed on the weight matrix to perform a second processing on the convolutional neural network model.
在本公开的一种示例性实施例中,所述通过所述图像处理模型对所述原始图像进行处理,包括:In an exemplary embodiment of the present disclosure, the processing of the original image by the image processing model includes:
利用转置处理之后的所述权重矩阵对所述原始图像进行内积操作得到图像特征;Perform an inner product operation on the original image by using the weight matrix after the transposition process to obtain image features;
对所述图像特征进行非线性处理得到非线性特征,并对所述非线性特征进行特征压缩处理得到压缩特征;Performing nonlinear processing on the image features to obtain nonlinear features, and performing feature compression processing on the nonlinear features to obtain compressed features;
对所述压缩特征进行全连接处理,以对所述原始图像进行处理。Perform full connection processing on the compressed features to process the original image.
在本公开的一种示例性实施例中,所述装置还包括:In an exemplary embodiment of the present disclosure, the apparatus further includes:
当所述原始图像和预设的图像处理模型匹配时,通过所述图像处理模型对所述原始图像进行处理,以识别所述原始图像中的目标对象。When the original image matches a preset image processing model, the original image is processed by the image processing model to identify the target object in the original image.
根据本公开的第三方面,提供一种计算机可读介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现上述第一方面提供的任一种图像处理方法。According to a third aspect of the present disclosure, there is provided a computer-readable medium on which a computer program is stored, and when the computer program is executed by a processor, implements any one of the image processing methods provided in the first aspect.
根据本公开的第四方面,提供一种电子设备,包括:According to a fourth aspect of the present disclosure, there is provided an electronic device, comprising:
处理器;以及processor; and
存储器,用于存储所述处理器的可执行指令;a memory for storing executable instructions for the processor;
其中,所述处理器配置为经由执行所述可执行指令来执行第一方面提供的任一项所述的图像处理方法Wherein, the processor is configured to execute any one of the image processing methods provided in the first aspect by executing the executable instructions
本公开的技术方案具有以下有益效果:The technical solution of the present disclosure has the following beneficial effects:
根据上述图像处理方法、图像处理装置、计算机可读介质以及电子设备,一方面,通过对原始图像进行第一处理和对图像处理模型进行第二处理,只需存储和加载一个图像处理模型,极大地节省了内存空间,并减少了加载时长;另一方面,通过图像处理模型对原始图像的处理 可以识别出目标对象,可以在保证图像处理模型性能的情况下,完成对两种拍摄模式下的目标检测处理,满足了目标检测的实时性要求,便于部署在拍摄设备上在线使用。According to the above-mentioned image processing method, image processing apparatus, computer-readable medium and electronic device, on the one hand, by performing the first processing on the original image and the second processing on the image processing model, only one image processing model needs to be stored and loaded, and it is extremely The memory space is greatly saved and the loading time is reduced; on the other hand, the target object can be identified through the processing of the original image by the image processing model, and the two shooting modes can be completed under the condition of ensuring the performance of the image processing model. The target detection processing meets the real-time requirements of target detection, and is easy to deploy on shooting equipment for online use.
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本公开。It is to be understood that the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the present disclosure.
附图说明Description of drawings
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本公开的实施方式,并与说明书一起用于解释本公开的原理。显而易见地,下面描述中的附图仅仅是本公开的一些实施方式,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description serve to explain the principles of the disclosure. Obviously, the drawings in the following description are only some embodiments of the present disclosure, and for those of ordinary skill in the art, other drawings can also be obtained from these drawings without creative effort.
图1示出本公开的示例性实施方式中一种图像处理方法的流程示意图;FIG. 1 shows a schematic flowchart of an image processing method in an exemplary embodiment of the present disclosure;
图2示出本公开的示例性实施方式中对原始图像进行翻转的方法的流程示意图;FIG. 2 shows a schematic flowchart of a method for flipping an original image in an exemplary embodiment of the present disclosure;
图3示出本公开的示例性实施方式中目标对象的四种朝向的界面示意图;FIG. 3 shows interface schematic diagrams of four orientations of a target object in an exemplary embodiment of the present disclosure;
图4示出本公开的示例性实施方式中分别进行第一处理和第二处理的方法的流程示意图;FIG. 4 shows a schematic flowchart of a method for respectively performing a first process and a second process in an exemplary embodiment of the present disclosure;
图5示出本公开的示例性实施方式中进行第二处理的方法的流程示意图;FIG. 5 shows a schematic flowchart of a method for performing a second process in an exemplary embodiment of the present disclosure;
图6示出本公开的示例性实施方式中对原始图像进行处理的方法的流程示意图;FIG. 6 shows a schematic flowchart of a method for processing an original image in an exemplary embodiment of the present disclosure;
图7示出本公开的示例性实施方式中对权重矩阵进行转置处理的界面示意图;FIG. 7 shows a schematic interface diagram of transposing a weight matrix in an exemplary embodiment of the present disclosure;
图8示出本公开的示例性实施方式中一种图像处理装置的流程示意图;FIG. 8 shows a schematic flowchart of an image processing apparatus in an exemplary embodiment of the present disclosure;
图9示意性示出本公开示例性实施例中一种用于实现图像处理方法的电子设备;FIG. 9 schematically shows an electronic device for implementing an image processing method in an exemplary embodiment of the present disclosure;
图10示意性示出本公开示例性实施例中一种用于实现图像处理方法的计算机可读存储介质。FIG. 10 schematically illustrates a computer-readable storage medium for implementing an image processing method in an exemplary embodiment of the present disclosure.
具体实施方式Detailed ways
现在将参考附图更全面地描述示例实施方式。然而,示例实施方式能够以多种形式实施,且不应被理解为限于在此阐述的范例;相反,提供这些实施方式使得本公开将更加全面和完整,并将示例实施方式的构思全面地传达给本领域的技术人员。所描述的特征、结构或特性可以以任何合适的方式结合在一个或更多实施方式中。在下面的描述中,提供许多具体细节从而给出对本公开的实施方式的充分理解。然而,本领域技术人员将意识到,可以实践本公开的技术方案而省略所述特定细节中的一个或更多,或者可以采用其它的方法、组元、装置、步骤等。在其它情况下,不详细示出或描述公知技术方案以避免喧宾夺主而使得本公开的各方面变得模糊。Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments, however, can be embodied in various forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided in order to give a thorough understanding of the embodiments of the present disclosure. However, those skilled in the art will appreciate that the technical solutions of the present disclosure may be practiced without one or more of the specific details, or other methods, components, devices, steps, etc. may be employed. In other instances, well-known solutions have not been shown or described in detail to avoid obscuring aspects of the present disclosure.
本说明书中使用用语“一个”、“一”、“该”和“所述”用以表示存在一个或多个要素/组成部分/等;用语“包括”和“具有”用以表示开放式的包括在内的意思并且是指除了列出的要素/组成部分/等之外还可存在另外的要素/组成部分/等;用语“第一”和“第二”等仅作为标记使用,不是对其对象的数量限制。The terms "a", "an", "the" and "said" are used in this specification to indicate the presence of one or more elements/components/etc; the terms "include" and "have" are used to indicate open-ended Inclusive means and means that additional elements/components/etc may be present in addition to the listed elements/components/etc; the terms "first" and "second" etc. are used only as labels, not for The number of its objects is limited.
此外,附图仅为本公开的示意性图解,并非一定是按比例绘制。图中相同的附图标记表示相同或类似的部分,因而将省略对它们的重复描述。附图中所示的一些方框图是功能实体,不一定必须与物理或逻辑上独立的实体相对应。Furthermore, the drawings are merely schematic illustrations of the present disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus their repeated descriptions will be omitted. Some of the block diagrams shown in the figures are functional entities that do not necessarily necessarily correspond to physically or logically separate entities.
针对手机或相机等具有横拍和竖拍两种模式的影像设备,当需要运行例如目标检测(如人 脸、头肩、全身等)的功能时,对应于两种模式需要适配两种不同的图像处理模型。For imaging devices such as mobile phones or cameras that have two modes of horizontal shooting and vertical shooting, when you need to run functions such as target detection (such as face, head and shoulders, whole body, etc.), you need to adapt two different modes for the two modes. image processing model.
一般的,将图像处理模型与横拍和竖拍两种模式一样进行区分。此时,需要维护两个图像处理模型。由于卷积神经网络等图像处理模型的大小通常以M(兆)为单位,因此很容易导致部署图像处理模型的应用程序过大。并且,为了保障切换顺畅,需要同时加载两个图像处理模式,浪费内存空间,并增大加载时长。Generally, the image processing model is distinguished from the two modes of horizontal and vertical shooting. At this point, two image processing models need to be maintained. Since the size of image processing models such as convolutional neural networks is often measured in M (mega), it can easily lead to oversized applications deploying image processing models. Moreover, in order to ensure smooth switching, two image processing modes need to be loaded at the same time, which wastes memory space and increases the loading time.
或者是,可以将横拍模式下的图像处理模型与竖拍模式下的图像处理模型合并。为了兼顾两种模式下的性能,通常输入的图像大小为正方形。正方向的图像会使得横拍图像和竖拍图像分别在横向和纵向都产生了形变,导致图像失真,影响目标检测等性能,降低了用户的体验感。Alternatively, the image processing model in the landscape mode can be combined with the image processing model in the portrait mode. In order to balance the performance in both modes, the input image size is usually square. The image in the positive direction will cause the horizontal and vertical images to be deformed in the horizontal and vertical directions, respectively, resulting in image distortion, affecting performance such as target detection, and reducing the user's sense of experience.
除此之外,也可以充分利用全卷积神经网络模型对图像的出入大小和形状不敏感的特征,使用全卷机神经网络分别对横拍图像和竖拍图像进行处理。但是,现在的移动端部署平台通常为TFLite。而TFLite不支持内存动态分配,此时需要离线将其中一种图像处理模型转换成另一种图像处理模型,则只需要维护一种图像处理模型。但是这时的应用程序过大,且加载时间仍然过长,浪费内存的问题也依然存在。In addition, the fully convolutional neural network model can also make full use of the insensitivity to the size and shape of the image, and use the fully convolutional neural network to process the horizontal and vertical images respectively. However, the current mobile deployment platform is usually TFLite. However, TFLite does not support dynamic memory allocation. At this time, if one of the image processing models needs to be converted into another image processing model offline, only one image processing model needs to be maintained. But at this time, the application is too large, and the loading time is still too long, and the problem of wasting memory still exists.
针对相关技术中存在的问题,本公开提供图像处理方法、图像处理装置、计算机可读介质以及电子设备。下面对本示例性实施方式的各个方面进行具体说明。In view of the problems existing in the related art, the present disclosure provides an image processing method, an image processing apparatus, a computer-readable medium, and an electronic device. Various aspects of the present exemplary embodiment are described in detail below.
图1示出了本示例性实施方式中一种图像处理方法的流程示意图,如图1所示,该方法至少包括以下步骤S110、S120和S130。具体的:FIG. 1 shows a schematic flowchart of an image processing method in this exemplary embodiment. As shown in FIG. 1 , the method includes at least the following steps S110 , S120 and S130 . specific:
步骤S110.获取原始图像,原始图像包括目标对象。Step S110. Acquire an original image, where the original image includes the target object.
步骤S120.当原始图像和预设的图像处理模型不匹配时,分别对原始图像进行第一处理和图像处理模型进行第二处理,以使得原始图像和图像处理模型匹配。Step S120. When the original image and the preset image processing model do not match, respectively perform the first processing on the original image and the second processing on the image processing model, so that the original image and the image processing model match.
步骤S130.通过图像处理模型对原始图像进行处理,以识别原始图像中的目标对象。Step S130. Process the original image through an image processing model to identify the target object in the original image.
在本公开的示例性实施例中,本公开一方面,通过对原始图像进行第一处理和对图像处理模型进行第二处理,只需存储和加载一个图像处理模型,极大地节省了内存空间;另一方面,通过图像处理模型对原始图像的处理可以识别出目标对象,可以在保证图像处理模型性能的情况下,完成对两种拍摄模式下的目标检测处理,满足了目标检测的实时性要求,便于部署在拍摄设备上在线使用。In an exemplary embodiment of the present disclosure, on the one hand, by performing the first processing on the original image and the second processing on the image processing model, only one image processing model needs to be stored and loaded, which greatly saves memory space; On the other hand, the target object can be identified through the processing of the original image by the image processing model, and the target detection processing in the two shooting modes can be completed under the condition of ensuring the performance of the image processing model, which meets the real-time requirements of target detection. , which is easy to deploy and use online on shooting equipment.
下面对图像处理方法的各个步骤进行详细说明。Each step of the image processing method will be described in detail below.
在步骤S110中,获取原始图像,原始图像包括目标对象。In step S110, an original image is acquired, and the original image includes the target object.
在本公开的示例性实施例中,原始图像中包括目标对象表明该原始图像是拍摄目标对象得到的图像。其中,目标对象例如人、树、车或船等,本示例性实施例对此不做特殊限定。In an exemplary embodiment of the present disclosure, the inclusion of the target object in the original image indicates that the original image is an image obtained by photographing the target object. The target object is, for example, a person, a tree, a vehicle, or a boat, which is not particularly limited in this exemplary embodiment.
在可选的实施例中,原始图像由预设的拍摄设备采集图像后生成;其中,当拍摄设备横拍时,原始图像的横边大于竖边;当拍摄设备竖拍时,原始图像的横边小于竖边。In an optional embodiment, the original image is generated by a preset photographing device after collecting the image; wherein, when the photographing device is horizontally photographed, the horizontal side of the original image is larger than the vertical side; when the photographing device is vertically photographed, the horizontal side of the original image is The side is smaller than the vertical side.
其中,该拍摄设备可以是具有横拍和竖拍两种拍摄模式的影像设备。举例而言,该拍摄设备可以是手机和相机,也可以是其他影像设备,本示例性实施例对此不做特殊限定。Wherein, the photographing device may be an imaging device having two photographing modes of horizontal shooting and vertical shooting. For example, the photographing device may be a mobile phone and a camera, or other imaging devices, which are not particularly limited in this exemplary embodiment.
当该拍摄设备处于横拍模式时,拍摄得到的原始图像的横边大于竖边;而当该拍摄设备处于竖拍模式时,拍摄得到的原始图像的横边小于竖边。When the photographing device is in the horizontal shooting mode, the horizontal edge of the original image obtained by shooting is larger than the vertical edge; and when the shooting device is in the vertical shooting mode, the horizontal edge of the original image obtained by shooting is smaller than the vertical edge.
在可选的实施例中,目标对象在原始图像中与原始图像的横边对应。即,当展示所述原始图像时,所述目标对象朝向所述原始图像的其中一条横边。In an optional embodiment, the target object corresponds to a horizontal edge of the original image in the original image. That is, when the original image is displayed, the target object faces one of the horizontal sides of the original image.
在步骤S120中,当原始图像和预设的图像处理模型不匹配时,分别对原始图像进行第一处理和图像处理模型进行第二处理,以使得原始图像和图像处理模型匹配。In step S120, when the original image and the preset image processing model do not match, respectively perform the first processing on the original image and the second processing on the image processing model, so that the original image and the image processing model match.
在本公开的示例性实施例中,为进一步对原始图像进行图像处理,可以预先设置一图像处理模型。而为确定该图像处理模型是否能够对原始图像进行图像处理,可以对原始图像与预设的图像处理模型是否匹配进行判断。In an exemplary embodiment of the present disclosure, in order to further perform image processing on the original image, an image processing model may be preset. To determine whether the image processing model can perform image processing on the original image, it may be determined whether the original image matches the preset image processing model.
并且,由于横拍模式或者竖拍模式下,目标对象均有两种朝向。具体的,当拍摄设备,例如手机采用横拍模式,且在手机的端部到底部是从左向右放置时,目标对象在原始画面中具有第一种朝向,而当手机的端部到底部是从右向左放置,目标对象在原始画面中具有第二朝向;当拍摄设备,例如手机采用竖拍模式,且在手机的端部到底部是从上向下放置时,目标对象在原始画面中具有第一种朝向,,而当手机的端部到底部是从下向上放置,目标对象在原始画面中具有第二朝向。因此,可以首先利用拍摄设备上的惯性测量单元对目标对象的朝向进行判断,以进行翻转等处理。Moreover, because in the horizontal shooting mode or the vertical shooting mode, the target object has two orientations. Specifically, when a photographing device, such as a mobile phone, adopts the horizontal shooting mode and is placed from left to right from the end to the bottom of the mobile phone, the target object has the first orientation in the original picture, and when the end to the bottom of the mobile phone is positioned from left to right, the target object has the first orientation. It is placed from right to left, and the target object has a second orientation in the original picture; when the shooting device, such as a mobile phone, is in vertical shooting mode, and the end to the bottom of the mobile phone is placed from top to bottom, the target object is in the original picture. It has the first orientation, and when the phone is placed from the bottom to the bottom, the target object has the second orientation in the original screen. Therefore, the orientation of the target object can be judged by using the inertial measurement unit on the photographing device first, so as to perform processing such as flipping.
在可选的实施例中,图2示出了对原始图像进行翻转的方法的流程示意图,如图2所示,该方法至少包括以下步骤:在步骤S210中,根据拍摄设备的惯性测量单元所采集到的数据确定原始图像的目标对象的朝向。In an optional embodiment, FIG. 2 shows a schematic flowchart of a method for flipping an original image. As shown in FIG. 2 , the method includes at least the following steps: In step S210, according to the information of the inertial measurement unit of the photographing device The acquired data determines the orientation of the target object of the original image.
其中,惯性测量单元(Intertial Measurement Unit,简称IMU)是由三个单轴的加速度传感器与三个单轴的角速度传感器(陀螺仪)组成的,可以测量出IMU数据,包括拍摄设备在三维空间中的加速度数据和角速度数据。基于此,IMU可以在便携式设备,例如可穿戴编写设备、手持式设备中安装,以依据IMU所测量出的IMU数据解算出拍摄设备的运动姿态。Among them, the Inertial Measurement Unit (IMU) is composed of three single-axis acceleration sensors and three single-axis angular velocity sensors (gyroscopes), which can measure the IMU data, including the shooting equipment in three-dimensional space. acceleration data and angular velocity data. Based on this, the IMU can be installed in a portable device, such as a wearable writing device and a handheld device, so as to calculate the motion posture of the photographing device according to the IMU data measured by the IMU.
具体的,图3示出了目标对象的四种朝向的界面示意图,如图3所示,在方向A中,目标对象的朝向为竖直方向上的从下到上;在方向B中,目标对象的朝向为竖直方向上的从上到下;在方向C中,目标对象的朝向为水平方向上的从右到左;在方向D中,目标对象的朝向为水平方向上的从左到右。Specifically, FIG. 3 shows a schematic interface diagram of four orientations of the target object. As shown in FIG. 3 , in the direction A, the orientation of the target object is from bottom to top in the vertical direction; in the direction B, the target The orientation of the object is from top to bottom in the vertical direction; in the direction C, the orientation of the target object is from right to left in the horizontal direction; in the direction D, the orientation of the target object is from the left to the horizontal direction right.
在步骤S220中,当目标对象的朝向为倒置时,对原始图像进行180度翻转。In step S220, when the orientation of the target object is upside down, the original image is flipped 180 degrees.
在图3中的方向A和方向B中,显然两幅原始图像是在同一竖直方向上,但是二者之间是翻转180°的关系,而方向C和方向D中,两幅原始图像均是在水平方向上,也是存在翻转180°则一致的关系。In the direction A and the direction B in Fig. 3, it is obvious that the two original images are in the same vertical direction, but the relationship between the two is flipped by 180°, while in the direction C and the direction D, the two original images are both in the same vertical direction. It is in the horizontal direction, and there is also a consistent relationship if it is flipped 180°.
并且,预设的图像处理模型可以是与横拍模式相匹配的模型,也可以是与竖拍模式相匹配的模型。Moreover, the preset image processing model may be a model matching the horizontal shooting mode, or may be a model matching the vertical shooting mode.
为进一步判断原始图像与预设的图像处理模型是否匹配,因此,可以将方向A和方向B上任一方向上的原始图像认为是另一方向上的倒置图像,亦即是目标对象相比于原始图像倒置,因此可以对该原始图像进行180度的翻转处理。相同的,方向B和方向C上的任一方向上的原始图像也可以认为是另一方向上的倒置图像,亦即是目标对象相比于原始图像倒置,也可以对该原始图像进行180度的翻转处理。In order to further judge whether the original image matches the preset image processing model, therefore, the original image in either direction of direction A and direction B can be regarded as an inverted image in the other direction, that is, the target object is inverted compared to the original image. , so the original image can be flipped 180 degrees. Similarly, the original image in either direction B or C can also be considered as an inverted image in the other direction, that is, the target object is inverted compared to the original image, and the original image can also be flipped 180 degrees. deal with.
在本示例性实施例中,通过目标对象的朝向可以对原始图像进行翻转处理,能够对原始图像进行朝向上的处理,减少了后续的匹配判断和图像处理的运算成本,并提升了图像处理效率。In this exemplary embodiment, the original image can be flipped through the orientation of the target object, and the original image can be oriented upward, which reduces the computational cost of subsequent matching judgment and image processing, and improves the efficiency of image processing. .
在对原始图像进行翻转处理之后,可以进一步判断原始图像和预设的图像处理模型是否匹配。After the original image is flipped, it can be further determined whether the original image matches the preset image processing model.
在可选的实施例中,当原始图像的横边大于竖边时,原始图像和预设的图像处理模型相互匹配;当原始图像的横边小于竖边时,原始图像和预设的图像处理模型不匹配;或者,当原始图像的横边小于竖边时,原始图像和预设的图像处理模型相互匹配;当原始图像的横边大于竖边时,原始图像和预设的图像处理模型不匹配。In an optional embodiment, when the horizontal side of the original image is larger than the vertical side, the original image and the preset image processing model match each other; when the horizontal side of the original image is smaller than the vertical side, the original image and the preset image processing model match each other; The models do not match; or, when the horizontal side of the original image is smaller than the vertical side, the original image and the preset image processing model match each other; when the horizontal side of the original image is larger than the vertical side, the original image and the preset image processing model do not match. match.
具体的,当该预设的图像处理模型是与拍摄设备的横拍模式相匹配的模型,且原始图像是采用横拍模式拍摄得到的时,该原始图像与预设的图像处理模型相匹配。其中,横拍模式拍摄得到的原始图像的横边大于竖边。Specifically, when the preset image processing model is a model matching the horizontal shooting mode of the photographing device, and the original image is obtained by shooting in the horizontal shooting mode, the original image matches the preset image processing model. Wherein, the horizontal side of the original image captured in the horizontal shooting mode is larger than the vertical side.
相反的,当原始图像是采用拍摄设备的竖拍模式拍摄得到的时,该原始图像与预设的图像处理模型不匹配。其中,竖拍模式拍摄得到的原始图像的横边小于竖边。On the contrary, when the original image is obtained by using the vertical shooting mode of the photographing device, the original image does not match the preset image processing model. Wherein, the horizontal side of the original image captured in the vertical shooting mode is smaller than the vertical side.
而当预设的图像处理模型是与拍摄设备的竖拍模式相匹配的模型,且原始图像是采用竖拍模式拍摄得到的时,该原始图像与预设的图像处理模型相匹配。其中,竖拍模式拍摄得到的原始图像的横边小于竖边。When the preset image processing model is a model matching the vertical shooting mode of the photographing device, and the original image is obtained by shooting in the vertical shooting mode, the original image matches the preset image processing model. Wherein, the horizontal side of the original image captured in the vertical shooting mode is smaller than the vertical side.
相反的,当原始图像是采用拍摄设备的横拍模式拍摄得到的时,该原始图像与预设的图像处理模型不匹配。其中,横拍模式拍摄得到的原始图像的横边大于竖边。On the contrary, when the original image is obtained by using the horizontal shooting mode of the photographing device, the original image does not match the preset image processing model. Wherein, the horizontal side of the original image captured in the horizontal shooting mode is larger than the vertical side.
进一步的,当原始图像与预设的图像处理模型不匹配时,可以对原始图像进行第一处理,并对图像处理模型进行第二处理以使原始图像和图像处理模型匹配。Further, when the original image does not match the preset image processing model, the first processing may be performed on the original image, and the second processing may be performed on the image processing model to match the original image and the image processing model.
在可选的实施例中,图4出了分别进行第一处理和第二处理的方法的流程示意图,如图4示,该方法至少包括以下步骤:在步骤S410中,根据拍摄设备的惯性测量单元所采集到的数据确定原始图像的形状。In an optional embodiment, FIG. 4 shows a schematic flowchart of a method for performing the first processing and the second processing respectively. As shown in FIG. 4 , the method at least includes the following steps: In step S410, according to the inertial measurement of the photographing device The data collected by the unit determines the shape of the original image.
除了能够根据拍摄设的惯性测量单元采集到的数据确定目标对象的朝向之外,还可以确定出原始图像的形状。In addition to determining the orientation of the target object according to the data collected by the inertial measurement unit of the shooting device, the shape of the original image can also be determined.
具体的,原始图像的形状可以是原始图像的长宽比。Specifically, the shape of the original image may be an aspect ratio of the original image.
当原始图像是采用横拍模式拍摄得到的时,原始图像的长宽比大于1。而该预设的图像处理模型是与拍摄设备的横拍模式相匹配的模型,表明该图像处理模型与大于1的长宽比的原始图像相匹配。When the original image is captured in horizontal shooting mode, the aspect ratio of the original image is greater than 1. The preset image processing model is a model that matches the horizontal shooting mode of the photographing device, indicating that the image processing model matches the original image with an aspect ratio greater than 1.
当原始图像是采用竖拍模式拍摄得到的,原始图像的长宽比小于1。而该预设的图像处理模型是与拍摄设备的竖拍模式相匹配的模型,表明该图像处理模型与小于1的长宽比的原始图像相匹配。When the original image is shot in vertical mode, the aspect ratio of the original image is less than 1. The preset image processing model is a model that matches the vertical shooting mode of the photographing device, indicating that the image processing model matches the original image with an aspect ratio smaller than 1.
在步骤S420中,当原始图像的形状与预设的图像处理模型不匹配时,对原始图像进行第一处理,并对图像处理模型进行第二处理。In step S420, when the shape of the original image does not match the preset image processing model, the first processing is performed on the original image, and the second processing is performed on the image processing model.
当原始图像的长宽比小于1,而该预设的图像处理模型是与拍摄设备的横拍模式相匹配的模型时,表明该原始图像的形状与预设的图像处理模型不匹配。When the aspect ratio of the original image is less than 1, and the preset image processing model is a model matching the horizontal shooting mode of the photographing device, it indicates that the shape of the original image does not match the preset image processing model.
当原始图像的长宽比大于1,而该预设的图像处理模型时域拍摄设备的竖拍模式相匹配的模型时,也可以表明该原始图像的形状与预设的图像处理模型不匹配。When the aspect ratio of the original image is greater than 1 and the preset image processing model matches the vertical shooting mode of the time-domain shooting device, it can also indicate that the shape of the original image does not match the preset image processing model.
因此,可以对原始图像进行第一处理,并对图像处理模型进行第二处理。Therefore, a first processing can be performed on the original image and a second processing on the image processing model.
在本示例性实施例中,通过惯性测量单元采集到的数据可以确定原始图像的形状,并进一步确定进行第一处理和第二处理,确定方式简单准确,适用性极强。In this exemplary embodiment, the shape of the original image can be determined through the data collected by the inertial measurement unit, and the first processing and the second processing can be further determined. The determination method is simple and accurate, and has strong applicability.
其中,对原始图像进行第一处理可以是旋转处理。Wherein, performing the first processing on the original image may be rotation processing.
在可选的实施例中,对原始图像旋转90度,以使得原始图像中的目标对象也旋转90度。In an alternative embodiment, the original image is rotated by 90 degrees so that the target object in the original image is also rotated by 90 degrees.
如图3所示的四个方向进行倒置判断和翻转之后,可以得到水平方向和竖直方向两个方向上的原始图像。After judging and flipping the four directions as shown in FIG. 3 upside down, the original images in the horizontal direction and the vertical direction can be obtained.
当图像处理模型为与拍摄设备的横拍模式相匹配的模型时,可以将竖直方向上的原始图像旋转90度,以使原始图像中的目标对象也旋转90度;当图像处理模型为与拍摄设备的竖拍模式相匹配的模型时,可以将水平方向上的原始图像旋转90度,以使原始图像中的目标对象也旋转90度。When the image processing model is a model that matches the horizontal shooting mode of the shooting device, the original image in the vertical direction can be rotated by 90 degrees, so that the target object in the original image is also rotated by 90 degrees; when the image processing model is the same as When shooting a model that matches the vertical mode of the device, you can rotate the original image in the horizontal direction by 90 degrees, so that the target object in the original image is also rotated by 90 degrees.
对原始图像进行旋转90度的处理,可以使原始图像的尺寸或者说形状与预设的图像处理模型相匹配。Rotating the original image by 90 degrees can make the size or shape of the original image match the preset image processing model.
进一步的,还可以对图像处理模型进行第二处理。Further, the second processing may also be performed on the image processing model.
在可选的实施例中,图5示出了进行第二处理的方法的流程示意图,如图5所示,该方法至少包括以下步骤:在步骤S510中,若图像处理模型为卷积神经网络模型,获取卷积神经网络模型中的权重矩阵。In an optional embodiment, FIG. 5 shows a schematic flowchart of a method for performing the second processing. As shown in FIG. 5 , the method includes at least the following steps: In step S510, if the image processing model is a convolutional neural network model to get the weight matrix in the convolutional neural network model.
其中,卷积神经网络(Convolutional Neural Network,简称CNN)是一种人工神经网络,在图像识别和目标检测等领域有着广泛的应用。典型的CNN模型包括卷积层、池化层、激活层和全连接层等,上一层根据输入的数据进行相应的运算,将运算结果输出给下一层,输入的初始数据经过多层的运算之后得到一个最终的结果。Among them, Convolutional Neural Network (CNN) is an artificial neural network, which has a wide range of applications in the fields of image recognition and target detection. A typical CNN model includes convolutional layers, pooling layers, activation layers, and fully-connected layers. The upper layer performs corresponding operations based on the input data, and outputs the operation results to the next layer. After the operation, a final result is obtained.
其中,卷积层的卷积操作可以为利用卷积核(也可以被称为滤波器)和图像进行运算后输出另一图像,该原酸可以是将图像的特征值与卷积核的权值做内积操作。Among them, the convolution operation of the convolution layer may be to use a convolution kernel (also called a filter) to operate on the image and then output another image, and the ortho acid may be the weight of the feature value of the image and the convolution kernel. value for inner product operation.
卷积层的计算过程为最重要的特征提取过程,每个预设特征提取卷积神经网络可以设计多个卷积层,每个卷积层可以包括输入层的特征图谱的尺寸,进行特征遍历的卷积核以及卷积核在输入层的特征图谱上的遍历步长。举例而言,输入层的特征图谱的尺寸为32*32*3,卷积核的尺寸为5*5,遍历步长为1,则卷积层输出的特征图谱的尺寸为28*28*3。The calculation process of the convolutional layer is the most important feature extraction process. Each preset feature extraction convolutional neural network can design multiple convolutional layers, and each convolutional layer can include the size of the feature map of the input layer to perform feature traversal. The convolution kernel and the traversal step size of the convolution kernel on the feature map of the input layer. For example, the size of the feature map of the input layer is 32*32*3, the size of the convolution kernel is 5*5, and the traversal step size is 1, then the size of the feature map output by the convolutional layer is 28*28*3 .
其中,卷积层的卷积核即为卷积神经网络模型中的权重矩阵。Among them, the convolution kernel of the convolutional layer is the weight matrix in the convolutional neural network model.
在步骤S520中,对权重矩阵进行转置处理,以对卷积神经网络模型进行第二处理。In step S520, transpose processing is performed on the weight matrix to perform second processing on the convolutional neural network model.
当卷积神经网络模型为与拍摄设备的横拍模式对应的模型时,为使该卷积神经网络模型适用于拍摄设备的竖拍模式,可以将卷积神经网络模型的权重矩阵进行转置处理。When the convolutional neural network model is a model corresponding to the horizontal shooting mode of the shooting device, in order to make the convolutional neural network model suitable for the vertical shooting mode of the shooting device, the weight matrix of the convolutional neural network model can be transposed. .
同样,当卷积神经网络模型为与拍摄设备的竖拍模式对应的模型时,为使该卷积神经网络模型适用于拍摄设备的横拍模式,也可以将卷积神经网络模型的权重矩阵进行转置处理。Similarly, when the convolutional neural network model is a model corresponding to the vertical shooting mode of the shooting device, in order to make the convolutional neural network model suitable for the horizontal shooting mode of the shooting device, the weight matrix of the convolutional neural network model can also be adjusted. Transpose processing.
因此,在得到卷积神经网络模型的权重矩阵,亦即卷积核之后,可以对该权重矩阵进行转置处理。Therefore, after obtaining the weight matrix of the convolutional neural network model, that is, the convolution kernel, the weight matrix can be transposed.
转置可以是将权重矩阵的所有元素绕着一条从第1行第1列元素出发的右下方45度的射线作镜面翻转,既可得到权重矩阵的转置矩阵。亦即,将权重矩阵的第一行变成第一列,第二行变成第二列,……,最后一行变成最后一列即可得到。The transposition can be mirror flipping of all elements of the weight matrix around a ray of 45 degrees to the lower right starting from the elements in the first row and the first column, which can obtain the transposed matrix of the weight matrix. That is, the first row of the weight matrix becomes the first column, the second row becomes the second column, ... and the last row becomes the last column.
值得说明的是,转置处理的方向与第一处理后的目标对象的朝向是相关的。具体地,转置处理的方向与第一处理的旋转方向相一致。例如,如图7所示,第一处理为对原始图像向右垂直旋转90度,则目标对象也向右垂直旋转90度,则转置处理也需要向右转置,例如,原始矩阵的第一行变为转置后的矩阵的最后一列,以使得转置处理后矩阵能与目标对象的朝向对应, 准确识别出目标对象。It should be noted that the direction of the transposition processing is related to the orientation of the target object after the first processing. Specifically, the direction of the transpose process coincides with the rotation direction of the first process. For example, as shown in Figure 7, the first process is to rotate the original image vertically by 90 degrees to the right, then the target object is also rotated vertically by 90 degrees to the right, and the transposition process also needs to be transposed to the right. One row becomes the last column of the transposed matrix, so that the transposed matrix can correspond to the orientation of the target object and the target object can be accurately identified.
在对权重矩阵进行转置处理之后,亦即实现了对卷积神经网络模型的第二处理。After transposing the weight matrix, the second processing of the convolutional neural network model is implemented.
在本示例性实施例中,通过对卷积神经网络模型的权重矩阵进行转置处理,可以实现对卷积神经网络模型的第二处理,实现了对两种原始图像统一图像处理模型的效果,为图像处理效果提供了保障。In this exemplary embodiment, by transposing the weight matrix of the convolutional neural network model, the second processing of the convolutional neural network model can be realized, and the effect of unifying the image processing model for the two original images is realized, Provides a guarantee for the image processing effect.
在对原始图像进行第一处理和对图像处理模型进行第二处理之后,原始图像的形状,亦即尺寸可以与图像处理模型匹配,并且目标对象也与图像处理模型的处理朝向相匹配,使得原始图像和图像处理模型达到了匹配的效果。After the first processing is performed on the original image and the second processing is performed on the image processing model, the shape of the original image, that is, the size, can be matched with the image processing model, and the target object is also matched with the processing orientation of the image processing model, so that the original image Image and image processing models achieve matching results.
举例而言,当原始图像为竖拍模式下的水平边小于竖直边的图像时,对原始图像向右旋转90度,使得原始图像的水平边大于竖直边。而且,当原始图像中的目标对象为一头朝上,脚朝下的方向上的人像时,在目标对象跟随原始图像也向右旋转了90度的情况下,该人像的朝向变成了头朝向右侧,而脚朝向左侧的朝向。For example, when the original image is an image whose horizontal side is smaller than the vertical side in the vertical shooting mode, the original image is rotated 90 degrees to the right, so that the horizontal side of the original image is larger than the vertical side. Moreover, when the target object in the original image is a portrait with its head up and feet down, and the target object is also rotated 90 degrees to the right following the original image, the orientation of the portrait becomes the head orientation The right side, while the foot is facing the left side.
进一步的,当图像处理模型为卷积神经网络模型时,获取到卷积神经网络模型中的权重矩阵为第一行是1,2,3,第二行是4,5,6,第三行为7,8,9的矩阵,并为了使得转置处理后的权重矩阵依然能够对原始图像中的人像的对应位置进行处理,可以将该权重矩阵进行向右转置的转置处理,以实现第一行的1,2,3依然对人像的头部位置进行处理,第二行的4,5,6依然对人像的身体部位进行处理,而最后一行的7,8,9依然对人像的脚部位置进行处理,以此达到原始图像与图像处理模型匹配的效果。Further, when the image processing model is a convolutional neural network model, the obtained weight matrix in the convolutional neural network model is that the first row is 1, 2, 3, the second row is 4, 5, 6, and the third row is 7, 8, and 9 matrices, and in order to make the transposed weight matrix still able to process the corresponding position of the portrait in the original image, the weight matrix can be transposed to the right to achieve the first 1,2,3 in one line still process the head position of the portrait, 4,5,6 in the second line still process the body part of the portrait, and 7,8,9 in the last line still process the feet of the portrait The position of the part is processed, so as to achieve the effect of matching the original image with the image processing model.
除此之外,也可以仅对图像处理模型进行第二处理实现原始图像与图像处理模型匹配。Besides, it is also possible to only perform the second processing on the image processing model to achieve matching between the original image and the image processing model.
在可选的实施例中,当原始图像和预设的图像处理模型不匹配时,对图像处理模型进行第二处理,以使原始图像和图像处理模型匹配。In an optional embodiment, when the original image and the preset image processing model do not match, a second process is performed on the image processing model to match the original image and the image processing model.
当原始图像的形状,亦即尺寸与图像处理模型相匹配,但是目标对象与图像处理模型的处理朝向不匹配时,也可以认为原始图像与预设的图像处理模型不匹配。因此,可以对图像处理模型进行第二处理。When the shape, that is, the size of the original image, matches the image processing model, but the processing orientation of the target object does not match the image processing model, it can also be considered that the original image does not match the preset image processing model. Therefore, a second process can be performed on the image processing model.
首先,若图像处理模型为卷积神经网络模型,获取卷积神经网络模型中的权重矩阵。然后,对权重矩阵进行转置处理,以对卷积神经网络模型进行第二处理。First, if the image processing model is a convolutional neural network model, obtain the weight matrix in the convolutional neural network model. Then, the weight matrix is transposed to perform the second processing on the convolutional neural network model.
具体的,卷积神经网络模型中卷积层的卷积操作可以为利用卷积核和图像进行运算后输出另一图像,该原酸可以是将图像的特征值与卷积核的权值做内积操作。Specifically, the convolution operation of the convolutional layer in the convolutional neural network model may be to output another image after using the convolution kernel and the image to operate. Inner product operation.
卷积层的计算过程为最重要的特征提取过程,每个预设特征提取卷积神经网络可以设计多个卷积层,每个卷积层可以包括输入层的特征图谱的尺寸,进行特征遍历的卷积核以及卷积核在输入层的特征图谱上的遍历步长。举例而言,输入层的特征图谱的尺寸为32*32*3,卷积核的尺寸为5*5,遍历步长为1,则卷积层输出的特征图谱的尺寸为28*28*3。The calculation process of the convolutional layer is the most important feature extraction process. Each preset feature extraction convolutional neural network can design multiple convolutional layers, and each convolutional layer can include the size of the feature map of the input layer to perform feature traversal. The convolution kernel and the traversal step size of the convolution kernel on the feature map of the input layer. For example, the size of the feature map of the input layer is 32*32*3, the size of the convolution kernel is 5*5, and the traversal step size is 1, then the size of the feature map output by the convolutional layer is 28*28*3 .
其中,卷积层的卷积核即为卷积神经网络模型中的权重矩阵。Among them, the convolution kernel of the convolutional layer is the weight matrix in the convolutional neural network model.
当卷积神经网络模型为与拍摄设备的横拍模式对应的模型时,为使该卷积神经网络模型适用于拍摄设备的竖拍模式,可以将卷积神经网络模型的权重矩阵进行转置处理。When the convolutional neural network model is a model corresponding to the horizontal shooting mode of the shooting device, in order to make the convolutional neural network model suitable for the vertical shooting mode of the shooting device, the weight matrix of the convolutional neural network model can be transposed. .
同样,当卷积神经网络模型为与拍摄设备的竖拍模式对应的模型时,为使该卷积神经网络模型适用于拍摄设备的横拍模式,也可以将卷积神经网络模型的权重矩阵进行转置处理。Similarly, when the convolutional neural network model is a model corresponding to the vertical shooting mode of the shooting device, in order to make the convolutional neural network model suitable for the horizontal shooting mode of the shooting device, the weight matrix of the convolutional neural network model can also be adjusted. Transpose processing.
因此,在得到卷积神经网络模型的权重矩阵,亦即卷积核之后,可以对该权重矩阵进行转 置处理。Therefore, after obtaining the weight matrix of the convolutional neural network model, that is, the convolution kernel, the weight matrix can be transposed.
转置可以是将权重矩阵的所有元素绕着一条从第1行第1列元素出发的右下方45度的射线作镜面翻转,既可得到权重矩阵的转置矩阵。亦即,将权重矩阵的第一行变成第一列,第二行变成第二列,……,最后一行变成最后一列即可得到。The transposition can be mirror flipping of all elements of the weight matrix around a ray of 45 degrees to the lower right starting from the elements in the first row and the first column, which can obtain the transposed matrix of the weight matrix. That is, the first row of the weight matrix becomes the first column, the second row becomes the second column, ... and the last row becomes the last column.
在对权重矩阵进行转置处理之后,亦即实现了对卷积神经网络模型的第二处理。After transposing the weight matrix, the second processing of the convolutional neural network model is implemented.
在本示例性实施例中,针对其中一种不匹配的情况给出处理方式,实现了对应情况下对两种原始图像统一图像处理模型的效果,为图像处理效果提供了保障,并且扩大了图像处理模型的应用场景。In this exemplary embodiment, a processing method is provided for one of the mismatched cases, which realizes the effect of unifying the image processing model for the two original images in the corresponding situation, provides a guarantee for the image processing effect, and enlarges the image Handle the application scenarios of the model.
在步骤S130中,通过图像处理模型对原始图像进行处理,以识别原始图像中的目标对象。In step S130, the original image is processed by an image processing model to identify the target object in the original image.
在本公开的示例性实施例中,在图像处理模型与原始图像匹配之后,可以利用该图像处理模型对原始图像进行处理。In an exemplary embodiment of the present disclosure, after the image processing model is matched with the original image, the original image may be processed using the image processing model.
在可选的实施例中,图6示出了对原始图像进行处理的方法的步骤流程图,如图6所示,该方法至少包括以下步骤:在步骤S610中,利用转置处理之后的权重矩阵对原始图像进行内积操作得到图像特征。In an optional embodiment, FIG. 6 shows a flow chart of steps of a method for processing an original image. As shown in FIG. 6 , the method at least includes the following steps: in step S610 , using the weights after transposition processing The matrix performs the inner product operation on the original image to obtain the image features.
具体的,当图像处理模型为卷积神经网络模型时,卷积神经网络模型的输入层可以检测原始图像的每个区域的像素特征,如每个区域的像素灰阶值等。进一步的,卷积神经网络模型的卷积层可以对该像素特征进行内积操作得到图像特征。Specifically, when the image processing model is a convolutional neural network model, the input layer of the convolutional neural network model can detect pixel features of each region of the original image, such as pixel grayscale values of each region. Further, the convolutional layer of the convolutional neural network model can perform an inner product operation on the pixel features to obtain image features.
其中,内积操作采用滑动卷积核,亦即权重矩阵的方式进行,以原始图像的左上角为起点,滑动权重矩阵到原始图像的右下角产生一个特征图。其中,每次滑动权重矩阵之后,都能够从原始图像中提取到一个与权重矩阵大小相同的特征矩阵,将该特征矩阵与权重矩阵进行内积操作,可以产生对应的图像特征。Among them, the inner product operation is performed by sliding the convolution kernel, that is, the weight matrix. Taking the upper left corner of the original image as the starting point, sliding the weight matrix to the lower right corner of the original image generates a feature map. Among them, after each sliding of the weight matrix, a feature matrix with the same size as the weight matrix can be extracted from the original image, and the corresponding image features can be generated by performing an inner product operation on the feature matrix and the weight matrix.
在步骤S620中,对图像特征进行非线性处理得到非线性特征,并对非线性特征进行特征压缩处理得到压缩特征。In step S620, nonlinear processing is performed on the image features to obtain nonlinear features, and feature compression processing is performed on the nonlinear features to obtain compressed features.
在得到图像特征之后,卷积神经网络模型的激活函数可以对图像特征加入非线性因素,以提高图像特征的特征表示效果。具体的,可以采用特定的激活函数进行点对点的映射得到非线性特征。After the image features are obtained, the activation function of the convolutional neural network model can add nonlinear factors to the image features to improve the feature representation effect of the image features. Specifically, a specific activation function can be used to perform point-to-point mapping to obtain nonlinear features.
进一步的,激活函数的池化层用于对非线性特征进行压缩处理,简化非线性特征提取卷积神经网络的计算复杂度。具体的,特征压缩处理可以采用滑动窗口的方式得到压缩特征,也可以采用其他方式,本示例性实施例对此不做特殊限定。Further, the pooling layer of the activation function is used to compress the nonlinear features and simplify the computational complexity of the convolutional neural network for nonlinear feature extraction. Specifically, the feature compression processing may adopt a sliding window manner to obtain compressed features, or may adopt other manners, which are not particularly limited in this exemplary embodiment.
在步骤S630中,对压缩特征进行全连接处理,以对原始图像进行处理。In step S630, full connection processing is performed on the compressed feature to process the original image.
在得到压缩特征之后,可以将该压缩特征输入至卷积神经网络模型的全连接层进行全连接处理。全连接处理可以将压缩特征映射为一个长的输出向量,并进行输出,以实现对原始图像的处理。After the compressed features are obtained, the compressed features can be input to the fully connected layer of the convolutional neural network model for full connection processing. Fully connected processing can map the compressed features into a long output vector and output it to process the original image.
在本示例性实施例中,利用转置处理后的卷积神经网络模型对原始图像进行处理,使得图像处理模型适配于两种尺寸的原始图像,保障了后续目标检测的检测效果。In this exemplary embodiment, the transposed convolutional neural network model is used to process the original image, so that the image processing model is adapted to the original images of two sizes, and the detection effect of subsequent target detection is guaranteed.
在图像处理模型对原始图像进行处理之后,可以识别出原始图像中的目标对象。图像目标检测是指对原始图像中的目标对象进行位置检测并进行分类,卷积神经网络模型以其高精度的检测效果广泛使用。After the image processing model processes the original image, the target object in the original image can be identified. Image target detection refers to the location detection and classification of target objects in the original image, and the convolutional neural network model is widely used for its high-precision detection effect.
除此之外,当原始图像与预设的图像处理模型已经匹配的情况下,可以直接利用图像处理模型对原始图像进行处理。In addition, when the original image has been matched with the preset image processing model, the image processing model can be directly used to process the original image.
在可选的实施例中,当原始图像和预设的图像处理模型匹配时,通过图像处理模型对原始图像进行处理,以识别原始图像中的目标对象。In an optional embodiment, when the original image matches the preset image processing model, the original image is processed by the image processing model to identify the target object in the original image.
当原始图像的横边大于竖边时,原始图像和预设的图像处理模型相互匹配,或者是当原始图像的横边小于竖边时,原始图像和预设的图像处理模型相互匹配。When the horizontal side of the original image is larger than the vertical side, the original image and the preset image processing model match each other, or when the horizontal side of the original image is smaller than the vertical side, the original image and the preset image processing model match each other.
在这两种情况下,可以直接使用图像处理模型对原始图像进行处理。In both cases, the original image can be processed directly using the image processing model.
首先,利用未转置处理的权重矩阵对原始图像进行内积操作得到图像特征;然后,对图像特征进行非线性处理得到非线性特征,并对非线性特征进行特征压缩处理得到压缩特征;最后,对压缩特征进行全连接处理,以对原始图像进行处理,识别出目标对象。First, use the untransposed weight matrix to perform inner product operation on the original image to obtain image features; then, perform nonlinear processing on image features to obtain nonlinear features, and perform feature compression on nonlinear features to obtain compressed features; finally, The compressed features are fully connected to process the original image and identify the target object.
具体的,当图像处理模型为卷积神经网络模型时,卷积神经网络模型的输入层可以检测原始图像的每个区域的像素特征,如每个区域的像素灰阶值等。进一步的,卷积神经网络模型的卷积层可以对该像素特征进行内积操作得到图像特征。Specifically, when the image processing model is a convolutional neural network model, the input layer of the convolutional neural network model can detect pixel features of each region of the original image, such as pixel grayscale values of each region. Further, the convolutional layer of the convolutional neural network model can perform an inner product operation on the pixel features to obtain image features.
其中,内积操作采用滑动卷积核,亦即权重矩阵的方式进行,以原始图像的左上角为起点,滑动权重矩阵到原始图像的右下角产生一个特征图。其中,每次滑动权重矩阵之后,都能够从原始图像中提取到一个与权重矩阵大小相同的特征矩阵,将该特征矩阵与权重矩阵进行内积操作,可以产生对应的图像特征。Among them, the inner product operation is performed by sliding the convolution kernel, that is, the weight matrix. Taking the upper left corner of the original image as the starting point, sliding the weight matrix to the lower right corner of the original image generates a feature map. Among them, after each sliding of the weight matrix, a feature matrix with the same size as the weight matrix can be extracted from the original image, and the corresponding image features can be generated by performing an inner product operation on the feature matrix and the weight matrix.
在得到图像特征之后,卷积神经网络模型的激活函数可以对图像特征加入非线性因素,以提高图像特征的特征表示效果。具体的,可以采用特定的激活函数进行点对点的映射得到非线性特征。After the image features are obtained, the activation function of the convolutional neural network model can add nonlinear factors to the image features to improve the feature representation effect of the image features. Specifically, a specific activation function can be used to perform point-to-point mapping to obtain nonlinear features.
进一步的,激活函数的池化层用于对非线性特征进行压缩处理,简化非线性特征提取卷积神经网络的计算复杂度。具体的,特征压缩处理可以采用滑动窗口的方式得到压缩特征,也可以采用其他方式,本示例性实施例对此不做特殊限定。Further, the pooling layer of the activation function is used to compress the nonlinear features and simplify the computational complexity of the convolutional neural network for nonlinear feature extraction. Specifically, the feature compression processing may adopt a sliding window manner to obtain compressed features, or may adopt other manners, which are not particularly limited in this exemplary embodiment.
在得到压缩特征之后,可以将该压缩特征输入至卷积神经网络模型的全连接层进行全连接处理。全连接处理可以将压缩特征映射为一个长的输出向量,并进行输出,以实现对原始图像的目标检测处理。After the compressed features are obtained, the compressed features can be input to the fully connected layer of the convolutional neural network model for full connection processing. The full connection processing can map the compressed features into a long output vector and output it to realize the target detection processing of the original image.
在对原始图像和图像处理模型进行处理之后,还可以将该图像处理模型部署到拍摄设备上。After the original image and the image processing model are processed, the image processing model can also be deployed on the photographing device.
具体的,可以利用TFLite实现。其中,TFLite是将图像处理模型部署到移动端和嵌入式设备的工具包。TFLite可以将图像出模型导入到TFLite的资源目录下,并运行应用程序,即可实现将图像处理模型部署应用程序到拍摄设备上的效果。Specifically, it can be implemented using TFLite. Among them, TFLite is a toolkit for deploying image processing models to mobile and embedded devices. TFLite can import the image output model into the resource directory of TFLite, and run the application to achieve the effect of deploying the image processing model to the shooting device.
下面结合一应用场景对本公开实施例中的图像处理方法做出详细说明。The image processing method in the embodiment of the present disclosure will be described in detail below with reference to an application scenario.
图7示出了对权重矩阵进行转置处理的界面示意图,如图7所示,第一行是拍摄设备的横拍模式,可以拍摄得到目标对象朝上,水平边大于竖直边的原始图像。在卷积神经网络与横拍模式对应的情况下,可以直接使用未转置的权重矩阵识别出目标对象。Figure 7 shows a schematic diagram of the interface for transposing the weight matrix. As shown in Figure 7, the first row is the horizontal shooting mode of the photographing device, and the original image with the target object facing upward and the horizontal side larger than the vertical side can be photographed. . In the case of a convolutional neural network corresponding to the horizontal mode, the target object can be identified directly using the untransposed weight matrix.
第二行是拍摄设备的竖拍模式,可以拍摄得到目标朝上,水平边小于竖直边的原始图像。当卷积神经网络模型为全卷积神经网络模型的情况下,由于全卷积神经网络模型对原始图像的大小和形状并不敏感,因此可以使用未转置的权重矩阵识别出原始图像中的目标对象。但是由于TFLite不支持模型加载内存动态分配,因此无法实现。The second row is the vertical shooting mode of the shooting device, which can capture the original image with the target facing up and the horizontal side smaller than the vertical side. When the convolutional neural network model is a fully convolutional neural network model, since the fully convolutional neural network model is not sensitive to the size and shape of the original image, the untransposed weight matrix can be used to identify the target. But since TFLite does not support dynamic allocation of model loading memory, it cannot be achieved.
其中,全卷积神经网络模型中的所有层均为卷积层,无全连接层,因此对原始图像的大小和形状并不敏感。Among them, all layers in the fully convolutional neural network model are convolutional layers without fully connected layers, so they are not sensitive to the size and shape of the original image.
第三行是拍摄设备的竖拍模式,将原始图像旋转90度,使得原始图像的水平边大于竖直边。在这种情况下,原始图像的大小与拍摄设备的横拍模式匹配。并且,为了使旋转后的原始图像与适配于横拍模式的图像处理模型匹配,可以将图像处理模型的权重矩阵进行转置,此时的内存分配情况并未发生改变,可以实现原始图像与图像处理模型的匹配。因此,可以利用图像处理模型对该原始图像进行处理,完成目标检测。The third row is the vertical mode of the shooting device, which rotates the original image by 90 degrees so that the horizontal side of the original image is larger than the vertical side. In this case, the size of the original image matches the landscape mode of the shooting device. In addition, in order to match the rotated original image with the image processing model adapted to the horizontal mode, the weight matrix of the image processing model can be transposed. At this time, the memory allocation has not changed, and the original image can be Matching of image processing models. Therefore, an image processing model can be used to process the original image to complete target detection.
在本公开的应用场景中,一方面,通过对原始图像进行第一处理和对图像处理模型进行第二处理,只需存储和加载一个图像处理模型,极大地节省了内存空间,并减少了加载时长;另一方面,通过图像处理模型对原始图像的处理可以识别出目标对象,可以在保证图像处理模型性能的情况下,完成对两种拍摄模式下的目标检测处理,满足了目标检测的实时性要求,便于部署在拍摄设备上在线使用。In the application scenario of the present disclosure, on the one hand, by performing the first processing on the original image and the second processing on the image processing model, only one image processing model needs to be stored and loaded, which greatly saves memory space and reduces loading On the other hand, the target object can be identified through the processing of the original image by the image processing model, and the target detection processing in the two shooting modes can be completed under the condition of ensuring the performance of the image processing model, which satisfies the real-time target detection process. Sexual requirements, easy to deploy on the shooting equipment for online use.
需要说明的是,虽然以上示例性实施例的实施方式以特定顺序描述了本公开中方法的各个步骤,但是,这并非要求或者暗示必须按照该特定顺序来执行这些步骤,或者必须执行全部的步骤才能实现期望的结果。附加地或者备选地,可以省略某些步骤,将多个步骤合并为一个步骤执行,以及/或者将一个步骤分解为多个步骤执行等。It should be noted that although the implementation of the above exemplary embodiments describes the various steps of the methods in the present disclosure in a specific order, this does not require or imply that these steps must be performed in this specific order, or that all steps must be performed. to achieve the desired result. Additionally or alternatively, certain steps may be omitted, multiple steps may be combined into one step for execution, and/or one step may be decomposed into multiple steps for execution, and the like.
此外,在本公开的示例性实施例中,还提供一种图像处理装置。图8示出了图像处理装置的结构示意图,如图8所示,图像处理装置可以包括:存储器810和处理器820。其中:Furthermore, in an exemplary embodiment of the present disclosure, an image processing apparatus is also provided. FIG. 8 shows a schematic structural diagram of an image processing apparatus. As shown in FIG. 8 , the image processing apparatus may include: a memory 810 and a processor 820 . in:
存储器810,用于存储所述处理器820的可执行指令;a memory 810 for storing executable instructions of the processor 820;
其中,处理器820被配置为经由执行可执行指令来执行:wherein the processor 820 is configured to perform, via executing executable instructions:
获取原始图像,所述原始图像包括目标对象;obtaining an original image, the original image including the target object;
当所述原始图像和预设的图像处理模型不匹配时,分别对所述原始图像进行第一处理和所述图像处理模型进行第二处理,以使得所述原始图像和所述图像处理模型匹配;When the original image and the preset image processing model do not match, respectively perform the first processing on the original image and the second processing on the image processing model, so that the original image and the image processing model match ;
通过所述图像处理模型对所述原始图像进行处理,以识别所述原始图像中的目标对象。The original image is processed by the image processing model to identify the target object in the original image.
在本公开的一种示例性实施例中,所述原始图像由预设的拍摄设备采集图像后生成;In an exemplary embodiment of the present disclosure, the original image is generated after an image is captured by a preset photographing device;
其中,当所述拍摄设备横拍时,所述原始图像的横边大于竖边;当所述拍摄设备竖拍时,所述原始图像的横边小于竖边。Wherein, when the photographing device shoots horizontally, the horizontal side of the original image is larger than the vertical side; when the photographing device shoots vertically, the horizontal side of the original image is smaller than the vertical side.
在本公开的一种示例性实施例中,所述装置还包括:In an exemplary embodiment of the present disclosure, the apparatus further includes:
当所述原始图像的横边大于竖边时,所述原始图像和预设的图像处理模型相互匹配;当所述原始图像的横边小于竖边时,所述原始图像和预设的图像处理模型不匹配;或者,When the horizontal side of the original image is larger than the vertical side, the original image and the preset image processing model match each other; when the horizontal side of the original image is smaller than the vertical side, the original image and the preset image processing model match each other; model mismatch; or,
当所述原始图像的横边小于竖边时,所述原始图像和预设的图像处理模型相互匹配;当所述原始图像的横边大于竖边时,所述原始图像和预设的图像处理模型不匹配。When the horizontal side of the original image is smaller than the vertical side, the original image and the preset image processing model match each other; when the horizontal side of the original image is larger than the vertical side, the original image and the preset image processing model match each other; Model does not match.
在本公开的一种示例性实施例中,所述目标对象在所述原始图像中与所述原始图像的横边对应。In an exemplary embodiment of the present disclosure, the target object corresponds to a horizontal edge of the original image in the original image.
在本公开的一种示例性实施例中,所述当所述原始图像和预设的图像处理模型不匹配时,分别对所述原始图像进行第一处理和所述图像处理模型进行第二处理,包括:In an exemplary embodiment of the present disclosure, when the original image does not match a preset image processing model, the first processing is performed on the original image and the second processing is performed on the image processing model, respectively. ,include:
根据所述拍摄设备的惯性测量单元所采集到的数据确定所述原始图像的形状;Determine the shape of the original image according to the data collected by the inertial measurement unit of the photographing device;
当所述原始图像的形状与预设的图像处理模型不匹配时,对所述原始图像进行第一处理, 并对所述图像处理模型进行第二处理。When the shape of the original image does not match the preset image processing model, the first processing is performed on the original image, and the second processing is performed on the image processing model.
在本公开的一种示例性实施例中,所述对所述原始图像进行第一处理,包括:In an exemplary embodiment of the present disclosure, performing the first processing on the original image includes:
对所述原始图像旋转90度,以使得所述原始图像中的目标对象也旋转90度。The original image is rotated by 90 degrees so that the target object in the original image is also rotated by 90 degrees.
在本公开的一种示例性实施例中,在所述分别对所述原始图像进行第一处理和所述图像处理模型进行第二处理前,所述装置还包括:In an exemplary embodiment of the present disclosure, before performing the first processing on the original image and performing the second processing on the image processing model respectively, the apparatus further includes:
根据所述拍摄设备的惯性测量单元所采集到的数据确定所述原始图像的目标对象的朝向;Determine the orientation of the target object of the original image according to the data collected by the inertial measurement unit of the photographing device;
当所述目标对象的朝向为倒置时,对所述原始图像进行180度翻转。When the orientation of the target object is inverted, the original image is flipped 180 degrees.
在本公开的一种示例性实施例中,所述装置还包括:In an exemplary embodiment of the present disclosure, the apparatus further includes:
当所述原始图像和预设的图像处理模型不匹配时,对所述图像处理模型进行第二处理,以使所述原始图像和所述图像处理模型匹配。When the original image and the preset image processing model do not match, a second process is performed on the image processing model to match the original image and the image processing model.
在本公开的一种示例性实施例中,所述对所述图像处理模型进行第二处理,包括:In an exemplary embodiment of the present disclosure, performing the second processing on the image processing model includes:
若所述图像处理模型为卷积神经网络模型,获取所述卷积神经网络模型中的权重矩阵;If the image processing model is a convolutional neural network model, obtain the weight matrix in the convolutional neural network model;
对所述权重矩阵进行转置处理,以对所述卷积神经网络模型进行第二处理。Transpose processing is performed on the weight matrix to perform a second processing on the convolutional neural network model.
在本公开的一种示例性实施例中,所述通过所述图像处理模型对所述原始图像进行处理,包括:In an exemplary embodiment of the present disclosure, the processing of the original image by the image processing model includes:
利用转置处理之后的所述权重矩阵对所述原始图像进行内积操作得到图像特征;Perform an inner product operation on the original image by using the weight matrix after the transposition process to obtain image features;
对所述图像特征进行非线性处理得到非线性特征,并对所述非线性特征进行特征压缩处理得到压缩特征;Performing nonlinear processing on the image features to obtain nonlinear features, and performing feature compression processing on the nonlinear features to obtain compressed features;
对所述压缩特征进行全连接处理,以对所述原始图像进行处理。Perform full connection processing on the compressed features to process the original image.
在本公开的一种示例性实施例中,所述装置还包括:In an exemplary embodiment of the present disclosure, the apparatus further includes:
当所述原始图像和预设的图像处理模型匹配时,通过所述图像处理模型对所述原始图像进行处理,以识别所述原始图像中的目标对象。When the original image matches a preset image processing model, the original image is processed by the image processing model to identify the target object in the original image.
上述图像处理装置的具体细节已经在对应的图像处理方法中进行了详细的描述,因此此处不再赘述。The specific details of the above image processing apparatus have been described in detail in the corresponding image processing method, and therefore are not repeated here.
本示例性实施方式提供的图像处理装置,一方面,通过对原始图像进行第一处理和对图像处理模型进行第二处理,只需存储和加载一个图像处理模型,极大地节省了内存空间,并减少了加载时长;另一方面,通过图像处理模型对原始图像的处理可以识别出目标对象,可以在保证图像处理模型性能的情况下,完成对两种拍摄模式下的目标检测处理,满足了目标检测的实时性要求,便于部署在拍摄设备上在线使用。The image processing apparatus provided by this exemplary embodiment, on the one hand, only needs to store and load one image processing model by performing the first processing on the original image and the second processing on the image processing model, which greatly saves memory space, and The loading time is reduced; on the other hand, the target object can be identified through the processing of the original image by the image processing model, and the target detection processing in the two shooting modes can be completed under the condition of ensuring the performance of the image processing model, which satisfies the target The real-time detection requirements are easy to deploy on the shooting equipment for online use.
应当注意,尽管在上文详细描述中提及图像处理装置800的若干模块或者单元,但是这种划分并非强制性的。实际上,根据本公开的实施方式,上文描述的两个或更多模块或者单元的特征和功能可以在一个模块或者单元中具体化。反之,上文描述的一个模块或者单元的特征和功能可以进一步划分为由多个模块或者单元来具体化。It should be noted that although several modules or units of the image processing apparatus 800 are mentioned in the above detailed description, such division is not mandatory. Indeed, according to embodiments of the present disclosure, the features and functions of two or more modules or units described above may be embodied in one module or unit. Conversely, the features and functions of one module or unit described above may be further divided into multiple modules or units to be embodied.
此外,在本公开的示例性实施例中,还提供了一种能够实现上述方法的电子设备。In addition, in an exemplary embodiment of the present disclosure, an electronic device capable of implementing the above method is also provided.
下面参照图9来描述根据本发明的这种实施例的电子设备900。图9显示的电子设备900仅仅是一个示例,不应对本发明实施例的功能和使用范围带来任何限制。An electronic device 900 according to such an embodiment of the present invention is described below with reference to FIG. 9 . The electronic device 900 shown in FIG. 9 is only an example, and should not impose any limitations on the function and scope of use of the embodiments of the present invention.
如图9所示,电子设备900以通用计算设备的形式表现。电子设备900的组件可以包括但不限于:上述至少一个处理单元910、上述至少一个存储单元920、连接不同系统组件(包括 存储单元920和处理单元910)的总线930、显示单元940。As shown in FIG. 9, electronic device 900 takes the form of a general-purpose computing device. The components of the electronic device 900 may include, but are not limited to: the above-mentioned at least one processing unit 910, the above-mentioned at least one storage unit 920, a bus 930 connecting different system components (including the storage unit 920 and the processing unit 910), and a display unit 940.
其中,所述存储单元存储有程序代码,所述程序代码可以被所述处理单元910执行,使得所述处理单元910执行本说明书上述“示例性方法”部分中描述的根据本发明各种示例性实施例的步骤。Wherein, the storage unit stores program codes, and the program codes can be executed by the processing unit 910, so that the processing unit 910 executes various exemplary methods according to the present invention described in the above-mentioned “Exemplary Methods” section of this specification Example steps.
存储单元920可以包括易失性存储单元形式的可读介质,例如随机存取存储单元(RAM)921和/或高速缓存存储单元922,还可以进一步包括只读存储单元(ROM)923。The storage unit 920 may include a readable medium in the form of a volatile storage unit, such as a random access storage unit (RAM) 921 and/or a cache storage unit 922 , and may further include a read only storage unit (ROM) 923 .
存储单元920还可以包括具有一组(至少一个)程序模块925的程序/实用工具924,这样的程序模块925包括但不限于:操作系统、一个或者多个应用程序、其它程序模块以及程序数据,这些示例中的每一个或某种组合中可能包括网络环境的实现。The storage unit 920 may also include a program/utility 924 having a set (at least one) of program modules 925 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, An implementation of a network environment may be included in each or some combination of these examples.
总线930可以为表示几类总线结构中的一种或多种,包括存储单元总线或者存储单元控制器、外围总线、图形加速端口、处理单元或者使用多种总线结构中的任意总线结构的局域总线。The bus 930 may be representative of one or more of several types of bus structures, including a memory cell bus or memory cell controller, a peripheral bus, a graphics acceleration port, a processing unit, or a local area using any of a variety of bus structures bus.
电子设备900也可以与一个或多个外部设备1100(例如键盘、指向设备、蓝牙设备等)通信,还可与一个或者多个使得用户能与该电子设备900交互的设备通信,和/或与使得该电子设备900能与一个或多个其它计算设备进行通信的任何设备(例如路由器、调制解调器等等)通信。这种通信可以通过输入/输出(I/O)接口950进行。并且,电子设备900还可以通过网络适配器960与一个或者多个网络(例如局域网(LAN),广域网(WAN)和/或公共网络,例如因特网)通信。如图所示,网络适配器960通过总线930与电子设备900的其它模块通信。应当明白,尽管图中未示出,可以结合电子设备900使用其它硬件和/或软件模块,包括但不限于:微代码、设备驱动器、冗余处理单元、外部磁盘驱动阵列、RAID系统、磁带驱动器以及数据备份存储系统等。The electronic device 900 may also communicate with one or more external devices 1100 (eg, keyboards, pointing devices, Bluetooth devices, etc.), with one or more devices that enable a user to interact with the electronic device 900, and/or with Any device (eg, router, modem, etc.) that enables the electronic device 900 to communicate with one or more other computing devices. Such communication may take place through input/output (I/O) interface 950 . Also, the electronic device 900 may communicate with one or more networks (eg, a local area network (LAN), a wide area network (WAN), and/or a public network such as the Internet) through a network adapter 960 . As shown, network adapter 960 communicates with other modules of electronic device 900 via bus 930 . It should be understood that, although not shown, other hardware and/or software modules may be used in conjunction with electronic device 900, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives and data backup storage systems.
通过以上的实施例的描述,本领域的技术人员易于理解,这里描述的示例实施例可以通过软件实现,也可以通过软件结合必要的硬件的方式来实现。因此,根据本公开实施例的技术方案可以以软件产品的形式体现出来,该软件产品可以存储在一个非易失性存储介质(可以是CD-ROM,U盘,移动硬盘等)中或网络上,包括若干指令以使得一台计算设备(可以是个人计算机、服务器、终端装置、或者网络设备等)执行根据本公开实施例的方法。From the description of the above embodiments, those skilled in the art can easily understand that the exemplary embodiments described herein may be implemented by software, or may be implemented by software combined with necessary hardware. Therefore, the technical solutions according to the embodiments of the present disclosure may be embodied in the form of software products, and the software products may be stored in a non-volatile storage medium (which may be CD-ROM, U disk, mobile hard disk, etc.) or on a network , including several instructions to cause a computing device (which may be a personal computer, a server, a terminal device, or a network device, etc.) to execute the method according to an embodiment of the present disclosure.
在本公开的示例性实施例中,还提供了一种计算机可读存储介质,其上存储有能够实现本说明书上述方法的程序产品。在一些可能的实施例中,本发明的各个方面还可以实现为一种程序产品的形式,其包括程序代码,当所述程序产品在终端设备上运行时,所述程序代码用于使所述终端设备执行本说明书上述“示例性方法”部分中描述的根据本发明各种示例性实施例的步骤。In an exemplary embodiment of the present disclosure, there is also provided a computer-readable storage medium on which a program product capable of implementing the above-described method of the present specification is stored. In some possible embodiments, aspects of the present invention may also be implemented in the form of a program product comprising program code for causing the program product to run on a terminal device when the program product is run The terminal device performs the steps according to various exemplary embodiments of the present invention described in the above-mentioned "Example Method" section of this specification.
参考图10所示,描述了根据本发明的实施例的用于实现上述方法的程序产品1000,其可以采用便携式紧凑盘只读存储器(CD-ROM)并包括程序代码,并可以在终端设备,例如个人电脑上运行。然而,本发明的程序产品不限于此,在本文件中,可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。Referring to FIG. 10, a program product 1000 for implementing the above method according to an embodiment of the present invention is described, which can adopt a portable compact disk read-only memory (CD-ROM) and include program codes, and can be stored in a terminal device, For example running on a personal computer. However, the program product of the present invention is not limited thereto, and in this document, a readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
所述程序产品可以采用一个或多个可读介质的任意组合。可读介质可以是可读信号介质或者可读存储介质。可读存储介质例如可以为但不限于电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式盘、硬盘、随机存取存储器(RAM)、只读存储器 (ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples (non-exhaustive list) of readable storage media include: electrical connections with one or more wires, portable disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了可读程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。可读信号介质还可以是可读存储介质以外的任何可读介质,该可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。A computer readable signal medium may include a propagated data signal in baseband or as part of a carrier wave with readable program code embodied thereon. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. A readable signal medium can also be any readable medium, other than a readable storage medium, that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于无线、有线、光缆、RF等等,或者上述的任意合适的组合。Program code embodied on a readable medium may be transmitted using any suitable medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
可以以一种或多种程序设计语言的任意组合来编写用于执行本发明操作的程序代码,所述程序设计语言包括面向对象的程序设计语言—诸如Java、C++等,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算设备上执行、部分地在用户设备上执行、作为一个独立的软件包执行、部分在用户计算设备上部分在远程计算设备上执行、或者完全在远程计算设备或服务器上执行。在涉及远程计算设备的情形中,远程计算设备可以通过任意种类的网络,包括局域网(LAN)或广域网(WAN),连接到用户计算设备,或者,可以连接到外部计算设备(例如利用因特网服务提供商来通过因特网连接)。Program code for carrying out operations of the present invention may be written in any combination of one or more programming languages, including object-oriented programming languages—such as Java, C++, etc., as well as conventional procedural Programming Language - such as the "C" language or similar programming language. The program code may execute entirely on the user computing device, partly on the user device, as a stand-alone software package, partly on the user computing device and partly on a remote computing device, or entirely on the remote computing device or server execute on. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computing device (eg, using an Internet service provider business via an Internet connection).
本领域技术人员在考虑说明书及实践这里公开的发明后,将容易想到本公开的其他实施例。本申请旨在涵盖本公开的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本公开的一般性原理并包括本公开未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本公开的真正范围和精神由权利要求指出。Other embodiments of the present disclosure will readily suggest themselves to those skilled in the art upon consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the present disclosure that follow the general principles of the present disclosure and include common knowledge or techniques in the technical field not disclosed by the present disclosure . The specification and examples are to be regarded as exemplary only, with the true scope and spirit of the disclosure being indicated by the claims.

Claims (24)

  1. 一种图像处理方法,其特征在于,包括:An image processing method, comprising:
    获取原始图像,所述原始图像包括目标对象;obtaining an original image, the original image including the target object;
    当所述原始图像和预设的图像处理模型不匹配时,分别对所述原始图像进行第一处理和所述图像处理模型进行第二处理,以使得所述原始图像和所述图像处理模型匹配;When the original image and the preset image processing model do not match, respectively perform the first processing on the original image and the second processing on the image processing model, so that the original image and the image processing model match ;
    通过所述图像处理模型对所述原始图像进行处理,以识别所述原始图像中的目标对象。The original image is processed by the image processing model to identify the target object in the original image.
  2. 根据权利要求1所述的方法,其特征在于,所述原始图像由预设的拍摄设备采集图像后生成;The method according to claim 1, wherein the original image is generated after the image is collected by a preset photographing device;
    其中,当所述拍摄设备横拍时,所述原始图像的横边大于竖边;当所述拍摄设备竖拍时,所述原始图像的横边小于竖边。Wherein, when the photographing device shoots horizontally, the horizontal side of the original image is larger than the vertical side; when the photographing device shoots vertically, the horizontal side of the original image is smaller than the vertical side.
  3. 根据权利要求1或2所述的方法,其特征在于,所述方法还包括:The method according to claim 1 or 2, wherein the method further comprises:
    当所述原始图像的横边大于竖边时,所述原始图像和预设的图像处理模型相互匹配;当所述原始图像的横边小于竖边时,所述原始图像和预设的图像处理模型不匹配;或者,When the horizontal side of the original image is larger than the vertical side, the original image and the preset image processing model match each other; when the horizontal side of the original image is smaller than the vertical side, the original image and the preset image processing model match each other; model mismatch; or,
    当所述原始图像的横边小于竖边时,所述原始图像和预设的图像处理模型相互匹配;当所述原始图像的横边大于竖边时,所述原始图像和预设的图像处理模型不匹配。When the horizontal side of the original image is smaller than the vertical side, the original image and the preset image processing model match each other; when the horizontal side of the original image is larger than the vertical side, the original image and the preset image processing model match each other; Model does not match.
  4. 根据权利要求1或2所述的方法,其特征在于,所述目标对象在所述原始图像中与所述原始图像的横边对应。The method according to claim 1 or 2, wherein the target object corresponds to a horizontal edge of the original image in the original image.
  5. 根据权利要求2所述的方法,其特征在于,所述当所述原始图像和预设的图像处理模型不匹配时,分别对所述原始图像进行第一处理和所述图像处理模型进行第二处理,包括:The method according to claim 2, wherein when the original image does not match a preset image processing model, the original image is subjected to a first processing and the image processing model is respectively subjected to a second processing processing, including:
    根据所述拍摄设备的惯性测量单元所采集到的数据确定所述原始图像的形状;Determine the shape of the original image according to the data collected by the inertial measurement unit of the photographing device;
    当所述原始图像的形状与预设的图像处理模型不匹配时,对所述原始图像进行第一处理,并对所述图像处理模型进行第二处理。When the shape of the original image does not match the preset image processing model, the first processing is performed on the original image, and the second processing is performed on the image processing model.
  6. 根据权利要求5所述的方法,其特征在于,所述对所述原始图像进行第一处理,包括:The method according to claim 5, wherein the performing the first processing on the original image comprises:
    对所述原始图像旋转90度,以使得所述原始图像中的目标对象也旋转90度。The original image is rotated by 90 degrees so that the target object in the original image is also rotated by 90 degrees.
  7. 根据权利要求2所述的方法,其特征在于,在所述分别对所述原始图像进行第一处理和所述图像处理模型进行第二处理前,所述方法还包括:The method according to claim 2, characterized in that, before performing the first processing on the original image and performing the second processing on the image processing model respectively, the method further comprises:
    根据所述拍摄设备的惯性测量单元所采集到的数据确定所述原始图像的目标对象的朝向;Determine the orientation of the target object of the original image according to the data collected by the inertial measurement unit of the photographing device;
    当所述目标对象的朝向为倒置时,对所述原始图像进行180度翻转。When the orientation of the target object is inverted, the original image is flipped 180 degrees.
  8. 根据权利要求1所述的方法,其特征在于,所述方法还包括:The method according to claim 1, wherein the method further comprises:
    当所述原始图像和预设的图像处理模型不匹配时,对所述图像处理模型进行第二处理,以使所述原始图像和所述图像处理模型匹配。When the original image and the preset image processing model do not match, a second process is performed on the image processing model to match the original image and the image processing model.
  9. 根据权利要求7或8所述的方法,其特征在于,所述对所述图像处理模型进 行第二处理,包括:The method according to claim 7 or 8, wherein the second processing is performed on the image processing model, comprising:
    若所述图像处理模型为卷积神经网络模型,获取所述卷积神经网络模型中的权重矩阵;If the image processing model is a convolutional neural network model, obtain the weight matrix in the convolutional neural network model;
    对所述权重矩阵进行转置处理,以对所述卷积神经网络模型进行第二处理。Transpose processing is performed on the weight matrix to perform a second processing on the convolutional neural network model.
  10. 根据权利要求9所述的方法,其特征在于,所述通过所述图像处理模型对所述原始图像进行处理,包括:The method according to claim 9, wherein the processing the original image by the image processing model comprises:
    利用转置处理之后的所述权重矩阵对所述原始图像进行内积操作得到图像特征;Perform an inner product operation on the original image by using the weight matrix after the transposition process to obtain image features;
    对所述图像特征进行非线性处理得到非线性特征,并对所述非线性特征进行特征压缩处理得到压缩特征;Performing nonlinear processing on the image features to obtain nonlinear features, and performing feature compression processing on the nonlinear features to obtain compressed features;
    对所述压缩特征进行全连接处理,以对所述原始图像进行处理。Perform full connection processing on the compressed features to process the original image.
  11. 根据权利要求1所述的方法,其特征在于,所述方法还包括:The method according to claim 1, wherein the method further comprises:
    当所述原始图像和预设的图像处理模型匹配时,通过所述图像处理模型对所述原始图像进行处理,以识别所述原始图像中的目标对象。When the original image matches a preset image processing model, the original image is processed by the image processing model to identify the target object in the original image.
  12. 一种图像处理装置,其特征在于,包括:An image processing device, comprising:
    处理器;processor;
    存储器,用于存储所述处理器的可执行指令;a memory for storing executable instructions for the processor;
    其中,所述处理器被配置为经由执行所述可执行指令来执行:wherein the processor is configured to perform, via executing the executable instructions:
    获取原始图像,所述原始图像包括目标对象;obtaining an original image, the original image including the target object;
    当所述原始图像和预设的图像处理模型不匹配时,分别对所述原始图像进行第一处理和所述图像处理模型进行第二处理,以使得所述原始图像和所述图像处理模型匹配;When the original image and the preset image processing model do not match, respectively perform the first processing on the original image and the second processing on the image processing model, so that the original image and the image processing model match ;
    通过所述图像处理模型对所述原始图像进行处理,以识别所述原始图像中的目标对象。The original image is processed by the image processing model to identify the target object in the original image.
  13. 根据权利要求12所述的装置,其特征在于,所述原始图像由预设的拍摄设备采集图像后生成;The apparatus according to claim 12, wherein the original image is generated after the image is collected by a preset photographing device;
    其中,当所述拍摄设备横拍时,所述原始图像的横边大于竖边;当所述拍摄设备竖拍时,所述原始图像的横边小于竖边。Wherein, when the photographing device shoots horizontally, the horizontal side of the original image is larger than the vertical side; when the photographing device shoots vertically, the horizontal side of the original image is smaller than the vertical side.
  14. 根据权利要求12或13所述的装置,其特征在于,所述装置还包括:The device according to claim 12 or 13, wherein the device further comprises:
    当所述原始图像的横边大于竖边时,所述原始图像和预设的图像处理模型相互匹配;当所述原始图像的横边小于竖边时,所述原始图像和预设的图像处理模型不匹配;或者,When the horizontal side of the original image is larger than the vertical side, the original image and the preset image processing model match each other; when the horizontal side of the original image is smaller than the vertical side, the original image and the preset image processing model match each other; model mismatch; or,
    当所述原始图像的横边小于竖边时,所述原始图像和预设的图像处理模型相互匹配;当所述原始图像的横边大于竖边时,所述原始图像和预设的图像处理模型不匹配。When the horizontal side of the original image is smaller than the vertical side, the original image and the preset image processing model match each other; when the horizontal side of the original image is larger than the vertical side, the original image and the preset image processing model match each other; Model does not match.
  15. 根据权利要求12或13所述的装置,其特征在于,所述目标对象在所述原始图像中与所述原始图像的横边对应。The device according to claim 12 or 13, wherein the target object in the original image corresponds to a horizontal edge of the original image.
  16. 根据权利要求13所述的装置,其特征在于,所述当所述原始图像和预设的图像处理模型不匹配时,分别对所述原始图像进行第一处理和所述图像处理模型进行第二处理,包括:The device according to claim 13, wherein when the original image does not match a preset image processing model, the first processing is performed on the original image and the second processing is performed on the image processing model, respectively. processing, including:
    根据所述拍摄设备的惯性测量单元所采集到的数据确定所述原始图像的形状;Determine the shape of the original image according to the data collected by the inertial measurement unit of the photographing device;
    当所述原始图像的形状与预设的图像处理模型不匹配时,对所述原始图像进行第一处理,并对所述图像处理模型进行第二处理。When the shape of the original image does not match the preset image processing model, the first processing is performed on the original image, and the second processing is performed on the image processing model.
  17. 根据权利要求16所述的装置,其特征在于,所述对所述原始图像进行第一处理,包括:The apparatus according to claim 16, wherein the performing the first processing on the original image comprises:
    对所述原始图像旋转90度,以使得所述原始图像中的目标对象也旋转90度。The original image is rotated by 90 degrees so that the target object in the original image is also rotated by 90 degrees.
  18. 根据权利要求13所述的装置,其特征在于,在所述分别对所述原始图像进行第一处理和所述图像处理模型进行第二处理前,所述装置还包括:The device according to claim 13, characterized in that before the first processing on the original image and the second processing on the image processing model respectively, the device further comprises:
    根据所述拍摄设备的惯性测量单元所采集到的数据确定所述原始图像的目标对象的朝向;Determine the orientation of the target object of the original image according to the data collected by the inertial measurement unit of the photographing device;
    当所述目标对象的朝向为倒置时,对所述原始图像进行180度翻转。When the orientation of the target object is inverted, the original image is flipped 180 degrees.
  19. 根据权利要求12所述的装置,其特征在于,所述装置还包括:The apparatus of claim 12, wherein the apparatus further comprises:
    当所述原始图像和预设的图像处理模型不匹配时,对所述图像处理模型进行第二处理,以使所述原始图像和所述图像处理模型匹配。When the original image and the preset image processing model do not match, a second process is performed on the image processing model to match the original image and the image processing model.
  20. 根据权利要求18或19所述的装置,其特征在于,所述对所述图像处理模型进行第二处理,包括:The apparatus according to claim 18 or 19, wherein the performing the second processing on the image processing model comprises:
    若所述图像处理模型为卷积神经网络模型,获取所述卷积神经网络模型中的权重矩阵;If the image processing model is a convolutional neural network model, obtain the weight matrix in the convolutional neural network model;
    对所述权重矩阵进行转置处理,以对所述卷积神经网络模型进行第二处理。Transpose processing is performed on the weight matrix to perform a second processing on the convolutional neural network model.
  21. 根据权利要求20所述的装置,其特征在于,所述通过所述图像处理模型对所述原始图像进行处理,包括:The apparatus according to claim 20, wherein the processing the original image by the image processing model comprises:
    利用转置处理之后的所述权重矩阵对所述原始图像进行内积操作得到图像特征;Perform an inner product operation on the original image by using the weight matrix after the transposition process to obtain image features;
    对所述图像特征进行非线性处理得到非线性特征,并对所述非线性特征进行特征压缩处理得到压缩特征;Performing nonlinear processing on the image features to obtain nonlinear features, and performing feature compression processing on the nonlinear features to obtain compressed features;
    对所述压缩特征进行全连接处理,以对所述原始图像进行处理。Perform full connection processing on the compressed features to process the original image.
  22. 根据权利要求12所述的装置,其特征在于,所述装置还包括:The apparatus of claim 12, wherein the apparatus further comprises:
    当所述原始图像和预设的图像处理模型匹配时,通过所述图像处理模型对所述原始图像进行处理,以识别所述原始图像中的目标对象。When the original image matches a preset image processing model, the original image is processed by the image processing model to identify the target object in the original image.
  23. 一种计算机可读介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现权利要求1-11任一项所述的图像处理方法。A computer-readable medium on which a computer program is stored, characterized in that, when the computer program is executed by a processor, the image processing method according to any one of claims 1-11 is implemented.
  24. 一种电子设备,其特征在于,包括:An electronic device, comprising:
    处理器;以及processor; and
    存储器,用于存储所述处理器的可执行指令;a memory for storing executable instructions for the processor;
    其中,所述处理器配置为经由执行所述可执行指令来执行权利要求1-11中任一项所述的图像处理方法。Wherein, the processor is configured to perform the image processing method of any one of claims 1-11 by executing the executable instructions.
PCT/CN2021/082809 2021-03-24 2021-03-24 Image processing method and apparatus, and medium and electronic device WO2022198517A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/082809 WO2022198517A1 (en) 2021-03-24 2021-03-24 Image processing method and apparatus, and medium and electronic device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/082809 WO2022198517A1 (en) 2021-03-24 2021-03-24 Image processing method and apparatus, and medium and electronic device

Publications (1)

Publication Number Publication Date
WO2022198517A1 true WO2022198517A1 (en) 2022-09-29

Family

ID=83395002

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/082809 WO2022198517A1 (en) 2021-03-24 2021-03-24 Image processing method and apparatus, and medium and electronic device

Country Status (1)

Country Link
WO (1) WO2022198517A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130058582A1 (en) * 2011-09-02 2013-03-07 Petrus J.L. van Beek Edge based template matching
CN109165619A (en) * 2018-09-04 2019-01-08 阿里巴巴集团控股有限公司 A kind of processing method of image, device and electronic equipment
CN111862124A (en) * 2020-07-29 2020-10-30 Oppo广东移动通信有限公司 Image processing method, device, equipment and computer readable storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130058582A1 (en) * 2011-09-02 2013-03-07 Petrus J.L. van Beek Edge based template matching
CN109165619A (en) * 2018-09-04 2019-01-08 阿里巴巴集团控股有限公司 A kind of processing method of image, device and electronic equipment
CN111862124A (en) * 2020-07-29 2020-10-30 Oppo广东移动通信有限公司 Image processing method, device, equipment and computer readable storage medium

Similar Documents

Publication Publication Date Title
CN111476306B (en) Object detection method, device, equipment and storage medium based on artificial intelligence
US10497099B2 (en) Automatic orientation adjustment of spherical panorama digital images
JP2020047276A (en) Method and device for calibrating sensor, computer device, medium, and vehicle
CN110163087B (en) Face gesture recognition method and system
WO2020107930A1 (en) Camera pose determination method and apparatus, and electronic device
WO2022000755A1 (en) Robot, motion control method and apparatus therefor, and computer-readable storage medium
US11276201B1 (en) Localizing an augmented reality device
EP3869404A2 (en) Vehicle loss assessment method executed by mobile terminal, device, mobile terminal and medium
CN115578433B (en) Image processing method, device, electronic equipment and storage medium
CN113793370B (en) Three-dimensional point cloud registration method and device, electronic equipment and readable medium
WO2022002262A1 (en) Character sequence recognition method and apparatus based on computer vision, and device and medium
US11604963B2 (en) Feedback adversarial learning
WO2023083030A1 (en) Posture recognition method and related device
WO2022142830A1 (en) Application device and air gesture recognition method thereof
US20230401799A1 (en) Augmented reality method and related device
CN110207643B (en) Folding angle detection method and device, terminal and storage medium
CN111684782B (en) Electronic device and control method thereof
WO2022198517A1 (en) Image processing method and apparatus, and medium and electronic device
CN114022570B (en) Method for calibrating external parameters between cameras and electronic equipment
US20230401897A1 (en) Method for preventing hand gesture misrecognition and electronic device
CN113762017B (en) Action recognition method, device, equipment and storage medium
CN115578432A (en) Image processing method, image processing device, electronic equipment and storage medium
CN114140512A (en) Image processing method and related equipment
CN116137025A (en) Video image correction method and device, computer readable medium and electronic equipment
US20220345621A1 (en) Scene lock mode for capturing camera images

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21932145

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21932145

Country of ref document: EP

Kind code of ref document: A1