CN112434587A

CN112434587A - Image processing method and device and storage medium

Info

Publication number: CN112434587A
Application number: CN202011281283.3A
Authority: CN
Inventors: 朱兆琪; 安山
Original assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Wodong Tianjun Information Technology Co Ltd
Current assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Wodong Tianjun Information Technology Co Ltd
Priority date: 2020-11-16
Filing date: 2020-11-16
Publication date: 2021-03-02
Anticipated expiration: 2040-11-16

Abstract

The embodiment of the invention discloses an image processing method and device and a storage medium, wherein the image processing method comprises the following steps: under the condition of acquiring an original image, inputting the original image into a detection object identification model to obtain a plurality of detection frames corresponding to a target detection object in the original image, a plurality of score information of the detection frames and a plurality of key point position information of the target detection object in the detection frames; screening a target detection frame from the plurality of detection frames according to the plurality of score information; determining the rotation angle of the original image according to the position information of a target key point corresponding to the target detection frame, wherein the position information of the target key point is the position information of the key point corresponding to the target detection frame in the position information of a plurality of key points; and rotating the original image according to the rotation angle and displaying the rotated original image.

Description

Image processing method and device and storage medium

Technical Field

The present invention relates to the field of image processing technologies, and in particular, to an image processing method and apparatus, and a storage medium.

Background

In recent years, live video has been rapidly developed, and live tape goods has gradually become an important direction for the development of various large electronic commerce. The beauty technology of the live broadcast software enables the anchor to be more beautiful and has important significance for attracting consumers to consume. Therefore, the live broadcast beauty technology becomes a new need of live broadcast software, and during live broadcast, the anchor broadcast can rotate the mobile phone and adopt modes such as horizontal screen live broadcast, and the like, so that the mobile phone has higher requirements on processing of original images including the anchor broadcast.

In the technical process of beautifying, the position of a face needs to be detected first, and then key point detection and other technical processes are performed, but the traditional face detection algorithm can only detect the face in the positive direction, also called as a positive face image, while in the live broadcast process, the situation of horizontal screen may occur, at this moment, the direction of the face is horizontal, and the traditional detection algorithm cannot detect the face in such a situation.

In the prior art, when the mobile phone determines the detection frame of the head portrait information of the anchor, the mobile phone directly displays the original image including the anchor according to the position of the detection frame, so that when the mobile phone acquires the original image through a horizontal screen, the mobile phone displays the original image through a vertical screen, the body direction of the displayed anchor is inconsistent with the actual body direction of the anchor, and the accuracy of image processing is reduced.

Disclosure of Invention

In order to solve the above technical problem, embodiments of the present invention are directed to providing an image processing method and apparatus, and a storage medium, which can improve accuracy of image processing in an image processing apparatus.

The technical scheme of the invention is realized as follows:

an embodiment of the present application provides an image processing method, including:

under the condition that an original image is obtained, inputting the original image into a detection object identification model to obtain a plurality of detection frames corresponding to a target detection object in the original image, a plurality of score information of the detection frames and a plurality of key point position information of the target detection object in the detection frames;

screening a target detection frame from the plurality of detection frames according to the plurality of score information;

determining a rotation angle of an original image according to target key point position information corresponding to the target detection frame, wherein the target key point position information is key point position information corresponding to the target detection frame in the plurality of key point position information;

and rotating the original image according to the rotation angle and displaying the rotated original image.

An embodiment of the present application provides an image processing apparatus, including:

the image processing device comprises an input unit, a detection object identification module and a processing unit, wherein the input unit is used for inputting an original image into a detection object identification model under the condition that the original image is obtained, and obtaining a plurality of detection frames corresponding to a target detection object in the original image, a plurality of score information of the detection frames and a plurality of key point position information of the target detection object in the detection frames;

the screening unit is used for screening a target detection frame from the plurality of detection frames according to the plurality of score information;

a determining unit, configured to determine a rotation angle of an original image according to target key point position information corresponding to the target detection frame, where the target key point position information is key point position information corresponding to the target detection frame in the multiple pieces of key point position information;

a rotation unit for rotating the original image according to the rotation angle;

and the display unit is used for displaying the rotated original image.

the image processing system comprises a memory, a processor and a communication bus, wherein the memory is communicated with the processor through the communication bus, the memory stores an image processing program executable by the processor, and when the image processing program is executed, the processor executes the image processing method.

The embodiment of the application provides a storage medium, which stores a computer program, is applied to an image processing device, and is characterized in that the computer program realizes the image processing method when being executed by a processor.

The embodiment of the invention provides an image processing method, an image processing device and a storage medium, wherein the image processing method comprises the following steps: under the condition of acquiring an original image, inputting the original image into a detection object identification model to obtain a plurality of detection frames corresponding to a target detection object in the original image, a plurality of score information of the detection frames and a plurality of key point position information of the target detection object in the detection frames; screening a target detection frame from the plurality of detection frames according to the plurality of score information; determining the rotation angle of the original image according to the position information of a target key point corresponding to the target detection frame, wherein the position information of the target key point is the position information of the key point corresponding to the target detection frame in the position information of a plurality of key points; and rotating the original image according to the rotation angle and displaying the rotated original image. By adopting the method, the image processing device can input the original image into the detection object identification model under the condition that the image processing device acquires the original image, the detection object identification model can be utilized to obtain a plurality of detection frames, a plurality of score information and a plurality of key point position information, the target detection frame can be determined according to the plurality of key point position information, the rotation angle of the original image can be determined by utilizing the target key point position information corresponding to the target detection frame, the image processing device can directly rotate the original image according to the rotation angle and display the rotated original image, and the accuracy of the image processing device in image processing is improved.

Drawings

Fig. 1 is a flowchart of an image processing method according to an embodiment of the present application;

fig. 2 is a schematic structural diagram of an exemplary image processing apparatus according to an embodiment of the present disclosure;

fig. 3 is a first schematic structural diagram of an image processing apparatus according to an embodiment of the present disclosure;

fig. 4 is a schematic diagram illustrating a composition structure of an image processing apparatus according to an embodiment of the present disclosure.

Detailed Description

The technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

In the prior art, a plurality of detection algorithms for human faces are available, which are basically divided into anchor point-based and non-anchor point-based, and the non-anchor point-based model has large calculation amount and is not beneficial to the integration of mobile terminals. The detection model based on the anchor point is divided into a single-stage detection algorithm and a multi-stage detection algorithm, the single model outputs the position of the detection frame, the multi-stage detection algorithm needs a plurality of models to process, the detection position is finally output, and the single-stage model is faster at the mobile phone end and more suitable for mobile end deployment. An SSD (Single Shot Multi Box Detector) algorithm is a common algorithm of a single-stage algorithm, and has the characteristics of high speed, easiness in deployment and the like, but the traditional SSD algorithm cannot detect the face in a full angle, so that the angle of the face in an image cannot be identified, and the accuracy in image processing is reduced.

Example one

An embodiment of the present application provides an image processing method, and fig. 1 is a first flowchart of the image processing method provided in the embodiment of the present application, and as shown in fig. 1, the image processing method may include:

s101, when the original image is acquired, inputting the original image into a detection object recognition model to obtain a plurality of detection frames corresponding to the target detection object in the original image, a plurality of score information of the detection frames and a plurality of key point position information of the target detection object in the detection frames.

The image processing method provided by the embodiment of the application is suitable for a scene that an image processing device processes a received original image.

In the embodiment of the present application, the image processing apparatus may be implemented in various forms. For example, the image processing apparatus described in the present application may include apparatuses such as a mobile phone, a tablet computer, a notebook computer, a palmtop computer, a Personal Digital Assistant (PDA), a Portable Media Player (PMP), a navigation apparatus, a wearable device, a smart band, a pedometer, and the like, and apparatuses such as a Digital TV, a desktop computer, and the like.

In this embodiment of the present application, the original image may be image information acquired by an image processing device using a camera, or may also be an image acquired by the image processing device from another device, which may be specifically determined according to an actual situation, and this is not limited in this embodiment of the present application.

In this embodiment of the present application, the original image may be an image of an animal, an image of a plant, an image of a human, or an image of another person, which may be determined specifically according to actual situations, and this is not limited in this embodiment of the present application.

It should be noted that the original image includes a target detection object, and the target detection object may be an animal, a plant, or a human, and may be specifically determined according to an actual situation, which is not limited in this embodiment of the application.

In this embodiment of the application, the detection object recognition model may specifically be a model for recognizing a target detection object in an original image, and the detection object recognition model is a model obtained by training sample images with different rotation angles. For example, if the target detection object is a person, the detection object recognition model is a model for recognizing the person in the original image.

In this embodiment of the application, the detection object recognition model may be a model obtained according to a Single Shot multi box Detector (SSD).

In the embodiment of the present application, the number of the target detection objects may be one, two, or multiple, and may be specifically determined according to actual situations, which is not limited in the embodiment of the present application.

In the embodiment of the present application, the detection frames are used to frame the faces of target detection objects, and one target object corresponds to a plurality of detection frames.

In the embodiment of the present application, the plurality of pieces of score information are pieces of score information obtained from areas of the plurality of detection frames occupied by the area of the target detection object. The detection frames correspond to the score information one by one, and specifically, one detection frame corresponds to one score information.

Note that the score information is the score information determined by the image processing apparatus based on the area of the image of the target detection object framed in the detection frame occupying the area of the detection frame, and for example, if the image of one of the detection frames is the face image of the target detection object, the score information corresponding to the one of the detection frames may be a score value of the area of the face image occupying the area of the one of the detection frames, and the score value may be 1 point, 0 point, or a score value between 0 and 1 point, and may be specifically determined according to the actual situation, which is not limited in the embodiment of the present application.

Note that, two score values, i.e., a face frame score value and a non-face frame score value, may be included in one score information; a score value, i.e., a face frame score value, may also be included in one score information; the specific details can be determined according to actual conditions, and the embodiment of the present application does not limit this.

In the embodiment of the present application, if an image in one detection frame is a facial image of a target detection object, a score of a facial frame corresponding to the detection frame may be 1 point, may also be 0.9 point, and may also be other score values, which may be determined according to actual situations, and this is not limited in the embodiment of the present application.

In the embodiment of the present application, when the image processing apparatus acquires an original image, the image processing apparatus inputs the original image into the detection object recognition model, after the original image passes through the convolution layer in the detection object recognition model, the detection object recognition model may obtain all the pixel points in the original image, the detection object recognition model may obtain a plurality of anchor points corresponding to the original image according to the correspondence between the pixel points and the anchor points, and the detection object recognition model may determine a plurality of detection frames according to the correspondence between the anchor points and the detection frames, where one anchor point corresponds to one detection frame. The image processing apparatus determines the position of the detection frame by using two points of the upper left corner and the lower right corner of the detection frame, that is, the detection frame is identified by using two points of the upper left corner and the lower right corner. The detection object recognition model scores each detection frame according to the image in each detection frame to obtain a plurality of score information of the plurality of detection frames, determines a plurality of key point position information corresponding to the plurality of detection frames by using an SSD algorithm, and then outputs a plurality of detection frames corresponding to the target detection object in the original image, a plurality of score information of the plurality of detection frames and a plurality of key point position information corresponding to the plurality of detection frames.

In the embodiment of the present application, the plurality of key point position information includes a plurality of left eye angular position information, a plurality of right eye angular position information, a plurality of nose tip position information, a plurality of left mouth angular position information, and a plurality of right mouth angular position information of the target detection object.

In the embodiment of the application, one detection frame corresponds to one left eye corner position information, one right eye corner position information, one nose tip position information, one left mouth corner position information and one right mouth corner position information; the plurality of detection frames correspond to the plurality of left eye corner position information, the plurality of right eye corner position information, the plurality of nose tip position information, the plurality of left mouth corner position information and the plurality of right mouth corner position information, namely the plurality of detection frames correspond to the plurality of key point information.

In this embodiment of the present application, the position information of the plurality of key points may specifically be 2-dimensional coordinate point information, and may also be other coordinate point information, which may specifically be determined according to an actual situation, and this is not limited in this embodiment of the present application.

In this embodiment of the application, the image processing apparatus includes an original detection object recognition model, and before the image processing apparatus inputs an original image into the detection object recognition model to obtain a plurality of detection frames, a plurality of score information of the plurality of detection frames, and a plurality of key point position information corresponding to the detection frames, the image processing apparatus needs to train the original detection object recognition model to obtain the trained original detection object recognition model, that is, the detection object recognition model is obtained. Specifically, the process of training the original detection object recognition model by the image processing apparatus includes: the image processing device acquires sample images of different rotation angles, sample detection frames corresponding to the sample images, sample score information of the sample detection frames and sample key point position information corresponding to the sample detection frames; after the image processing device acquires sample images of different rotation angles, sample detection frames corresponding to the sample images, sample score information of the sample detection frames and sample key point position information corresponding to the sample detection frames, the image processing device can input the sample images into an original detection object recognition model to obtain sample detection frames to be trained, sample score information to be trained and sample key point position information to be trained, which correspond to the sample images; after the image processing device inputs the sample image into the original detection object recognition model and obtains the sample detection frame to be trained, the sample score information to be trained and the key point position information of the sample to be trained corresponding to the sample image, the image processing device can train the original detection object recognition model by using the sample detection frame to be trained, the sample score information to be trained, the key point position information of the sample to be trained, the sample score information, the key point position information of the sample and the sample detection frame to obtain the detection object recognition model.

In the embodiment of the present application, the sample images at different rotation angles are image information obtained by rotating the sample detection object by different angles.

For example, in the case of obtaining the original sample image I, the image processing apparatus may Rotate the original sample image I by different rotation angles using the affine function Rotate in formula (1), so as to obtain the rotated original sample image I_RObtaining sample images with different rotation angles, and inputting the sample images with different rotation angles into the original detection object identification model by the image processing deviceAnd under the condition that the original detection object recognition model is used for obtaining the sample detection frame to be trained, the score information of the sample to be trained and the key point position information of the sample to be trained corresponding to the sample image, the original detection object recognition model is trained by using the sample detection frame to be trained, the score information of the sample to be trained, the key point position information of the sample to be trained, the score information of the sample, the key point position information of the sample and the sample detection frame until the detection object recognition model is obtained.

I_R＝Rotate(I，angle) (1)

It can be understood that the image processing apparatus trains the original detection object recognition model by using the sample images with different rotation angles, so that the original detection object recognition model can recognize target detection objects with different rotation angles in the original image under the condition that the original image is obtained, so as to rotate the original image according to different rotation angles, thereby improving the accuracy of the image processing apparatus in image processing.

And S102, screening a target detection frame from the plurality of detection frames according to the plurality of score information.

In the embodiment of the present application, after the image processing apparatus inputs the original image into the detection object recognition model and obtains the plurality of detection frames corresponding to the target detection object in the original image, the plurality of score information of the plurality of detection frames, and the plurality of key point position information corresponding to the plurality of detection frames, the image processing apparatus may screen the target detection frame from the plurality of detection frames according to the plurality of score information.

In this embodiment of the present application, the target detection frame may be a detection frame for framing a face of a target detection object in an original image, the number of the target detection frames may be one or multiple, and the target detection frames may be specifically determined according to actual situations, which is not limited in this embodiment of the present application.

In the embodiment of the present application, the number of target detection objects corresponds to the number of target detection frames one to one, that is, one target detection object corresponds to one target detection frame.

In this embodiment of the application, when the image processing apparatus obtains a plurality of detection frames, the image processing apparatus may screen out a target detection frame from the plurality of detection frames by using a Non-Maximum Suppression (NMS) method, specifically: the process of screening out the target detection frame from the plurality of detection frames according to the plurality of score information by the image processing device comprises the following steps: the image processing device extracts a first detection frame with the largest score from the plurality of detection frames according to the plurality of score information; after the image processing device extracts a first detection frame with the maximum score from the plurality of detection frames according to the plurality of score information, the image processing device eliminates the detection frame with the overlapping area of the first detection frame larger than a preset area threshold value from the plurality of detection frames to obtain a plurality of first candidate detection frames; in a case where the image processing apparatus determines that the number of the plurality of first candidate detection frames is less than or equal to a preset number threshold, the image processing apparatus regards the first detection frame as a target detection frame. In the case that the image processing device determines that the number of the plurality of first candidate detection frames is greater than a preset number threshold, the image processing device continues to extract a second detection frame with the largest score from the plurality of first candidate detection frames; after the image processing device extracts a second detection frame with the largest score from the plurality of first candidate detection frames, the image processing device eliminates a detection frame with the overlapping area of the second detection frame larger than a preset area threshold value from the plurality of first candidate detection frames to obtain a plurality of second candidate detection frames, and the image processing device compares the number of the plurality of second candidate detection frames with a preset number threshold value; until the image processing apparatus determines that the number of the second candidate detection frames is less than or equal to the preset number threshold, the image processing apparatus takes the first detection frame and the second detection frame as the target detection frame.

The first detection frame corresponds to the first score information, and the first score information is the score information having the highest score among the plurality of score information.

In the embodiment of the present application, the image processing apparatus may sort the plurality of score information in the order of scores from large to small to obtain the sorted score information, and then the image processing apparatus may obtain the first score information, that is, the score information with the highest score among the plurality of score information, by obtaining the first score information from the first score information arranged in the sorted score information.

In the embodiment of the present application, the number of the first detection frames may be one or multiple, and may be specifically determined according to an actual situation, which is not limited in the embodiment of the present application.

In the embodiment of the present application, the number of the first score information may be one, or may be multiple, and may be specifically determined according to actual situations, which is not limited in the embodiment of the present application.

The plurality of pieces of score information correspond to the plurality of detection frames, and specifically, one piece of score information corresponds to one detection frame.

In this embodiment of the application, the preset area threshold may be an area threshold configured in the image processing apparatus, may also be an area threshold obtained by the image processing apparatus according to an operation instruction of a user, and may also be an area threshold determined by the image processing apparatus according to another manner, which may specifically be determined according to an actual situation, and this is not limited in this embodiment of the application.

In this embodiment of the application, the preset number threshold may be a number threshold configured in the image processing apparatus, may also be a number threshold obtained by the image processing apparatus according to an operation instruction of a user, and may also be a number threshold determined by the image processing apparatus according to another manner, which may specifically be determined according to an actual situation, and this is not limited in this embodiment of the application.

It should be noted that the preset number threshold may be 0 or 1; other quantities can be used, and the specific quantity can be determined according to actual conditions, and the embodiment of the application is not limited to the specific quantity.

In the embodiment of the present application, when the image processing apparatus obtains a plurality of detection frames, the image processing apparatus may store the plurality of detection frames in the set B, store a plurality of score information items corresponding to the plurality of detection frames in the set S, in the case where the image processing apparatus finds the first detection frame M corresponding to the first score information having the largest score from the set S, the image processing apparatus deletes the first detection frame M from the set B, meanwhile, deleting the first score information in the set S, adding a first detection frame M to the set D, then the image processing device rejects the detection frames in the set B, the overlapping area of which with the first detection frame is larger than a preset area threshold value, to obtain a plurality of first candidate detection frames, that is, only the first candidate detection frames remain in the set B, and the image processing apparatus deletes the score information corresponding to the first candidate detection frames in the set S. The image processing apparatus sets the first detection frame as the target detection frame in a case where it is determined that the number of the plurality of first candidate detection frames is less than or equal to a preset number threshold. When the image processing device determines that the number of the plurality of first candidate detection frames is greater than the preset number threshold, the image processing device continues to extract a second detection frame with the largest score from the plurality of first candidate detection frames in the set S, delete score information corresponding to the second detection frame in the set S, add the extracted second detection frame to the set D, then the image processing device eliminates a detection frame with an overlapping area with the second detection frame greater than the preset area threshold from the plurality of first candidate detection frames in the set D to obtain a plurality of second candidate detection frames (eliminate score information corresponding to the plurality of second candidate detection frames from the set S), and the image processing device compares the number of the plurality of second candidate detection frames with the preset number threshold until the image processing device determines that the number of the plurality of second candidate detection frames in the set B is less than or equal to the preset number threshold (namely, until the plurality of second candidate detection frames in the set B do not exist (namely, the image processing device does not exist the plurality of second candidate detection frames in the set B) Framed), after which the image processing apparatus may take the first detection frame and the second detection frame in the set D as the target detection frames.

S103, according to the position information of the target key point corresponding to the target detection frame, the rotation angle of the original image is determined, and the position information of the target key point is the position information of the key point corresponding to the target detection frame in the position information of the key points.

In this embodiment of the application, after the image processing device screens out the target detection frame from the plurality of detection frames according to the plurality of score information, the image processing device may determine the rotation angle of the original image according to the position information of the target key point corresponding to the target detection frame.

After the image processing apparatus specifies the target detection frame, the image processing apparatus may specify the key point position information corresponding to the target detection frame from the plurality of key point position information, thereby obtaining the target key point position information.

In this embodiment of the present application, the process of determining the rotation angle of the original image by the image processing apparatus according to the position information of the target key point corresponding to the target detection frame includes: the image processing device inputs the position information of the target key point into the multi-dimensional pose model to obtain multi-dimensional position information corresponding to the position information of the target key point; and the position information of the target key point of the image processing device is input into the multi-dimensional pose model, and after the multi-dimensional position information corresponding to the position information of the target key point is obtained, the image processing device can determine the rotation angle according to the multi-dimensional position information.

In this embodiment of the present application, the multidimensional pose model may be a model for determining multidimensional position information corresponding to the target key point position information according to the target key point position information, and the multidimensional pose model may specifically be a 3-dimensional coordinate point model or other models, and may specifically be determined according to an actual situation, which is not limited in this embodiment of the present application.

In the embodiment of the present application, the multi-dimensional pose model may be a model obtained according to a multipoint perspective-n-point (PNP) algorithm.

In this embodiment, the multidimensional position information may specifically be a point pair corresponding to the target key point position information 2-dimensional point and the 3-dimensional point, and the image processing device may determine the rotation angle of the original image according to the point pair corresponding to the target key point position information 2-dimensional point and the 3-dimensional point, when obtaining the point pair corresponding to the target key point position information 2-dimensional point and the 3-dimensional point.

And S104, rotating the original image according to the rotation angle and displaying the rotated original image.

In the embodiment of the application, after the image processing device determines the rotation angle of the original image according to the position information of the target key point corresponding to the target detection frame, the image processing device may rotate the original image according to the rotation angle to obtain the rotated original image, and then the image processing device may display the rotated original image.

For example, as shown in fig. 2, when the image processing apparatus acquires an original image, the image processing apparatus inputs the original image into the detection object recognition model, after the original image passes through a convolution layer in the detection object recognition model, the detection object recognition model may obtain all pixel points in the original image, the detection object recognition model may obtain a plurality of anchor points corresponding to the original image according to a correspondence between the pixel points and the anchor points, and the detection object recognition model may determine a plurality of detection frames according to the plurality of anchor points (the plurality of anchor points correspond to the plurality of detection frames one to one) according to a correspondence between the anchor points and the detection frames. The detection object recognition model identifies the detection box by using two points at the upper left corner and the lower right corner of the detection box. The detection object identification model scores each detection frame according to the image in each detection frame to obtain a plurality of score information of the plurality of detection frames, and determines a plurality of key point position information corresponding to the plurality of detection frames by using an SSD algorithm.

The number of convolutional layers in the detection object recognition model may be 5 convolutional layers, 8 convolutional layers, or other number of convolutional layers, and may be determined according to actual conditions, which is not limited in the embodiment of the present application.

It can be understood that, when the image processing apparatus acquires an original image, the image processing apparatus may input the original image into the detection object recognition model, obtain a plurality of detection frames, a plurality of score information, and a plurality of key point position information by using the detection object recognition model, determine a target detection frame according to the plurality of key point position information, determine a rotation angle of the original image by using the target key point position information corresponding to the target detection frame, and directly rotate the original image according to the rotation angle and display the rotated original image, thereby improving accuracy of the image processing apparatus in image processing.

Example two

Based on the same inventive concept of the embodiments, the embodiments of the present application provide an image processing apparatus 1 corresponding to an image processing method; fig. 3 is a schematic diagram illustrating a first configuration of an image processing apparatus according to an embodiment of the present disclosure, where the image processing apparatus 1 may include:

an input unit 11, configured to, when an original image is obtained, input the original image into a detection object identification model, to obtain a plurality of detection frames corresponding to a target detection object in the original image, a plurality of score information of the plurality of detection frames, and a plurality of key point position information of the target detection object in the plurality of detection frames;

a screening unit 12, configured to screen a target detection frame from the plurality of detection frames according to the plurality of score information;

a determining unit 13, configured to determine a rotation angle of the original image according to target key point position information corresponding to the target detection frame, where the target key point position information is key point position information corresponding to the target detection frame in the multiple pieces of key point position information;

a rotation unit 14 for rotating the original image according to the rotation angle;

and a display unit 15 for displaying the rotated original image.

In some embodiments of the present application, the apparatus further comprises an acquisition unit and a training unit;

the acquisition unit is used for acquiring the sample image, a sample detection frame corresponding to the sample image, sample score information of the sample detection frame and sample key point position information corresponding to the sample detection frame;

the input unit 11 is configured to input the sample image into an original detection object recognition model, so as to obtain a to-be-trained sample detection frame, to-be-trained sample score information, and to-be-trained sample key point position information corresponding to the sample image;

the training unit is configured to train the original detection object identification model by using the to-be-trained sample detection box, the to-be-trained sample score information, the to-be-trained sample keypoint position information, the sample score information, the sample keypoint position information, and the sample detection box, so as to obtain the detection object identification model.

In some embodiments of the present application, the plurality of keypoint location information comprises a plurality of left eye angular location information, a plurality of right eye angular location information, a plurality of nose tip location information, a plurality of left mouth angular location information, and a plurality of right mouth angular location information of the target detection object.

In some embodiments of the present application, the input unit 11 is configured to input the target keypoint location information into a multi-dimensional pose model, so as to obtain multi-dimensional location information corresponding to the target keypoint location information;

the determining unit 13 is configured to determine the rotation angle according to the multi-dimensional position information.

In some embodiments of the present application, the apparatus further comprises an extraction unit and a culling unit;

the extracting unit is used for extracting a first detection frame with the largest score from the plurality of detection frames according to the plurality of score information;

the removing unit is used for removing the detection frames with the overlapping area larger than a preset area threshold value from the plurality of detection frames to obtain a plurality of first candidate detection frames; and taking the first detection frame as the target detection frame when the number of the plurality of first candidate detection frames is less than or equal to a preset number threshold.

In some embodiments of the present application, the extracting unit is configured to, in a case that the number of the plurality of first candidate detection frames is greater than the preset number threshold, continue to extract a second detection frame with a largest score from the plurality of first candidate detection frames;

the removing unit is used for removing the detection frames with the overlapping area larger than a preset area threshold value from the plurality of first candidate detection frames to obtain a plurality of second candidate detection frames; comparing the number of the second candidate detection frames with the preset number threshold; and taking the first detection frame and the second detection frame as the target detection frame until the number of the plurality of second candidate detection frames is less than or equal to the preset number threshold.

In practical applications, the input Unit 11, the screening Unit 12, the determining Unit 13, the rotating Unit 14, and the display Unit 15 may be implemented by a processor 16 on the image Processing apparatus 1, specifically, implemented by a CPU (Central Processing Unit), an MPU (micro processor Unit), a DSP (Digital Signal Processing), a Field Programmable Gate Array (FPGA), or the like; the above data storage may be realized by the memory 17 on the image processing apparatus 1.

An embodiment of the present invention further provides an image processing apparatus 1, and as shown in fig. 4, the image processing apparatus 1 includes: a processor 16, a memory 17 and a communication bus 18, the memory 17 communicating with the processor 16 via the communication bus 18, the memory 17 storing a program executable by the processor 16, the program, when executed, performing the image processing method as described above via the processor 16.

In practical applications, the Memory 17 may be a volatile Memory (volatile Memory), such as a Random-Access Memory (RAM); or a non-volatile Memory (non-volatile Memory), such as a Read-Only Memory (ROM), a flash Memory (flash Memory), a Hard Disk (Hard Disk Drive, HDD) or a Solid-State Drive (SSD); or a combination of the above types of memories and provides instructions and data to the processor 16.

An embodiment of the present invention provides a computer-readable storage medium, on which a computer program is stored, which when executed by the processor 16 implements the image processing method as described above.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of a hardware embodiment, a software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention.

Claims

1. An image processing method, characterized in that the method comprises:

2. The method according to claim 1, wherein before inputting the original image into the detection object recognition model to obtain a plurality of detection frames corresponding to the target detection object in the original image, a plurality of score information of the plurality of detection frames, and a plurality of key point position information of the target detection object in the plurality of detection frames, the method further comprises:

acquiring the sample image, a sample detection frame corresponding to the sample image, sample score information of the sample detection frame and sample key point position information corresponding to the sample detection frame;

inputting the sample image into an original detection object recognition model to obtain a sample detection frame to be trained, score information of the sample to be trained and key point position information of the sample to be trained, which correspond to the sample image;

and training the original detection object identification model by using the to-be-trained sample detection frame, the to-be-trained sample score information, the to-be-trained sample key point position information, the sample score information, the sample key point position information and the sample detection frame to obtain the detection object identification model.

3. The method of claim 1, wherein the plurality of keypoint location information comprises a plurality of left eye angular location information, a plurality of right eye angular location information, a plurality of nose tip location information, a plurality of left mouth angular location information, and a plurality of right mouth angular location information of the target detection object.

4. The method according to claim 1, wherein the determining the rotation angle of the original image according to the position information of the target key point corresponding to the target detection frame comprises:

inputting the position information of the target key point into a multi-dimensional pose model to obtain multi-dimensional position information corresponding to the position information of the target key point;

and determining the rotation angle according to the multi-dimensional position information.

5. The method of claim 1, wherein the screening the plurality of detection frames for a target detection frame based on the plurality of score information comprises:

extracting a first detection frame with the largest score from the plurality of detection frames according to the plurality of score information;

removing the detection frames with the overlapping area larger than a preset area threshold value from the plurality of detection frames to obtain a plurality of first candidate detection frames;

and taking the first detection frame as the target detection frame when the number of the plurality of first candidate detection frames is less than or equal to a preset number threshold.

6. The method according to claim 5, wherein after removing the detection frames from the plurality of detection frames whose overlapping area with the first detection frame is larger than a preset area threshold to obtain a plurality of first candidate detection frames, the method further comprises:

under the condition that the number of the plurality of first candidate detection frames is larger than the preset number threshold, continuously extracting a second detection frame with the largest score from the plurality of first candidate detection frames;

removing the detection frames with the overlapping area with the second detection frame larger than a preset area threshold value from the plurality of first candidate detection frames to obtain a plurality of second candidate detection frames; comparing the number of the second candidate detection frames with the preset number threshold;

and taking the first detection frame and the second detection frame as the target detection frame until the number of the plurality of second candidate detection frames is less than or equal to the preset number threshold.

7. An image processing apparatus, characterized in that the apparatus comprises:

and the display unit is used for displaying the rotated original image.

8. The apparatus of claim 7, further comprising an acquisition unit and a training unit;

the input unit is used for inputting the sample image into an original detection object recognition model to obtain a sample detection frame to be trained, sample score information to be trained and sample key point position information to be trained, which correspond to the sample image;

9. An image processing apparatus, characterized in that the apparatus comprises:

a memory, a processor, and a communication bus, the memory in communication with the processor through the communication bus, the memory storing a program of image processing executable by the processor, the program of image processing when executed performing the method of any of claims 1 to 6 by the processor.

10. A storage medium having stored thereon a computer program for application in an image processing apparatus, characterized in that the computer program, when being executed by a processor, carries out the method of any one of claims 1 to 6.