CN113313009A

CN113313009A - Method, device and terminal for continuously shooting output image and readable storage medium

Info

Publication number: CN113313009A
Application number: CN202110578211.3A
Authority: CN
Inventors: 苏展
Original assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Current assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date: 2021-05-26
Filing date: 2021-05-26
Publication date: 2021-08-27

Abstract

The application discloses a method for continuously shooting output images, an image output device, a terminal and a non-volatile computer readable storage medium. The image output method comprises the following steps: performing continuous shooting to generate a multi-frame continuous shooting image; carrying out human body posture detection and human face expression detection on each frame of continuous shooting images in the multi-frame continuous shooting images; and outputting at least one target image from the multi-frame continuous shooting images according to a posture detection result obtained by human body posture detection and an expression detection result obtained by expression detection. This application is through carrying out human gesture detection and facial expression detection to each frame continuous shooting image in the multiframe continuous shooting image that the continuous shooting obtained, exports at least one target image from the multiframe continuous shooting image, can intelligently select the image that the gesture is graceful and the expression is preferred from the multiframe continuous shooting image in automatic, and need not the user after the continuous shooting spend time select the image of heart appearance from the multiframe continuous shooting image.

Description

Method, device and terminal for continuously shooting output image and readable storage medium

Technical Field

The present application relates to the field of image processing technologies, and in particular, to a method for continuously shooting an image, an image output apparatus, a terminal, and a non-volatile computer-readable storage medium.

Background

At present, a mobile phone manufacturer only stores each frame of continuously shot images into an album for realizing the application of continuous shooting of a camera. Generally, a user only uses a camera continuous shooting function to obtain a certain heart instrument image to be captured, but images acquired by the camera continuous shooting are often blurred due to exposure time and the like, so that the quality of the images generated by the continuous shooting is uneven, the user needs to select the images of the heart instrument from all the continuous shooting images, and the redundant images increase the time for the user to search the heart instrument images.

Disclosure of Invention

The embodiment of the application provides a continuous shooting image output method, an image output device, a terminal and a non-volatile computer readable storage medium.

The method for outputting the continuous shooting image comprises the following steps: performing continuous shooting to generate a multi-frame continuous shooting image; carrying out human body posture detection and human face expression detection on each frame of continuous shooting images in a plurality of frames of continuous shooting images; and outputting at least one target image from the multi-frame continuous shooting images according to a posture detection result obtained by the human body posture detection and an expression detection result obtained by the expression detection.

The continuous shooting output image device comprises a continuous shooting module, a detection module and an output module. The continuous shooting module is used for executing continuous shooting to generate a multi-frame continuous shooting image; the detection module is used for carrying out human body posture detection and human face expression detection on each frame of continuous shooting images in a plurality of frames of continuous shooting images; and the output module is used for outputting at least one target image from the multi-frame continuous shooting images according to a posture detection result obtained by human body posture detection and an expression detection result obtained by expression detection.

The terminal of embodiments of the present application includes one or more processors, memory, and one or more programs. Wherein the one or more programs are stored in the memory and executed by the one or more processors, the programs including instructions for performing the method of outputting an image according to embodiments of the present application. The image output method comprises the following steps: performing continuous shooting to generate a multi-frame continuous shooting image; carrying out human body posture detection and human face expression detection on each frame of continuous shooting images in a plurality of frames of continuous shooting images; and outputting at least one target image from the multi-frame continuous shooting images according to a posture detection result obtained by the human body posture detection and an expression detection result obtained by the expression detection.

A non-transitory computer-readable storage medium containing a computer program of embodiments of the present application, which, when executed by one or more processors, causes the processors to implement the method of outputting an image of embodiments of the present application. The image output method comprises the following steps: performing continuous shooting to generate a multi-frame continuous shooting image; carrying out human body posture detection and human face expression detection on each frame of continuous shooting images in a plurality of frames of continuous shooting images; and outputting at least one target image from the multi-frame continuous shooting images according to a posture detection result obtained by the human body posture detection and an expression detection result obtained by the expression detection.

According to the image output method, the image output device, the terminal and the nonvolatile computer readable storage medium, human body posture detection and human face expression detection are carried out on each frame of continuous shooting images in multi-frame continuous shooting images obtained through continuous shooting, and then at least one target image is output from the multi-frame continuous shooting images according to the posture detection result obtained through the human body posture detection and the expression detection result obtained through the expression detection.

Additional aspects and advantages of embodiments of the present application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the present application.

Drawings

The above and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:

FIG. 1 is a schematic flow chart diagram of a method of outputting an image according to some embodiments of the present application;

FIG. 2 is a schematic diagram of an output image device according to some embodiments of the present application;

FIG. 3 is a schematic block diagram of a terminal according to some embodiments of the present application;

FIGS. 4 and 5 are schematic flow diagrams of methods of outputting an image according to certain embodiments of the present application;

FIGS. 6 and 7 are schematic diagrams of the structure of an output image device according to some embodiments of the present application;

FIG. 8 is a schematic flow chart diagram of a method of outputting an image according to some embodiments of the present application;

FIG. 9 is a schematic diagram of an output image device according to some embodiments of the present application;

FIGS. 10-14 are schematic flow charts of methods for outputting images according to certain embodiments of the present disclosure;

FIG. 15 is a schematic diagram of an output image device according to some embodiments of the present application;

FIGS. 16 and 17 are schematic flow diagrams of methods of outputting an image according to certain embodiments of the present application;

FIG. 18 is a schematic diagram of an output image device according to some embodiments of the present application;

FIGS. 19 and 20 are schematic flow diagrams of methods of outputting an image according to certain embodiments of the present application;

FIG. 21 is a schematic view of a scene of a method of outputting an image according to some embodiments of the present application;

FIGS. 22 and 23 are schematic flow diagrams of methods of outputting an image according to certain embodiments of the present application;

FIG. 24 is a schematic view of a scene of a method of outputting an image according to some embodiments of the present application;

FIGS. 25 and 26 are schematic flow charts of methods of outputting an image according to certain embodiments of the present application;

FIG. 27 is a schematic diagram of an output image device according to some embodiments of the present application;

FIG. 28 is a schematic diagram of a connection between a computer-readable storage medium and a processor according to some embodiments of the present application.

Detailed Description

Reference will now be made in detail to embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below by referring to the drawings are exemplary only for the purpose of explaining the embodiments of the present application, and are not to be construed as limiting the embodiments of the present application.

Referring to fig. 1 to 3, a method for outputting a continuous shooting image according to an embodiment of the present disclosure includes the following steps:

02: performing continuous shooting to generate a multi-frame continuous shooting image;

04: carrying out human body posture detection and human face expression detection on each frame of continuous shooting images in the multi-frame continuous shooting images; and

06: and outputting at least one target image from the multi-frame continuous shooting images according to a posture detection result obtained by human body posture detection and an expression detection result obtained by expression detection.

The continuous shooting output image device 10 of the present embodiment includes a continuous shooting module 12, a detection module 14, and an output module 16. The continuous shooting module 12, the detection module 14 and the output module 16 may be used to implement step 02, step 04 and step 06, respectively. That is, the continuous shooting module 12 may be configured to perform continuous shooting to generate a multi-frame continuous shooting image; the detection module 14 may also be configured to perform human body posture detection and human face expression detection on each frame of the multiple frames of continuous shooting images; the output module 16 may also be configured to output at least one target image from the multi-frame continuous shooting image according to a posture detection result obtained by human body posture detection and an expression detection result obtained by expression detection.

The terminal 100 of the present embodiment includes one or more processors 20, a memory 30, and one or more programs, wherein the one or more programs are stored in the memory 30 and executed by the one or more processors 20, the programs including instructions for performing the output image method of the present embodiment. That is, when the processor 20 executes the program, the processor 20 may implement step 02, step 04, and step 06. That is, the processor 20 may be configured to: controlling the camera 40 to perform continuous shooting to generate a multi-frame continuous shooting image; carrying out human body posture detection and human face expression detection on each frame of continuous shooting images in the multi-frame continuous shooting images; and outputting at least one target image from the multi-frame continuous shooting images according to a posture detection result obtained by human body posture detection and an expression detection result obtained by expression detection.

In the image output method, the image output device 10 and the terminal 100 of the embodiment of the application, human body posture detection and facial expression detection are performed on each frame of continuously shot images in a plurality of frames of continuously shot images obtained by continuous shooting, and then at least one target image is output from the plurality of frames of continuously shot images according to the posture detection result and the expression detection result obtained by the expression detection obtained by the human body posture detection.

Specifically, the terminal 100 may be a mobile phone, a tablet computer, a display device, a notebook computer, a teller machine, a gate, a smart watch, a head-up display device, a game machine, and the like. As shown in fig. 3, the terminal 100 is a mobile phone as an example in the embodiment of the present application, and it is understood that the specific form of the terminal is not limited to the mobile phone.

Please refer to fig. 3, the terminal 100 may further include a camera 40, and the camera 40 may be a front camera or a rear camera of the terminal 100. The processor 20 may be connected to the camera 40 and control the camera 40 to perform continuous shooting. In step 02, continuous shooting is performed to generate a multi-frame continuous shooting image. Specifically, the camera 40 may perform continuous shooting, and a multi-frame continuous shooting image may be generated after the continuous shooting. For example, after continuous shooting, images of 5 frames, 10 frames, 15 frames, 20 frames, 25 frames, 30 frames or more may be generated, the number of the generated continuous shooting images may be a fixed value, or may be set by a user in a self-defined manner, or may be determined according to the duration of the continuous shooting, and in the embodiment of the present application, the N frames represent multiple frames of images.

Please refer to fig. 4, step 04: and carrying out human body posture detection and human face expression detection on each frame of continuous shooting images in the multi-frame continuous shooting images. It is understood that step 04 includes step 042: detecting the human body posture of each frame of continuous shooting image in the multi-frame continuous shooting images; and step 044: and carrying out facial expression detection on each frame of continuous shooting image in the multi-frame continuous shooting images. Specifically, the preset human body posture detection model can be used for detecting the human body posture of each frame of continuous shooting image, posture detection data of each frame of continuous shooting image is generated, and the image with the beautiful posture can be selected quickly. The preset facial expression detection model can be used for carrying out facial expression detection on each frame of continuous shooting image, and facial expression detection data of each frame of continuous shooting image are generated, so that the image with better expression can be selected quickly. For example, after each frame of continuous shooting image is subjected to human body posture detection, the human body posture score of each frame of continuous shooting image can be calculated according to the human body posture detection result, and after each frame of continuous shooting image is subjected to human face expression detection, the human face expression score of each frame of continuous shooting image can be calculated according to the human body posture detection result.

Wherein, step 042 and step 044 can go on simultaneously to reduce and carry out human posture detection and the duration that facial expression detected to every frame continuous shooting image, promote work efficiency. Of course, in other embodiments, step 042 and step 044 may be performed sequentially, without limitation, for example, step 042 is performed first and then step 044 is performed, or step 044 is performed first and then step 044 is performed.

Step 06: and outputting at least one target image from the multi-frame continuous shooting images according to a posture detection result obtained by human body posture detection and an expression detection result obtained by expression detection. At least one image with beautiful posture and better expression can be selected from the multi-frame continuous shooting images according to the posture detection result and the expression detection result of each frame of continuous shooting images to be used as a target image, and then the target image can be presented to a user, so that the user does not need to spend time to select from the multi-frame continuous shooting images, and the time of the user is saved. The number of the target images may be one, two, three, four, five, six or more, which is not listed here. The number of the output target images may be a fixed number, or a number customized by the user, or a number determined according to the number of the obtained continuous shooting images, for example, the number of the output target images may be a predetermined proportion of the number of the obtained continuous shooting images, such as one tenth, one ninth, one eighth or more proportion values, without limitation, so that the number of the obtained target images is reasonable, the obtained target images do not occupy too much memory, and the diversity of the user selection is not affected too little.

Referring to fig. 5 to 7, in some embodiments, step 042 may include the following steps:

0422: judging whether the focal distance of the continuously shot image is greater than a preset distance;

0424: when the focal distance of the continuously shot image is greater than a preset distance, identifying human body joint points in the continuously shot image;

0426: generating a posture detection result according to the human body joint points; and

0428: and when the focusing distance of the continuously shot image is smaller than or equal to the preset distance, generating a posture detection result, wherein the posture detection result is zero.

In some embodiments, the detection module 14 includes a human posture detection submodule 13, and the human posture detection submodule 13 includes a determination unit 131, a first recognition unit 132, a first generation unit 133, and a second generation unit 134. The judging unit 131, the first identifying unit 132, the first generating unit 133 and the second generating unit 134 may be respectively configured to implement step 0422, step 0424, step 0426 and step 0428. That is, the determination unit 131 may be configured to determine whether the focus distance of the continuously shot image is greater than a predetermined distance; the first recognition unit 132 may be configured to recognize a human body joint point in the continuously shot image when a focus distance of the continuously shot image is greater than a predetermined distance; the first generating unit 133 may be configured to generate a posture detection result according to a joint point of the human body; the second generating unit 134 may be configured to generate a posture detection result when the focus distance of the continuously shot image is less than or equal to a predetermined distance, the posture detection result being zero.

In some embodiments, the processor 20 may be further configured to: judging whether the focal distance of the continuously shot image is greater than a preset distance; when the focal distance of the continuously shot image is greater than a preset distance, identifying human body joint points in the continuously shot image; generating a posture detection result according to the human body joint points; and generating a posture detection result when the focusing distance of the continuously shot image is less than or equal to a preset distance, wherein the posture detection result is zero. That is, processor 20 may also be used to implement

steps

0422, 0424, 0426 and 0428.

Specifically, if the focal distance of the continuous shooting image is less than the predetermined distance, it is considered that the distance between the object and the camera is too short, and the camera 40 cannot capture all of the body of the object, and all of the body of the object is not captured in the continuous shooting image. For example, when a self-timer is taken with the front camera, the distance between the front camera and the subject is short, and the front camera can only take the image of the face of the subject. If the human body joint point recognition is performed on the continuous shot image, the complete human body joint point cannot be recognized, the posture detection result generated according to the human body joint point is inaccurate, and the resources and the processing time of the processor 20 are consumed, so that the posture detection result can be directly generated to be zero when the focal distance of the continuous shot image is judged to be less than or equal to the predetermined distance, that is, when the focal distance of the continuous shot image is judged to be less than or equal to the predetermined distance, the step 0424 and the step 0426 are not performed, so that the resource consumption of the processor 20 is reduced, and the processing time is saved.

Further, when the focal distance of the continuously shot image is greater than a predetermined distance, the human body joint point in the continuously shot image can be identified. Specifically, the human body joint points of each frame of the continuous shooting images from the first frame to the Nth frame can be identified through a human body posture estimation algorithm, and the human body joint points of a single person can be identified when only a single person exists in the continuous shooting images; when there are a plurality of persons in the continuously shot image, for example, two, three, four, five or more persons, the human body joint points of each person in the continuously shot image can be identified. The body posture estimation algorithm may include, but is not limited to, algorithms such as a Part Intensity Field-Part Association Field (PifPAF), a posnet, or YOLOv4, which are not listed herein.

After the human body joint points in the continuous shooting image are identified, a posture detection result can be generated according to the human body joint points. For example, whether the posture of the human body is relatively stretched, whether the human body is blocked, whether the human body is clear or not can be detected according to the human body joint points, and then a posture detection result can be obtained. The gesture detection result may include a gesture detection score of the continuously shot image, and the like. In the embodiment, the human body joint points are firstly identified, and then the posture detection result is generated according to the identified human body joint points, so that the posture detection result is more accurate, and finally, the target image output based on the posture detection result meets the requirements of the user better.

Referring to fig. 8, in some embodiments, the human body pose detection includes face sharpness detection, face occlusion detection, pose stretching detection, and human body height detection, and step 0426 may include the following steps:

4261: performing face definition detection according to face joint points in the human body joint points to obtain face definition;

4262: executing face shielding degree detection according to the confidence corresponding to the face joint point to obtain face shielding degree;

4263: performing posture extension degree detection according to limb joint points in the human body joint points to obtain posture extension degree;

4264: performing human body height detection according to human body joint points to obtain human body height; and

4265: and generating a posture detection result according to at least one of the human face definition, the human face shielding degree, the posture extension degree and the human body height.

In some embodiments, please refer to fig. 9, the first generating unit 133 may include a face sharpness detecting subunit 1331, a face occlusion degree detecting subunit 1332, a pose-extension degree detecting subunit 1333, a body height detecting subunit 1334, and a generating subunit 1335, which may be respectively configured to implement step 4261, step 42612, step 4263, step 4264, and step 4265. That is, the face sharpness detection subunit 1331 may be configured to perform face sharpness detection according to face joint points in the human body joint points to obtain face sharpness; the face occlusion degree detection subunit 1332 may be configured to perform face occlusion degree detection according to the confidence corresponding to the face joint point to obtain a face occlusion degree; the posture extension degree detecting subunit 1333 may be configured to perform posture extension degree detection according to a limb joint point of the human body joint points to obtain a posture extension degree; the human body height detection subunit 1334 may be configured to perform human body height detection according to human body joint points to obtain a human body height; the generating sub-unit 1335 may be configured to generate a posture detection result according to the face sharpness, the face occlusion degree, the posture extension degree, and the human body height.

In some embodiments, referring to fig. 3, the processor 20 may further be configured to: performing face definition detection according to face joint points in the human body joint points to obtain face definition; executing face shielding degree detection according to the confidence corresponding to the face joint point to obtain face shielding degree; performing posture extension degree detection according to limb joint points in the human body joint points to obtain posture extension degree; performing human body height detection according to human body joint points to obtain human body height; and generating a posture detection result according to the human face definition, the human face shielding degree, the posture extension degree and the human body height. That is, the processor 20 may also be configured to implement step 4261, step 42612, step 4263, step 4264 and step 4265.

Specifically, the human body joint points may include, but are not limited to, a nose, two eyes, two ears, two shoulders, two elbows, two hands, two hips, two knees, two feet, and the like, and may be selectively increased or decreased according to user requirements. At least one of face definition detection, face shielding degree detection, posture stretching degree detection and body height detection can be selectively carried out according to joint point information of each part in the human body joint points, and corresponding face definition Score is generated_face-clarityFace shielding degree Score_{face_occlusion}Stretching degree of posture Score_stretchAnd body height Score_hAt least one of the data.

In some embodiments, when the human body posture is detected, one of the

steps

4261, 42612, 4263 and 4264 may be performed. In other embodiments, when the human body posture is detected, two steps of the step 4261, the step 42612, the step 4263 and the step 4264 may be performed, for example, the step 4261+ the step 4262, the step 4261+ the step 4263, the step 4262+ the step 4263, the step 4263+ the step 4264, and the like, which are not listed herein. In still other embodiments, when the human body posture is detected, three steps of the step 4261, the step 4262, the step 4263 and the step 4264 may be performed, for example, the step 4261+ the step 4262+ the step 4263, the step 4261+ the step 4262+ the step 4264 and the step 4262+ the step 4263+ the step 4264. In still other embodiments, when detecting the posture of the human body, the four steps of the step 4261, the step 4262, the step 4263 and the step 4264 may be performed. Wherein, the step 4261, the step 4262, the step 4263 and the step 4264 can be performed simultaneously or sequentially in any order.

Further, in order to accurately detect the pose of the photographed portrait in the continuously photographed image, when detecting the pose of the human body, step 4263 is generally executed: and performing posture extension degree detection according to the limb joint points in the human body joint points to obtain the posture extension degree, wherein the posture extension degree is important for detecting whether the posture is beautiful or not. Then, in step 4263, at least one of step 4261, step 4262 and step 4263 may be selectively executed in synchronization or in sequence according to a requirement, for example, if the human face in the continuously shot image is emphasized to be shielded, step 4262 is further executed; if the attention is paid to whether the human face in the continuously shot image is clear, the step 4261 is executed; if the emphasis is placed on the height of the portrait in the continuously shot image, the step 4264 needs to be executed, and if the emphasis is placed on whether the face in the continuously shot image is blocked or not and whether the face is clear or not, the

steps

4261 and 4262 need to be executed. There are other situations, not listed here.

Further, if only one of step 4261, step 4262, step 4263 and step 4264 is performed, a posture detection result may be generated from one type of detection data correspondingly generated. For example, if only step 4263 is performed, the pose detection result may be generated based only on the pose stretching degree. If two of step 4261, step 4262, step 4263, and step 4264 are performed, a posture detection result may be generated from at least one of the two kinds of detection data correspondingly generated. If three of step 4261, step 4262, step 4263, and step 4264 are performed, a posture detection result may be generated from at least one of the three detection data correspondingly generated. If the

steps

4261, 4262, 4263 and 4264 are performed, a posture detection result may be generated according to at least one of the corresponding generated posture extension degree, the human face occlusion degree, the human face definition degree and the human body height.

In one embodiment, the step 4261, the step 4262, the step 4263 and the step 4264 are performed to obtain a pose stretching degree, a face shielding degree, a face definition and a body height, and the pose detection result Score may be calculated and generated according to at least one of the pose stretching degree, the face shielding degree, the face definition and the body height_pose. For example, a pose detection result can be calculated and generated according to the pose extension and the face definition. And calculating and generating a posture detection result according to the posture extension degree and the face shielding degree. And calculating and generating a posture detection result according to the posture extension degree, the human face shielding degree and the human face definition. And calculating and generating a posture detection result according to the posture extension degree, the human face shielding degree, the human face definition and the human body height.

In one example, the stretching degree Score can be determined according to the gesture_stretchFace shielding degree Score_{face_occlusion}Face definition Score_face-clarityAnd body height Score_hCalculating and generating a posture detection result Score_pose。Score_pose＝Score_face-clarity+Score_{face_occlusion}+Score_stretch+Score_h. Or, in another example, Score_face-clarity、Score_{face_occlusion}、Score_stretchAnd Score_hCorresponding weights are a, b, c, d, respectively, then Score_pose＝aScore_face-clarity+bScore_{face_occlusion}+cScore_stretch+dScore_h. Alternatively, other computing methods are also possible, not listed here. Compared with the situation that some posture detection models only predict the posture of the shot portrait and ignore the interference of the portrait definition and the human face shielding, the posture detection method and the posture detection system can improve the accuracy of human body posture detection by combining the human face definition, the human face shielding degree, the posture stretching degree detection and the human body height detection to obtain the posture detection result, so that the posture of the output target image is beautiful and the portrait is clear.

Further, referring to fig. 9 and 10, in some embodiments, step 4261 may include the following steps:

42611: determining a face region according to the nose joint point, the ear joint point and the eye joint point; and

42612: and calculating the face definition of the face area.

In some embodiments, the face sharpness detection subunit 1331 may also be configured to: determining a face region according to the nose joint point, the ear joint point and the eye joint point; and calculating the face definition of the face region. That is, the face sharpness detection subunit 1331 may also be used to implement step 42611 and step 42612.

In some embodiments, referring to fig. 3, the processor 20 may further be configured to: determining a face region according to the nose joint point, the ear joint point and the eye joint point; and calculating the face definition of the face region. That is, the processor 20 may also be configured to implement step 42611 and step 42612.

In particular, the ear joint points may include a left ear joint point and a right ear joint point, the eye joint points may include a left eye joint point and a right eye joint point, a face region may be determined from the nose joint point, the left ear joint point, the right ear joint point, the left eye joint point, and the right eye joint point, and then a face sharpness detection algorithm may be used to calculate a face sharpness within the face region, for example, a variance of laplacian may be used to represent a face sharpness Score within the face region_face-clarity. Of course, the face sharpness may also be calculated by other algorithms, which are not listed here. In this embodiment, the face sharpness is calculated so that the pose detection result includes face sharpness data, and the face sharpness problem is considered when performing the pose detection, so that the face of the obtained target image is clearer.

Further, referring to fig. 11, in some embodiments, step 4262 comprises the steps of:

42621: calculating the nose occlusion degree of the nose according to the confidence coefficient of the nose joint point;

42622: calculating the eye occlusion degree of the eye according to the confidence coefficient of the eye joint point;

42623: calculating the ear shielding degree of the ear according to the confidence coefficient of the ear joint point; and

42624: and calculating the face shielding degree according to the nose shielding degree, the eye shielding degree and the ear shielding degree.

In some embodiments, please refer to fig. 9, the face occlusion degree detection subunit 1332 may further be configured to: calculating the nose occlusion degree of the nose according to the confidence coefficient of the nose joint point; calculating the eye occlusion degree of the eye according to the confidence coefficient of the eye joint point; calculating the ear shielding degree of the ear according to the confidence coefficient of the ear joint point; and calculating the face shielding degree according to the nose shielding degree, the eye shielding degree and the ear shielding degree. That is, the face occlusion degree detection subunit 1332 can also be used to implement step 42621, step 42622, step 42623 and step 42624.

In some embodiments, referring to fig. 3, the processor 20 may further be configured to: calculating the nose occlusion degree of the nose according to the confidence coefficient of the nose joint point; calculating the eye occlusion degree of the eye according to the confidence coefficient of the eye joint point; calculating the ear shielding degree of the ear according to the confidence coefficient of the ear joint point; and calculating the face shielding degree according to the nose shielding degree, the eye shielding degree and the ear shielding degree. That is, the processor 20 may also be used to implement step 42621, step 42622, step 42623 and step 42624.

Specifically, the human face is a relatively critical area in the image, and the occlusion of the human face has a large influence on the quality of the image, so that the human face occlusion degree needs to be calculated. The obvious feature in the human face is the five sense organs, and the human face shielding degree can be accurately determined by calculating the shielding degree of the five sense organs. In this embodiment, the nose shielding degree, the eye shielding degree and the ear shielding degree can be respectively calculated according to the data of the corresponding human body joint point, then the face shielding degree is calculated according to the nose shielding degree, the eye shielding degree and the ear shielding degree, and the face shielding degree obtained by calculation is more accurate.

When the human body posture estimation algorithm is used for identifying the human body joint points, the human body posture estimation algorithm can not only identify the coordinates of the human body joint points, but also provide the confidence coefficient of the human body joint points. The confidence may be used to indicate the probability of being the human joint point, and a higher confidence indicates a higher probability of being the human joint point, and the human joint point may be considered to be occluded to a lower degree. For example, there may be a mapping relationship between the confidence level and the occlusion level. For example, the mapping relationship between the confidence a and the occlusion degree S is: if the confidence of the nasal joint point is 75%, the nasal obstruction degree of the nasal joint point can be considered to be 25%. Or the mapping relation between the confidence A and the shielding degree S is as follows: s ═ aA, where a is a coefficient, can be calculated by multiple experiments. The mapping relationship corresponding to the confidence and the occlusion degree of each human body joint point can be the same or different. For example, the mapping relationship corresponding to the confidence degree of the nose joint point and the nose shielding degree may be inconsistent with the mapping relationship corresponding to the confidence degree of the ear joint point and the ear shielding degree, so that different calculation rules may be set for different human body joint points to more fit the corresponding human body joint points, and further the nose shielding degree and the ear shielding degree may be more accurately calculated.

Further, a nose occlusion Score of the nose may be calculated according to the confidence of the nose joint point_{nose_occlusion}. The left eye occlusion Score of the left eye can be calculated according to the confidence of the left eye joint point_{eye_occlusion-l}. Calculating the right eye occlusion degree Score of the right eye according to the confidence coefficient of the right eye joint point_{eye_occlusion-r}. According to the left eye shielding degree Score_{eye_occlusion-l}And right eye opacity Score_{eye_occlusion-r}Can calculate the eye shielding degree Score_{eye_occlusion}E.g. Score_{eye_occlusion}＝Score_{eye_occlusion-l}+Score_{eye_occlusion-r}. Left ear occlusion Score of the left ear can be calculated according to the confidence of the left ear joint point_{ear_occlusion-l}The occlusion degree Score of the right ear can be calculated according to the confidence coefficient of the joint point of the right ear_{ear_occlusion-r}Further, according to the left ear shielding degree Score_{ear_occlusion-l}And degree of occlusion Score of right ear_{ear_occlusion-r}Calculating the ear occlusion degree Score of the ear_{ear_occlusion}E.g. Score_{ear_occlusion}＝Score_{ear_occlusion-l}+Score_{ear_occlusion-r}。

Further, according to the nose shielding degree Score_{nose_occlusion}Eye shielding Score_{eye_occlusion}And ear occlusion Score_{ear_occlusion}Calculating the face shielding degree Score_{face_occlusion}. For example, Score_{face_occlusion}＝Score_{nose_occlusion}+Score_{eye_occlusion}+Score_{ear_occlusion}. Alternatively, the weights corresponding to the nose occlusion degree, the eye occlusion degree and the ear occlusion degree may be set correspondingly, and then the face occlusion degree is calculated according to the weights, which are not listed herein.

Wherein, in other embodiments, the mouth occlusion degree of the mouth can also be calculated. For example, mouth occlusion may be calculated from the confidence of the mouth joint points. The mouth shielding degree can be added when the face shielding degree is calculated, so that the obtained face shielding degree is more accurate.

Of course, the face occlusion degree can also be calculated through some deep learning algorithms. For example, the occluded area in the face area may be identified, and then the proportion of the occlusion degree area to the face area may be calculated, and so on, without limitation.

Referring to fig. 12, in some embodiments, step 4263 may comprise the following steps:

42631: calculating the bending degree of the arms according to the hand joint points, the elbow joint points and the shoulder joint points;

42632: calculating the stretching degree of the leg according to the foot joint point and the hip joint point;

42633: calculating a first distortion degree of the feet and the trunk according to the foot joint points, the hip joint points and the shoulder joint points;

42634: calculating a second degree of distortion of the leg and the trunk according to the knee joint point, the hip joint point and the shoulder joint point; and

42635: and calculating the posture extension degree according to the arm bending degree, the leg stretching degree, the first distortion degree and the second distortion degree.

In some embodiments, please refer to fig. 9, the gesture extension detection subunit 1333 may further be configured to: calculating the bending degree of the arms according to the hand joint points, the elbow joint points and the shoulder joint points; calculating the stretching degree of the leg according to the foot joint point and the hip joint point; calculating a first distortion degree of the feet and the trunk according to the foot joint points, the hip joint points and the shoulder joint points; calculating a second degree of distortion of the leg and the trunk according to the knee joint point, the hip joint point and the shoulder joint point; and calculating the posture extension degree according to the arm bending degree, the leg stretching degree, the first distortion degree and the second distortion degree. That is, the gesture stretching degree detection subunit 1333 can also be used to implement step 42631, step 42632, step 43633, step 43634, and step 42635.

In some embodiments, referring to fig. 3, the processor 20 may further be configured to: calculating the bending degree of the arms according to the hand joint points, the elbow joint points and the shoulder joint points; calculating the stretching degree of the leg according to the foot joint point and the hip joint point; calculating a first distortion degree of the feet and the trunk according to the foot joint points, the hip joint points and the shoulder joint points; calculating a second degree of distortion of the leg and the trunk according to the knee joint point, the hip joint point and the shoulder joint point; and calculating the posture extension degree according to the arm bending degree, the leg stretching degree, the first distortion degree and the second distortion degree. That is, the processor 20 may also be used to implement step 42631, step 42632, step 43633, step 43634, and step 42635.

Specifically, the influence of the pose extension degree on the image quality is relatively critical, and the normal user continuously shoots images so as to shoot the images with beautiful poses. If the pose extension degree is too small, the user may not be fully extended, and the frame continuous shooting image may not be an image desired by the user, and therefore, the pose extension degree of the human body in the continuous shooting image needs to be detected. Wherein, the hand joint points may include left hand joint points and right hand joint points, the elbow joint points may include left elbow joint points and right elbow joint points, the shoulder joint points may include left shoulder joint points and right shoulder joint points, the hip joint points may include left hip joint points and right hip joint points, the knee joint points may include left knee joint points and right knee joint points, and the foot joint points may include left foot joint points and right foot joint points.

Further, the degree of arm bending, the degree of leg stretching, the first degree of distortion, and the second degree of distortion may be calculated by the following formulas.

Wherein A, B, C can be the position coordinates of three related human joint points, | | A-B | | Y₂Representing the 2 norm of a-B and arccos representing the inverse cosine function.

For example, for a single left arm as an example, the degree of flexion Score of the left Elbow is calculated using the left-hand joint point coordinates Wrist _ l (x, y) (i.e., a in the above formula), the left-Elbow joint point coordinates Elbow _ l (x, y) (i.e., B in the above formula), and the left-Shoulder joint point coordinates Shoulder _ l (x, y) (i.e., C in the above formula)_{elbow_l}(i.e., S in the above equation), the calculation formula is as follows:

the degree of flexion Score of the right elbow can be calculated by the above formula using the coordinates of the right hand joint point, the right elbow joint point and the right shoulder joint point_{elbow_r}. Wherein, the right hand joint point coordinate Wrist _ r (x, y) is a in the above formula, the right Elbow joint point coordinate Elbow _ r (x, y) is B in the above formula, and the right Shoulder joint point coordinate Shoulder _ r (x, y) is C in the above formula.

The stretching degree Score of the left leg can be calculated by the formula by using the coordinates of the left foot joint point, the left hip joint point and the left knee joint point_leg-l. The left Foot joint point coordinates Foot _ l (x, y) are a in the above formula, the left Hip joint point coordinates Hip _ l (x, y) are B in the above formula, and the left Knee joint point coordinates Knee _ l (x, y) are C in the above formula.

The stretching degree Score of the right leg can be calculated by the formula by using the coordinates of the right foot joint point, the right hip joint point and the right knee joint point_leg-r. Wherein, the right Foot joint point coordinate Foot _ r (x, y) is A in the above formula, the right Hip joint point coordinate Hip _ r (x, y) is B in the above formula, the right knee joint isThe joint point coordinate Knee _ r (x, y) is C in the above formula.

The first degree of distortion Score of the left foot and the trunk can be calculated by using the coordinates of the left foot joint point, the left hip joint point and the left shoulder joint point through the formula_{twist_ankle_l}. The left Foot joint point coordinate Foot _ l (x, y) is a in the above formula, the left Hip joint point coordinate Hip _ l (x, y) is B in the above formula, and the left Shoulder joint point coordinate Shoulder _ l (x, y) is C in the above formula.

The first distortion degree Score of the right foot and the trunk can be calculated by using the right foot joint point coordinate, the right hip joint point coordinate and the right shoulder joint point coordinate through the formula_{twist_ankle_r}. The right Foot joint point coordinate Foot _ r (x, y) is a in the above formula, the right Hip joint point coordinate Hip _ r (x, y) is B in the above formula, and the right Shoulder joint point coordinate Shoulder _ r (x, y) is C in the above formula.

The second distortion degree Score of the left leg and the trunk can be calculated by using the coordinates of the left knee joint point, the left hip joint point and the left shoulder joint point through the formula_{twist_knee_l}. The left knee joint point coordinate knee _ l (x, y) is a in the above formula, the left Hip joint point coordinate Hip _ l (x, y) is B in the above formula, and the left Shoulder joint point coordinate Shoulder _ l (x, y) is C in the above formula.

The second distortion degree Score of the right leg and the trunk can be calculated by using the right knee joint point coordinate, the right hip joint point coordinate and the right shoulder joint point coordinate through the formula_{twist_knee_r}. Wherein, the right foot joint point coordinate Knee _ r (x, y) is A in the above formula, the right Hip joint point coordinate Hip _ r (x, y) is B in the above formula, and the right Shoulder joint point coordinate Shoulder _ r (x, y) is C in the above formula.

Further, may be according to Score_{elbow_l}Degree of flexion of the right elbow Score_{elbow_r}Left leg stretch Score_leg-lRight leg stretching degree Score_leg-rLeft foot and torso with a first degree of flexion Score_{twist_ankle_l}First degree of flexion Score of right foot and torso_{twist_ankle_r}Second twist of left leg and torsoDegree Score_{twist_knee_l}And a second distortion degree of the right leg and the trunk, and calculating the gesture extension Score of the human body in the continuous shooting image_stretch. For example, the degree of bending of the left elbow, the degree of bending of the elbow, the degree of stretching of the left leg, the degree of stretching of the right leg, the first degree of distortion of the left foot and the torso, the first degree of distortion of the right foot and the torso, the second degree of distortion of the left leg and the torso, and the second degree of distortion of the right leg and the torso may be added directly or after weighting to obtain the stretching of the human body, e.g., Score_stretch＝Score_{elbow_l}+Score_{elbow_r}+Score_leg-l+Score_leg-r+Score_{twist_ankle_l}+Score_{twist_ankle_r}+Score_{twist_knee_l}+Score_{twist_knee_r}. Of course, the gesture extension degree may also be calculated in other calculation manners, which are not listed here.

In this embodiment, the stretching degree of the left elbow, the bending degree of the right elbow, the stretching degree of the left leg, the stretching degree of the right leg, the first bending degree of the left foot and the trunk, the first bending degree of the right foot and the trunk, the second bending degree of the left leg and the trunk, and the second bending degree gesture stretching degree of the right leg and the trunk are calculated, so that the stretching degrees of each limb of the human body are fully considered, the gesture stretching degrees obtained through calculation are more accurate, and the gesture of the finally output target image is more beautiful.

Referring to fig. 13, in some embodiments, step 4264 may comprise the steps of:

42641: and calculating the height of the human body according to the shoulder joint points and the foot joint points.

In some embodiments, please refer to fig. 9, the body height detecting sub-unit 1334 can also be used to calculate the body height according to the shoulder joint point and the foot joint point. That is, the human body height detecting subunit 1334 may also be used to implement step 42461.

In some embodiments, referring to fig. 3, the processor 20 is further configured to calculate the height of the human body according to the shoulder joint point and the foot joint point. That is, processor 20 may also be used to implement step 42461.

In particular, the height of the human body is also important for the quality of the continuously shot image, and if the human body in the continuously shot image is short, the form of the user is relatively good and is not the continuously shot image desired by the user. Therefore, it is necessary to detect the height of the human body in the continuously shot images to be able to find an image in which the height of the human body is high. The height of the human body can be calculated according to the shoulder joint points and the foot joint points.

More specifically, the body height can be calculated by using the ordinate of the left shoulder joint point, the ordinate of the right shoulder joint point, the ordinate of the left foot joint point and the ordinate of the right foot joint point, and the calculation formula is as follows:

wherein, i represents the ith person in the image, j-0 represents the left shoulder, j-1 represents the right shoulder, j-2 represents the left foot, j-3 represents the right foot, yij represents the ordinate of j joint point of the ith person, and H represents the height of the frame continuous shooting image.

Of course, the head coordinates of the human body and the foot coordinates of the human body may be recognized to calculate the height of the human body. The height of the human body can also be calculated by other algorithms, which are not limited herein.

Referring to fig. 14 and 15, in some embodiments, step 044 may include the steps of:

0442: identifying key points of the face of the image; and

0444: and generating an expression detection result according to the key points of the face.

In some embodiments, please refer to fig. 7, the detection module 14 includes an expression detection sub-module 15, and the expression detection sub-module 15 may include a second recognition unit 151 and a third generation unit 152. The second recognition unit 151 and the third generation unit 152 may be used to implement step 0442 and step 0444, respectively. That is, the second recognition unit 151 may be used to recognize face key points of an image; the third generating unit 152 may be configured to generate an expression detection result according to the face key point.

In some embodiments, referring to fig. 3, the processor 20 may further be configured to: identifying key points of the face of the image; and generating an expression detection result according to the key points of the face. That is, processor 20 may also be used to implement

steps

0442 and 0444.

Specifically, in order to accurately detect the expression of the photographed portrait in each of the continuous shooting images, the continuous shooting image with a better expression can be found. Specifically, the face key point information of each frame of the continuously shot images in the first frame to the Nth frame can be detected through a face key point detection algorithm. The face key point detection algorithm can detect face key point information of all people in the continuously shot image, for example, when only one person exists in the continuously shot image, the face key point information of the person can be detected, and when a plurality of persons exist in the continuously shot image, the face key point information of each person can be detected. The face keypoint detection algorithm may include Dlib algorithm or a Practical Face Landmark Detector (PFLD) algorithm, which are not listed here. And selecting a proper algorithm to detect the key points of the human face according to actual requirements. The face key points may include key points of the features of the eyes (left and right eyes), ears (left and right ears), nose, mouth, and the like of the photographed person.

And then the expression of the shot portrait can be detected according to the detected key points of the face. For example, the smiling degree of the photographed person, the size of the opening of the eyes of the photographed person, the degree of opening of the mouth of the photographed person, and the like can be determined from the detected face key points. Therefore, the expression detection result of the continuous shooting image can be generated according to the detected expression data. The expression of the photographed portrait in each continuous shooting image can be acquired according to the expression detection result, so that the continuous shooting image with better expression can be selected according to the expression detection result.

Referring to fig. 16-18, in some embodiments, the facial expression detection includes smile detection, blink detection, and step 0444 includes the steps of:

4442: performing at least one of smile detection, blink detection and blink monocular detection according to the key points of the human face, and correspondingly generating at least one of smile detection results, blink detection results and blink monocular detection results; and

4444: and acquiring an expression detection result according to at least one of the smile detection result, the blink detection result and the blink monocular detection result.

In some embodiments, please refer to fig. 15, the third generating unit 152 may include a smile detecting subunit 1521, a blink detecting subunit 1522, a blink detecting subunit 1523, and an obtaining subunit 1524. The smile detection subunit 1521 may be configured to perform smile detection based on the face key points and generate corresponding smile detection results, the blink detection subunit 1522 may be configured to perform blink detection based on the face key points and generate corresponding blink detection results, the blink detection subunit 1523 may be configured to perform blink detection based on the face key points and generate corresponding blink detection results, and the obtaining subunit 1524 may be configured to obtain the expression detection results based on at least one of the smile detection results, the blink detection results, and the blink detection results. That is, smile detection subunit 1521, blink detection subunit 1522, and blink detection subunit 1523 may be collectively used to implement step 4442, and acquisition subunit 1524 may be used to implement step 4444. In performing step 4442, one or more of the smile detection subunit 1521, blink detection subunit 1522, and blink detection subunit 1523 may operate.

In some embodiments, referring to fig. 3, the processor 20 may further be configured to: performing at least one of smile detection, blink detection and blink monocular detection according to the key points of the human face, and correspondingly generating at least one of smile detection results, blink detection results and blink monocular detection results; and acquiring an expression detection result according to at least one of the smile detection result, the blink detection result and the blink monocular detection result. That is, the processor 20 may also be used to implement step 4442 and step 4444.

Specifically, in order to facilitate accurate expression detection, at least one of smile detection, blink detection, and blink detection may be performed according to the identified face key points, and then at least one of smile detection, blink detection, and blink detection may be generated accordingly. For example, smile detection can be performed according to key points of the face, and a smile detection result can be generated after smile detection; blink detection can be carried out according to the key points of the human face, and a blink detection result can be generated after blink detection; blinking eye detection can be carried out according to the human face joint points, and blinking eye detection results can be generated after blinking eye detection. Or smile detection and blink detection can be carried out according to the key points of the human face, and a smile detection result and a blink detection result are correspondingly generated. Or smile detection, blink detection and blink single eye detection can be carried out according to the key points of the human face, and smile detection results, blink detection results and blink single eye detection results are correspondingly generated.

It is understood that, referring to fig. 17, step 4442 may include step 44422: performing smile detection according to the key points of the face to generate a smile detection result; step 44424: executing blink detection according to the key points of the human face to generate a blink detection result; and step 44426: and carrying out blinking monocular detection according to the face joint points to generate blinking monocular detection results.

In some embodiments, at least one of the

steps

44422, 44424 and 44426 may be selectively performed according to the needs of the user, so that the obtained expression detection result meets the expectations of the user. In one example, one of step 44422, step 44424 and step 44426 may be selected to be performed as desired. In another example. Two of step 44422, step 44424 and step 44426 may be selected to be performed as desired, e.g., step 44422+ step 44424, step 44422+ step 44426, step 44424+ step 44426. In yet another example. Three of step 44422, step 44424 and step 44426 may be selected to be performed.

In other embodiments, at least one of

steps

44422, 44424 and 44426 may be selected automatically based on a common characteristic of multiple learning user-selected burst images. For example, the user often selects a relatively happy continuous shot image with smiles and eyes open as the target image, and then may automatically select to perform step 44422 and step 44424. Alternatively, the user often selects a more happy burst image with blinking and smiling as the target image, then steps 44422 and 44426 may be automatically selected to be performed. Other situations are also possible, not listed here.

Further, the expression detection result of the continuous shooting image may be obtained according to the detection result of step 4442. For example, the expression detection result may be obtained from the smile detection result obtained in step 44422, the expression detection result may be obtained from the blink detection result obtained in step 44424, the expression detection result may be obtained from the smile detection result and the blink detection result obtained in step 44426, and the expression detection result may be obtained from the smile detection result, the blink detection result, and the blink detection result. If the expression detection result is obtained only according to one of the smile detection result, the blink detection result and the blink monocular detection result, one of the smile detection result, the blink detection result and the blink monocular detection result can be directly used as the expression detection result, and if the expression detection result is obtained only according to two or three of the smile detection result, the blink detection result and the blink monocular detection result, the two or three results can be added to obtain the expression detection result.

In this embodiment, at least one of smile detection, blink detection, and blink monocular detection is performed according to the face key points, and an expression detection result is generated according to the detection result, so that expression detection data of a photographed portrait in a continuously photographed image can be obtained, and a target image with a better expression can be selected according to the expression detection result.

Referring to FIG. 18, in some embodiments, the face keypoints include nose keypoints and lip keypoints, and step 44422 includes the steps of:

444222: calculating the mouth corner rising degree according to the nose key point and the lip key point;

444224: calculating the opening degree of the mouth angle according to the lip key points; and

444226: and calculating the smile detection result of the human face according to the mouth angle rising degree and the mouth angle opening degree.

In some embodiments, please refer to fig. 18, the smile detection subunit 1521 may further be configured to: calculating the mouth corner rising degree according to the nose key point and the lip key point; calculating the opening degree of the mouth angle according to the lip key points; and calculating the smile detection result of the human face according to the mouth corner rising degree and the mouth corner opening degree. That is, the smile detection subunit 1521 may be used to implement step 44422, step 44424, and step 44426.

In some embodiments, referring to fig. 3, the processor 20 may further be configured to: calculating the mouth corner rising degree according to the nose key point and the lip key point; calculating the opening degree of the mouth angle according to the lip key points; and calculating the smile detection result of the human face according to the mouth corner rising degree and the mouth corner opening degree. That is, processor 20 may be used to implement step 444222, step 444224, and step 444226.

Specifically, the smile of the human face generally mainly means that the lips change, so the smile data of the human face can be determined according to the rising degree of the mouth angle and the opening degree of the mouth angle. The change of the lips relative to the nose can be calculated according to the coordinates of the key points of the nose and the coordinates of the key points of the lips, and then the head height Score on the mouth angle can be obtained_{rise_mouth}Generally, the greater the lift at the corners of the mouth, the more enthusiasm is shown for the photographed person to laugh. The number of the nose key points can be one or more, and the number of the lip key points can be one or more. The mouth angular opening degree may be calculated from the key points of the lips, for example, the mouth angular opening degree Score may be calculated from the coordinates of the key points of the upper lips and the coordinates of the key points of the lower lips_{expand_mouth}. Wherein, the number of the key points of the upper lip and the lower lip can be one or more.

Further, the head height Score on the obtained mouth angle_{rise_mouth}And degree of mouth angle opening Score_{expand_mouth}Then, according to the upper lift of the mouth angle, Score_{rise_mouth}And degree of mouth angle opening Score_{expand_mouth}Calculating smile detection result Score of human face_smile. In one example, smile detection may be obtained by adding the head lift to the mouth angle opening, i.e., Score_smile＝Score_{rise_mouth}+Score_{expand_mouth}. In another example, smile detection is mapped to the degree of mouth-angle rising and mouth-angle opening, e.g., Score_smile＝aScore_{rise_mouth}+bScore_{expand_mouth}. Of course, the smile detection result of the face can also be calculated in other ways according to the head lift on the mouth angle and the opening degree of the mouth angle.

Referring to FIG. 20, in some embodiments, the nose keypoints comprise nose keypoints, the lip keypoints comprise two mouth corner keypoints, and step 444222 comprises the steps of:

444221: and calculating the mouth corner rising degree according to the nose key point and the two mouth corner key points.

In some embodiments, please refer to fig. 18, the smile detection subunit 1521 may further be configured to calculate the mouth corner raising degree according to the nose key point and the two mouth corner key points. That is, the smile detection subunit 1521 may also be used to implement step 444221.

In some embodiments, referring to fig. 3, processor 20 may be further configured to calculate the mouth corner lift based on the nose key point and the two mouth corner key points. That is, the processor may also be used to implement step 444221.

Specifically, referring to fig. 21, the coordinates of the Nose key point 33 are Nose (x, y), the coordinates of the left mouth corner key point 48 are Lips _ l (x, y), the coordinates of the right mouth corner key point 54 are Lips _ r (x, y), and the mouth angle head Score_{rise_mouth}The calculation formula of (c) may be as follows:

referring to fig. 22, in some embodiments, the lip keypoints include upper lip keypoints, lower lip keypoints, and mouth corner keypoints, and step 444224 includes the steps of:

444223: and calculating the mouth angle opening degree according to the upper lip key point, the lower lip key point and the mouth angle key point.

In some embodiments, please refer to fig. 18, the smile detection subunit 1521 may further be configured to calculate a mouth corner opening degree according to the top lip key point, the bottom lip key point, and the mouth corner key point. That is, smile detection subunit 1521 may also be used to implement step 444243.

In some embodiments, referring to fig. 3, the processor 20 may be further configured to calculate the mouth corner opening degree according to the upper lip key point, the lower lip key point and the mouth corner key point. That is, processor 20 may also be used to implement step 444243.

Referring to fig. 21, the mouth corner key points may include two

key points

48 and 54, respectively, the upper lip key points may include two

key points

49 and 53, respectively, the lower lip key points include two

key points

55 and 59, respectively, and the mouth corner opening degree Score is calculated_{expand_mouth}The formula of (c) can be as follows:

wherein, dist_{49_59}Indicating the distance, dist, between keypoint 49 and keypoint 59_{53_55}Indicating the distance, dist, between keypoint 53 and keypoint 55_{48_54}Representing the distance between keypoint 48 and keypoint 54. The coordinates of the

key points

49, 59, 53, 55, 48, 54 can be calculated by the above-mentioned face key point detection algorithm or other algorithms, which will not be described in detail herein.

Of course, in other embodiments, the mouth angle opening degree can be calculated by other embodiments, and is not limited herein.

Referring to fig. 23, in some embodiments, the face key points include eye key points, and step 44424 includes the steps of:

444242: and calculating the eye openness degree of each eye according to the eye key points of each eye to obtain a blink detection result.

In some embodiments, please refer to fig. 18, the blink detection subunit 1522 may further be configured to calculate an eye openness degree of each eye according to the eye key points of each eye, so as to obtain a blink detection result. That is, the blink detection subunit 1522 may also be used to implement step 444242.

In some embodiments, referring to fig. 3, the processor 20 may be further configured to calculate an eye openness degree of each eye according to the eye key point of each eye, so as to obtain the blink detection result. That is, processor 20 may also be used to implement step 444242.

The degree of eye openness is important for the expression of the photographed portrait, and the degree of eye openness directly affects the overall beauty of the photographed portrait, and in general, users want to be able to photograph images with their eyes open. In this embodiment, the eye openness degree of each eye is calculated, and the obtained blink detection result includes the eye openness degree of each eye, so that the expression detection result also includes the eye openness degree of each eye, so that the eyes of the photographed portrait in the finally obtained target image are more likely to be open.

Specifically, each eye may include an upper eyelid, a lower eyelid, and an canthus, and the eye key points may include an upper eyelid key point, a lower eyelid key point, and an canthus key point. According to the key points of the upper eyelid, the lower eyelid and the corner of the eye of the left eye, the eye opening degree of the left eye can be calculated. According to the key points of the upper eyelid, the lower eyelid and the canthus of the right eye, the eye opening degree of the right eye can be calculated. And then, generating a blink detection result according to the eye opening degree of the left eye and the eye opening degree of the right eye. For example, the blink detection structure may be the eye openness of the left eye plus the eye openness of the right eye.

More specifically, continuing with fig. 24, the corner key points for the left eye may include two

key points

36 and 39, respectively, the upper eyelid key points for the left eye may include two

key points

37 and 38, respectively, and the lower eyelid key points for the left eye may include two

key points

40 and 41, respectively. From the key points 36, 37, 38, 40 and 41, the eye openness Score of the left eye can be obtained by aspect ratio calculation_{expand_eye_l}. The corner key points for the right eye may include two,

key points

42 and 45, respectively, the upper eyelid key points for the right eye may include two,

key points

43 and 44, respectively, and the lower eyelid key points for the right eye may includeTwo, keypoint 47 and keypoint 48. From the key points 42, 43, 44, 45, 47 and 48, the eye openness Score of the left eye can be obtained by the aspect ratio (longitudinal and lateral comparison) calculation_{expand_eye_r}The degree of openness of the eyes Score_{expand_eye＝}Score_{expand_eye_l}+Score_{expand_eye_r}。

Referring to FIG. 25, in some embodiments, step 44426 includes the steps of:

444262: and generating a blinking detection result according to the eye opening degree of the left eye and the eye opening degree of the right eye.

In some embodiments, please refer to fig. 18, the blink detection subunit 1523 may be further configured to generate the blink detection result according to the eye opening degree of the left eye and the eye opening degree of the right eye. That is, the blink detection subunit 1523 may also be used to implement step 444262.

In some embodiments, referring to fig. 3, processor 20 may be further configured to generate blinking detection results according to the eye openness of the left eye and the eye openness of the right eye. That is, processor 20 may also be used to implement step 444262.

Specifically, to detect whether there is blinking detection of a captured person image, i.e., to detect whether the user is blinking only one eye, e.g., only the left eye is open and the right eye is closed; alternatively, only the right eye is opened and the left eye is closed. Whether the situation of blinking eyes exists in the shot portrait in the continuously shot image can be judged according to the eye opening degree of the left eye and the eye opening degree of the right eye, and the fact that which eye is opened and which eye is closed can also be determined.

In one example, whether there is a blinking condition can be determined based on a difference between the eye openness of the left eye and the eye openness of the right eye. For example, the greater the difference between the degree of eye openness of the left eye and the degree of eye openness of the right eye, the greater the probability of blinking of the single eye; the smaller the difference between the degree of eye openness of the left eye and the degree of eye openness of the right eye, the smaller the probability of blinking of the single eye.

In another example, whether blinking is present and the blinking detection result may be determined based on an absolute value of a ratio between a difference between the eye openness degree of the left eye and the eye openness degree of the right eye and a sum of the eye openness degree of the left eye and the eye openness degree of the right eye. The specific calculation formula may be as follows:

wherein abs represents the absolute value of the logarithm, Score_{expand_eye_l}Indicating the degree of eye openness of the left eye, Score_{expand_eye_r}Indicating the degree of eye openness of the right eye, Score_winkIndicating the blinking detection result. If Score_winkThe larger the value, the higher the probability of blinking, and if Score is present_winkThe smaller the value, the smaller the probability of blinking.

In some embodiments, the expression detection result includes a smile detection result, a blink detection result, and the expression detection result is obtained based on the smile detection result, the blink detection result, and the blink detection result. For example, smile detection result Score_smileAnd a blink detection result Score_{expand_eye}And blink detection result Score_winkAdding to obtain the expression detection result Score_emotionI.e. Score_emotion＝Score_smile+Score_{expand_eye}+Score_wink。

Referring to fig. 26 and 27, in some embodiments, step 06 includes the following steps:

062: calculating the detection score of each frame of continuous shooting image according to the posture detection result and the expression detection result of each frame of continuous shooting image;

064: sequencing the multi-frame continuous shooting images according to the detection scores in a descending or descending order; and

066: the continuous shooting images whose order is within a predetermined order are selected as the target images.

In some embodiments, the output module 16 may include a calculation unit 161, a sorting unit 162, and a selection unit 163. Calculation unit 161, sorting unit 162 and selection unit 163 may be used to implement step 062, step 064 and step 066, respectively. That is, the calculation unit 161 may be configured to calculate a detection score of each frame of the continuous shooting image according to the posture detection result and the expression detection result of each frame of the continuous shooting image; the sorting unit 162 may be configured to sort the multiple frames of continuous shooting images in a descending order or a descending order according to the detection scores; the selection unit 163 may be configured to select continuous shooting images whose order is within a predetermined order as the target image.

In some embodiments, the processor 20 is further configured to calculate a detection score of each frame of the continuous shooting image according to the posture detection result and the expression detection result of each frame of the continuous shooting image; sequencing the multi-frame continuous shooting images according to the detection scores in a descending or descending order; and selecting the continuous shooting images with the sequence within the preset sequence as target images. That is, the processor may also be used to implement step 062, step 064, and step 066.

Specifically, the gesture detection result obtained in the above embodiment may be directly used as the gesture detection score, and the expression detection result may be directly used as the expression detection score. In one example, the gesture detection Score_posePlus expression detection Score_emotionA detection Score of each frame of the continuous shot image can be obtained, i.e., Score ═ Score_pose+Score_emotion. Or, the gesture detection Score_poseAnd expression detection Score_emotionCorresponding to the weight k1 and the weight k2, respectively, and calculating the detection Score according to the posture detection Score_poseMultiplying by the corresponding weight k1, the expression detection Score_emotionMultiplying by the corresponding weight k2 and then adding the two may result in a detection Score, i.e., Score k1 Score_pose+k2*Score_emotion. Wherein, the weight k1 and the weight k2 can be fixed values or can be customized by a user, and the weight k1 and the weight k2 can also be determined according toThe deep learning user's preferences automatically adjust, for example, finding that the user focuses on gestures multiple times may set k1 larger, and finding that the user focuses on expressions multiple times may set k2 larger.

After the detection score of each frame of continuous shooting image is obtained through calculation, the multiple frames of continuous shooting images can be sorted according to the size of the detection score from small to large or from large to small, and then the images with the sequence in the preset sequence can be selected as target images. For example, when the multiple frames of continuous shooting images are sorted from large to small according to the size of the detection score, the predetermined sequence may be the top-ranked sequence, for example, the predetermined sequence may be the first name, the first two names, the first three names, and the first five names; alternatively, when the multi-frame continuous shot images are sorted from small to large according to the size of the detection score, the predetermined order may be the order of the top rank, for example, the predetermined order may be the last, the last two, the last three, and the last five. Therefore, the obtained target image has beautiful posture and better expression, and is more suitable for the expectation of the user.

In one example, the number of the target images is one, and when the multiple frames of continuous shooting images are sorted from small to large according to the size of the detection fraction, the preset sequence is the last one; when the multi-frame continuous shooting images are sorted from large to small according to the detection scores, the preset sequence is the first name, and therefore the quality of the obtained target image is better.

In another example, the number of the target images may be two, three, etc., so that the user may select an image more suitable for the user's desire from the plurality of target images. Compared with the method that the user directly selects one target image from all the continuous shooting images, the method can save the time for the user to select the target image and further enhance the use experience of the user.

Further, in one example, after outputting the target image, the processor 20 or the terminal 100 may delete other continuous shot images to save the memory space of the terminal. In another example, after the target image is output, the target image may be used as a base frame, and other continuous shooting images are fused into the target image, so as to make the posture and expression of the target image better, and the target image may be subjected to High-Dynamic illumination rendering (HDR), color dazzling, deblurring, and the like, so that the obtained target image is clearer, and the quality and aesthetic feeling of the target image are improved.

Referring to fig. 28, one or more non-transitory computer-readable storage media 300 containing a computer program 301 according to an embodiment of the present disclosure, when the computer program 301 is executed by one or more processors 20, the processor 20 may execute the method for outputting a continuous shooting image according to any one of the embodiments. For example, the method of step 02, step 04, step 06, step 042, step 044, step 0422, step 0424, step 0426, step 0428, step 4261, step 4262, step 4263, step 4264, step 4265, step 42611, step 42612, step 42621, step 42622, step 42623, step 42624, step 42631, step 42632, step 42633, step 42634, step 42635, step 42641, step 0442, step 0444, step 4442, step 4444, step 44422, step 44424, step 44426, step 444222, step 444224, step 444226, step 444221, step 444223, step 444242, step 444262, step 062, step 064, step 066 is performed.

For example, referring to fig. 1, the computer program 301, when executed by the one or more processors 20, causes the processors 20 to perform the steps of:

For another example, referring to fig. 8, when the computer program 301 is executed by the one or more processors 20, the processor 20 is caused to perform the following steps:

In the description herein, reference to the description of the terms "one embodiment," "some embodiments," "an illustrative embodiment," "an example," "a specific example" or "some examples" or the like means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.

Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and the scope of the preferred embodiments of the present application includes other implementations in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present application.

Although embodiments of the present application have been shown and described above, it is to be understood that the above embodiments are exemplary and not to be construed as limiting the present application, and that changes, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present application.

Claims

1. A method of outputting an image for continuous shooting, comprising:

performing continuous shooting to generate a multi-frame continuous shooting image;

carrying out human body posture detection and human face expression detection on each frame of continuous shooting images in a plurality of frames of continuous shooting images; and

and outputting at least one target image from the multi-frame continuous shooting images according to a posture detection result obtained by the human body posture detection and an expression detection result obtained by the expression detection.

2. The method for outputting images according to claim 1, wherein the detecting the human body posture of each frame of the continuous shooting images comprises:

when the focal distance of the continuously shot image is larger than a preset distance, identifying human body joint points in the continuously shot image; and

and generating the posture detection result according to the human body joint points.

3. The method for outputting images according to claim 2, wherein the human posture detection comprises human face sharpness detection, human face occlusion detection, posture extension detection and human body height detection, and the generating the posture detection result according to the human body joint point comprises:

executing at least one of the face definition detection, the face occlusion detection, the pose extension detection and the human body height detection according to the human body joint points to correspondingly generate at least one of face definition, face occlusion, pose extension and human body height; and

and generating the posture detection result according to at least one of the human face definition, the human face shielding degree, the posture stretching degree and the human body height.

4. The method for outputting images according to claim 1, wherein the performing facial expression detection on each frame of the continuous shooting images in the plurality of frames of the continuous shooting images comprises:

identifying key points of the human face of the continuously shot image; and

and generating the expression detection result according to the face key points.

5. The method of outputting images according to claim 4, wherein the facial expression detection comprises smile detection, blink detection and blink detection, and wherein the generating the expression detection results from the face keypoints comprises:

performing at least one of the smile detection, the blink detection, and the blink detection based on the face key points to correspond to the at least one of the generated smile detection result, blink detection result, and blink detection result; and

and generating the expression detection result according to at least one of the smile detection result, the blink detection result and the blink monocular detection result.

6. The method of outputting an image according to claim 5, wherein the face key points include a nose key point and a lip key point, and the smile detection is performed according to the face key points to generate the smile detection result, comprising:

calculating the mouth corner raising degree according to the nose key point and the lip key point;

calculating the opening degree of the mouth angle according to the lip key points; and

and generating the smile detection result according to the mouth corner rising degree and the mouth corner opening degree.

7. The method of outputting an image of claim 5, wherein the face keypoints comprise eye keypoints, and wherein performing the blink detection on the face keypoints to generate the blink detection result comprises:

calculating an eye openness degree of each eye according to the eye key points of each eye to generate the blink detection result.

8. The method of outputting an image of claim 5, wherein the eyes comprise a left eye and a right eye, and wherein performing the blink detection based on the face joint to generate the blink detection comprises:

generating the blink detection result according to the eye openness degree of the left eye and the eye openness degree of the right eye.

9. The method for outputting images according to claim 1, wherein the selecting at least one target image from the multiple frames of continuously shot images according to the posture detection result obtained by detecting the human body posture and the expression detection result obtained by detecting the expression comprises:

calculating the detection score of each frame of continuous shooting image according to the posture detection result and the expression detection result of each frame of continuous shooting image;

sequencing the multiple frames of continuous shooting images from large to small or from small to large according to the detection scores; and

the continuous shooting images whose order is within a predetermined order are selected as the target images.

10. An output image device for continuous shooting, comprising:

the continuous shooting module is used for executing continuous shooting to generate multi-frame continuous shooting images;

the detection module is used for carrying out human body posture detection and human face expression detection on each frame of continuous shooting images in a plurality of frames of continuous shooting images; and

and the output module is used for outputting at least one target image from the multi-frame continuous shooting images according to a posture detection result obtained by the human body posture detection and an expression detection result obtained by the expression detection.

11. A terminal, characterized in that the terminal comprises:

one or more processors, memory; and

one or more programs, wherein the one or more programs are stored in the memory and executed by the one or more processors, the programs comprising instructions for performing the method of outputting an image of any of claims 1 to 10.

12. A non-transitory computer readable storage medium containing a computer program which, when executed by one or more processors, causes the processors to implement the method of outputting an image of any one of claims 1 to 10.