CN113326775B - Image processing method and device, terminal and readable storage medium - Google Patents

Image processing method and device, terminal and readable storage medium Download PDF

Info

Publication number
CN113326775B
CN113326775B CN202110600413.3A CN202110600413A CN113326775B CN 113326775 B CN113326775 B CN 113326775B CN 202110600413 A CN202110600413 A CN 202110600413A CN 113326775 B CN113326775 B CN 113326775B
Authority
CN
China
Prior art keywords
score
image
images
recognition
scores
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110600413.3A
Other languages
Chinese (zh)
Other versions
CN113326775A (en
Inventor
苏展
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Oppo Mobile Telecommunications Corp Ltd
Original Assignee
Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Oppo Mobile Telecommunications Corp Ltd filed Critical Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority to CN202110600413.3A priority Critical patent/CN113326775B/en
Publication of CN113326775A publication Critical patent/CN113326775A/en
Application granted granted Critical
Publication of CN113326775B publication Critical patent/CN113326775B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/166Detection; Localisation; Normalisation using acquisition arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/95Computational photography systems, e.g. light-field imaging systems
    • H04N23/951Computational photography systems, e.g. light-field imaging systems by using two or more images to influence resolution, frame rate or aspect ratio

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Multimedia (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • Signal Processing (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses an image processing method, an image processing device, a terminal and a nonvolatile computer readable storage medium. The image processing method comprises the following steps: performing continuous shooting to generate multi-frame images; performing feature recognition on each frame of image in the multi-frame images to output scores of each frame of image; when the scores are that the images are abnormal, the images with the abnormalities are processed according to the scores of the multi-frame images so as to update the scores of the images with the abnormalities; and outputting at least one target image from the multi-frame images according to the scores of the updated multi-frame images. According to the method and the device, the characteristics of each frame of image in the multi-frame images are identified, when the scores indicate that the images are abnormal, the images with the abnormality are processed according to the scores of the multi-frame images, so that the scores of the images with the abnormality are updated, and therefore when at least one target image is output from the multi-frame images, the imaging quality of the target image can be guaranteed.

Description

Image processing method and device, terminal and readable storage medium
Technical Field
The present invention relates to the field of image processing technologies, and in particular, to an image processing method, an image processing apparatus, a terminal, and a non-volatile computer readable storage medium for continuous shooting.
Background
Currently, mobile phone manufacturers only store each frame of continuously shot images to an album for realizing continuous shooting application of cameras. Usually, a user uses a continuous shooting function of a camera to select a certain target image, but the image acquired by continuous shooting of the camera is often blurred due to exposure time and other reasons, so that the imaging quality of the selected target image is poor.
Disclosure of Invention
The embodiment of the application provides a continuous shooting image processing method, an image processing device, a terminal and a nonvolatile computer readable storage medium.
The image processing method of continuous shooting in the embodiment of the application comprises the following steps: performing continuous shooting to generate multi-frame images; performing feature recognition on each frame of images in a plurality of frames of images to output scores of each frame of images; processing the image with the abnormality according to the scores of the images of a plurality of frames when the scores are that the image has the abnormality, so as to update the scores of the images with the abnormality; and outputting at least one target image from the plurality of frames of images according to the scores of the updated plurality of frames of images.
The continuous shooting image processing device comprises a continuous shooting module, an identification module, an updating module and an output module. The continuous shooting module is used for executing continuous shooting to generate multi-frame images. The identification module is used for carrying out feature identification on each frame of images in the multi-frame images so as to output scores of each frame of images. The updating module is used for processing the images with the abnormality according to the scores of the images with multiple frames when the scores are that the images have the abnormality, so as to update the scores of the images with the abnormality. And the output module is used for outputting at least one target image from the multiple frames of images according to the scores of the multiple frames of updated images.
The terminal of the embodiments of the present application includes one or more processors, memory, and one or more programs. Wherein the one or more programs are stored in the memory and executed by the one or more processors, the programs including instructions for performing the image processing methods described in embodiments of the present application. The image processing method comprises the following steps: performing continuous shooting to generate multi-frame images; performing feature recognition on each frame of images in a plurality of frames of images to output scores of each frame of images; processing the image with the abnormality according to the scores of the images of a plurality of frames when the scores are that the image has the abnormality, so as to update the scores of the images with the abnormality; and outputting at least one target image from the plurality of frames of images according to the scores of the updated plurality of frames of images.
A non-transitory computer readable storage medium containing a computer program of an embodiment of the present application, which when executed by one or more processors, causes the processors to implement the image processing method of the embodiment of the present application. The image processing method comprises the following steps: performing continuous shooting to generate multi-frame images; performing feature recognition on each frame of images in a plurality of frames of images to output scores of each frame of images; processing the image with the abnormality according to the scores of the images of a plurality of frames when the scores are that the image has the abnormality, so as to update the scores of the images with the abnormality; and outputting at least one target image from the plurality of frames of images according to the scores of the updated plurality of frames of images.
In the image processing method, the image processing device, the terminal and the non-volatile computer readable storage medium, feature identification is performed on each frame of image in the multi-frame images, and when the scores indicate that the images are abnormal, the images with the abnormality are processed according to the scores of the multi-frame images so as to update the scores of the images with the abnormality, so that each frame of image in the multi-frame images is ensured to have no abnormality, and the imaging quality of the multi-frame images is ensured, and therefore, when at least one target image is output from the multi-frame images, the imaging quality of the target image can be ensured.
Additional aspects and advantages of embodiments of the application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the application.
Drawings
The foregoing and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings, in which:
FIG. 1 is a flow diagram of an image processing method of certain embodiments of the present application;
FIG. 2 is a schematic diagram of an image processing apparatus according to some embodiments of the present application;
FIG. 3 is a schematic diagram of a terminal according to some embodiments of the present application;
fig. 4 to 15 are flowcharts of an image processing method according to some embodiments of the present application;
FIG. 16 is a schematic view of a scenario of an image processing method of certain embodiments of the present application;
FIGS. 17 and 18 are flow diagrams of image processing methods of certain embodiments of the present application;
FIG. 19 is a schematic view of a scenario of an image processing method of certain embodiments of the present application;
FIGS. 20-26 are flow diagrams of image processing methods according to certain embodiments of the present application;
FIG. 27 is a schematic illustration of a scenario of an image processing method of certain embodiments of the present application;
FIG. 28 is a schematic diagram of a connection of a computer readable storage medium and a processor according to some embodiments of the present application.
Detailed Description
Embodiments of the present application are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar components or components having like or similar functionality throughout. The embodiments described below by referring to the drawings are exemplary only for explaining the embodiments of the present application and are not to be construed as limiting the embodiments of the present application.
Referring to fig. 1 to 3, the image processing method of continuous shooting according to the embodiment of the application includes the following steps:
01: performing continuous shooting to generate multi-frame images;
02: performing feature recognition on each frame of image in the multi-frame images to output scores of each frame of image;
03: when the scores are that the images are abnormal, the images with the abnormalities are processed according to the scores of the multi-frame images so as to update the scores of the images with the abnormalities; a kind of electronic device with high-pressure air-conditioning system
04: and outputting at least one target image from the multi-frame images according to the updated scores of the multi-frame images.
The continuous shooting image processing apparatus 10 of the embodiment of the present application includes a continuous shooting module 11, an identification module 12, an update module 13, and an output module 14. The continuous shooting module 11, the identification module 12, the updating module 13 and the output module 14 can be used for realizing the steps 01, 02, 03 and 04 respectively. That is, the continuous shooting module 11 may be configured to perform continuous shooting to generate a plurality of frame images; the identification module 12 may be configured to perform feature identification on each of the multiple frames of images to output a score of each of the multiple frames of images; the updating module 13 may be configured to, when the score indicates that the image has an abnormality, process the image having the abnormality according to the score of the multi-frame image, so as to update the score of the image having the abnormality; the output module 14 may be configured to output at least one target image from the multiple frame images according to the updated scores of the multiple frame images.
The terminal 100 of the present embodiment includes one or more processors 20, a memory 30, and one or more programs, wherein the one or more programs are stored in the memory 30 and executed by the one or more processors 20, the programs including instructions for performing the image processing method of the present embodiment. That is, when the processor 20 executes the program, the processor 20 may implement step 01, step 02, step 03, and step 04. That is, the processor 20 may be configured to: performing continuous shooting to generate multi-frame images; performing feature recognition on each frame of image in the multi-frame images to output scores of each frame of image; when the scores are that the images are abnormal, the images with the abnormalities are processed according to the scores of the multi-frame images so as to update the scores of the images with the abnormalities; and outputting at least one target image from the multi-frame images according to the scores of the updated multi-frame images.
According to the image processing method, the image processing device 10 and the terminal 100, the image with the abnormality is processed according to the scores of the multi-frame images when the scores indicate that the images are abnormal, so that the scores of the images with the abnormality are updated, each frame of image in the multi-frame images is ensured to be free of the abnormality, the imaging quality of the multi-frame images is ensured, and therefore, when at least one target image is output from the multi-frame images, the imaging quality of the target image can be ensured.
The terminal 100 may be a mobile phone, a tablet computer, a display device, a notebook computer, a teller machine, a gate, a smart watch, a head display device, a game machine, etc. As shown in fig. 3, the embodiment of the present application is described by taking the terminal 100 as an example of a mobile phone, and it is understood that the specific form of the terminal 100 is not limited to the mobile phone.
Referring to fig. 3, the terminal 100 may further include a camera, which may be a front camera or a rear camera of the terminal 100. The processor 20 may be coupled to the camera and control the camera to take successive shots. In step 02, continuous shooting is performed to generate a multi-frame image. Specifically, the camera 40 performs continuous shooting, and a multi-frame image can be generated after continuous shooting. For example, 5 frames, 10 frames, 15 frames, 20 frames, 25 frames, 30 frames or more frames of images can be generated after continuous shooting, the number of generated images can be a fixed value, can be set by a user in a self-defining manner, can be determined according to the duration of continuous shooting, and in the embodiment of the invention, the N frames represent the multiple frames of images.
Specifically, step 02: and carrying out feature recognition on each frame of image in the multi-frame images to output the score of each frame of image. It will be appreciated that after the processor 20 controls the camera to perform continuous shooting to generate multiple frames of images, the processor 20 may identify the features in each image according to the features in each frame of images, so as to output a score of each frame of images. Wherein the score is a score rated according to the performance of the feature after the feature of each frame of image is identified, and the higher the score, the better the performance of the feature in the image is indicated. When a plurality of features are included in an image, the score of each frame of image is also a plurality of, each score corresponding to a feature.
For example, the image is a portrait, and the features in each frame of image may be the pose of the human body, the expression of the human face, and so on, and then the pose of the human body and the expression of the human face in each frame of image may be identified to output the pose of the human body and the expression of the human face for scoring. If the pose of the human body in each frame of image is scored, the highest scoring is the frame of image with the optimal pose, so that the image with the optimal pose can be selected quickly, and if the expression of the human face in each frame of image is scored, the highest scoring is the frame of image with the optimal expression, so that the image with the optimal expression can be selected quickly.
Step 03: and when the scores are that the images are abnormal, processing the images with the abnormalities according to the scores of the multi-frame images so as to update the scores of the images with the abnormalities. The scoring is that the image is abnormal and the feature of the image is not recognized, specifically, the scoring is that the feature in the image is scored, and the scoring is 0 when the image is abnormal. The processor 20 may determine whether an image has an abnormality by determining whether the score is 0, and process the image having the abnormality by the scores of the multi-frame images, thereby updating the scores of the images having the abnormality.
When the score of one frame of the multi-frame image is that there is an abnormality in the image, the processor 20 may determine that there is an abnormality in the frame of the image according to the score of the frame of the image. In one embodiment, the processor 20 searches the other images for the score corresponding to the abnormal feature of the frame image according to the abnormal feature of the frame image. At this time, the processor 20 may use the average value of the scores in the other images as the score of the frame image, and the processor 20 may update the score of the frame image according to the score of the two frame images having the feature that are closest to the frame image in time sequence, so as to ensure that the multi-frame image does not have an abnormality. For example, the processor 20 generates 3 frames of images and orders the images in a time sequence, when the score of a feature in the 2 nd frame of images is that the image has an abnormality, the score of the feature in the second frame of images is 0, and at this time, the score of the feature may be updated according to the scores of the corresponding feature in the first frame of images and the second frame of images, that is, the scores of the second frame of images may be updated according to the scores of the first frame of images and the second frame of images.
Step 04: and outputting at least one target image from the multi-frame images according to the updated scores of the multi-frame images. According to the grading of the updated multi-frame images, one frame of image with the highest grading can be selected from the updated multi-frame images, namely at least one image with better characteristic performance is selected from the multi-frame images and used as a target image, and the grading of the updated multi-frame images does not contain abnormal images, so that the imaging quality of the output target image is ensured. The number of the target images may be one, two, three, four, five, six or more, and is not specifically exemplified herein. The number of the output target images can be a fixed number, or can be a user-defined number, or can be a number determined according to the number of the obtained images, for example, the number of the output target images can be a preset proportion, such as one tenth, one ninth, one eighth or more proportion value, of the number of the obtained images, and the limitation is not made, so that the number of the obtained target images is reasonable, the memory is not occupied too much, and the diversity of the user selection is not affected too little.
Referring to fig. 2-4, in some embodiments, the scoring includes a gesture score and an expression score, and step 02 may include the steps of:
021: carrying out human body gesture recognition on each frame of image in the multi-frame images to output gesture scores;
022: and carrying out facial expression recognition on each frame of image in the multi-frame images to output expression scores.
In certain embodiments, the identification module 12 is used to implement steps 021 and 022. Namely, the recognition module 12 is used for recognizing the human body gesture of each frame of images in the multi-frame images so as to output gesture scores; and carrying out facial expression recognition on each frame of image in the multi-frame images to output expression scores.
In some embodiments, the processor 20 may also be configured to: carrying out human body gesture recognition on each frame of image in the multi-frame images to output gesture scores; and carrying out facial expression recognition on each frame of image in the multi-frame images to output expression scores. That is, the processor 20 is further configured to implement steps 021 and 022.
Specifically, a preset human body posture model can be used for carrying out human body posture recognition on each frame of image, and a posture score of each frame of image is generated, so that images with attractive postures can be selected quickly. The facial expression recognition can be carried out on each frame of image by using a preset facial expression model, and the expression score of each frame of image is generated, so that the image with better expression can be selected more quickly.
Step 021 and step 022 may be performed simultaneously, so as to reduce the time period consumed for human body gesture recognition and facial expression recognition on each frame of image, and improve the working efficiency. Of course, in other embodiments, the step 021 and the step 022 may be performed sequentially, and the method is not limited herein, for example, the step 021 is performed first and then the step 022 is performed, or the step 021 is performed first and then the step 022 is performed.
Referring to fig. 2, 3 and 5, in some embodiments, step 021 may comprise the steps of:
023: when the focusing distance of the image is larger than a preset distance, identifying a human body joint point in the image; a kind of electronic device with high-pressure air-conditioning system
024: and generating a gesture score according to the human body joint points.
In some embodiments, identification module 12 may be used to implement steps 023 and 024. That is, the identification module 12 may be configured to identify a human body node in the image when the focusing distance of the image is greater than a predetermined distance; and generating a gesture score according to the human body joint points.
In some embodiments, the processor 20 may also be configured to: when the focusing distance of the image is judged to be larger than a preset distance, identifying a human body joint point in the image; and generating a gesture score according to the human body joint points. That is, processor 20 may also be used to implement steps 023 and 024.
Specifically, if the focusing distance of the image is smaller than the predetermined distance, it can be considered that the distance between the object and the camera is too short, and the camera cannot capture the whole of the body of the object, then the whole body of the object is not captured in the image. For example, when a front camera is used for self-photographing, the front camera is closer to the subject, and the front camera generally only photographs the face of the subject. If the human body joint point is identified on the image, the complete human body joint point cannot be identified, and the gesture score generated according to the human body joint point is inaccurate, and the resources of the processor 20 and the processing time are consumed, so that when the focusing distance of the image is judged to be smaller than or equal to the preset distance, the gesture identification can be directly not performed on the image, and the consumption of the resources of the processor 20 and the processing time are reduced.
Further, when the focusing distance of the image is greater than a predetermined distance, then a human body node in the image may be identified. Specifically, the human body joint point of each frame of images from the first frame to the N frame can be identified through a human body posture estimation algorithm, and when only a single person exists in the images, the human body joint point of the single person can be identified; when there are multiple people in the image, for example, two, three, four, five, or more people, the human body node of each person in the image may be identified. The human body posture estimation algorithm may include, but is not limited to, pifPAF, poseNet or YOLOv4, etc., and is not limited to.
After identifying the human body node in the image, a pose score may be generated from the human body node. For example, whether the posture of the human body is stretched, whether the human body is shielded, whether the human body is clear or not and the like can be identified according to the human body joint points, so that the posture score can be obtained.
In the embodiment, the human body joint point is first identified, then the gesture score is generated according to the identified human body joint point, so that the gesture score is more accurate, and finally the target image output based on the gesture score meets the requirements of the user.
Referring to fig. 2, 3 and 6, in some embodiments, the human body pose recognition includes face definition recognition, face occlusion recognition, pose expansion recognition and human body height recognition, and step 024 may include the steps of:
025: performing at least one of face definition recognition, face shielding degree recognition, gesture stretching degree recognition and human height recognition according to the human joint points so as to correspondingly generate at least one of face definition, face shielding degree, gesture stretching degree and human height; a kind of electronic device with high-pressure air-conditioning system
026: and generating a gesture score according to at least one of the face definition, the face shielding degree, the gesture stretching degree and the human body height.
In certain embodiments, the identification module 12 may be used to implement step 025 and step 026. That is, the recognition module 12 may perform at least one of face definition recognition, face occlusion recognition, pose expansion recognition, and human height recognition according to the human articulation point to correspond to at least one of face definition, face occlusion, pose expansion, and human height; and generating a pose score according to at least one of the face definition, the face shielding degree, the pose stretching degree and the human body height.
In some embodiments, the processor 20 may also be configured to: performing at least one of face definition recognition, face shielding degree recognition, gesture stretching degree recognition and human height recognition according to the human joint points so as to correspondingly generate at least one of face definition, face shielding degree, gesture stretching degree and human height; and generating a pose score according to at least one of the face definition, the face shielding degree, the pose stretching degree and the human body height. That is, the processor 20 may also be used to implement step 025 and step 026.
Specifically, the human body articulation points may include, but are not limited to, nose, eyes, ears, shoulders, elbows, hands, buttocks, knees, feet, etc., and may be selectively increased or decreased according to user needs. At least one identification mode of face definition identification, face shielding degree identification, gesture stretching degree identification and human height identification can be selectively carried out according to joint point information of each part in human body joint points, and corresponding face definition Score is generated face-clarity Face shielding Score face_occlusion Posture stretching Score stretch And human height Score h At least one data of (a) is provided.
In some embodiments, when the human body gesture recognition is performed, any one of the recognition modes of the human face definition recognition, the human face shielding degree recognition, the gesture stretching degree recognition and the human body height recognition may be performed. In other embodiments, when performing human body pose recognition, two recognition modes of face definition recognition, face shielding degree recognition, pose spreading degree recognition and human body height recognition may be performed, for example, face definition recognition and face shielding degree recognition, face definition recognition and pose spreading degree recognition, face definition recognition and human body height recognition, face shielding degree recognition and pose spreading degree recognition, and the like are performed, which are not listed herein. In still other embodiments, three of face clarity recognition, face occlusion degree recognition, pose expansion degree recognition, and human height recognition, for example, face clarity recognition, face occlusion degree recognition, and pose expansion degree recognition, face clarity recognition, face occlusion degree recognition, and human height recognition, face occlusion degree recognition, pose expansion degree recognition, and human height recognition may be performed when human body pose recognition is performed. In some other embodiments, when the human body gesture is identified, four identification modes of human face definition identification, human face shielding degree identification, gesture stretching degree identification and human body height identification can be executed, wherein the human face definition identification, the human face shielding degree identification, the gesture stretching degree identification and the human body height identification can be performed simultaneously or sequentially in any order.
Further, in order to more accurately recognize the pose of a photographed person in an image, when performing human body pose recognition, it is generally necessary to perform: the pose expansion degree recognition is performed according to the human body node to correspond to the recognition mode in which the pose expansion degree is generated, because the pose expansion degree is important for recognizing whether the pose is graceful. Then, after the gesture stretching degree recognition is performed, at least one of face definition recognition, face shielding degree recognition and human height recognition can be performed selectively synchronously or successively according to requirements, for example, if the human face in the image is focused on whether the human face is clear, the face definition recognition is also performed; if the face in the image is focused on whether the face is blocked, the face blocking degree recognition is required to be executed; if the human image height in the image is emphasized, human body height recognition is required to be performed, and if whether the human face in the image is blocked or not is emphasized, and if the human face is clear, human face definition recognition and human face shielding degree recognition are required to be performed. There are other cases, which are not listed here.
Further, if only one recognition mode of face definition recognition, face shielding recognition, pose expansion recognition, and human height recognition is performed, a pose score may be generated according to one recognition data generated correspondingly. For example, if only gesture stretching recognition is performed, a gesture score is generated from the gesture stretching. If two recognition modes of face definition recognition, face shielding recognition, gesture stretching recognition and human body height recognition are performed, a gesture score may be generated according to at least one of the two recognition data correspondingly generated. If three recognition modes of face clarity recognition, face shielding recognition, pose expansion recognition and human body height recognition are performed, a pose score may be generated according to at least one of the three recognition data correspondingly generated. If four recognition modes of face definition recognition, face shielding degree recognition, pose stretching degree recognition and human body height recognition are performed, a pose score may be generated according to at least one of the corresponding generated pose stretching degree, face shielding degree, face definition and human body height.
In one embodiment, step 025 is performed and the pose expansion degree, the face shielding degree, the face definition and the human body height are obtained correspondingly, and the pose Score may be calculated and generated according to at least one data of the pose expansion degree, the face shielding degree, the face definition and the human body height pose . For example, a pose score may be calculated and generated based on pose stretch and face definition. The pose score can also be calculated and generated according to the pose stretching degree and the face shielding degree. And the gesture score can be calculated and generated according to the gesture stretching degree, the face shielding degree and the face definition. And the gesture score can be calculated and generated according to the gesture stretching degree, the face shielding degree, the face definition and the human height.
In one example, the Score may be stretched according to pose stretch Face shielding Score face_occlusion Face definition Score face-clarity And human height Score h Calculate and generate a pose Score pose 。Score pose =Score face-clarity +Score face_occlusion +Score stretch +Score h . Alternatively, in another example, score face-clarity 、Score face_occlusion 、Score stretch And Score h Corresponding weights are a, b, c, d, respectively, then Score pose =aScore face-clarity +bScore face_occlusion +cScore stretch +dScore h . Alternatively, other calculation methods are possible, which are not listed here. Compared with some gesture recognition models, the gesture recognition models only predict the gesture of the shot human image, and neglect the definition of the human image and the face mask Interference of the baffle, the gesture score is obtained by combining the face definition, the face shielding degree, the gesture stretching degree recognition and the human body height recognition, so that the accuracy of human body gesture recognition can be improved, and the gesture of the output target image is graceful and the human image is clear.
Further, referring to fig. 2, 3 and 7, in some embodiments, step 025 may include the steps of:
0251: determining a face area according to the nose articulation point, the ear articulation point and the eye articulation point; a kind of electronic device with high-pressure air-conditioning system
0252: and calculating the face definition of the face area.
In some embodiments, the identification module 12 may also be used to implement steps 0251 and 0252. I.e. the identification module 12 is further operable to: determining a face area according to the nose articulation point, the ear articulation point and the eye articulation point; and calculating the face definition of the face region.
In some embodiments, the processor 20 may also be configured to: determining a face area according to the nose articulation point, the ear articulation point and the eye articulation point; and calculating the face definition of the face region. That is, the processor 20 may also be used to implement steps 0251 and 0252.
Specifically, the ear articulation point may include a left ear articulation point and a right ear articulation point, the eye articulation point may include a left eye articulation point and a right eye articulation point, the face region may be determined from the nose articulation point, the left ear articulation point, the right ear articulation point, the left eye articulation point, and the right eye articulation point, and then the face sharpness may be calculated using a face sharpness detection algorithm within the face region, e.g., the variance of the laplace operator may be used to represent the face sharpness Score within the face region face-clarity . Of course, the face definition may also be calculated by other algorithms, not specifically illustrated herein. In this embodiment, the face definition is calculated so that the pose score includes face definition data, so that the problem of face definition is considered when pose detection is performed, and thus the face of the obtained target image is relatively clear.
Further, referring to fig. 2, 3 and 8, in certain embodiments, step 025 further comprises the steps of:
0253: calculating nose shielding degree of the nose according to the confidence degree of the nose joint point;
0254: calculating the eye shielding degree of the eyes according to the confidence degree of the eye joint points;
0255: calculating the ear shielding degree of the ear according to the confidence degree of the ear joint point; a kind of electronic device with high-pressure air-conditioning system
0256: and calculating the face shielding degree according to the nose shielding degree, the eye shielding degree and the ear shielding degree.
In some embodiments, the identification module 12 may also be used to implement step 0253, step 0254, step 0255, and step 0256. That is, the identification module 12 may also be configured to: calculating nose shielding degree of the nose according to the confidence degree of the nose joint point; calculating the eye shielding degree of the eyes according to the confidence degree of the eye joint points; calculating the ear shielding degree of the ear according to the confidence degree of the ear joint point; and calculating the face shielding degree according to the nose shielding degree, the eye shielding degree and the ear shielding degree.
In some embodiments, referring to fig. 3, the processor 20 may be further configured to: calculating nose shielding degree of the nose according to the confidence degree of the nose joint point; calculating the eye shielding degree of the eyes according to the confidence degree of the eye joint points; calculating the ear shielding degree of the ear according to the confidence degree of the ear joint point; and calculating the face shielding degree according to the nose shielding degree, the eye shielding degree and the ear shielding degree. That is, processor 20 may also be used to implement steps 0253, 0254, 0255, and 0256.
Specifically, the face is a relatively critical area in the image, and the influence of the occlusion of the face on the quality of the image is relatively large, so that the face occlusion degree needs to be calculated. The obvious characteristic of the face is the five sense organs, and the face shielding degree can be accurately determined by calculating the shielding degree of the five sense organs. In this embodiment, the nose shielding degree, the eye shielding degree and the ear shielding degree can be calculated according to the corresponding data of the human body joint point, and then the face shielding degree is calculated according to the nose shielding degree, the eye shielding degree and the ear shielding degree, so that the face shielding degree obtained by calculation is more accurate.
When the human body posture estimation algorithm is used for identifying the human body joint points, the human body posture estimation algorithm can not only identify the coordinates of the human body joint points, but also provide the confidence of the human body joint points. The confidence level may be used to represent the probability of being the human body node, and a higher confidence level indicates a higher probability of being the human body node, and the degree to which the human body node is occluded may be considered to be lower. For example, a certain mapping relationship may be formed between the confidence level and the occlusion level. For example, the mapping relationship between the confidence level a and the occlusion level S is: s=1-a, if the confidence of the nose articulation point is 75%, then the nose occlusion of the nose articulation point can be considered to be 25%. Alternatively, the mapping relationship between the confidence level a and the occlusion level S is: s=aa, where a is a coefficient, which can be calculated by multiple experiments. The mapping relation corresponding to the confidence and shielding degree of each human body joint point can be the same or different. For example, the mapping relationship corresponding to the confidence level and the nose shielding level of the nose articulation point may be inconsistent with the mapping relationship corresponding to the confidence level and the ear shielding level of the ear articulation point, so that different calculation rules may be set for different human articulation points to attach to the corresponding human articulation points more accurately, and thus the nose shielding level and the ear shielding level may be calculated more accurately.
Further, the nose shielding degree Score of the nose can be calculated according to the confidence degree of the nose joint point nose_occlusion . Left eye occlusion Score for the left eye can be calculated based on the confidence level of the left eye joint point eye_occlusion-l . Calculating right eye occlusion Score of right eye according to confidence of right eye joint point eye_occlusion-r . According to the occlusion degree Score of the left eye eye_occlusion-l And right eye occlusion Score eye_occlusion-r Eye occlusion Score can be calculated eye_occlusion For example, score eye_occlusion =Score eye_occlusion-l +Score eye_occlusion-r . The left ear shielding degree Score of the left ear can be calculated according to the confidence degree of the left ear joint point ear_occlusion-l The right ear occlusion Score of the right ear can be calculated according to the confidence coefficient of the right ear joint point ear_occlusion-r Further according to the left ear shielding degree Score ear_occlusion-l And right ear occlusion Score ear_occlusion-r Calculating the ear shielding degree Score of the ear ear_occlusion . For example, score ear_occlusion =Score ear_occlusion-l +Score ear_occlusion-r
Still further, the Score may be based on nose occlusion nose_occlusion Eye occlusion Score eye_occlusion And ear occlusion Score ear_occlusion Face occlusion Score calculation face_occlusion . For example, score face_occlusion =Score nose_occlusion +Score eye_occlusion +Score ear_occlusion . Alternatively, weights corresponding to the nose shielding degree, the eye shielding degree and the ear shielding degree may be set correspondingly, and then the face shielding degree is calculated according to the weights, which are not listed here.
In other embodiments, the mouth shielding degree of the mouth can be calculated. For example, mouth occlusion may be calculated based on the confidence of the mouth articulation point. The mouth shielding degree can be added when the face shielding degree is calculated, so that the obtained face shielding degree is more accurate.
Of course, the face shielding degree can also be calculated through some deep learning algorithms. For example, the occluded region in the face region may be identified, then the proportion of the face region occupied by the occlusion degree region may be calculated, and so on without limitation.
Referring to fig. 2, 3 and 9, in some embodiments, step 025 may include the steps of:
02511: calculating the bending degree of the arm according to the hand joint point, the elbow joint point and the shoulder joint point;
02512: according to the foot joint points and the hip joint points, the leg stretching degree is calculated;
02513: calculating a first degree of twisting of the foot and the trunk according to the foot articulation point, the hip articulation point and the shoulder articulation point;
02514: calculating a second degree of torsion of the leg and the torso according to the knee joint point, the hip joint point and the shoulder joint point; a kind of electronic device with high-pressure air-conditioning system
02515: the pose expansion degree is calculated according to the arm bending degree, the leg stretching degree, the first twisting degree and the second twisting degree.
In some embodiments, the identification module 12 may also be configured to: calculating the bending degree of the arm according to the hand joint point, the elbow joint point and the shoulder joint point; according to the foot joint points and the hip joint points, the leg stretching degree is calculated; calculating a first degree of twisting of the foot and the trunk according to the foot articulation point, the hip articulation point and the shoulder articulation point; calculating a second degree of torsion of the leg and the torso according to the knee joint point, the hip joint point and the shoulder joint point; and calculating the posture extension degree according to the arm bending degree, the leg stretching degree, the first twisting degree and the second twisting degree. That is, the identification module 12 may also be used to implement steps 02511, 02512, 02513, 02514, and 02515.
In some embodiments, the processor 20 may also be configured to: calculating the bending degree of the arm according to the hand joint point, the elbow joint point and the shoulder joint point; according to the foot joint points and the hip joint points, the leg stretching degree is calculated; calculating a first degree of twisting of the foot and the trunk according to the foot articulation point, the hip articulation point and the shoulder articulation point; calculating a second degree of torsion of the leg and the torso according to the knee joint point, the hip joint point and the shoulder joint point; and calculating the posture extension degree according to the arm bending degree, the leg stretching degree, the first twisting degree and the second twisting degree. That is, the processor 20 may also be used to implement steps 02511, 02512, 02513, 02514, and 02515.
In particular, the influence of the pose expansion degree on the image quality is critical, and normal users continuously shoot so that images with graceful poses can be shot. If the pose expansion degree is too small, the user may not be fully expanded, and the frame image may not be an image desired by the user, and thus, it is necessary to detect the pose expansion degree of the human body in the image. Wherein the hand articulation points may include left hand articulation points and right hand articulation points, the elbow articulation points may include left elbow articulation points and right elbow articulation points, the shoulder articulation points may include left shoulder articulation points and right shoulder articulation points, the hip articulation points may include left hip articulation points and right hip articulation points, the knee articulation points may include left knee articulation points and right knee articulation points, and the foot articulation points may include left foot articulation points and right foot articulation points.
Further, the degree of arm bending, the degree of leg stretching, the first degree of twisting, and the second degree of twisting can all be calculated by the following formulas.
Wherein A, B, C can be the position coordinates of three related human body nodes, respectively, |A-B|| 2 Representing the 2-norm of a-B, arccos represents the inverse cosine function.
For example, for a single left arm, the degree of curvature Score of the left arm Elbow is calculated using the left hand joint point coordinate Wrist_l (x, y) (i.e., A in the above formula), the left Elbow joint point coordinate Elbow_l (x, y) (i.e., B in the above formula), and the left Shoulder joint point coordinate Shoulder_l (x, y) (i.e., C in the above formula) elbow_l (i.e., S in the above formula), the calculation formula is as follows:
the bending degree Score of the right elbow can be calculated by using the coordinates of the right hand joint point, the coordinates of the right elbow joint point and the coordinates of the right shoulder joint point through the above formula elbow_r . Wherein, the right hand joint point coordinate Wrist_r (x, y) is A in the above formula, the right Elbow joint point coordinate Elbow_r (x, y) is B in the above formula, and the right Shoulder joint point coordinate Shoulder_r (x, y) is C in the above formula.
The stretching degree Score of the left leg can be calculated by the above formula by using the coordinates of the left foot joint point, the left hip joint point and the left knee joint point leg-l . The left Foot joint point coordinate Foot_l (x, y) is A in the formula, the left Hip joint point coordinate Hip_l (x, y) is B in the formula, and the left Knee joint point coordinate Knee_l (x, y) is C in the formula.
The stretching degree Score of the right leg can be calculated by the above formula by using the coordinates of the right foot joint point, the coordinates of the right hip joint point and the coordinates of the right knee joint point leg-r . Wherein, the right Foot joint point coordinate foot_r (x, y) is A in the above formula, the right Hip joint point coordinate hip_r (x, y) is B in the above formula, and the right Knee joint point coordinate knee_r (x, y) is C in the above formula.
The first degree of twisting Score of the left foot and the trunk can be calculated by the above formula by using the coordinates of the left foot joint point, the coordinates of the left hip joint point and the coordinates of the left shoulder joint point twist_ankle_l . Wherein the Foot joint point coordinate foot_l (x, y) is A in the above formula, the Hip joint point coordinate hip_l (x, y) is B in the above formula, and the Shoulder joint point coordinate shoulder_l (x, y) is C in the above formula.
The first degree of twisting Score of the right foot and the trunk can be calculated by the above formula by using the coordinates of the right foot joint point, the coordinates of the right hip joint point and the coordinates of the right shoulder joint point twist_ankle_r . Wherein, the right Foot joint point coordinate foot_r (x, y) is A in the above formula, the right Hip joint point coordinate hip_r (x, y) is B in the above formula, and the right Shoulder joint point coordinate shoulder_r (x, y) is C in the above formula.
The second degree of twist Score of the left leg and the torso can be calculated by the above formula by using the left knee joint point coordinates, the left hip joint point coordinates and the left shoulder joint point coordinates twist_knee_l . Wherein the left knee joint point coordinate knee_l (x, y) is A in the above formula, the left Hip joint point coordinate hip_l (x, y) is B in the above formula, and the left Shoulder joint point coordinate shoulder_l (x, y) is C in the above formula.
The second degree of twisting Score of the right leg and the trunk can be calculated by the above formula by using the coordinates of the right knee joint point, the coordinates of the right hip joint point and the coordinates of the right shoulder joint point twist_knee_r . Wherein, the right foot joint point coordinate knee_r (x, y) is A in the above formula, the right Hip joint point coordinate hip_r (x, y) is B in the above formula, and the right Shoulder joint point coordinate shoulder_r (x, y) is C in the above formula.
Further, according to Score elbow_l Degree of curvature Score of right elbow elbow_r Degree of left leg stretch Score leg-l Degree of stretch Score of right leg leg-r First twisting process of left foot and trunkDegree Score twist_ankle_l First degree of torsion Score of right foot and torso twist_ankle_r Second degree of twist Score of left leg and torso twist_knee_l And a second degree of twisting of the right leg and the torso, calculating a pose expansion Score of the human body in the image stretch . For example, the degree of curvature of the left elbow, the degree of curvature of the elbow, the degree of stretching of the left leg, the degree of stretching of the right leg, the first degree of twisting of the left leg with the torso, the first degree of twisting of the right leg with the torso, the second degree of twisting of the left leg with the torso, and the second degree of twisting of the right leg with the torso are added directly or weighted to obtain the pose stretching degree of the human body, e.g., score stretch =Score elbow_l +Score elbow_r +Score leg-l +Score leg-r +Score twist_ankle_l +Score twist_ankle_r +Score twist_knee_l +Score twist_knee_r . Of course, other computing methods are also possible to calculate the pose expansion, which is not illustrated herein.
In this embodiment, the degree of curvature of the left elbow, the degree of curvature of the right elbow, the degree of stretching of the left leg, the degree of stretching of the right leg, the first degree of twisting of the left leg and the trunk, the first degree of twisting of the right leg and the trunk, the second degree of twisting of the left leg and the trunk, and the second degree of twisting of the right leg and the trunk are calculated, so that the degree of stretching of each limb of the human body is fully considered, and the calculated degree of stretching of the posture is more accurate, so that the posture of the finally output target image is also more graceful.
Referring to fig. 2, 3 and 10, in some embodiments, step 025 may include the steps of:
02516: and calculating the height of the human body according to the shoulder joint points and the foot joint points.
In some embodiments, the identification module 12 may also be configured to calculate the height of the person based on the shoulder joint and the foot joint. That is, the identification module 12 may also be used to implement step 02516.
In some embodiments, the processor 20 may also be configured to calculate the height of the person based on the shoulder joint and the foot joint. That is, the processor 20 may also be used to implement step 02516.
In particular, the height of the human body is also important for the quality of the image, and if the human body in the image is short, the morphology of the user is relatively good, not the image desired by the user. Therefore, it is necessary to detect the height of the human body in the image so as to be able to find an image in which the height of the human body is high. The height of the human body can be calculated according to the shoulder joint point and the foot joint point.
More specifically, the human height may be calculated using the ordinate of the left shoulder joint, the ordinate of the right shoulder joint, the ordinate of the left foot joint, and the ordinate of the right foot joint, with the following calculation formula:
where i denotes the ith person in the image, j=0 denotes the left shoulder, j=1 denotes the right shoulder, j=2 denotes the left foot, j=3 denotes the right foot, yij denotes the ordinate of the j-joint point of the ith person, and H denotes the height of the frame image.
Of course, the head-top coordinates of the human body and the sole coordinates of the human body can be recognized, and the human body height can be calculated. The height of the person may also be calculated by other algorithms, without limitation.
Referring to fig. 2, 3 and 11, in some embodiments, step 022 may include the steps of:
0221: identifying key points of the face of the image; a kind of electronic device with high-pressure air-conditioning system
0222: and generating expression scores according to the face key points.
In certain embodiments, the identification module 1214 may be used to implement step 0221 and step 0222. That is, the recognition module 1214 may be used to recognize the face keypoints of the image; and generating expression scores according to the face key points.
In some embodiments, the processor 20 may also be configured to: identifying key points of the face of the image; and generating expression scores according to the face key points. That is, processor 20 may also be used to implement steps 0221 and 0222.
Specifically, in order to accurately recognize the expression of the photographed person in each image, an image with a better expression can be found. Specifically, the face key point information of each image in the first frame to the N frame can be identified through a face key point identification algorithm. The face key point recognition algorithm can recognize the face key point information of all people in the image, for example, when only one person exists in the image, the face key point information of the person can be recognized, and when a plurality of persons exist in the image, the face key point information of each person can be detected. The face key point recognition algorithm may include Dlib or PFLD, and the like, which are not listed here. And a proper algorithm can be selected to recognize the key points of the human face according to the actual requirements. The face key points may include key points of features of eyes (left and right eyes), ears (left and right ears), nose, mouth, and the like of the photographed person.
And further, the expression of the photographed image can be recognized according to the recognized face key points. For example, expression data such as smile degree of a photographed person, eye opening size of the photographed person, mouth opening degree of the photographed person, and the like can be determined from the recognized face key points. Thus, according to the identified expression data, an expression score of the image can be generated. According to the expression scores, the expression of the photographed image in each image and the better image can be obtained.
Referring to fig. 2, 3 and 12, in some embodiments, facial expression recognition includes smile recognition, blink recognition and blink recognition, step 0222 includes the steps of:
2221: executing at least one of smile recognition, eye opening recognition and eye blinking recognition according to the face key points so as to correspond to at least one of the generated smile scores, eye opening scores and eye blinking scores; a kind of electronic device with high-pressure air-conditioning system
2222: and generating an expression score according to at least one of the smile score, the eye opening score and the eye blinking score.
In some embodiments, the identification module 12 may be used to implement step 2221 and step 2222. That is, the recognition module 12 may be configured to perform at least one of smile recognition, eye-open recognition, and eye-blink recognition according to the face keypoints to correspond to at least one of the generated smile score, eye-open score, and eye-blink score; a kind of electronic device with high-pressure and high-pressure functions: and generating an expression score according to at least one of the smile score, the eye opening score and the eye blinking score.
In some embodiments, the processor 20 may also be configured to: executing at least one of smile recognition, eye opening recognition and eye blinking recognition according to the face key points so as to correspond to at least one of the generated smile scores, eye opening scores and eye blinking scores; a kind of electronic device with high-pressure and high-pressure functions: and generating an expression score according to at least one of the smile score, the eye opening score and the eye blinking score. That is, the processor 20 may also be configured to implement step 2221 and step 2222.
Specifically, in order to facilitate accurate acquisition of expression scores, at least one of smile recognition, eye-opening recognition, and eye-blinking recognition may be performed according to the recognized face key points, and then at least one of smile scores, eye-opening scores, and eye-blinking scores may be correspondingly generated. For example, smile recognition can be performed according to the key points of the face, and a smile score can be generated after the smile recognition; eye opening identification can be carried out according to the face key points, and eye opening scores can be generated after the eye opening identification; the method can perform the blinking identification according to the human face joint points, and generate the blinking score after the blinking identification. Or, smile recognition and eye opening recognition can be performed according to the face key points, and smile scores and eye opening scores are correspondingly generated. Alternatively, smile recognition, eye opening recognition and eye blinking recognition may be performed according to the face key points, and a smile score, an eye opening score and an eye blinking score may be correspondingly generated.
Further, referring to fig. 13, step 2221 may include step 22211: performing smile recognition according to the face key points to generate smile scores; step 22212: performing eye opening identification according to the face key points to generate eye opening scores; and step 22213: the blinking identification is performed according to the human face joint points to generate the blinking score.
In some embodiments, at least one of step 22211, step 22212, and step 22213 may be performed selectively according to the needs of the user, so that the obtained expression score meets the expectations of the user. In one example, one of steps 22211, 22212, and 22213 may be selected to be performed as desired. In another example. Two of the steps 22211, 22212 and 22213, for example, step 22211+step 22212, step 22211+step 22213, step 22212+step 22213, may be selectively performed according to the need. In yet another example, the method includes the step of providing a first signal to the first sensor. Three of steps 22211, 22212 and 22213 may be selected to be performed.
In other embodiments, at least one of step 22211, step 22212, and step 22213 may be automatically selected to be performed based on multiple learning of a common characteristic of the user-selected images. For example, if the user frequently selects a comparatively large image with a happy smile and open eyes as the target image, the execution of step 22211 and step 22212 may be automatically selected. Alternatively, if the user frequently selects a comparison happy image, which blinks and laughters, as the target image, the execution of step 22211 and step 22213 may be automatically selected. Other cases are also possible and are not listed here.
Further, an expression score for the image may be obtained based on the score of step 2221. For example, the expression score may be obtained according to the smile score obtained in step 22211, the expression score may be obtained according to the eye-open score obtained in step 22212, the eye-blink score obtained in step 22213, the expression score may be obtained according to the smile score and the eye-open score, and the expression score may be obtained according to the smile score, the eye-open score, and the eye-blink score. If the expression score is obtained according to one of the smile score, the eye opening score and the eye blinking score, the smile score, the eye opening score and the eye blinking score can be directly used as the expression score, and if the expression score is obtained according to two or three of the smile score, the eye opening score and the eye blinking score, the expression score can be obtained by adding the results of the two or three.
In this embodiment, at least one of smile recognition, eye opening recognition and eye blinking recognition is performed according to the face key points, and expression scores are generated according to the scores, so that expression recognition data of photographed images in the images can be obtained, and a target image with a better expression can be obtained by selecting according to the expression scores.
Referring to fig. 2, 3 and 14, in some embodiments, the face keypoints comprise nose keypoints and mouth keypoints, step 22211 comprises the steps of:
222112: calculating the mouth corner raising degree according to the nose key points and the mouth key points;
222114: calculating the opening degree of the mouth angle according to the mouth key points; a kind of electronic device with high-pressure air-conditioning system
222116: and generating smile scores according to the mouth corner upward degree and the mouth corner opening degree.
In some embodiments, the identification module 12 may also be configured to: calculating the mouth corner raising degree according to the nose key points and the mouth key points; calculating the opening degree of the mouth angle according to the mouth key points; and generating smile scores according to the mouth corner upward degree and the mouth corner opening degree. That is, the identification module 12 may be used to implement steps 222112, 222114, and 222116.
In some embodiments, the processor 20 may also be configured to: calculating the mouth corner raising degree according to the nose key points and the mouth key points; calculating the opening degree of the mouth angle according to the mouth key points; and generating smile scores according to the mouth corner upward degree and the mouth corner opening degree. That is, the processor 20 may be used to implement step 222112, step 222114, and step 222116.
Specifically, smiling of a face is generally mainly that lips are changed, so that smiling data of the face can be determined according to the degree of mouth corner lifting and the degree of mouth corner opening. The change of the mouth relative to the nose can be calculated according to the coordinates of the nose key points and the coordinates of the mouth key points, so as to obtain the mouth angle upper lift Score rise_mouth Generally, the greater the lift on the corner of the mouth, the more happy the person being photographed will be. Wherein the number of nose keypoints may be one or more, and the number of mouth keypoints may be one or more. The degree of mouth opening can be calculated from the key points of the mouth, for example, the degree of mouth opening Score can be calculated from the coordinates of the key points of the upper lip and the coordinates of the key points of the lower lip expand_mouth . Wherein the number of upper lip keypoints and lower lip keypoints may be one or more.
Further, the methodObtaining the lift Score at the mouth angle rise_mouth And degree of mouth angle opening Score expand_mouth Then, according to the upper lift Score of the mouth angle rise_mouth And degree of mouth angle opening Score expand_mouth Calculating smile Score of a face smile . In one example, the lift on corner plus the degree of opening of the corner may yield a smile Score, i.e., score smile =Score rise_mouth +Score expand_mouth . In another example, smile Score is mapped to corner lift and corner stretch, e.g. Score smile =aScore rise_mouth +bScore expand_mouth . Of course, the smile score of the face can be calculated according to the upper lift of the mouth corner and the opening degree of the mouth corner in other manners.
Referring to fig. 2, 3 and 15, in some embodiments, the nose keypoints comprise a nose head keypoint, the mouth keypoint comprises two mouth angle keypoints, and step 22211 comprises the steps of:
222111: and calculating the mouth corner upward degree according to the nose head key point and the two mouth corner key points.
In some embodiments, the recognition module 12 may also be configured to calculate the degree of mouth lift based on the nose keypoints and the two mouth corner keypoints. That is, the identification module 12 may also be used to implement step 222111.
In some embodiments, the processor 20 may also be configured to calculate the degree of mouth lift based on the nose key point and the two mouth corner key points. That is, the processor 20 may also be used to implement step 222111.
Specifically, referring to fig. 16, the Nose key point 33 has the coordinates of Nose (x, y), the left-mouth-corner key point 48 has the coordinates of lips_l (x, y), the right-mouth-corner key point 54 has the coordinates of lips_r (x, y), and the upper-mouth-lift Score rise_mouth The calculation formula of (2) may be as follows:
referring to fig. 2, 3 and 17, in some embodiments, the mouth keypoints comprise an upper lip keypoint, a lower lip keypoint and a mouth corner keypoint, and step 222114 comprises the steps of:
222113: and calculating the mouth angle opening degree according to the upper lip key point, the lower lip key point and the mouth angle key point.
In some embodiments, the recognition module 12 may also be used to calculate the degree of mouth opening based on the upper lip keypoints, the lower lip keypoints, and the mouth corner keypoints. That is, the identification module 12 may also be used to implement step 222113.
In some embodiments, the processor 20 may be further configured to calculate the degree of mouth opening based on the upper lip keypoint, the lower lip keypoint, and the mouth angle keypoint. That is, the processor 20 may also be used to implement step 222113.
Referring to fig. 16, the mouth angle key points may include two, namely, the key point 48 and the key point 54, the upper lip key point may include two, namely, the key point 49 and the key point 53, the lower lip key point includes two, namely, the key point 55 and the key point 59, and the mouth angle opening degree Score is calculated expand_mouth The formula of (c) may be as follows:
wherein dist 49_59 Representing the distance between keypoint 49 and keypoint 59, dist 53_55 Representing the distance between keypoint 53 and keypoint 55, dist 48_54 Representing the distance between keypoint 48 and keypoint 54. The coordinates of the keypoints 49, 59, 53, 55, 48, 54 may be calculated by the face keypoint identification algorithm or other algorithms described above, and are not described in detail herein.
Of course, in other embodiments, the degree of mouth opening may also be calculated by other embodiments, without limitation.
Referring to fig. 2, 3 and 18, in some embodiments, the face keypoints comprise eye keypoints, and step 22212 comprises the steps of:
222122: the eye opening degree of each eye is calculated according to the eye key point of each eye to generate an open eye score.
In some embodiments, the identification module 12 may also be configured to calculate the eye openness of each eye based on the eye keypoints for each eye to generate an eye-open score. That is, the identification module 12 may also be used to implement step 222122.
In some embodiments, the processor 20 may be further configured to calculate an eye openness of each eye based on the eye keypoints of each eye to generate an eye-open score. That is, the processor 20 may also be used to implement step 222122.
The degree of eye opening is important for the expression of the photographed image, and directly affects the overall aesthetic appearance of the photographed image, and in general, users want to photograph an open eye image. In this embodiment, the eye opening degree of each eye is calculated, and the obtained eye opening score includes the eye opening degree of each eye, so that the expression score also includes the eye opening degree of each eye, so that the eyes of the photographed image in the finally obtained target image are more likely to be open.
In particular, each eye may include an upper eyelid, a lower eyelid, and an canthus, and the eye keypoints may include an upper eyelid keypoint, a lower eyelid keypoint, and an canthus keypoint. The eye opening degree of the left eye can be calculated according to the upper eyelid key point, the lower eyelid key point and the corner key point of the left eye. The eye opening degree of the right eye can be calculated according to the upper eyelid key point, the lower eyelid key point and the corner key point of the right eye. And then the eye opening score can be generated according to the eye opening degree of the left eye and the eye opening degree of the right eye. For example, the open eye score may be the degree of eye opening for the left eye plus the degree of eye opening for the right eye.
More specifically, with continued reference to FIG. 19, the left eye's canthus key points may include two, key point 36 and key point 39, respectively, the left eye's upper eyelid key points may include two, key point 37 and key point 38, respectively, and the left eye's lower eyelid key points may includeTwo, keypoints 40 and 41, respectively. The left eye openness Score can be obtained by aspect ratio calculation based on the keypoint 36, the keypoint 37, the keypoint 38, the keypoint 40, and the keypoint 41 expand_eye_l . The corner of the eye keypoints of the right eye may include two, keypoint 42 and keypoint 45, respectively, the upper eyelid keypoints of the right eye may include two, keypoint 43 and keypoint 44, respectively, and the lower eyelid keypoints of the right eye may include two, keypoint 47 and keypoint 48, respectively. The eye openness Score of the left eye can be obtained by calculating the aspect ratio (longitudinal and transverse comparison) based on the key point 42, the key point 43, the key point 44, the key point 45, the key point 47 and the key point 48 expand_eye_r The extent of opening of eyes Score expand_eye= Score expand_eye_l +Score expand_eye_r
Referring to fig. 2, 3 and 20, in some embodiments, step 22213 includes the steps of:
222131: a blink monocular score is generated based on the eye opening level of the left eye and the eye opening level of the right eye.
In some embodiments, the identification module 12 may also be configured to generate a blink score based on the eye openness of the left eye and the eye openness of the right eye. That is, the identification module 12 may also be used to implement step 222131.
In some embodiments, the processor 20 may also be configured to generate a blink score based on the eye openness of the left eye and the eye openness of the right eye. That is, the processor 20 may also be used to implement step 222131.
Specifically, in order to recognize whether or not a photographed image has a blinking situation, that is, whether or not a user blinks only one eye, for example, opens only the left eye and closes the right eye; alternatively, the right eye is opened and the left eye is closed. The method can judge whether the shot portrait in the image has the condition of blinking according to the eye opening degree of the left eye and the eye opening degree of the right eye, and can determine which eye is open and which eye is closed, so that the expression score can comprise the blinking score, and further the condition that the user wants to shoot the blinking is satisfied, so that the expression in the output target image is richer.
In one example, whether a blink condition exists may be determined based on the difference between the eye opening level of the left eye and the eye opening level of the right eye. For example, the greater the difference between the eye opening degree of the left eye and the eye opening degree of the right eye, the greater the probability of blinking the single eye; the smaller the difference between the eye opening degree of the left eye and the eye opening degree of the right eye, the smaller the probability of blinking the single eye.
In another example, whether a blink exists and a blink score is generated may be determined from the absolute value of the ratio between the difference between the eye opening degree of the left eye and the eye opening degree of the right eye and the sum between the eye opening degree of the left eye and the eye opening degree of the right eye. The specific calculation formula can be as follows:
wherein abs represents the absolute value of the pair, score expand_eye_l Indicating the degree of eye openness of the left eye, score expand_eye_r Indicating the eye opening degree of the right eye, score wink Indicating a blink monocular score. If Score wink The larger the value, the greater the probability that a blink exists, if Score wink The smaller the value, the smaller the probability that a blink exists.
In some embodiments, the expression score comprises a smile score, a eyes open score, a eyes blink score, from which the expression score is derived. For example, smile Score smile Score for eye opening expand_eye And blink monocular Score wink Adding to obtain expression Score emotion I.e. Score emotion =Score smile +Score expand_eye +Score wink
Referring to fig. 2, 3 and 21, in some embodiments, step 03 includes the steps of:
031: determining that the score is abnormal when the score lacks a gesture score and/or an expression score;
032: processing the abnormal image according to the gesture scores of the multi-frame images so as to update the gesture scores of the abnormal image; and/or
033: and processing the image with the abnormality according to the expression scores of the multi-frame images so as to update the expression scores of the image with the abnormality.
In some embodiments, update module 13 may be configured to perform steps 031, 032, and 033. That is, the update module 13 may be configured to determine that the score is an abnormality in the image when the score lacks a gesture score and/or an expression score; processing the abnormal image according to the gesture scores of the multi-frame images so as to update the gesture scores of the abnormal image; and/or processing the image with the abnormality according to the expression scores of the multi-frame images so as to update the expression scores of the image with the abnormality.
In some embodiments, the processor 20 may be configured to determine that the score is an abnormality in the image when the score lacks a gesture score and/or an expression score; processing the abnormal image according to the gesture scores of the multi-frame images so as to update the gesture scores of the abnormal image; and/or processing the image with the abnormality according to the expression scores of the multi-frame images so as to update the expression scores of the image with the abnormality. That is, processor 20 may be configured to implement steps 031, 032 and 033.
Specifically, in the above embodiment, the pose score and the expression score of the image can be obtained, and because the camera shoots a scene that moves rapidly, the quality of the obtained multi-frame image cannot be ensured, and the low-quality multi-frame image easily causes the situation of missed recognition or false recognition of the pose score and the expression score. The condition of missing recognition is specifically shown as that, for example, a face exists in an image, and the pose of the image is scored without the score of the definition of the face or the shielding degree of the face. For another example, there is a smile in the image, but there is no smile score in the expression scores of the image.
Thus, when there is a lack of a gesture score and/or an expression score in the score, then it may be determined that the score is an abnormality in the image. Specifically, when any one of the scoring values is 0 in the scoring of the gesture of a frame of image, it can be indicated that the gesture feature of the frame of image with the scoring value of 0 is not recognized, that is, the image is abnormal, and/or when any one of the scoring values is 0 in the scoring of the expression of a frame of image, it can be indicated that the expression feature of the frame of image with the scoring value of 0 is not recognized. For example, when the face definition of the image is 0 in the pose score, if the frame image contains a face, it is indicated that the face definition identification of the image is not identified, and the image is abnormal. For another example, when the eye opening score of the image is 0 in the expression score, if the frame image contains eyes, it is indicated that the eye opening score of the image is not recognized, and the image is abnormal.
From this, it can be derived that when the processor 20 traverses the pose score and the expression score of each frame image, if the score of the pose feature or the expression feature appearing in the frame image is 0, the processor 20 determines that there is an abnormality in the frame image.
In some embodiments, if a certain gesture feature or expression feature does not exist in the frame image, in the scoring of the frame image, the score of the gesture feature or expression feature may be assigned as null, so as to ensure that the scoring calculation of the image is not affected and meanwhile, misjudgment is not generated.
In one example, step 032, the image with the anomaly may be processed according to the pose scores of the multi-frame images, so as to update the pose scores of the image with the anomaly. For example, the pose scores of the images with the anomalies are assigned again according to the average value of the pose scores in the multi-frame images, so that the pose scores of the images with the anomalies are updated. For another example, the pose score of the image with the abnormality may be updated by assigning a value to the pose score of the image with the abnormality based on two frames of images before and after the image with the abnormality and using an average value of the two frames of images.
In yet another example, in step 033, an image with an anomaly may be processed according to the expression scores of the multiple frames of images, so as to update the expression scores of the image with the anomaly. For example, according to the average value of expression scores in the multi-frame images, the expression scores of the images with the anomalies are assigned again, so that the expression scores of the images with the anomalies are updated. For another example, the expression score of the image with the abnormality may be assigned based on two images before and after the image with the abnormality, and an average value of the two images may be used to update the expression score of the image with the abnormality.
When the gesture score and the expression score are determined simultaneously, the step 032 and the step 033 may be performed synchronously, so that the gesture score and the expression score of the image are processed and updated simultaneously, so that the time consumed for updating the score is reduced, and the working efficiency is improved. Of course, in other embodiments, step 032 and step 033 may be performed sequentially, which is not limited herein. For example, step 032 is performed first, then step 033 is performed, or step 033 is performed first, then step 032 is performed.
In this embodiment, whether the image is abnormal or not may be determined according to whether the pose score and/or the expression score are absent in the judgment score, and the score of the image with the abnormality is processed and updated, so as to exclude the influence on the output target image when the pose recognition and/or the expression recognition is omitted in the image, so as to ensure the imaging quality of the output target image.
Referring to fig. 2, 3 and 22, in some embodiments, step 032 includes the steps of:
0321: and calculating the pose scores of the images with the abnormality according to the pose scores of two frames of images which are positioned before and after the images with the abnormality in the time sequence in the multi-frame images.
In some embodiments, update module 13 may be used to perform step 0321. That is, the updating module 13 may be configured to calculate the pose score of the image having the abnormality from the pose scores of two frames of images that are located in front and behind the image having the abnormality in time series among the multi-frame images.
In some embodiments, the processor 20 may be configured to calculate the pose score of the image with the anomaly from the pose scores of two frames of images of the multi-frame image that are temporally located before and after the image with the anomaly. That is, the processor 20 may be configured to implement step 0321.
Specifically, when abnormality occurs in the pose score of a certain frame of image in the multi-frame images, the pose score of the image with the abnormality can be calculated according to the pose scores of the two frames of images positioned before and after the frame of image in time sequence. The front and rear frame images represent one frame corresponding to the posture score of the frame image with abnormality.
For example, in the frame image, the face definition score is 0, and the face definition score is abnormal, at this time, if the face definition score does not exist in the previous frame image of the frame image, the frame image is continuously forward in time sequence until the face definition score is found out to be included in the frame image, and the frame image is the previous frame image which can process and update the frame image, and similarly, the face definition score also needs to be included in the pose score of the next frame image. The specific formula is as follows:
Wherein Score is a pose Score representing the computed image with the anomaly, i represents the position of the image with the anomaly in time sequence, l 1 The distance between the previous frame image and the image i having the abnormality, i.e., the number of frames, is represented. l (L) 2 The distance between the image of the subsequent frame and the image in which the abnormality exists, i.e., the number of frames, is represented.Scoring of the previous frame of image, +.>For the scoring of the image of the following frame,
when Score is calculated according to the pose scores of the front frame image and the rear frame image, score can be assigned to the pose Score of the image with the abnormality, so that updating of the pose Score of the image with the abnormality is completed.
In some embodiments, whether the image contains a gesture feature corresponding to the gesture score or an expression feature corresponding to the expression score, it is identified whether the image contains all of the gesture score and the expression score. For example, when the image does not include a smile, the smile score of the frame image is also identified, and at this time, the smile score is also 0, the processor 20 also determines that there is an abnormality in the frame image, and calculates the smile scores of the two images before and after the frame image by the above formula, so as to update the smile score of the frame image. It should be noted that if a certain feature does not appear in the image, in theory, the feature does not exist in several frames before and after the frame image, for example, the frame image is the 20 th frame in time sequence and does not include a smile, and the 19 th frame and the 21 st frame image also do not include a smile. Therefore, when the processor 20 determines that the frame image is abnormal, even if the frame image is updated, the effect on the final score of the frame image is small.
Referring to fig. 2, 3 and 23, in some embodiments, step 033 includes the steps of:
0331: and calculating the expression scores of the images with the abnormality according to the expression scores of two frames of images which are positioned before and after the images with the abnormality in the time sequence in the multi-frame images.
In certain embodiments, the update module 13 may be used to perform step 0331. That is, the updating module 13 may be configured to calculate the expression score of the image having the abnormality from the expression scores of two frames of images that are located in front and behind the image having the abnormality in time series among the multiple frames of images.
In some embodiments, the processor 20 may be configured to calculate the expression score of the image with the abnormality from the expression scores of two frames of images of the multi-frame image that are temporally located before and after the image with the abnormality. That is, the processor 20 may be configured to implement step 0331.
Specifically, in the same manner as in the step 0321, when the expression score of a frame of image appears in the multi-frame image and there is an abnormality, the expression score of the image with the abnormality can be calculated according to the expression scores of the two frames of images located before and after the frame of image in time sequence. The front and rear frame images represent one frame corresponding to the expression score of the frame image with abnormality. For example, in the frame image, the smile score is 0, that is, the smile score of the person is abnormal, at this time, if the smile score does not exist in the previous frame image of the frame image, the frame image is continuously moved forward in time sequence until the smile score is found out to be included in the frame image, and the smile score is required to be included in the expression score of the next frame image as well.
Similarly, by using the formula in the above embodiment, the Score may be calculated according to the expression scores of the two images, and the Score may be assigned to the expression Score of the image with the abnormality, thereby completing the update of the expression Score of the image with the abnormality.
Referring to fig. 2, 3 and 24, in some embodiments, step 04 includes the steps of:
041: sequencing the multi-frame images according to the identification score from large to small or from small to large; a kind of electronic device with high-pressure air-conditioning system
042: images in the order within the predetermined order are selected as target images.
In some embodiments, output module 14 is used to implement step 041 and step 042. That is, the output module 14 is configured to sort the multi-frame images in order from large to small or from small to large according to the recognition scores; and selecting images in a predetermined order as target images.
In some embodiments, the processor 20 may be further configured to rank the multi-frame images in order from large to small or from small to large according to the identification score; and selecting images in a predetermined order as target images. That is, processor 20 may also be used to implement step 041 and step 042
Specifically, the gesture score obtained in the above embodiment may be directly used as a gesture score, and the expression score may be directly used as an expression score. In one example, the pose Score pose Plus expression Score emotion A Score per frame of image may be obtained, i.e. score=score pose +Score emotion . Alternatively, the gesture Score pose And expression Score emotion Corresponding to the weight k1 and the weight k2, respectively, the pose Score is calculated pose Multiplying the corresponding weight k1 to obtain expression Score emotion Multiplying the corresponding weight k2 and then adding the two to obtain a detection Score, i.e., score=k1×score pose +k2*Score emotion . The weights k1 and k2 may be fixed values, or may be customized by the user, and the weights k1 and k2 may be automatically adjusted according to the preference of the deep learning user, for example, if the user is found to pay attention to the gesture multiple times, the user may set the larger k1, and if the user is found to pay attention to the expression multiple times, the user may set the larger k 2.
Thus, the multi-frame images can be sorted in order from small to large or from large to small according to the magnitude of the score, and then images in the order within the predetermined order can be selected as the target images. For example, when the multi-frame images are sorted from large to small in terms of the size of the score, the predetermined order may be the order of top ranking, for example, the predetermined order may be the first, the first two, the first three, the first five; alternatively, when the multi-frame images are sorted from small to large according to the size of the score, the predetermined order may be the order after the ranking, for example, the predetermined order may be the last, last two, last three, last five. Thus, the obtained target image has better gesture and better expression, and is more fit with the expectations of users.
In one example, the number of target images is one, and when the multi-frame images are ordered from small to large according to the scoring size, the predetermined order is the last name; when the multi-frame images are ordered from large to small according to the scoring size, the preset order is the first name, and therefore the quality of the obtained target image is better.
In another example, the number of the target images may be two, three, or the like, so that the user may select an image more suitable for the user's desire from the plurality of target images. Compared with the method that the user directly selects one target image from all the images, the time for selecting the target image by the user can be saved, and the use experience of the user can be further enhanced.
Further, in one example, after outputting the target image, the processor 20 or the terminal 100 may delete other images to save the memory space of the terminal 100. In another example, after the target image is output, the target image may be used as a basic frame, and other images may be fused into the target image, so that the pose and expression of the target image are better, and the target image may be processed by High-Dynamic Range (HDR), glare, deblurring, etc., so that the obtained target image is clearer, and the quality and aesthetic feeling of the target image are improved.
Referring to fig. 2, 3 and 25, in some embodiments, the scoring includes a gesture score and an expression score, and the image processing method may include the steps of:
027: and calculating the recognition score of each frame of image according to the gesture score, the expression score, the first weight preset by the gesture score and the second weight preset by the expression score of each frame of image.
In certain embodiments, the output module 14 is used to implement step 027 in certain embodiments. That is, the output module 14 is configured to calculate the recognition score of each frame image according to the pose score, the expression score, the first weight preset by the pose score, and the second weight preset by the expression score of each frame image.
In some embodiments, the processor 20 may be further configured to calculate the recognition score of each frame image according to the pose score, the expression score, the first weight preset by the pose score, and the second weight preset by the expression score of each frame image. That is, the processor 20 may also be used to implement step 027.
Specifically, the gesture score obtained in the above embodiment may be directly used as a gesture recognition score, and the expression score may be directly used as an expression recognition score. At this time, a first weight α of the gesture Score and a second weight β of the expression Score may be preset, and the recognition Score is the gesture recognition Score pose Product with alpha, plus expression recognition Score emotion Product with β, i.e. score=α. Score pose +β*Score emotion
Further, to ensure that the resulting Score is more accurate, different weights may be set for different pose scores and different expression scores, e.g., in the pose Score, the face clarity Score face-clarity Is provided with a first weight alpha of the face definition 1 Face shielding degree Score face_occlusion A first weight alpha with face shielding degree is set 2 Pose expansion Score stretch Is provided with a first weight alpha of the pose expansion degree 3 Human body height Score h Is provided with a first weight alpha of human body height 4 Then the gesture recognition Score pose For human face definition and alpha 1 Is the product of (a) and (a) of face shielding degree 2 Product of (a), pose expansion and alpha 3 Product of (2) and human height and alpha 4 The sum of the products of (2), i.e. Score pose =α 1 *Score face-clarity2 *Score face_occlusion3 *Score stretch4 *Score h . Similarly, expression Score emotion Score can be scored for smile smile Score for eye opening expand_eye One eye Score for blinking wink Respectively setting smile identification second weight value beta 1 Second weight beta for eye opening recognition 2 Blink monocular recognition second weight beta 3 Score then emotion =β 1 *Score smile2 *Score expand_eye3 *Score wink . Thus, the recognition Score score=α 1 *Score face-clarity2 *Score face_occlusion3 *Score stretch4 *Score h1 *Score smile2 *Score expan d_eye3 *Score wink . Wherein the expression Score emotion Also comprises a mouth angle upper lift Score rise_mouth Degree of mouth angle opening Score expand_mouth And the like, the score of each type can be provided with a corresponding weight, and the description is omitted herein.
Thus, by calculating the score of each frame image, the multi-frame images can be sorted according to the magnitude value of the score, thereby taking the images in the predetermined order as target images to be output to the user.
Referring to fig. 2, 3 and 26, in some embodiments, step 043 may include the steps of:
0421: acquiring images in a predetermined order as pre-output images;
0422: acquiring a time difference between the shooting time of a pre-output image and the shooting time of an intermediate frame image, wherein the intermediate frame image is a pre-output image positioned at an intermediate time sequence in the pre-output image;
0423: deleting the pre-output image with the time difference greater than a predetermined threshold; a kind of electronic device with high-pressure air-conditioning system
0424: and outputting the pre-output image with the highest recognition score as a target image.
In certain embodiments, the output module 14 is used to implement step 0421, step 0422, step 0423, and step 0424 in certain embodiments. That is, the output module 14 is configured to acquire images in a predetermined order as pre-output images; acquiring a time difference between the shooting time of a pre-output image and the shooting time of an intermediate frame image, wherein the intermediate frame image is a pre-output image positioned at an intermediate time sequence in the pre-output image; deleting the pre-output image with the time difference greater than a predetermined threshold; and outputting the pre-output image with the highest recognition score as a target image.
In some embodiments, the processor 20 may be further configured to acquire images in a predetermined order as pre-output images; acquiring a time difference between the shooting time of a pre-output image and the shooting time of an intermediate frame image, wherein the intermediate frame image is a pre-output image positioned at an intermediate time sequence in the pre-output image; deleting the pre-output image with the time difference greater than a predetermined threshold; and outputting the pre-output image with the highest recognition score as a target image. That is, the processor 20 may also be used to implement step 04221, step 0422, step 0423, and step 0424.
After the images in the predetermined sequence are acquired according to the above embodiment, the quality of the obtained multi-frame images cannot be ensured because the images shot by the camera are all fast moving scenes, and the condition of missed recognition or false recognition of the gesture score and the expression score is easy to occur due to the low-quality multi-frame images. The misidentification is specifically implemented by misjudging a certain frame of image, so that the score of the frame of image is higher, and therefore images in a preset sequence possibly contain misidentified images, the accuracy of a final output target image is affected, and the imaging quality of the output target image is not the highest frame. Therefore, after selecting the images whose identification scores are in the predetermined order, the selected images are required to be used as pre-output images for the subsequent screening.
Specifically, in step 0422, after the pre-output image is acquired, one frame of image with the shooting time in the middle in the pre-output image is taken as a reference image, and the time difference between the shooting time of each other frame of image in the pre-output image and the reference image is calculated. In the continuous shooting process, the closer the shooting time is to the middle moment, the better the shot image quality is, therefore, one frame of image with the shooting moment in the middle is taken as a reference image, and the preset sequence is an odd number, so that the number of the acquired pre-output images is ensured to be an odd number, and the middle frame of image is found conveniently.
The intermediate frame image is a frame with the shooting time being the most middle in the predetermined output image, and does not represent time sharing of the first and last time images in the predetermined output image by the intermediate frame image.
Because the image with better performance should be a group of relatively consecutive images in the continuous shooting process, that is, the time difference between each frame of image is not too long, in step 0423, after the time difference between the shooting time of other each frame of image in the pre-output image and the reference image is obtained, the image with abnormal performance can be removed by comparing the time difference with the predetermined threshold value. Specifically, when the time difference between a certain frame image and an intermediate frame image is greater than a predetermined threshold value, the frame image is an abnormal image, and the image is removed.
As shown in fig. 27, the pre-output image 50 has 5 pieces in total, and is a 17 th frame image, a 19 th frame image, a 20 th frame image, a 25 th frame image, and a 26 th frame image in time series. It is known that the intermediate frame image is the 20 th frame image, if the predetermined threshold is 4, the time difference between the 25 th frame image and the 26 th frame image and the 20 th frame image is greater than 4, and at this time, the 25 th frame image and the 26 th frame image are rejected.
After step 0423 is performed, one or more images with higher score, that is, better performance and no false recognition can be obtained, and at this time, a frame of image with the highest recognition score can be output as the target image according to the recognition score of each frame of image in the deleted pre-output image.
In the embodiment, the time difference between each frame of image and the middle frame of image in the pre-output image and the preset threshold value are calculated, and the pre-output image with the time difference larger than the preset threshold value is deleted, so that the image with false identification is not existed in the pre-output image, the accuracy of the output target image is ensured, and the imaging quality of the target image is ensured to be higher.
Referring to fig. 28, one or more non-transitory computer-readable storage media 300 embodying a computer program 301 of an embodiment of the present application, when executed by one or more processors 20, causes the processors 20 to perform the image processing method of any of the embodiments described above.
For example, referring to fig. 1, when the computer program 301 is executed by one or more processors 20, the processor 20 is caused to perform the steps of:
01: performing continuous shooting to generate multi-frame images;
02: performing feature recognition on each frame of image in the multi-frame images to output scores of each frame of image;
03: when the scores are that the images are abnormal, the images with the abnormalities are processed according to the scores of the multi-frame images so as to update the scores of the images with the abnormalities; a kind of electronic device with high-pressure air-conditioning system
04: and outputting at least one target image from the multi-frame images according to the updated scores of the multi-frame images.
For another example, please refer to fig. 26, which illustrates a computer program 301, when executed by one or more processors 20, causes the processors 20 to perform the steps of:
0421: acquiring images in a predetermined order as pre-output images;
0422: acquiring a time difference between the shooting time of a pre-output image and the shooting time of an intermediate frame image, wherein the intermediate frame image is a pre-output image positioned at an intermediate time sequence in the pre-output image;
0423: deleting the pre-output image with the time difference greater than a predetermined threshold; a kind of electronic device with high-pressure air-conditioning system
0424: and outputting the pre-output image with the highest recognition score as a target image.
In the description of the present specification, reference to the terms "one embodiment," "some embodiments," "illustrative embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and further implementations are included within the scope of the preferred embodiment of the present application in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the embodiments of the present application.
While embodiments of the present application have been shown and described above, it will be understood that the above embodiments are illustrative and not to be construed as limiting the present application, and that variations, modifications, alternatives, and variations may be made to the above embodiments by one of ordinary skill in the art within the scope of the present application.

Claims (15)

1. An image processing method, comprising:
performing continuous shooting to generate multi-frame images;
calculating the score of each frame of the image according to the gesture score, the expression score, a first weight preset by the gesture score and a second weight preset by the expression score of each frame of the image;
determining that the score of the image is abnormal when the score of the image lacks the gesture score and/or the expression score;
processing the pose scores of the images with anomalies according to the pose scores of the images with multiple frames to update the pose scores of the images with anomalies; and/or
Processing the expression scores of the images with the abnormality according to the expression scores of the images with multiple frames to update the expression scores of the images with the abnormality; a kind of electronic device with high-pressure air-conditioning system
And outputting at least one target image from the multiple frames of images according to the updated scores of the multiple frames of images.
2. The image processing method according to claim 1, wherein calculating the score of the image per frame based on the pose score, the expression score, the first weight preset for the pose score, and the second weight preset for the expression score of the image per frame, comprises:
carrying out human body gesture recognition on each frame of images in a plurality of frames of images so as to output the gesture scores;
and carrying out facial expression recognition on each frame of images in the multiple frames of images so as to output the expression scores.
3. The image processing method according to claim 2, wherein said performing human body posture recognition on each of the plurality of frames of the images to output the posture score includes:
when the focusing distance of the image is larger than a preset distance, identifying a human body joint point in the image; a kind of electronic device with high-pressure air-conditioning system
And generating the gesture score according to the human body joint point.
4. The image processing method according to claim 3, wherein the human body pose recognition includes face definition recognition, face occlusion recognition, pose expansion recognition, and human body height recognition, the generating the pose score according to the human body articulation point includes:
Executing at least one of the face definition recognition, the face shielding degree recognition, the gesture stretching degree recognition and the human height recognition according to the human joint points so as to correspondingly generate at least one of the face definition, the face shielding degree, the gesture stretching degree and the human height; a kind of electronic device with high-pressure air-conditioning system
And generating the gesture score according to at least one of the face definition, the face shielding degree, the gesture stretching degree and the human height.
5. The image processing method according to claim 2, wherein said performing facial expression recognition on each of the plurality of frames of the images to output the expression score includes:
identifying face key points of the image; a kind of electronic device with high-pressure air-conditioning system
And generating the expression scores according to the face key points.
6. The image processing method according to claim 5, wherein the facial expression recognition includes smile recognition, open eye recognition, and blink recognition, and the generating the expression score from the facial key points includes:
executing at least one of the smile recognition, the eye opening recognition and the eye blinking recognition according to the face key points so as to correspond to at least one of the generated smile score, the eye opening score and the eye blinking score; a kind of electronic device with high-pressure air-conditioning system
And generating the expression score according to at least one of the smile score, the eye opening score and the eye blinking score.
7. The image processing method of claim 6, wherein the face keypoints comprise nose keypoints and mouth keypoints, wherein performing the smile identification based on the face keypoints to generate the smile score comprises:
calculating the mouth corner rising degree according to the nose key points and the mouth key points;
calculating the opening degree of the mouth angle according to the mouth key points; a kind of electronic device with high-pressure air-conditioning system
And generating the smile score according to the upper lift of the mouth corner and the opening degree of the mouth corner.
8. The image processing method of claim 6, wherein the face keypoints comprise eye keypoints, the eye-open recognition being performed in accordance with the face keypoints to generate the eye-open score, comprising:
and calculating the eye opening degree of each eye according to the eye key point of each eye so as to generate the open eye score.
9. The image processing method of claim 8, wherein the eyes include a left eye and a right eye, wherein performing the blink recognition based on the face joint points to generate the blink score comprises:
Generating the blink score based on the eye opening degree of the left eye and the eye opening degree of the right eye.
10. The image processing method according to claim 1, wherein the processing the pose score of the image in which an abnormality exists from the pose scores of a plurality of frames of the image to update the pose score of the image in which an abnormality exists includes:
calculating the gesture scores of the images with the abnormality according to the gesture scores of two frames of the images which are positioned before and after the images with the abnormality in time sequence in the images with multiple frames;
the processing the expression score of the image with abnormality according to the expression scores of a plurality of frames of the image to update the expression score of the image with abnormality comprises:
and calculating the expression scores of the images with the abnormality according to the expression scores of the two frames of images which are positioned before and after the images with the abnormality in time sequence in the images with multiple frames.
11. The image processing method according to claim 1, wherein outputting at least one target image from among the plurality of frames of the image according to the updated scores of the plurality of frames of the image, comprises:
Sequencing a plurality of frames of images according to the scores of the images in order from large to small or from small to large; a kind of electronic device with high-pressure air-conditioning system
The images whose order is within a predetermined order are selected as the target images.
12. The image processing method according to claim 11, wherein the selecting of the images in a predetermined order as the target image includes:
acquiring the images in the predetermined order as pre-output images;
acquiring a time difference between the shooting time of the pre-output image and the shooting time of an intermediate frame image, wherein the intermediate frame image is the pre-output image positioned at an intermediate time sequence in the pre-output image;
deleting the pre-output image for which the time difference is greater than a predetermined threshold;
and outputting the pre-output image with the highest score of the image as the target image.
13. An image processing apparatus for continuous shooting, comprising:
the continuous shooting module is used for executing continuous shooting so as to generate multi-frame images;
the recognition module is used for calculating the score of each frame of the image according to the gesture score, the expression score, a first weight preset by the gesture score and a second weight preset by the expression score of each frame of the image;
The updating module is used for determining that the score of the image is abnormal when the score of the image lacks the gesture score and/or the expression score; processing the pose scores of the images with anomalies according to the pose scores of the images with multiple frames to update the pose scores of the images with anomalies; and/or processing the expression score of the image with the abnormality according to the expression scores of a plurality of frames of the image so as to update the expression score of the image with the abnormality;
and the output module is used for outputting at least one target image from the multiple frames of images according to the updated scores of the multiple frames of images.
14. A terminal, the terminal comprising:
one or more processors, memory; and
one or more programs, wherein the one or more programs are stored in the memory and executed by the one or more processors, the programs comprising instructions for performing the image processing method of any of claims 1 to 12.
15. A non-transitory computer readable storage medium containing a computer program which, when executed by one or more processors, causes the processors to implement the image processing method of any of claims 1 to 12.
CN202110600413.3A 2021-05-31 2021-05-31 Image processing method and device, terminal and readable storage medium Active CN113326775B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110600413.3A CN113326775B (en) 2021-05-31 2021-05-31 Image processing method and device, terminal and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110600413.3A CN113326775B (en) 2021-05-31 2021-05-31 Image processing method and device, terminal and readable storage medium

Publications (2)

Publication Number Publication Date
CN113326775A CN113326775A (en) 2021-08-31
CN113326775B true CN113326775B (en) 2023-12-29

Family

ID=77422597

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110600413.3A Active CN113326775B (en) 2021-05-31 2021-05-31 Image processing method and device, terminal and readable storage medium

Country Status (1)

Country Link
CN (1) CN113326775B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114495241A (en) * 2022-02-16 2022-05-13 平安科技(深圳)有限公司 Image identification method and device, electronic equipment and storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107124901A (en) * 2014-08-02 2017-09-01 加坦公司 The method for removing electron micrograph machine image abnormity point
CN109978884A (en) * 2019-04-30 2019-07-05 恒睿(重庆)人工智能技术研究院有限公司 More people's image methods of marking, system, equipment and medium based on human face analysis
CN110287877A (en) * 2019-06-25 2019-09-27 腾讯科技(深圳)有限公司 The processing method and processing device of video object
CN110765952A (en) * 2019-10-24 2020-02-07 上海眼控科技股份有限公司 Vehicle illegal video processing method and device and computer equipment
CN111199165A (en) * 2018-10-31 2020-05-26 浙江宇视科技有限公司 Image processing method and device
CN111241927A (en) * 2019-12-30 2020-06-05 新大陆数字技术股份有限公司 Cascading type face image optimization method, system and equipment and readable storage medium
CN111259857A (en) * 2020-02-13 2020-06-09 星宏集群有限公司 Human face smile scoring method and human face emotion classification method
CN112733575A (en) * 2019-10-14 2021-04-30 北京字节跳动网络技术有限公司 Image processing method, image processing device, electronic equipment and storage medium
CN112771612A (en) * 2019-09-06 2021-05-07 华为技术有限公司 Method and device for shooting image

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107124901A (en) * 2014-08-02 2017-09-01 加坦公司 The method for removing electron micrograph machine image abnormity point
CN111199165A (en) * 2018-10-31 2020-05-26 浙江宇视科技有限公司 Image processing method and device
CN109978884A (en) * 2019-04-30 2019-07-05 恒睿(重庆)人工智能技术研究院有限公司 More people's image methods of marking, system, equipment and medium based on human face analysis
CN110287877A (en) * 2019-06-25 2019-09-27 腾讯科技(深圳)有限公司 The processing method and processing device of video object
CN112771612A (en) * 2019-09-06 2021-05-07 华为技术有限公司 Method and device for shooting image
CN112733575A (en) * 2019-10-14 2021-04-30 北京字节跳动网络技术有限公司 Image processing method, image processing device, electronic equipment and storage medium
CN110765952A (en) * 2019-10-24 2020-02-07 上海眼控科技股份有限公司 Vehicle illegal video processing method and device and computer equipment
CN111241927A (en) * 2019-12-30 2020-06-05 新大陆数字技术股份有限公司 Cascading type face image optimization method, system and equipment and readable storage medium
CN111259857A (en) * 2020-02-13 2020-06-09 星宏集群有限公司 Human face smile scoring method and human face emotion classification method

Also Published As

Publication number Publication date
CN113326775A (en) 2021-08-31

Similar Documents

Publication Publication Date Title
WO2019128508A1 (en) Method and apparatus for processing image, storage medium, and electronic device
US20190347826A1 (en) Method and apparatus for pose processing
CN109815826B (en) Method and device for generating face attribute model
CN110544301A (en) Three-dimensional human body action reconstruction system, method and action training system
KR101303877B1 (en) Method and apparatus for serving prefer color conversion of skin color applying face detection and skin area detection
US20220383653A1 (en) Image processing apparatus, image processing method, and non-transitory computer readable medium storing image processing program
US9679383B2 (en) Display control apparatus displaying image
CN113239220A (en) Image recommendation method and device, terminal and readable storage medium
CN110544302A (en) Human body action reconstruction system and method based on multi-view vision and action training system
CN111382648A (en) Method, device and equipment for detecting dynamic facial expression and storage medium
JP2019191981A (en) Behavior recognition device, model construction device, and program
CN111723687A (en) Human body action recognition method and device based on neural network
CN113856186B (en) Pull-up action judging and counting method, system and device
Zhang et al. Automatic calibration of the fisheye camera for egocentric 3d human pose estimation from a single image
CN106034203A (en) Image processing method and apparatus for shooting terminal
CN113326775B (en) Image processing method and device, terminal and readable storage medium
WO2023155533A1 (en) Image driving method and apparatus, device and medium
JP5648452B2 (en) Image processing program and image processing apparatus
JP5503510B2 (en) Posture estimation apparatus and posture estimation program
CN110326287A (en) Image pickup method and device
TW202242797A (en) Device for detecting human body direction and method for detecting human body direction
CN113313009A (en) Method, device and terminal for continuously shooting output image and readable storage medium
WO2012153868A1 (en) Information processing device, information processing method and information processing program
CN116343325A (en) Intelligent auxiliary system for household body building
JP2011232845A (en) Feature point extracting device and method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant