CN113239220A - Image recommendation method and device, terminal and readable storage medium - Google Patents

Image recommendation method and device, terminal and readable storage medium Download PDF

Info

Publication number
CN113239220A
CN113239220A CN202110577167.4A CN202110577167A CN113239220A CN 113239220 A CN113239220 A CN 113239220A CN 202110577167 A CN202110577167 A CN 202110577167A CN 113239220 A CN113239220 A CN 113239220A
Authority
CN
China
Prior art keywords
image
face
human body
detection
degree
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110577167.4A
Other languages
Chinese (zh)
Inventor
苏展
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Oppo Mobile Telecommunications Corp Ltd
Original Assignee
Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Oppo Mobile Telecommunications Corp Ltd filed Critical Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority to CN202110577167.4A priority Critical patent/CN113239220A/en
Publication of CN113239220A publication Critical patent/CN113239220A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/53Querying
    • G06F16/535Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/757Matching configurations of points or features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/165Detection; Localisation; Normalisation using facial parts and geometric relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Library & Information Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Geometry (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses an image recommendation method, an image recommendation device, a terminal and a computer readable storage medium. The image recommendation method comprises the following steps: acquiring a preset image, wherein the preset image comprises a preset face; executing continuous shooting to obtain a plurality of frames of shot images; performing joint point detection on each frame of shot image to obtain human body joint points of each frame of shot image; acquiring human body information of each frame of shot image according to a preset human face and human body joint points of each frame of shot image; acquiring a recommended value of each frame of shot image according to the human body information; and sequencing the multiple frames of shot images according to the recommended values, and recommending the shot images according to the sequencing result. According to the method and the device, the preset image comprising the preset face is obtained before the human body information is obtained, the preset face is the face of the object person which the user wants to shoot, so that the image with the best posture and/or expression of the object person can be recommended to the user according to the preset face, and the image recommended to the user can meet the expected requirement of the user.

Description

Image recommendation method and device, terminal and readable storage medium
Technical Field
The present application relates to the field of image processing technologies, and in particular, to an image recommendation method, an image recommendation apparatus, a terminal, and a non-volatile computer-readable storage medium.
Background
At present, a mobile phone manufacturer only stores each frame of continuously shot images into an album for realizing the application of continuous shooting of a camera. Often the user uses the camera burst function only to acquire a certain cardiograph image to be captured. Because a camera continuous shooting function usually collects a plurality of images, redundant images increase the time for searching for a mood image by a user, and how to automatically recommend the mood image meeting the expectation of the user to the user becomes a problem to be solved urgently in the field.
Disclosure of Invention
The embodiment of the application provides an image recommendation method, an image recommendation device, a terminal and a non-volatile computer readable storage medium.
The image recommendation method according to the embodiment of the application comprises the following steps: acquiring a preset image, wherein the preset image comprises a preset face; executing continuous shooting to obtain a plurality of frames of shot images; carrying out joint point detection on each frame of the shot image to obtain human body joint points of each frame of the shot image; acquiring human body information of each frame of shot image according to the preset human face and the human body joint point of each frame of shot image; acquiring a recommended value of each frame of the shot image according to the human body information; and sequencing the plurality of frames of shot images according to the recommended value, and recommending the shot images according to a sequencing result.
The image recommendation device comprises an acquisition module, a continuous shooting module, a detection module and a recommendation module. The acquisition module is used for acquiring a preset image, acquiring multiple frames of shot images, acquiring human body joint points of each frame of shot images, acquiring human body information of each frame of shot images and acquiring recommended values of each frame of shot images. The continuous shooting module is used for executing continuous shooting. The detection module is used for detecting the joint points of each frame of the shot image. The recommending module is used for sequencing the plurality of frames of the shot images according to the recommended values and recommending the shot images according to sequencing results.
The terminal of embodiments of the present application includes one or more processors, memory, and one or more programs. Wherein the one or more programs are stored in the memory and executed by the one or more processors, the programs including instructions for performing the image recommendation method of embodiments of the present application. The image recommendation method comprises the following steps: acquiring a preset image, wherein the preset image comprises a preset face; executing continuous shooting to obtain a plurality of frames of shot images; carrying out joint point detection on each frame of the shot image to obtain human body joint points of each frame of the shot image; acquiring human body information of each frame of shot image according to the preset human face and the human body joint point of each frame of shot image; acquiring a recommended value of each frame of the shot image according to the human body information; and sequencing the plurality of frames of shot images according to the recommended value, and recommending the shot images according to a sequencing result.
A non-transitory computer-readable storage medium containing a computer program according to an embodiment of the present application, which when executed by one or more processors, causes the processors to implement an image recommendation method according to an embodiment of the present application. The image recommendation method comprises the following steps: acquiring a preset image, wherein the preset image comprises a preset face; executing continuous shooting to obtain a plurality of frames of shot images; carrying out joint point detection on each frame of the shot image to obtain human body joint points of each frame of the shot image; acquiring human body information of each frame of shot image according to the preset human face and the human body joint point of each frame of shot image; acquiring a recommended value of each frame of the shot image according to the human body information; and sequencing the plurality of frames of shot images according to the recommended value, and recommending the shot images according to a sequencing result.
In the image recommendation method, the image recommendation device, the terminal and the nonvolatile computer readable storage medium according to the embodiments of the application, the preset image is acquired before the human body information is acquired, the preset image includes the preset face, and the preset face is the face of the object person that the user wants to shoot, so that when the terminal automatically recommends the shot image, the shot image with the best posture and/or expression of the object person is recommended to the user according to the preset face, so as to ensure that the image recommended to the user meets the expected requirements of the user on the posture and/or expression of the person and the person in the image.
Additional aspects and advantages of embodiments of the present application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the present application.
Drawings
The above and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
FIG. 1 is a schematic flow chart diagram of an image recommendation method according to some embodiments of the present application;
FIG. 2 is a schematic diagram of an image recommendation device according to some embodiments of the present application;
FIG. 3 is a schematic block diagram of a terminal according to some embodiments of the present application;
FIG. 4 is a schematic diagram of a scenario for acquiring joint points of a human body according to some embodiments of the present application;
FIG. 5 is a schematic flow chart diagram of an image recommendation method according to some embodiments of the present application;
FIG. 6 is a schematic flow chart diagram of an image recommendation method according to some embodiments of the present application;
FIG. 7 is a schematic diagram of a scenario for acquiring human body information according to some embodiments of the present application;
FIG. 8 is a schematic flow chart diagram of an image recommendation method according to some embodiments of the present application;
FIG. 9 is a schematic flow chart diagram of an image recommendation method according to some embodiments of the present application;
FIG. 10 is a schematic flow chart diagram of an image recommendation method of certain embodiments of the present application;
FIG. 11 is a schematic flow chart diagram of an image recommendation method according to some embodiments of the present application;
FIG. 12 is a schematic flow chart diagram of an image recommendation method according to some embodiments of the present application;
FIG. 13 is a schematic flow chart diagram of an image recommendation method according to some embodiments of the present application;
FIG. 14 is a schematic flow chart diagram of an image recommendation method according to some embodiments of the present application;
FIG. 15 is a schematic flow chart diagram of an image recommendation method according to some embodiments of the present application;
FIG. 16 is a schematic flow chart diagram of an image recommendation method according to some embodiments of the present application;
FIG. 17 is a schematic flow chart diagram of an image recommendation method according to some embodiments of the present application;
FIG. 18 is a schematic flow chart diagram of an image recommendation method according to some embodiments of the present application;
FIG. 19 is a scene diagram illustrating expression detection in an image recommendation method according to some embodiments of the present application;
FIG. 20 is a schematic flow chart diagram of an image recommendation method according to some embodiments of the present application;
FIG. 21 is a scene diagram illustrating expression detection in an image recommendation method according to some embodiments of the present application;
FIG. 22 is a schematic flow chart diagram of an image recommendation method according to some embodiments of the present application;
FIG. 23 is a schematic flow chart diagram of an image recommendation method according to some embodiments of the present application;
FIG. 24 is a schematic diagram of a connection between a computer-readable storage medium and a processor according to some embodiments of the present application.
Detailed Description
Reference will now be made in detail to embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below by referring to the drawings are exemplary only for the purpose of explaining the embodiments of the present application, and are not to be construed as limiting the embodiments of the present application.
Referring to fig. 1 to 3, a method for outputting a continuous shooting image according to an embodiment of the present disclosure includes the following steps:
01: acquiring a preset image, wherein the preset image comprises a preset face;
02: executing continuous shooting to obtain a plurality of frames of shot images;
03: performing joint point detection on each frame of shot image to obtain human body joint points of each frame of shot image;
04: acquiring human body information of each frame of shot image according to a preset human face and human body joint points of each frame of shot image;
05: acquiring a recommended value of each frame of shot image according to the human body information; and
06: and sequencing the multiple frames of shot images according to the recommended values, and recommending the shot images according to the sequencing result.
The image recommendation device 10 according to the embodiment of the present application includes an acquisition module 12, a continuous shooting module 14, a detection module 16, and a recommendation module 18. The obtaining module 12 may be configured to implement steps 01, 04, and 05, the obtaining module 12 and the continuous shooting module 14 may be configured to implement step 02, the obtaining module 12 and the detecting module 16 may be configured to implement step 03, and the recommending module 18 may be configured to implement step 06. That is, the obtaining module 12 may be configured to obtain a preset image, obtain multiple frames of shot images, obtain a human body joint point of each frame of shot images, obtain human body information of each frame of shot images, and obtain a recommended value of each frame of shot images. The continuous shooting module 14 may be used to perform continuous shooting. The detection module 16 may be configured to perform joint detection on each frame of the captured image. The recommending module 18 is configured to sort the multiple frames of shot images according to the recommended values, and recommend the shot images according to the sorting result.
The terminal 100 of the present embodiment includes one or more processors 20, memory 30; and one or more programs, wherein the one or more programs are stored in the memory 30 and executed by the one or more processors 20, the programs including instructions for performing the output image method of an embodiment of the present application. That is, when the processor 20 executes the program, the processor 20 may implement step 02, step 04, and step 06. That is, the processor 20 may be configured to: acquiring a preset image, wherein the preset image comprises a preset face; executing continuous shooting to obtain a plurality of frames of shot images; performing joint point detection on each frame of shot image to obtain human body joint points of each frame of shot image; acquiring human body information of each frame of shot image according to a preset human face and human body joint points of each frame of shot image; acquiring a recommended value of each frame of shot image according to the human body information; and sequencing the multiple frames of shot images according to the recommended values, and recommending the shot images according to the sequencing result.
In the image output method, the image recommendation apparatus 10, and the terminal 100 according to the embodiment of the present application, the shot images are recommended to the user according to the ranking result of the recommended values of each frame of shot images. The recommended value of each frame of shot image is acquired according to the human body information, so that the image is recommended to the user based on the human body information in the shot image, and the image is not recommended to the user based on information such as scene information, object information, animal information or plant information in the shot image, so that the state of people in the image is emphasized when the image is recommended to the user. Further, the human body information of each frame of the photographed image is obtained according to the human body joint point of the frame of image and the preset human face in the preset image to determine whether the obtained human body information is the human body information related to the preset human face, so that when the preset human face is the human face of the object person concerned by the user, when recommending the image to the user, more emphasis is placed on recommending the state of the object person concerned by the user in the photographed image, so that the photographed image automatically recommended to the user by the image recommending device 10 and the terminal 100 is closer to the image containing the object person that the user psychology may want to select.
Specifically, the terminal 100 may be a mobile phone, a tablet computer, a display device, a notebook computer, a teller machine, a gate, a smart watch, a head-up display device, a game machine, and the like. As shown in fig. 3, the embodiment of the present application is described by taking the terminal 100 as an example, and it is understood that the specific form of the terminal 100 is not limited to a mobile phone.
Please refer to fig. 3, the terminal 100 may further include a camera 40, and the camera 40 may be a front camera or a rear camera of the terminal 100. The processor 20 may be connected to the camera 40 and control the camera 40 to perform continuous shooting. Executing continuous shooting to obtain a process of continuously shooting multiple frames of images, specifically, the process may refer to a process of continuously shooting multiple frames of images generated by a scene by the camera 40, which is obtained by the processor 20; or may be a plurality of frames of images extracted from the video by the processor 20 after the camera 40 captures the video at a preset frame rate. For example, after continuous shooting, 5, 10, 15, 20, 25, 30 or more frames of images may be generated, and similarly, the multi-frame image extracted from the video may be 5, 10, 15, 20, 25, 30 or more frames of images, and the number of the acquired multi-frame images may be a fixed value, may also be set by a user in a customized manner, and may also be determined according to the duration of the continuous shooting. The embodiment of the present application represents a multi-frame image as N frames.
Referring to fig. 4, after acquiring multiple frames of photographed images, joint detection may be performed on each frame of photographed images to acquire human body joints of each frame of photographed images, where the human body joints may be expressed as one or more coordinate points in a human body of the photographed images. Specifically, the human body joint points of each frame photographed image of the first to nth frames may be identified through a human body posture estimation algorithm. In the human body posture estimation algorithm, as shown in the right diagram of fig. 4, the posture of the human body is fitted through the connecting line between each human body joint point d0 and each human body joint point d0 to determine the position of the human body in the image. The human body joint points may include joint points related to human body motion, such as limb joint points corresponding to limb parts of shoulders, elbows, hands, arms, knees, feet and the like, such as joint point d0 in the region of Z2 illustrated in fig. 4, and motion postures of the human body, such as walking, sitting, standing, running, lying and the like, may be determined according to the limb joint points. The human body joint points may further include facial joint points, such as facial joint points corresponding to facial features such as eyes, ears, nose, etc., such as joint point d0 in the region Z1 illustrated in fig. 4, from which facial posture-related factors such as the orientation of the human face in the image (e.g., facing the lens, facing away from the lens, etc.), the degree of occlusion of the human face, the position of five sense organs of the human face, etc. can be determined.
When only a single person exists in the shot image, the human body joint point of the single person can be identified. When there are a plurality of persons, for example, two, three, four, five or more persons, in the captured image, the human joint points of each person in the captured image can be identified. The human posture estimation algorithm may include, but is not limited to, algorithms such as pifaf (Part Intensity Field-Part Association Fields, pifaf), visual positioning model algorithm (posnet), and YOLOv4 (young Only Look on VERsion four, YOLOv4), which are not listed herein.
After the human body joint points in the shot image are acquired, the human body information can be acquired according to the human body joint points. For example, whether the posture of the human body is relatively stretched, whether the human body is blocked, whether the human body is clear, and the like can be detected according to the human body joint points, so that posture detection results corresponding to the detection can be obtained, and at least one item of the posture detection results is used as the human body information. For another example, expressions of the human body, such as smiling face, eye opening, eye blinking, and the like, may be detected according to the joint points of the human body, and then expression detection results corresponding to the detection may be obtained, and at least one of the expression detection results may be used as the human body information. For another example, the gesture detection and the expression detection may be performed on the captured image according to the human body relation node, so as to use at least one of the gesture detection result and/or the expression detection result as the human body information.
The human body information may include a score of gesture detection and/or expression detection in the captured images obtained by continuous shooting, and the recommended value of each captured image may be obtained according to the score calculated by performing gesture detection and/or expression detection on each captured image. Generally, when a user selects a mood image, the user often determines whether the expression of the face in the image is rich by naked eyes to select the mood image. In the image recommendation method according to the embodiment of the application, the recommendation value of the shot image can be obtained according to the human body information, so that the shot image with the best posture and/or expression can be automatically recommended to the user, for example, after a plurality of frames of shot images are sorted according to the recommendation value, one or more frames of shot images with high recommendation values are recommended to the user, and therefore the time for the user to search and select the cardioscope image is saved.
In addition, the persons in the captured image may not all be the subject persons that the user wants to capture, and may include non-subject persons that the user does not want to capture, non-subject persons that the user does not care about their expression or posture, and the like, for example, the non-subject persons are strangers that the user does not know or persons that the user knows but does not care about. When the recommended value is obtained based on the human body information, the recommended value obtained from the human body information corresponding to the non-subject person may be higher in the captured image, so that the person with a better posture and/or expression in the captured image recommended to the user is not the target person that the user wants to capture, and is not in accordance with the user expectation. For example, in a shot image obtained by a self-timer continuous shooting of a user, the face of the user in a certain frame of image is folded, but the frame of image is recommended to the user because the smile of a stranger entering the mirror is bright, and the recommended frame of image does not conform to the expectation of the user because the user does not care how the expression of the stranger is, and the face of the user in the recommended frame of image is blocked.
Referring to fig. 5, in the image recommendation method of the present application, a preset image is obtained before obtaining human body information, where the preset image includes a preset face, and the preset face is a face of a target person that a user wants to photograph, so as to obtain human body information of each frame of photographed image according to the preset face and a human body joint of each frame of photographed image, that is, in a stage of obtaining the human body information, the preset face is used as one of bases for obtaining the human body information, for example, only the human body information of the target person corresponding to the preset face is obtained, or a recommendation value weight of the human body information of the target person corresponding to the preset face is increased. In this way, when the terminal 100 automatically recommends the captured image, the captured image with the best posture and/or expression of the target person can be recommended to the user according to the preset face, so as to ensure that the image recommended to the user meets the expected requirements of the user on the person in the image and the posture and/or expression of the person.
In one embodiment, the preset image may be obtained locally from the terminal 100, for example, one or more images containing the subject person are selected from an album of the terminal 100 as the preset image. In another embodiment, the preset image may be obtained from the cloud, for example, by downloading the preset image through a cloud network. In another embodiment, the preset image may be obtained from other terminals of the same type or different types interacting with the terminal 100, for example, the preset image may be downloaded from other terminals by bluetooth transmission, local area network transmission, hotspot transmission, and the like. The preset images acquired by the three modes can be stored in the pre-storage area, a user can change, replace, delete or add the preset images in the pre-storage area at any time, and the preset images in the pre-storage area can be classified, for example, folders such as 'family', 'colleague', 'classmate' and the like are established in the pre-storage area, and different respective pre-stored images are respectively added in the folders of different classifications. When acquiring the human body information, the processor 20 automatically calls a pre-stored preset image from the pre-storage area for acquiring the human body information. When the pre-storage area includes a plurality of categories, the user may be prompted to select a category, or preset images in a default category or a most recently used category may be automatically invoked. Further, in some embodiments, in order to save the storage space, after the preset image is obtained, the face in the preset image is identified, and the preset image is cut, for example, after the face is selected by using a graphic frame such as a rectangular frame, a circular frame, an oval frame, etc., pixels except the graphic frame are cut, so that the preset image in the pre-storage area is the face image without the background, thereby saving the storage space.
In still another embodiment, at least one reference image may be photographed for the subject person before the continuous photographing is performed, and the at least one reference image may be used as a preset image. For example, after entering the continuous shooting mode, the display interface of the terminal 100 prompts the "shooting object person" that the user can shoot the reference images of one or more object persons, for example, the object persons include three persons, i.e., a person, B person and C person, and after the user clicks the "shooting object person" prompt box, the user can shoot three reference images P1 through the camera 40, wherein each reference image P1 only includes the a person, B person and C person, respectively, and the three reference images P1 can be used as the preset images. Alternatively, the reference image P2 may be used as the preset image if the camera 40 captures reference images P2, for example, two or one reference image P2, of which the number is smaller than that of the target person, and each reference image P2 includes at least two target persons. More specifically, if two reference images P2 are captured by the camera 40, one of the reference images P2 includes only a person a and B, and the other reference image includes only a person C, the two reference images P1 can be used as the default images. And after the preset image is acquired, the user executes continuous shooting according to the prompt of the display interface, for example, clicking 'finish' to enter the continuous shooting interface from the single-frame shooting interface, and then executing continuous shooting. Therefore, the preset image does not need to be prestored, and the storage space can be saved. The preset image can be deleted after the human body information is acquired, so that the storage space is saved; or saving the preset image, determining whether to continue using the last saved preset image according to the selection of the user when the continuous shooting mode is entered next time, and prompting the user to shoot the preset image again when the last saved preset image is not continued.
After the preset image is obtained, face recognition can be performed on the preset image to obtain a preset face. After the preset face is obtained, the human body information of the shot image can be obtained according to the preset face and the human body joint points in the preset image. Specifically, referring to fig. 6, in some embodiments, 04: the method for acquiring the human body information of each frame of shot image according to the preset human face and the human body joint point of each frame of shot image comprises the following steps:
041: acquiring the face features in the shot images according to the face joint points of each frame of shot images;
042: matching a preset face and the face features in each frame of shot image;
043: selecting a shot image with successfully matched human face features as an image to be recommended; and
044: and acquiring human body information according to human body joint points in the image to be recommended.
Referring to FIG. 2, in some embodiments, the obtaining module 12 may also be used to implement step 041, step 042, step 043 and step 044. That is, the obtaining module 12 may be configured to obtain the face features in the captured image according to the face joint points of each frame of captured image, match the preset face with the face features in each frame of captured image, select the captured image with successfully matched face features as the image to be recommended, and obtain the body information according to the body joint points in the image to be recommended.
Referring to fig. 3, in some embodiments, the processor 20 may further be configured to: acquiring the face features in the shot images according to the face joint points of each frame of shot images; matching a preset face and the face features in each frame of shot image; selecting a shot image with successfully matched human face features as an image to be recommended; and acquiring human body information according to the human body joint points in the image to be recommended. That is, processor 20 may also be configured to implement step 041, step 042, step 043 and step 044.
Facial joint points may include joint points of facial features, such as eye joint points, nose joint points, ear joint points, mouth joint points, and the like. From the facial joint points of each frame of the photographed image, a face region of each frame of the photographed image, for example, a Z1 region shown in fig. 4, may be determined to extract a face feature in the face region through a face detection algorithm. When a shot image includes a plurality of face regions, face features are extracted for each face region. The extracted face features are used for matching with a preset face so as to judge whether the person in the feature image is a target object (person) in the preset image.
When the human body joint point cannot be detected in a certain frame of shot image or the detected human body joint point does not include a face joint point, the human face feature of the frame of shot image is not continuously obtained, and the human body information of the frame of shot image is not continuously obtained, so that the calculation amount is reduced, and the image recommendation efficiency is improved.
When the face features in a certain frame of shot image are successfully matched, namely when the certain frame of shot image comprises at least one target person, the frame of shot image can be selected as an image to be recommended, and the human body information can be further obtained according to the human body joint points corresponding to the successfully matched face features in the frame of shot image, so that the recommended value of the frame of image to be recommended can be obtained according to the human body information of the target person. And the human body joint points corresponding to the unsuccessfully matched human face features in the frame of shot image are not used for acquiring human body information, namely the influence of the human body joint points of the non-target person on the recommended value is zero, the human body joint points of the non-target person can be directly deleted to save storage space, and the human body information of the non-target person is not required to be acquired subsequently to save calculation amount and improve image recommendation efficiency. Therefore, when the shot image is recommended according to the recommended value, only the human body information of the target person influences the recommended value, and the human body information of the non-target person cannot be acquired, so that the recommended value cannot be influenced, and the situation that the human body information of the non-target person influences the calculation of the recommended value, and the shot image recommended to the user according to the recommended value is an image with the best posture and/or expression of the non-target person is avoided.
Referring to fig. 7, for example, the photographed images include four frame images of an image P1, an image P2, an image P3, and an image P4. The face feature a can be detected in the image P1, the face feature a and the face feature B can be detected in the image P2, the face feature B can be detected in the image P3, and the face joint point and the face feature can not be detected in the image P4. The preset image P0 includes a preset face a of the target person a. And the face features A are matched with the preset face A. Since there is a face feature a that matches successfully in both the image P1 and the image P2, the image P1 and the image P2 are picked up as images to be recommended. However, there is no successfully matched face feature a in the images P3 and P4, so the images P3 and P4 are not recommended images, the human body information of the images P3 and P4 is not obtained, and the images P3 and P4 can be deleted to save storage space.
In the image P1, the human body information of the target person a may be acquired from the human body joint point of the target person a to acquire the recommended value of the image P1 from the human body information of the target person a. In the image P2, the human body information of the target person a can be obtained from the human body joint point of the target person a, and the matching of the face feature B with the preset face is unsuccessful, so that the human body information is not obtained from the human body joint point corresponding to the face feature B, so as to ensure that the recommended value of the image P2 is obtained only from the human body information of the target person a.
In combination with the foregoing, the human body information may include pose information and/or expression information obtained by performing pose detection and/or expression detection on the captured image according to the human body relation of the target person. Therefore, the recommendation value acquired based on the human body information can be used to evaluate whether the posture and/or expression of the target person is optimal. After the image to be recommended is selected, the human body information can be obtained according to the human body joint points in the image to be recommended. Please refer to fig. 8 and fig. 9, 044: acquiring human body information according to human body joint points in an image to be recommended, wherein the method comprises the following steps:
0441: executing posture detection according to human body joint points in the image to be recommended to acquire posture information; and/or
0442: and executing expression detection according to the human body joint points in the image to be recommended so as to acquire expression information.
Referring to fig. 2, in some embodiments, the obtaining module 12 may further be configured to implement steps 0441 and 0442. That is, the obtaining module 12 may be further configured to perform posture detection according to the human body joint points in the image to be recommended, so as to obtain posture information; and/or performing expression detection according to human body joint points in the image to be recommended so as to acquire expression information.
Referring to fig. 3, in some embodiments, the processor 20 may further be configured to: executing posture detection according to human body joint points in the image to be recommended to acquire posture information; and/or performing expression detection according to human body joint points in the image to be recommended so as to acquire expression information. That is, processor 20 may also be used to implement steps 0441 and 0442.
The human body joint points used for acquiring the human body information in the image to be recommended are human body joint points corresponding to successfully matched human face features, the human body joint points corresponding to unsuccessfully matched human face features are not used for acquiring the human body information, and the human body joint points corresponding to unsuccessfully matched human face features can be deleted after the matching step is finished and cannot be used for acquiring the human body information.
Specifically, the preset human body posture detection model can be used for detecting the human body posture of each frame of image to be recommended according to the human body joint points in the image to be recommended, posture detection data of each frame of image to be recommended is generated, namely posture information is obtained, so that the image with a beautiful posture can be selected quickly. The facial expression detection method can also be used for carrying out facial expression detection on each frame of image to be recommended by using a preset facial expression detection model according to human body joint points in the image to be recommended and generating facial expression detection data of each frame of image to be recommended, namely, obtaining expression information so as to rapidly select an image with better expression. For example, the posture information includes a human body posture score, after human body posture detection is performed on each frame of image to be recommended, the human body posture score of each frame of shot image can be calculated according to the human body posture detection result, and then the image with a better posture is selected according to the height of the human body posture score. For another example, the expression information includes facial expression scores, after facial expression detection is performed on each frame of image to be recommended, the facial expression scores of each frame of image to be recommended can be calculated according to the result of the facial expression detection, and then images with better expressions are selected according to the height of the facial expression scores.
Step 0441 and step 0442 can be performed simultaneously, so that the time consumed by performing human posture detection and human facial expression detection on each frame of image to be recommended is reduced, and the working efficiency is improved. Of course, in other embodiments, step 0441 and step 0442 may be performed sequentially, without limitation, for example, step 0441 is performed first and then step 0442 is performed, or step 0442 is performed first and then step 0441 is performed.
Further, referring to fig. 10, in some embodiments, the human body posture detection includes at least one of a human face definition detection, a human face occlusion detection, a posture extension detection and a human body height detection, and the posture information includes a human face definition, a human face occlusion degree, a posture extension degree and a human body height, 0441: executing posture detection according to human body joint points in an image to be recommended to acquire posture information, wherein the posture detection comprises the following steps:
4411: and performing at least one of face definition detection, face shielding degree detection, posture stretching degree detection and human body height detection according to the human body joint points to correspondingly acquire at least one of face definition, face shielding degree, posture stretching degree and human body height.
Referring to fig. 2, in some embodiments, the obtaining module 12 may be further configured to implement step 4411. That is, the human obtaining module 12 may be further configured to perform at least one of human face sharpness detection, human face occlusion degree detection, posture extension degree detection, and human body height detection according to the human body joint point, so as to correspondingly obtain at least one of human face sharpness, human face occlusion degree, posture extension degree, and human body height.
Referring to fig. 3, in some embodiments, the processor 20 may further be configured to: and performing at least one of face definition detection, face shielding degree detection, posture stretching degree detection and human body height detection according to the human body joint points to correspondingly acquire at least one of face definition, face shielding degree, posture stretching degree and human body height. That is, processor 20 may also be used to implement step 4411.
Specifically, the human body joint points may include, but are not limited to, nose, eyes, ears, shoulders, elbows, hands, hips, knees, feet, and the like, and may be selectively added or increased according to the user's needsThereby reducing the number of joints. According to the foregoing, the human body joint point corresponding to the target person is selected while the image to be recommended is selected, so that the human body joint points detected in step 0441, such as face sharpness detection, face occlusion detection, pose extension detection, and human body height detection, are performed as the human body joint points corresponding to the target person, at least one of the face sharpness detection, the face occlusion detection, the pose extension detection, and the human body height detection may be selectively performed according to different joint points in the human body joint points, and the corresponding face sharpness Score is generatedface-clarityFace shielding degree Scoreface_occlusionStretching degree of posture ScorestretchAnd body height ScorehAs attitude information.
In some embodiments, upon human pose detection, one of face sharpness detection, face occlusion detection, pose stretching detection, and human height detection may be performed. In other embodiments, two of face sharpness detection, face occlusion detection, pose stretching detection, and body height detection may be performed when performing body pose detection, such as performing step face sharpness detection and face occlusion detection, performing face sharpness detection and pose stretching detection, performing face sharpness detection and body height detection, or performing face sharpness detection and pose stretching detection, etc., which are not enumerated herein. In still other embodiments, upon human pose detection, three of face sharpness detection, face occlusion detection, pose stretching detection, and human height detection may be performed, e.g., face sharpness detection, face occlusion detection, and pose stretching detection; performing face sharpness detection, face occlusion detection, and human height detection; performing face sharpness detection, pose extension detection, and human height detection; or performing face occlusion degree detection, posture extension degree detection and human body height detection. In still other embodiments, upon human pose detection, four detections, face sharpness detection, face occlusion detection, pose stretching detection, and human height detection, may be performed. The human face definition detection, the human face shielding degree detection, the posture stretching degree detection and the human body height detection can be carried out simultaneously or sequentially in any sequence.
Further, if only one of face sharpness detection, face occlusion detection, posture extension detection, or human height detection is performed, the posture information may be acquired from one of the detection data correspondingly generated. For example, only the gesture stretching degree detection is performed, the gesture detection result may be generated only according to the gesture stretching degree, and the gesture detection result is used as the gesture information. If two kinds of detection among face sharpness detection, face occlusion detection, pose stretching detection, and human height detection are performed, pose information can be acquired from at least one kind of detection data among the two kinds of detection data generated correspondingly. For example, the posture extension degree detection and the human body height detection are performed and the posture detection result and the height detection result are generated, respectively, the posture information includes at least one of the posture detection result and the height detection result. Similarly, if three kinds of detection among face sharpness detection, face occlusion detection, pose stretching detection, and human height detection are performed, pose information may be acquired from at least one of the three kinds of detection data generated correspondingly. If face clarity detection, face occlusion detection, pose stretching detection, and body height detection are performed, the available pose information includes at least one of pose stretching, face occlusion, face clarity, and body height.
In one embodiment, the face definition detection, the face shielding degree detection, the pose stretching degree detection, and the human body height detection are respectively performed, and the pose stretching degree, the face shielding degree, the face definition, and the human body height are correspondingly obtained, and the pose information Score can be calculated and obtained according to at least one data of the pose stretching degree, the face shielding degree, the face definition, and the human body heightpose. For example, the pose information can be calculated and obtained according to the pose extension and the face definition. The posture can be calculated and obtained according to the posture extension degree and the face shielding degreeAnd (4) state information. And attitude information can be calculated and obtained according to the attitude extension degree, the face shielding degree and the face definition. And the posture information can be calculated and obtained according to the posture extension degree, the human face shielding degree, the human face definition and the human body height.
In one example, the stretching degree Score can be determined according to the gesturestretchFace shielding degree Scoreface_occlusionFace definition Scoreface-clarityAnd body height ScorehCalculating and acquiring the posture information Scorepose。Scorepose=Scoreface-clarity+Scoreface_occlusion+Scorestretch+Scoreh. Or, in another example, Scoreface-clarity、Scoreface_occlusion、ScorestretchAnd ScorehCorresponding weights are a, b, c, d, respectively, then Scorepose=aScoreface-clarity+bScoreface_occlusion+cScorestretch+dScoreh. Alternatively, other computing methods are also possible, not listed here. Compared with the situation that some posture detection models only predict the postures of the shot portrait and ignore the interference of the portrait definition and the human face shielding, the method and the device for detecting the postures of the shot portrait can improve the accuracy of the human body posture detection by combining the human face definition, the human face shielding degree, the posture stretching degree detection and the human body height detection to obtain the posture information, and further the recommended postures of the shot image are beautiful and the portrait is clear.
Further, referring to fig. 11, in some embodiments, the human body joints include face joints, the face joints include nose joints, ear joints and eye joints, and the step 0441 of performing face sharpness detection according to the human body joints to obtain the face sharpness may include the following steps:
4412: determining a face region according to the nose joint points, the ear joint points and the eye joint points; and
4413: and calculating the face definition of the face area.
Referring to fig. 2, in some embodiments, the detection module 16 may further be configured to: determining a face region according to the nose joint points, the ear joint points and the eye joint points; and calculating the face definition of the face region. That is, the detection module 16 may also be used to implement step 4412 and step 4413.
Referring to fig. 3, in some embodiments, referring to fig. 3, the processor 20 may further be configured to: determining a face region according to the nose joint points, the ear joint points and the eye joint points; and calculating the face definition of the face region. That is, processor 20 may also be used to implement step 4412 and step 4413.
Specifically, the ear joint points may include left and right ear joint points, the eye joint points may include left and right eye joint points, a face region may be determined according to the nose joint points, the left and right ear joint points, the left and right eye joint points, and then a face sharpness may be calculated within the face region using a face sharpness detection algorithm, for example, a variance of laplacian may be used within the face region to represent a face sharpness Scoreface-clarity. Of course, the face sharpness may also be calculated by other algorithms, which are not listed here. In this embodiment, the face sharpness is calculated so that the pose detection result includes face sharpness data, and the face sharpness problem is considered when performing the pose detection, so that the face of the obtained target image is clearer.
Further, referring to fig. 12, in some embodiments, the step 0441 of performing face occlusion degree detection according to a human body joint point to obtain a face occlusion degree may include the following steps:
4414: calculating the nose occlusion degree of the nose according to the confidence coefficient of the nose joint point;
4415: calculating the eye shielding degree of the eyes according to the confidence coefficient of the eye joint points;
4416: calculating the ear shielding degree of the ear according to the confidence coefficient of the ear joint point; and
4417: and calculating the face shielding degree according to the nose shielding degree, the eye shielding degree and the ear shielding degree.
Referring to fig. 2, in some embodiments, referring to fig. 2, the detection module 16 may further be configured to: calculating the nose occlusion degree of the nose according to the confidence coefficient of the nose joint point; calculating the eye shielding degree of the eyes according to the confidence coefficient of the eye joint points; calculating the ear shielding degree of the ear according to the confidence coefficient of the ear joint point; and calculating the face shielding degree according to the nose shielding degree, the eye shielding degree and the ear shielding degree. That is, the detection module 16 may also be used to implement step 4414, step 4415, step 4416 and step 4417.
Referring to fig. 3, in some embodiments, referring to fig. 3, the processor 20 may further be configured to: calculating the nose occlusion degree of the nose according to the confidence coefficient of the nose joint point; calculating the eye shielding degree of the eyes according to the confidence coefficient of the eye joint points; calculating the ear shielding degree of the ear according to the confidence coefficient of the ear joint point; and calculating the face shielding degree according to the nose shielding degree, the eye shielding degree and the ear shielding degree. That is, processor 20 may also be used to implement step 4414, step 4415, step 4416, and step 4417.
Specifically, the human face is a relatively critical area in the image, and the occlusion of the human face has a large influence on the quality of the image, so that the human face occlusion degree needs to be calculated. The obvious feature in the human face is the five sense organs, and the human face shielding degree can be accurately determined by calculating the shielding degree of the five sense organs. In this embodiment, the nose shielding degree, the eye shielding degree and the ear shielding degree can be respectively calculated according to the data of the corresponding human body joint point, then the face shielding degree is calculated according to the nose shielding degree, the eye shielding degree and the ear shielding degree, and the face shielding degree obtained by calculation is more accurate.
When the human body posture estimation algorithm is used for identifying the human body joint points, the human body posture estimation algorithm can not only identify the coordinates of the human body joint points, but also provide the confidence coefficient of the human body joint points. The confidence may be used to indicate the probability of being the human joint point, and a higher confidence indicates a higher probability of being the human joint point, and the human joint point may be considered to be occluded to a lower degree. For example, there may be a mapping relationship between the confidence level and the occlusion level. For example, the mapping relationship between the confidence a and the occlusion degree S is: if the confidence of the nasal joint point is 75%, the nose obstruction degree of the nasal joint point can be considered to be 25%. Or the mapping relation between the confidence A and the shielding degree S is as follows: s ═ aA, where a is a coefficient, can be calculated by multiple experiments. The mapping relationship corresponding to the confidence and the occlusion degree of each human body joint point can be the same or different. For example, the mapping relationship corresponding to the confidence degree of the nose joint point and the nose shielding degree may be inconsistent with the mapping relationship corresponding to the confidence degree of the ear joint point and the ear shielding degree, so that different calculation rules may be set for different human body joint points to more fit the corresponding human body joint points, and further the nose shielding degree and the ear shielding degree may be more accurately calculated.
Further, a nose occlusion Score may be calculated based on the confidence of the nasal joint pointsnose_occlusion. The left eye occlusion Score of the left eye can be calculated according to the confidence of the left eye joint pointeye_occlusion-l. Calculating the right eye occlusion degree Score of the right eye according to the confidence coefficient of the right eye joint pointeye_occlusion-r. According to the left eye shielding degree Scoreeye_occlusion-lAnd right eye opacity Scoreeye_occlusion-rCan calculate the eye shielding degree Scoreeye_occlusionE.g. Scoreeye_occlusion=Scoreeye_occlusion-l+Scoreeye_occlusion-r. Left ear occlusion Score of the left ear can be calculated according to the confidence of the left ear joint pointear_occlusion-lThe occlusion degree Score of the right ear can be calculated according to the confidence coefficient of the joint point of the right earear_occlusion-rFurther, according to the left ear shielding degree Scoreear_occlusion-lAnd degree of occlusion Score of right earear_occlusion-rCalculating the ear occlusion degree Scoreear_occlusion. For example, Scoreear_occlusion=Scoreear_occlusion-l+Scoreear_occlusion-r
Further, according to the nose shielding degree Scorenose_occlusionEye shielding Scoreeye_occlusionAnd ear occlusion Scoreear_occlusionCalculating the face shielding degree Scoreface_occlusion. For example, Scoreface_occlusion=Scorenose_occlusion+Scoreeye_occlusion+Scoreear_occlusion. Alternatively, the weights corresponding to the nose occlusion degree, the eye occlusion degree and the ear occlusion degree may be set correspondingly, and then the face occlusion degree is calculated according to the weights, which are not listed herein.
Wherein, in other embodiments, the mouth occlusion degree of the mouth can also be calculated. For example, mouth occlusion may be calculated from the confidence of the mouth joint points. The mouth shielding degree can be added when the face shielding degree is calculated, so that the obtained face shielding degree is more accurate.
Of course, the face occlusion degree can also be calculated through some deep learning algorithms. For example, the occluded area in the face area may be identified, and then the proportion of the occlusion degree area to the face area may be calculated, and so on, without limitation.
Referring to fig. 13, in some embodiments, the human joint points include limb joint points including hand joint points, elbow joint points, shoulder joint points, hip joint points, knee joint points, and foot joint points, and the step 0441 of performing the gesture stretching degree detection according to the human joint points to obtain the gesture stretching degree may include the following steps:
4418: calculating the bending degree of the arms according to the hand joint points, the elbow joint points and the shoulder joint points;
4419: calculating the stretching degree of the leg according to the foot joint point and the hip joint point;
44110: calculating a first distortion degree of the feet and the trunk according to the foot joint points, the hip joint points and the shoulder joint points;
44111: calculating a second degree of distortion of the leg and the trunk according to the knee joint point, the hip joint point and the shoulder joint point; and
44112: and calculating the posture extension degree according to the arm bending degree, the leg stretching degree, the first distortion degree and the second distortion degree.
Referring to fig. 2, in some embodiments, the detection module 16 may further be configured to: calculating the bending degree of the arms according to the hand joint points, the elbow joint points and the shoulder joint points; calculating the stretching degree of the leg according to the foot joint point and the hip joint point; calculating a first distortion degree of the feet and the trunk according to the foot joint points, the hip joint points and the shoulder joint points; calculating a second degree of distortion of the leg and the trunk according to the knee joint point, the hip joint point and the shoulder joint point; and calculating the posture extension degree according to the arm bending degree, the leg stretching degree, the first distortion degree and the second distortion degree. The detection module 16 may also be used to implement step 4418, step 4419, step 44110, step 44111, and step 44112.
Referring to fig. 3, in some embodiments, the processor 20 may further be configured to: calculating the bending degree of the arms according to the hand joint points, the elbow joint points and the shoulder joint points; calculating the stretching degree of the leg according to the foot joint point and the hip joint point; calculating a first distortion degree of the feet and the trunk according to the foot joint points, the hip joint points and the shoulder joint points; calculating a second degree of distortion of the leg and the trunk according to the knee joint point, the hip joint point and the shoulder joint point; and calculating the posture extension degree according to the arm bending degree, the leg stretching degree, the first distortion degree and the second distortion degree. Processor 20 may also be configured to implement step 4418, step 4419, step 44110, step 44111, and step 44112.
Specifically, the influence of the pose extension degree on the image quality is relatively critical, and the normal user continuously shoots images so as to shoot the images with beautiful poses. If the pose extension degree is too small, the user may not be fully extended, and the frame image may not be an image desired by the user, and therefore, the pose extension degree of the human body in the captured image needs to be detected. Wherein, the hand joint points can include a left hand joint point and a right hand joint point, the elbow joint points can include a left elbow joint point and a right elbow joint point, the shoulder joint points can include a left shoulder joint point and a right shoulder joint point, the hip joint points can include a left hip joint point and a right hip joint point, the knee joint points can include a left knee joint point and a right knee joint point, and the foot joint points can include a left foot joint point and a right foot joint point.
Further, the degree of arm bending, the degree of leg stretching, the first degree of distortion, and the second degree of distortion may be calculated by the following formulas.
Figure BDA0003084829660000101
Wherein A, B, C can be the position coordinates of three related human joint points, | | A-B | | Y2Representing the 2 norm of a-B and arccos representing the inverse cosine function.
For example, for a single left arm as an example, the degree of flexion Score of the left Elbow is calculated using the left-hand joint point coordinates Wrist _ l (x, y) (i.e., a in the above formula), the left-Elbow joint point coordinates Elbow _ l (x, y) (i.e., B in the above formula), and the left-Shoulder joint point coordinates Shoulder _ l (x, y) (i.e., C in the above formula)elbow_l(i.e., S in the above equation), the calculation formula is as follows:
Figure BDA0003084829660000102
the degree of flexion Score of the right elbow can be calculated by the above formula using the coordinates of the right hand joint point, the right elbow joint point and the right shoulder joint pointelbow_r. Wherein, the right hand joint point coordinate Wrist _ r (x, y) is a in the above formula, the right Elbow joint point coordinate Elbow _ r (x, y) is B in the above formula, and the right Shoulder joint point coordinate Shoulder _ r (x, y) is C in the above formula.
The stretching degree Score of the left leg can be calculated by the formula by using the coordinates of the left foot joint point, the left hip joint point and the left knee joint pointleg-l. The left Foot joint point coordinates Foot _ l (x, y) are a in the above formula, the left Hip joint point coordinates Hip _ l (x, y) are B in the above formula, and the left Knee joint point coordinates Knee _ l (x, y) are C in the above formula.
The stretching degree Score of the right leg can be calculated by the formula by using the coordinates of the right foot joint point, the right hip joint point and the right knee joint pointleg-r. The right Foot joint point coordinate Foot _ r (x, y) is a in the above formula, the right Hip joint point coordinate Hip _ r (x, y) is B in the above formula, and the right Knee joint point coordinate Knee _ r (x, y) is C in the above formula.
The first degree of distortion Score of the left foot and the trunk can be calculated by using the coordinates of the left foot joint point, the left hip joint point and the left shoulder joint point through the formulatwist_ankle_l. The left Foot joint point coordinate Foot _ l (x, y) is a in the above formula, the left Hip joint point coordinate Hip _ l (x, y) is B in the above formula, and the left Shoulder joint point coordinate Shoulder _ l (x, y) is C in the above formula.
The first distortion degree Score of the right foot and the trunk can be calculated by using the right foot joint point coordinate, the right hip joint point coordinate and the right shoulder joint point coordinate through the formulatwist_ankle_r. The right Foot joint point coordinate Foot _ r (x, y) is a in the above formula, the right Hip joint point coordinate Hip _ r (x, y) is B in the above formula, and the right Shoulder joint point coordinate Shoulder _ r (x, y) is C in the above formula.
The second distortion degree Score of the left leg and the trunk can be calculated by using the coordinates of the left knee joint point, the left hip joint point and the left shoulder joint point through the formulatwist_knee_l. The left knee joint point coordinate knee _ l (x, y) is a in the above formula, the left Hip joint point coordinate Hip _ l (x, y) is B in the above formula, and the left Shoulder joint point coordinate Shoulder _ l (x, y) is C in the above formula.
The second distortion degree Score of the right leg and the trunk can be calculated by using the right knee joint point coordinate, the right hip joint point coordinate and the right shoulder joint point coordinate through the formulatwist_knee_r. Wherein, the right foot joint point coordinate Knee _ r (x, y) is A in the above formula, the right Hip joint point coordinate Hip _ r (x, y) is B in the above formula, and the right Shoulder joint point coordinate Shoulder _ r (x, y) is C in the above formula.
Further, may be according to Scoreelbow_lDegree of flexion of the right elbow Scoreelbow_rLeft leg stretch Scoreleg-lRight leg stretching degree Scoreleg-rLeft foot and torso with a first degree of flexion Scoretwist_ankle_lFirst degree of flexion Score of right foot and torsotwist_ankle_rSecond degree of left leg and torso Scoretwist_knee_lAnd a second distortion degree of the right leg and the trunk, and calculating the gesture stretching degree Score of the human body in the shot imagestretch. For example, the degree of flexion of the left elbow, the degree of flexion of the elbow, the degree of stretching of the left leg, the degree of stretching of the right legThe sum of the degree of extension, the first degree of flexion of the left foot and the torso, the first degree of flexion of the right foot and the torso, the second degree of flexion of the left leg and the torso, and the second degree of flexion of the right leg and the torso may result in a degree of postural extension of the human body, e.g., Scorestretch=Scoreelbow_l+Scoreelbow_r+Scoreleg-l+Scoreleg-r+Scoretwist_ankle_l+Scoretwist_ankle_r+Scoretwist_knee_l+Scoretwist_knee_r. Of course, the gesture extension degree may also be calculated in other calculation manners, which are not listed here.
In this embodiment, the degree of curvature of the left elbow, the degree of curvature of the elbow, the degree of stretching of the left leg, the degree of stretching of the right leg, the first degree of distortion of the left foot and the trunk, the first degree of distortion of the right foot and the trunk, the second degree of distortion of the left leg and the trunk, and the second degree of distortion of the right leg and the trunk are calculated, so that the degree of stretching of each limb of the human body is fully considered, the calculated degree of stretching of the posture is more accurate, and the posture of the finally output target image is more beautiful.
Referring to fig. 14, in some embodiments, the step 0441 of obtaining the body height by performing the body height detection according to the body joint point may include the following steps:
44113: and calculating the height of the human body according to the shoulder joint points and the foot joint points.
Referring to fig. 2, in some embodiments, the detecting module 16 can be further configured to calculate the height of the human body according to the shoulder joint point and the foot joint point. That is, the detection module 16 may also be used to implement step 44113.
Referring to fig. 3, in some embodiments, the processor 20 may be further configured to calculate the height of the human body according to the shoulder joint point and the foot joint point. That is, processor 20 may also be used to implement step 44113.
In particular, the height of the human body is also important for the quality of the photographed image, and if the human body in the photographed image is short, the user's form is relatively good and is not the photographed image desired by the user. Therefore, it is necessary to detect the height of the human body in the captured image to be able to find an image in which the height of the human body is high. The height of the human body can be calculated according to the shoulder joint points and the foot joint points.
More specifically, the body height can be calculated by using the ordinate of the left shoulder joint point, the ordinate of the right shoulder joint point, the ordinate of the left foot joint point and the ordinate of the right foot joint point, and the calculation formula is as follows:
Figure BDA0003084829660000121
where i denotes the ith person in the image, j-0 denotes the left shoulder, j-1 denotes the right shoulder, j-2 denotes the left foot, j-3 denotes the right foot, yij denotes the ordinate of the j joint point of the ith person, and H denotes the height of the captured image of the frame.
Of course, the head coordinates of the human body and the foot coordinates of the human body may be recognized to calculate the height of the human body. The height of the human body can also be calculated by other algorithms, which are not limited herein.
Referring to fig. 15, in some embodiments, the expression detection includes at least one of smile detection, eye opening detection, and the expression information includes smile degree, eye opening degree, and blink degree, and step 0442 may include the following steps:
4421: identifying face key points in the image to be recommended according to the face joint points; and
4422: at least one of smiling face detection, eye opening detection and eye opening detection is performed according to the key points of the face to correspondingly acquire at least one of a degree of smiling face, a degree of eye opening and a degree of blinking.
Referring to fig. 2, in some embodiments, the detection module 16 can be used to implement steps 4421 and 4422. That is, the detection module 16 may be configured to identify key points of a face in the image to be recommended according to the face joint points; and performing at least one of smiling face detection, eye opening detection and eye opening detection according to the key points of the face so as to correspondingly acquire at least one of smiling face degree, eye opening degree and blink degree.
Referring to fig. 3, in some embodiments, the processor 20 may further be configured to: identifying face key points in the image to be recommended according to the face joint points; and performing at least one of smiling face detection, eye opening detection and eye opening detection according to the key points of the face so as to correspondingly acquire at least one of smiling face degree, eye opening degree and blink degree. That is, the processor 20 may also be used to implement step 4421 and step 4422.
Specifically, in order to accurately detect the expression of the photographed portrait in each photographed image, a photographed image with a better expression can be found. Specifically, the face key point information of each frame of the shot images in the first frame to the nth frame can be detected through a face key point detection algorithm. The face key point detection algorithm can detect face key point information of all people in the shot image, for example, when only one person exists in the shot image, the face key point information of the person can be detected, and when a plurality of persons exist in the shot image, the face key point information of each person can be detected. The face key point detection algorithm may include algorithms such as Dlib or PFLD, which are not listed here. And selecting a proper algorithm to detect the key points of the human face according to actual requirements. The face key points may include key points of the features of the eyes (left and right eyes), ears (left and right ears), nose, mouth, and the like of the photographed person.
And then the expression of the shot portrait can be detected according to the detected key points of the face. For example, the smiling degree of the photographed person, the size of the opening of the eyes of the photographed person, the degree of opening of the mouth of the photographed person, and the like can be determined from the detected face key points. Therefore, according to the detected expression data, the expression detection result of the shot image can be generated. Expression information of the target person in each shot image can be acquired according to the expression detection result, so that the shot image with better expression of the target person can be selected according to the expression information.
Specifically, in order to facilitate accurate expression detection, at least one of smiling face detection, eye-open detection, and blink detection may be performed according to the identified key points of the human face, and then at least one of the smiling face detection result, the eye-open detection result, and the blink detection result may be correspondingly generated to respectively obtain at least one of a smiling face degree, an eye-open degree, and a blink degree. For example, smiling face detection may be performed according to the key points of the human face, and a smiling face detection result (smiling degree) may be generated after smiling face detection; eye opening detection can be carried out according to the key points of the human face, and an eye opening detection result (eye opening degree) can be generated after eye opening detection; blink detection can be performed according to the face joint points, and a blink detection result (blink degree) can be generated after blink detection. Alternatively, smile detection and eye-open detection may be performed based on the face key points, and smile detection results and eye-open detection results may be generated correspondingly. Alternatively, smile detection, eye-open detection, and blink detection may be performed based on the face key points, and a smile detection result, an eye-open detection result, and a blink detection result may be generated correspondingly.
Wherein, the smiling face degree is used for evaluating the smiling condition of the facial expression, and the higher the smiling face degree is, the more obvious the smiling is. The eye opening degree is used to evaluate the degree of opening of the eyes, and a higher eye opening degree indicates a more pronounced degree of opening of the eyes, and a lower eye opening degree indicates a more pronounced degree of closing of the eyes. The blink degree is used for measuring whether the user is blinking (wink), and the higher the blink degree is, the more obvious the blinking degree is.
In some embodiments, at least one of smiling face detection, eye-open detection, and blink detection may be selectively performed according to a user's demand, so that the obtained expression detection result conforms to the user's expectation. In one example, one of smiling face detection, eye-open detection, or blink detection may be selectively performed according to a demand. In another example. Two of smile detection, eye-open detection, and blink detection may be performed selectively as needed, for example, smile detection and eye-open detection, smile detection and blink detection, or eye-open detection and blink detection are performed. In yet another example. Smile detection, eye-open detection, and blink detection may be selectively performed.
In other embodiments, at least one of smiling face detection, eye-open detection, and blink detection may be automatically selected to be performed according to a common characteristic of a plurality of learning user selections of photographed images. For example, the user often selects a comparatively large photographic image with smile being more distracted and eyes being open as the recommended photographic image, and may automatically select to perform smiling face detection and open eye detection. Alternatively, the user often selects a comparatively happy photographic image that blinks and laughs as the recommended photographic image, smiling face detection and blink detection may be automatically selected to be performed. Other situations are also possible, not listed here.
Further, expression information can be acquired according to the expression detection result. For example, smile degrees may be obtained from the smile detection result, and the smile degrees may be used as expression information; the eye opening degree can be acquired according to the eye opening detection result and is used as expression information; the blink degree can be acquired according to the blink detection result and is used as expression information; expression information can be obtained according to the smiling face detection result and the eye opening detection result, and the expression information comprises at least one of smiling face degree and eye opening degree; further, expression information including at least one of a smiling face degree, an eye-opening degree, and a blink degree may be acquired based on the smiling face detection result, the eye-opening detection result, and the blink detection result. When expression information is acquired based on two or three of the smiling face detection result, the eye-open detection result, and the blink detection result, expression information may be obtained by adding two or three detection results (e.g., detection scores).
In this embodiment, at least one of smiling face detection, eye opening detection, and blink detection is performed according to the face key points, and an expression detection result is generated according to the detection result, so that expression detection data of a photographed portrait in a photographed image can be obtained, and thus the photographed image with a better expression can be selected according to the expression detection result.
Referring to fig. 16, in some embodiments, the key points of the human face include key points of the nose and key points of the mouth, and the step 4422 of performing the smiling face detection according to the key points of the face to obtain the smiling face degree includes the following steps:
44221: calculating the degree of mouth angle bending according to the nose key points and the mouth key points;
44222: calculating the opening degree of the mouth according to the key points of the mouth; and
44223: and obtaining the smiling face degree according to the mouth angle bending degree and the mouth opening degree.
Referring to fig. 2, in some embodiments, the detection module 16 may further be configured to: calculating the degree of mouth angle bending according to the nose key points and the mouth key points; calculating the opening degree of the mouth according to the key points of the mouth; and obtaining the smiling face degree according to the mouth angle bending degree and the mouth opening degree. That is, the detection module 16 may also be used to implement step 44221, step 44222 and step 44223.
Referring to fig. 3, in some embodiments, the processor 20 may further be configured to: calculating the degree of mouth angle bending according to the nose key points and the mouth key points; calculating the opening degree of the mouth according to the key points of the mouth; and obtaining the smiling face degree according to the mouth angle bending degree and the mouth opening degree. That is, the processor 20 may be used to implement step 44221, step 44222 and step 44223.
Specifically, the smile of the human face is mainly that the lips are changed, so the smile data of the human face can be determined according to the bending degree of the mouth corners and the opening degree of the mouth. The change of the lips relative to the nose can be calculated according to the coordinates of the key points of the nose and the coordinates of the key points of the mouth, and then the mouth angle bending degree Score can be obtainedrise_mouthGenerally, the more the mouth corner is curved, the more the subject image is happy. The number of the nose key points can be one or more, and the number of the mouth key points can be one or more. The mouth opening degree may be calculated from key points of the lips, for example, the mouth opening degree Score may be calculated from coordinates of an upper mouth key point and coordinates of a lower mouth key pointexpand_mouth. Wherein the number of the upper mouth key point and the lower mouth key point may be one or more.
Further, the degree of curvature at the mouth angle is obtained as Scorerise_mouthAnd the degree of mouth opening Scoreexpand_mouthThen, according to the degree of mouth angle bending Scorerise_mouthAnd the degree of mouth opening Scoreexpand_mouthCalculating smiling face detection result Score of human facesmile. In one example, the degree of mouth angle curvature plus mouth opening can be used to obtain smiling face detection, i.e., Scoresmile=Scorerise_mouth+Scoreexpand_mouth. In another example, smiling face detection results are mapped to the degree of mouth angle curvature and mouth opening, e.g., Scoresmile=aScorerise_mouth+bScoreexpand_mouth. Of course, the smiling face detection result of the human face can also be calculated in other ways according to the mouth angle bending degree and the mouth opening degree.
Referring to fig. 17, in some embodiments, the nose keypoints comprise a nose keypoint, the mouth keypoints comprise two mouth corner keypoints, and step 44221 comprises the steps of:
442211: and calculating the degree of mouth corner bending according to the key point of the nose head and the two key points of the mouth corner.
Referring to fig. 2, in some embodiments, the detection module 16 may be further configured to calculate the degree of mouth corner curvature according to the nose key point and the two mouth corner key points. That is, the detection module 16 may also be used to implement step 442211.
Referring to fig. 3, in some embodiments, processor 20 may be further configured to calculate the degree of mouth corner curvature based on the nose key point and the two mouth corner key points. That is, processor 20 may also be used to implement step 442211.
Specifically, referring to fig. 16, the coordinates of the Nose key point 33 are Nose (x, y), the coordinates of the left mouth corner key point 48 are Lips _ l (x, y), the coordinates of the right mouth corner key point 54 are Lips _ r (x, y), and the degree of curvature of mouth corner Scorerise_mouthThe calculation formula of (c) may be as follows:
Figure BDA0003084829660000141
referring to fig. 18, in some embodiments, the mouth keypoints include upper mouth keypoints, lower mouth keypoints, and mouth corner keypoints, and step 44222 includes the steps of:
442221: and calculating the opening degree of the mouth according to the key points of the upper mouth, the lower mouth and the corner of the mouth.
Referring to fig. 2, in some embodiments, the detection module 16 may be further configured to calculate the mouth opening degree according to the upper mouth key point, the lower mouth key point and the mouth corner key point. That is, the detection module 16 may also be used to implement step 442221.
Referring to fig. 3, in some embodiments, processor 20 may be further configured to calculate the mouth openness based on the upper mouth keypoint, the lower mouth keypoint, and the mouth corner keypoint. That is, processor 20 may also be used to implement step 442221.
Referring to fig. 19, the mouth corner key points may include two key points 48 and 54, the upper mouth key points may include two key points 49 and 53, and the lower mouth key points include two key points 55 and 59, respectively, and the mouth opening Score is calculatedexpand_mouthThe formula of (c) can be as follows:
Figure BDA0003084829660000142
wherein, dist49_59Indicating the distance, dist, between keypoint 49 and keypoint 5953_55Indicating the distance, dist, between keypoint 53 and keypoint 5548_54Representing the distance between keypoint 48 and keypoint 54. The coordinates of the key points 49, 59, 53, 55, 48, 54 can be calculated by the above-mentioned face key point detection algorithm or other algorithms, which will not be described in detail herein.
Of course, in other embodiments, the mouth opening degree can be calculated by other embodiments, and is not limited herein.
As such, in performing step 44223, a smiling face degree may be obtained from the degree of mouth angle curvature obtained in step 442211 and the degree of mouth opening obtained in step 442221.
Referring to fig. 20, in some embodiments, the facial key points include eye key points, and the step 4422 of performing eye opening detection according to the facial key points to obtain the eye opening degree includes the steps of:
44223: and calculating the eye opening degree of each eye according to the eye key points of each eye.
Referring to fig. 2, in some embodiments, the detection module 16 may be further configured to calculate an eye opening degree of each eye according to the eye key point of each eye. That is, the detection module 16 may also be used to implement step 44223.
Referring to fig. 3, in some embodiments, the processor 20 may be further configured to calculate an eye opening degree of each eye according to the eye key point of each eye. That is, processor 20 may also be used to implement step 44223.
The eye openness is used to evaluate the degree of openness of the eyes of the target person, which is important for the expression of the target person, and the degree of openness of the eyes directly affects the overall appearance of the captured image. According to the eye key points of each eye, the opening degree of each eye can be calculated and the eye opening detection result of each eye can be generated, and the eye opening detection result can be taken as the eye opening degree.
Specifically, each eye may include an upper eyelid, a lower eyelid, and an canthus, and the eye key points may include an upper eyelid key point, a lower eyelid key point, and an canthus key point. According to the key points of the upper eyelid, the lower eyelid and the corner of the eye of the left eye, the eye opening degree of the left eye can be calculated. According to the key points of the upper eyelid, the lower eyelid and the canthus of the right eye, the eye opening degree of the right eye can be calculated. And then the eye opening detection result can be generated according to the eye opening degree of the left eye and the eye opening degree of the right eye. For example, the open-eye detection structure may be the eye openness degree of the left eye plus the eye openness degree of the right eye.
More specifically, continuing with fig. 21, the corner key points for the left eye may include two key points 36 and 39, respectively, the upper eyelid key points for the left eye may include two key points 37 and 38, respectively, and the lower eyelid key points for the left eye may include two key points 40 and 41, respectively. From the key points 36, 37, 38, 40 and 41, the eye openness Score of the left eye can be obtained by aspect ratio calculationexpand_eye_l. Right sideThe corner key points of the eye may include two, key points 42 and 45, respectively, the upper eyelid key points of the right eye may include two, key points 43 and 44, respectively, and the lower eyelid key points of the right eye may include two, key points 47 and 48, respectively. From the key points 42, 43, 44, 45, 47 and 48, the eye openness Score of the left eye can be obtained by the aspect ratio (longitudinal and lateral comparison) calculationexpand_eye_rThe degree of openness of the eyes Scoreexpand_eye=Scoreexpand_eye_l+Scoreexpand_eye_r. In this example, the open eye detection result is the degree of openness Score of both eyesexpand_eye=Scoreexpand_eye_l+Scoreexpand_eye_rThe opening degree of the eyes can be changed into Scoreexpand_eye=Scoreexpand_eye_l+Scoreexpand_eye_rAs the eye opening degree, a photographed image in which both eyes are clearly opened can be recommended according to the eye opening degree.
Referring to fig. 22, in some embodiments, the step 4422 of performing blink detection according to the key points of the face to obtain the blink degree includes the steps of:
44224: the blink degree (monocular blink degree) is generated based on the eye opening degree of the left eye and the eye opening degree of the right eye.
Referring to fig. 2, in some embodiments, the detection module 16 may be further configured to generate a blink degree (monocular blink degree) according to the eye openness degree of the left eye and the eye openness degree of the right eye. That is, the detection module 16 may also be used to implement step 44224.
Referring to fig. 3, in some embodiments, the processor 20 may be further configured to generate a blink rate (monocular blink rate) according to the eye openness degree of the left eye and the eye openness degree of the right eye. That is, processor 20 may also be used to implement step 44224.
Specifically, in order to detect whether a subject person blinks (wink), that is, whether the user blinks only one eye (single eye blink), for example, only the left eye is opened and the right eye is closed; alternatively, only the right eye is opened and the left eye is closed. Whether the situation of blinking eyes exists in the shot portrait of the shot image can be judged according to the eye opening degree of the left eye and the eye opening degree of the right eye, and the fact that which eye is opened and which eye is closed can also be determined.
In one example, whether there is a blinking condition can be determined based on a difference between the eye openness of the left eye and the eye openness of the right eye. For example, the greater the difference between the degree of eye openness of the left eye and the degree of eye openness of the right eye, the greater the probability of blinking of the single eye; the smaller the difference between the degree of eye openness of the left eye and the degree of eye openness of the right eye, the smaller the probability of blinking of the single eye.
In another example, the presence or absence of a blinking eye may be determined and a blink detection result may be generated based on an absolute value of a ratio between a difference between the eye openness degree of the left eye and the eye openness degree of the right eye and a sum of the eye openness degree of the left eye and the eye openness degree of the right eye. The specific calculation formula may be as follows:
Figure BDA0003084829660000151
wherein abs represents the absolute value of the logarithm, Scoreexpand_eye_lIndicating the degree of eye openness of the left eye, Scoreexpand_eye_rIndicating the degree of eye openness of the right eye, ScorewinkIndicating the blink degree. If ScorewinkThe larger the value, the higher the probability of blinking, and if Score is presentwinkThe smaller the value, the smaller the probability of blinking.
In some embodiments, the expression detection result includes a smiling face detection result, an eye-open detection result, and a blink detection result, and the expression detection result is obtained from the smiling face detection result, the eye-open detection result, and the blink detection result. For example, smiling face detection result ScoresmileAnd eye opening detection result Scoreexpand_eyeAnd blink detection ScorewinkAddingThe expression detection result Score can be obtainedemotionI.e. Scoreemotion=Scoresmile+Scoreexpand_eye+Scorewink
In summary, the human body information may be obtained according to any of the above embodiments, so as to obtain the recommended value of each frame of the captured image according to the human body information. The human body information comprises posture information and/or expression information. The pose information may include at least one of face sharpness, face occlusion, pose extension, and body height. The expression information may include at least one of a smiling face degree, an eye-opening degree, and a blinking degree. When the human body information includes different types of posture information and/or expression information, recommendation strategies for the shot images are different.
For example, when the user only focuses on selecting an image with a clear face and the face not blocked, the body information may only include pose information, and the pose information includes face definition and face blocking degree, so as to calculate a recommended value of an image to be recommended in each frame of the captured image according to the face definition and the face blocking degree, so that the captured image recommended to the user is an image selected based on the face definition and the face blocking degree.
Understandably, the human body information may include only posture information; or the human body information may include only the facial expression information; or the human body information may include both the posture information and the expression information. Wherein, the gesture information comprises 4 elements, and the expression information comprises 3 elements. When the human body information only includes the posture information, the combination mode of the posture information has Q1,
Figure BDA0003084829660000161
when the selection criteria of the user's mind-set image only refer to 4 elements in the posture information, the human body information may only include the posture information, and a recommendation value may be obtained using the human body information with the selection criteria among the 15 kinds of human body information according to the selection criteria to recommend a photographed image that meets the selection criteria.
Similarly, when the human body information includes only the expression information, the combination manner of the posture information is Q2,
Figure BDA0003084829660000162
when the selection standard of the user for the mind-set image only refers to 3 elements in the expression information, the human body information can only comprise the expression information, and a recommended value can be obtained by applying the human body information corresponding to the selection standard in the 7 human body information according to the selection standard so as to recommend the shot image meeting the selection standard.
Similarly, when the human body information includes the posture information and the expression information, the combination mode of the posture information is Q3,
Figure BDA0003084829660000163
when the selection standard of the user on the cardioscope image refers to 7 elements in the posture information and the expression information, the human body information comprises the posture information and the expression information, and a recommended value can be obtained according to the human body information which is in the 127 kinds of human body information and the selection standard, so that the shot image which meets the selection standard is recommended.
Referring to FIG. 23, in some embodiments, step 05 includes the following steps:
051: acquiring a recommended value model, wherein the recommended value model represents a mapping relation between a recommended value and a preset recommendation coefficient, posture information and expression information;
052: and acquiring a recommended value according to the recommended value model, the posture information and the expression information.
In certain embodiments, acquisition module 12 may also be used to implement steps 051 and 052. That is, the obtaining module 12 may further be configured to obtain a recommendation value model, where the recommendation value model represents a mapping relationship between a recommendation value and a preset recommendation coefficient, posture information, and expression information; and acquiring a recommended value according to the recommended value model, the posture information and the expression information.
In some embodiments, the processor 20 may be further configured to obtain a recommendation value model, where the recommendation value model represents a mapping relationship between a recommendation value and preset recommendation coefficients, posture information, and expression information; and acquiring a recommended value according to the recommended value model, the posture information and the expression information. That is, processor 20 may also be used to implement steps 051 and 052.
Specifically, will beThe model of importing the posture information and/or the expression information in the human body information obtained in the embodiment into the recommended value can calculate the recommended value of the image to be recommended. The recommended value model is: scorefinal=k1Scorepose+k2ScoreemotionWherein, ScorefinalTo a recommended value, K1And K2To recommend the coefficient, when K1When 0, the human body information includes only the expression information Scoreemotion(ii) a When K is2When 0, the human body information includes only the posture information ScoreposeWhen K is1And K2When the values are all not 0, the human body information is represented to comprise posture information and expression information.
Further, the posture information ScoreposeCan be unfolded as follows: scorepose=α1*Scoreeface_clarity2*Scoreface_occlusion3*Scorestretch4*ScorehExpression information ScoreemotionCan be unfolded as follows: scoreemotion=α5*Scorerise_mouth6*Scoreexpand_eye7*ScorewinkThen the recommendation model is:
Figure BDA0003084829660000171
wherein alpha is1、α2、α3、α4、α5、α6And alpha7Is a coefficient of recommendation, and alpha1、α2、α3、α4、α5、α6Or α7All values of (1) can be 0. According to the condition that whether the value of the recommendation coefficient is 0 or not and the recommendation value ScorefinalThere are 127 combinations of the calculation-related human body information elements of (1). For example when alpha1、α2、α3And alpha4Is 0, alpha5、α6And alpha7When the value of (A) is not 0, Scorefinal=α5*Scorerise_mouth6*Scoreexpand_eye7*ScorewinkIs a calculation of the recommended value ScorefinalA combination of elements (c). Other 126 calculation recommendation values ScorefinalCombinations of elements of (a) are not listed here.
In one embodiment, the recommendation coefficient α may be obtained based on a priori knowledge of a training set using an SVM algorithm1、α2、α3、α4、α5、α6And alpha7In the index value of not 0 to obtain a recommended value ScorefinalSo as to be able to import human body information into the trained recommendation value ScorefinalThe regression prediction model of (a) obtains a recommended value.
In yet another embodiment, the recommendation coefficient α may be customized1、α2、α3、α4、α5、α6And alpha7To artificially determine the recommended coefficient alpha1、α2、α3、α4、α5、α6And alpha7Corresponding human face definition, human face shielding degree, posture extension degree, human body height, smiling face degree, eye opening degree and blink degree. For example, if the user pays more attention to the face definition of the target person, the face definition Score can be improvedface_occlusionCorresponding recommendation coefficient alpha1The influence of the face definition on the calculation of the recommended value is improved, and the face definition is preferentially considered when the shot image is recommended.
Further, the gesture extension ScorestretchCan be unfolded as follows:
Figure BDA0003084829660000172
eye opening degree ScoreeyeCan be unfolded as follows: alpha is alpha6*Scoreeye=β8*Scoreexpand_eye_l9*Scoreexpand_eye_rThen the recommendation model is:
Figure BDA0003084829660000173
wherein, beta1、β2、β3、β4、β5、β6、β7、β8And beta9Is also thatCoefficient is recommended, and when alpha is3When not 0, β1、β2、β3、β4、β5、β6And beta7Are not all 0; when alpha is6When not 0, β7、β8And beta9Are not 0. Recommendation coefficient alpha can be obtained by utilizing SVM algorithm based on priori knowledge of training set1、α2、α4、α5、α7、β1、β2、β3、β4、β5、β6、β7、β8And beta9In the index value of not 0 to obtain a recommended value ScorefinalSo as to be able to import human body information into the trained recommendation value ScorefinalThe regression prediction model of (a) obtains a recommended value.
After the recommended value of the image to be recommended in each frame of shot image is obtained through calculation, the plurality of frames of images to be recommended can be sorted according to the recommended value from small to large or from large to small, and then the images with the sequence in the preset sequence can be selected as the target images. For example, when the multiple frames of shot images are sorted from large to small according to the size of the recommended value, the predetermined order may be the order ranked first, for example, the predetermined order may be the first name, the first two names, the first three names, and the first five names; alternatively, when the multiple frame shot images are sorted from small to large according to the size of the recommended value, the predetermined order may be the order of the last, for example, the last two, the last three, and the last five. Therefore, the obtained target image has beautiful posture and/or better expression, and is more suitable for the expectation of the user.
In one example, the recommended number of the shot images is one, and when the multiple frames of shot images are sorted from small to large according to the recommended value, the predetermined sequence is the last one; when the plurality of frames of shot images are sorted from large to small according to the recommended value, the preset sequence is the first name, so that the recommended shot image quality is better.
In another example, the recommended number of the shot images may be two, three, etc., so that the user may select an image more suitable for the user's desire from the target images. Compared with the method that the user directly selects one target image from all the shot images, the method can save the time for the user to select the target image, and further can enhance the use experience of the user.
Further, in one example, after recommending that the images be taken, the processor 20 or the terminal 100 may delete other images to save memory space of the terminal 100. In another example, after the recommended shot image is recommended, the recommended shot image may be used as a base frame, and the postures and expressions of other shot images are fused into the recommended shot image, so as to make the posture and expression of the recommended shot image better, or High-Dynamic illumination rendering (HDR), dazzling, deblurring, and the like may be performed on the recommended shot image, so as to make the recommended shot image clearer and improve the quality and aesthetic feeling of the recommended shot image.
Referring to fig. 24, one or more non-transitory computer-readable storage media 300 containing a computer program 301 of embodiments herein, when the computer program 301 is executed by one or more processors 20, causes the processors 20 to perform the image recommendation method of any of the above embodiments, for example, performing one or more of steps 01, 02, 03, 04, 05, 06, 041, 042, 043, 044, 0441, 0442, 04411, 4412, 4413, 4414, 4415, 4416, 4417, 4418, 4419, 44110, 44111, 44112, 44113, 04421, 04422, 44221, 44222, 44223, 44223, 44224, 442211, 051, and 052.
For example, the computer program 301, when executed by the one or more processors 20, causes the processors 20 to perform the steps of:
01: acquiring a preset image, wherein the preset image comprises a preset face;
02: executing continuous shooting to obtain a plurality of frames of shot images;
03: performing joint point detection on each frame of shot image to obtain human body joint points of each frame of shot image;
04: acquiring human body information of each frame of shot image according to a preset human face and human body joint points of each frame of shot image;
05: acquiring a recommended value of each frame of shot image according to the human body information; and
06: and sequencing the multiple frames of shot images according to the recommended values, and recommending the shot images according to the sequencing result.
As another example, the computer program 301, when executed by the one or more processors 20, causes the processors 20 to perform the steps of:
01: acquiring a preset image, wherein the preset image comprises a preset face;
02: executing continuous shooting to obtain a plurality of frames of shot images;
03: performing joint point detection on each frame of shot image to obtain human body joint points of each frame of shot image;
041: acquiring the face features in the shot images according to the face joint points of each frame of shot images;
042: matching a preset face and the face features in each frame of shot image;
043: selecting a shot image with successfully matched human face features as an image to be recommended;
044: acquiring human body information according to human body joint points in an image to be recommended;
05: acquiring a recommended value of each frame of shot image according to the human body information; and
06: and sequencing the multiple frames of shot images according to the recommended values, and recommending the shot images according to the sequencing result.
In the description herein, reference to the description of the terms "one embodiment," "some embodiments," "an illustrative embodiment," "an example," "a specific example" or "some examples" or the like means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and the scope of the preferred embodiments of the present application includes other implementations in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present application.
Although embodiments of the present application have been shown and described above, it is to be understood that the above embodiments are exemplary and not to be construed as limiting the present application, and that changes, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present application.

Claims (12)

1. An image recommendation method, comprising:
acquiring a preset image, wherein the preset image comprises a preset face;
executing continuous shooting to obtain a plurality of frames of shot images;
carrying out joint point detection on each frame of the shot image to obtain human body joint points of each frame of the shot image;
acquiring human body information of each frame of shot image according to the preset human face and the human body joint point of each frame of shot image;
acquiring a recommended value of each frame of the shot image according to the human body information; and
and sequencing the plurality of frames of shot images according to the recommended values, and recommending the shot images according to sequencing results.
2. The image recommendation method according to claim 1, wherein the human body joint points include face joint points; the acquiring the human body information of each frame of the shot image according to the preset human face and the human body joint point of each frame of the shot image comprises the following steps:
acquiring human face features in the shot images according to the facial joint points of each frame of the shot images;
matching the preset face and the face features in each frame of the shot image;
selecting the shot image with the successfully matched face features as an image to be recommended; and
and acquiring the human body information according to the human body joint points in the image to be recommended.
3. The image recommendation method according to claim 2, wherein the human body information includes pose information and expression information, and the obtaining the human body information according to the human body joint points in the image to be recommended includes:
executing gesture detection according to the human body joint points in the image to be recommended to acquire the gesture information; and/or
And executing expression detection according to the human body joint points in the image to be recommended so as to acquire the expression information.
4. The image recommendation method according to claim 3, wherein the pose detection comprises at least one of face sharpness detection, face occlusion detection, pose extension detection and body height detection, and the pose information comprises face sharpness, face occlusion, pose extension and body height; the executing gesture detection according to the human body joint point in the image to be recommended to acquire the gesture information includes:
and executing at least one of the face definition detection, the face shielding degree detection, the posture stretching degree detection and the human body height detection according to the human body joint points so as to correspondingly acquire at least one of the face definition, the face shielding degree, the posture stretching degree and the human body height.
5. The image recommendation method according to claim 3, wherein the expression detection includes at least one of smile detection, eye opening detection, and the expression information includes a smile degree, an eye opening degree, and a blink degree; the performing expression detection according to the human body joint points in the image to be recommended to acquire the expression information includes:
identifying face key points in the image to be recommended according to the face joint points; and
and executing at least one of the smiling face detection, the eye opening detection and the blink detection according to the key points of the face so as to correspondingly acquire at least one of the smiling face degree, the eye opening degree and the blink degree.
6. The image recommendation method according to claim 5, wherein the face key points include a nose key point and a mouth key point, and wherein the smiling face detection is performed according to the face key points to obtain a smiling face degree, comprising:
calculating the degree of mouth angle bending according to the nose key points and the mouth key points;
calculating the opening degree of the mouth according to the key points of the mouth; and
and obtaining the smiling face degree according to the mouth angle bending degree and the mouth opening degree.
7. The image recommendation method according to claim 5, wherein the facial key points include eye key points, the eye-open detection is performed based on the facial key points to acquire the eye-open degree, further comprising:
calculating the eye opening degree of each eye according to the eye key points of each eye.
8. The image recommendation method of claim 5, wherein the eyes comprise a left eye and a right eye, and wherein the blink detection is performed based on the face key points to obtain blink degrees, further comprising:
calculating the blink degree according to the eye opening degree of the left eye and the eye opening degree of the right eye.
9. The image recommendation method according to claim 3, wherein the obtaining of the recommendation value of the captured image according to the human body information includes:
acquiring a recommended value model representing a mapping relation between the recommended value and a preset recommendation coefficient, the posture information and the expression information; and
and obtaining the recommended value according to the recommended value model, the posture information and the expression information.
10. An image recommendation apparatus characterized by comprising:
the system comprises an acquisition module, a display module and a control module, wherein the acquisition module is used for acquiring a preset image, acquiring a plurality of frames of shot images, acquiring human body joint points of each frame of shot images, acquiring human body information of each frame of shot images and acquiring recommended values of each frame of shot images;
a continuous shooting module for executing continuous shooting;
the detection module is used for detecting the joint points of each frame of the shot image;
and the recommending module is used for sequencing the plurality of frames of the shot images according to the recommended values and recommending the shot images according to sequencing results.
11. A terminal, characterized in that the terminal comprises:
one or more processors, memory; and
one or more programs, wherein the one or more programs are stored in the memory and executed by the one or more processors, the programs comprising instructions for performing the image recommendation method of any of claims 1-9.
12. A non-transitory computer-readable storage medium containing a computer program which, when executed by one or more processors, causes the processors to implement the image recommendation method of any one of claims 1-9.
CN202110577167.4A 2021-05-26 2021-05-26 Image recommendation method and device, terminal and readable storage medium Pending CN113239220A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110577167.4A CN113239220A (en) 2021-05-26 2021-05-26 Image recommendation method and device, terminal and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110577167.4A CN113239220A (en) 2021-05-26 2021-05-26 Image recommendation method and device, terminal and readable storage medium

Publications (1)

Publication Number Publication Date
CN113239220A true CN113239220A (en) 2021-08-10

Family

ID=77138897

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110577167.4A Pending CN113239220A (en) 2021-05-26 2021-05-26 Image recommendation method and device, terminal and readable storage medium

Country Status (1)

Country Link
CN (1) CN113239220A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113673466A (en) * 2021-08-27 2021-11-19 深圳市爱深盈通信息技术有限公司 Method for extracting photo stickers based on face key points, electronic equipment and storage medium
CN115170441A (en) * 2022-08-30 2022-10-11 荣耀终端有限公司 Image processing method and electronic equipment
WO2024046162A1 (en) * 2022-08-30 2024-03-07 华为技术有限公司 Image recommendation method and electronic device

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104683692A (en) * 2015-02-04 2015-06-03 广东欧珀移动通信有限公司 Continuous shooting method and continuous shooting device
CN110147744A (en) * 2019-05-09 2019-08-20 腾讯科技(深圳)有限公司 A kind of quality of human face image appraisal procedure, device and terminal
CN111599002A (en) * 2020-05-15 2020-08-28 北京百度网讯科技有限公司 Method and apparatus for generating image
CN112036209A (en) * 2019-06-03 2020-12-04 Tcl集团股份有限公司 Portrait photo processing method and terminal
CN112464012A (en) * 2020-10-31 2021-03-09 浙江工业大学 Automatic scenic spot photographing system capable of automatically screening photos and automatic scenic spot photographing method
WO2021042364A1 (en) * 2019-09-06 2021-03-11 华为技术有限公司 Method and device for taking picture

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104683692A (en) * 2015-02-04 2015-06-03 广东欧珀移动通信有限公司 Continuous shooting method and continuous shooting device
CN110147744A (en) * 2019-05-09 2019-08-20 腾讯科技(深圳)有限公司 A kind of quality of human face image appraisal procedure, device and terminal
CN112036209A (en) * 2019-06-03 2020-12-04 Tcl集团股份有限公司 Portrait photo processing method and terminal
WO2021042364A1 (en) * 2019-09-06 2021-03-11 华为技术有限公司 Method and device for taking picture
CN111599002A (en) * 2020-05-15 2020-08-28 北京百度网讯科技有限公司 Method and apparatus for generating image
CN112464012A (en) * 2020-10-31 2021-03-09 浙江工业大学 Automatic scenic spot photographing system capable of automatically screening photos and automatic scenic spot photographing method

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113673466A (en) * 2021-08-27 2021-11-19 深圳市爱深盈通信息技术有限公司 Method for extracting photo stickers based on face key points, electronic equipment and storage medium
CN115170441A (en) * 2022-08-30 2022-10-11 荣耀终端有限公司 Image processing method and electronic equipment
CN115170441B (en) * 2022-08-30 2023-02-07 荣耀终端有限公司 Image processing method and electronic equipment
WO2024046162A1 (en) * 2022-08-30 2024-03-07 华为技术有限公司 Image recommendation method and electronic device

Similar Documents

Publication Publication Date Title
CN113239220A (en) Image recommendation method and device, terminal and readable storage medium
US10964078B2 (en) System, device, and method of virtual dressing utilizing image processing, machine learning, and computer vision
US20200380594A1 (en) Virtual try-on system, virtual try-on method, computer program product, and information processing device
WO2019128508A1 (en) Method and apparatus for processing image, storage medium, and electronic device
CN109905593B (en) Image processing method and device
WO2021114814A1 (en) Human body attribute recognition method and apparatus, electronic device and storage medium
CN108810406B (en) Portrait light effect processing method, device, terminal and computer readable storage medium
US20120007859A1 (en) Method and apparatus for generating face animation in computer system
CN108182714A (en) Image processing method and device, storage medium
CN112164091B (en) Mobile device human body pose estimation method based on three-dimensional skeleton extraction
CN110637324B (en) Three-dimensional data system and three-dimensional data processing method
US11574477B2 (en) Highlight video generated with adaptable multimodal customization
CN106815803B (en) Picture processing method and device
WO2023143215A1 (en) Acquisition method for human body measurement data, processing method for human body measurement data, and device
CN110222597A (en) The method and device that screen is shown is adjusted based on micro- expression
CN112036209A (en) Portrait photo processing method and terminal
KR20140124087A (en) System and method for recommending hair based on face and style recognition
CN109986553B (en) Active interaction robot, system, method and storage device
KR20230060726A (en) Method for providing face synthesis service and apparatus for same
CN113326775B (en) Image processing method and device, terminal and readable storage medium
CN113313009A (en) Method, device and terminal for continuously shooting output image and readable storage medium
CN114943924A (en) Pain assessment method, system, device and medium based on facial expression video
KR101734212B1 (en) Facial expression training system
CN109035177B (en) Photo processing method and device
JP2023005567A (en) Machine learning program, machine learning method, and facial expression recognition device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210810

RJ01 Rejection of invention patent application after publication