CN115223238A - Information prompting method and device, storage medium and electronic equipment - Google Patents

Information prompting method and device, storage medium and electronic equipment Download PDF

Info

Publication number
CN115223238A
CN115223238A CN202210682243.2A CN202210682243A CN115223238A CN 115223238 A CN115223238 A CN 115223238A CN 202210682243 A CN202210682243 A CN 202210682243A CN 115223238 A CN115223238 A CN 115223238A
Authority
CN
China
Prior art keywords
information
shot
target
posture
prompting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210682243.2A
Other languages
Chinese (zh)
Inventor
柳阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Bank Co Ltd
Original Assignee
Ping An Bank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Bank Co Ltd filed Critical Ping An Bank Co Ltd
Priority to CN202210682243.2A priority Critical patent/CN115223238A/en
Publication of CN115223238A publication Critical patent/CN115223238A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces

Abstract

The application discloses an information prompting method, an information prompting device, a storage medium and electronic equipment. The method comprises the following steps: recognizing the gesture of an object to be shot in a shooting scene to obtain object gesture information of the object to be shot; when the object posture information is not matched with preset posture information, determining the target distance between the object to be shot and the electronic equipment; acquiring current time, and acquiring target prompt time according to the target distance and the current time; and outputting prompt information for prompting the object to be shot to adjust the posture when the target prompt time is reached. The image quality can be improved.

Description

Information prompting method and device, storage medium and electronic equipment
Technical Field
The present application relates to the field of electronic technologies, and in particular, to an information prompting method and apparatus, a storage medium, and an electronic device.
Background
With the continuous development of electronic devices, camera pixels on electronic devices such as smart phones are higher and higher, so that more and more users tend to take pictures by using the electronic devices such as smart phones. In order to meet the photographing requirements of users, manufacturers of large electronic equipment continuously update and upgrade the hardware of the electronic equipment so as to improve the photographing pixels of the electronic equipment. However, taking a high-quality photograph requires not only a high resolution of the camera of the electronic device but also an appropriate posture of the subject. However, most of the subjects cannot be properly posed, and thus a high-quality photograph cannot be taken.
Disclosure of Invention
The embodiment of the application provides an information prompting method, an information prompting device, a storage medium and electronic equipment, which can improve the quality of images.
In a first aspect, an embodiment of the present application provides an information prompting method, which is applied to an electronic device, and includes:
recognizing the posture of an object to be shot in a shooting scene to obtain object posture information of the object to be shot;
when the object posture information is not matched with preset posture information, determining the target distance between the object to be shot and the electronic equipment;
acquiring current time, and acquiring target prompt time according to the target distance and the current time;
and outputting prompt information for prompting the object to be shot to adjust the posture when the target prompt time is reached.
In a second aspect, an embodiment of the present application provides an information prompt apparatus, applied to an electronic device, including:
the information acquisition module is used for identifying the gesture of an object to be shot in a shooting scene to obtain object gesture information of the object to be shot;
the distance determining module is used for determining the target distance between the object to be shot and the electronic equipment when the object posture information is not matched with preset posture information;
the time acquisition module is used for acquiring current time and acquiring target prompt time according to the target distance and the current time;
and the information output module is used for outputting prompt information for prompting the object to be shot to adjust the posture when the target prompt time is reached.
In a third aspect, an embodiment of the present application provides a storage medium, on which a computer program is stored, and when the computer program is executed on a computer, the computer is caused to execute an information prompting method provided in the embodiment of the present application.
In a fourth aspect, an embodiment of the present application further provides an electronic device, which includes a memory and a processor, where the processor is configured to execute the information prompting method provided in the embodiment of the present application by calling a computer program stored in the memory.
In the embodiment of the application, when the object posture information of the object to be shot is not matched with the preset posture information, the object prompt time is obtained according to the object distance between the object to be shot and the electronic equipment and the object attribute information of the object to be shot; when the target prompt time is reached, prompt information for prompting the to-be-shot object to adjust the posture is output, so that the posture of the to-be-shot object can be adjusted based on the output prompt information to enable the to-be-shot object to swing out of a proper posture, and the quality of an image obtained by shooting the to-be-shot object is improved.
Drawings
The technical solutions and advantages of the present application will become apparent from the following detailed description of specific embodiments of the present application when taken in conjunction with the accompanying drawings.
Fig. 1 is a schematic flow chart of an information prompting method provided in an embodiment of the present application.
Fig. 2 is a schematic view of a first scenario of an information prompting method according to an embodiment of the present application.
Fig. 3 is a schematic view of a second scenario of an information prompting method according to an embodiment of the present application.
Fig. 4 is a schematic structural diagram of an information prompt apparatus according to an embodiment of the present application.
Fig. 5 is a schematic structural diagram of a first electronic device according to an embodiment of the present application.
Fig. 6 is a schematic structural diagram of a second electronic device according to an embodiment of the present application.
Detailed Description
Reference is made to the drawings, wherein like reference numerals refer to like elements, which are illustrated in the various figures, and which are implemented in a suitable computing environment. The following description is based on illustrated embodiments of the application and should not be taken as limiting the application with respect to other embodiments that are not detailed herein.
It should be noted that the terms "first", "second", and "third", etc. in this application are used to distinguish different objects, and are not used to describe a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or modules is not limited to only those steps or modules recited, but rather, some embodiments include additional steps or modules not recited, or inherent to such process, method, article, or apparatus.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.
The embodiment of the application provides an information prompting method, an information prompting device, a storage medium and an electronic device, wherein an execution main body of the information prompting method can be the information prompting device provided by the embodiment of the application or the electronic device integrated with the information prompting device, and the information prompting device can be realized in a hardware or software mode. The electronic device may be a smart phone, a tablet computer, a palm computer, a notebook computer, or other devices equipped with a processor and having an information prompting capability.
Referring to fig. 1, fig. 1 is a schematic flow chart of an information prompting method provided in an embodiment of the present application, where the flow chart may include:
in 101, the posture of the object to be shot in the shooting scene is recognized, and the object posture information of the object to be shot is obtained.
With the continuous development of electronic devices, the pixels of cameras on the electronic devices such as smart phones are higher and higher, so that more and more users tend to take pictures by using the electronic devices such as smart phones. In order to meet the photographing requirements of users, electronic equipment manufacturers continuously update and upgrade electronic equipment hardware so as to improve photographing pixels of the electronic equipment. However, taking a high-quality photograph requires not only a high resolution of the camera of the electronic device but also an appropriate posture of the subject. However, most of the subjects cannot be properly posed, and thus a high-quality photograph cannot be taken.
In this embodiment, the electronic device identifies the posture of the object to be photographed in the photographing scene to obtain object posture information of the object to be photographed.
After the electronic device starts a shooting application program (for example, a system application "camera" of the electronic device) according to a user operation, a scene aimed at by a camera of the electronic device is a shooting scene. For example, after a user clicks an icon of a "camera" application on the electronic device through a finger to start the "camera application", if the user uses a camera of the electronic device to aim at a certain scene, the scene is a shooting scene. From the above description, it will be understood by those skilled in the art that the shooting scene is not specific to a particular scene, but is a scene aligned in real time following the orientation of the camera.
The object to be photographed is an object which needs to be photographed or recorded, such as a person, a cat or a dog, etc. who needs to photograph or record.
The object posture information may include a posture of the object to be photographed. Referring to fig. 2, when the object to be photographed is in the posture shown in fig. 2, the object posture information may be a dual-handed hat standing.
Since the key point positions may generally reflect the pose of the object to be photographed, the object pose information may further include key point positions of the object to be photographed. The key points can correspond to joints with certain freedom degree on human body, such as neck, shoulder, elbow, wrist, waist, knee, ankle, etc. Key points may also include, for example, mouth, nose, left eye, right eye, left ear, right ear, neck, left shoulder, right shoulder, left elbow, right elbow, left wrist, right wrist, left hip, right hip, left knee, right knee, left ankle, right ankle, etc.
At 102, when the object posture information does not match with preset posture information, determining a target distance between the object to be shot and the electronic device.
The preset posture information can be collected in advance and stored in the electronic equipment. For example, a professional photographer may design a different pose, and use the different pose, or a key point position corresponding to the different pose, as the candidate pose information.
Before obtaining the object posture information of the object to be shot, a user, for example, selects one piece of posture information to be selected from the plurality of pieces of posture information to be shot as preset posture information, or the electronic device determines one piece of posture information to be selected from the plurality of pieces of posture information to be selected as preset posture information. Then, after the object posture information of the object to be shot is acquired, the electronic equipment matches the object posture information with preset posture information.
And if the object posture information is not matched with the preset posture information, the electronic equipment determines the target distance between the object to be shot and the electronic equipment. For example, the electronic device may obtain a target distance between the object to be photographed and the electronic device in an infrared ranging manner or a time of flight TOF ranging manner.
If the posture information of the object is matched with the preset posture information, the electronic equipment can directly shoot the object to be shot to obtain a corresponding image. The electronic equipment can also shoot the shooting object after receiving the shooting instruction to obtain a corresponding image.
In 103, the current time is obtained, and the target prompt time is obtained according to the target distance and the current time.
Wherein, different target distances may correspond to different target prompt times. For example, the target distance may be inversely related to the target cue time, i.e., the farther the target distance, the earlier the target cue time. For example, assuming that the target distance between the object to be photographed and the electronic device is 10cm, and the current time is 10 hours and 20 minutes, the target prompt time may be 10 points and 22 minutes; assuming that the target distance between the object to be photographed and the electronic device is 20cm, and the current time is 10 hours and 20 minutes, the prompting time may be 10 hours and 21 minutes.
In an optional embodiment, a preset mapping relationship between the distance and the prompting time may also be preset. For example, after the distance D1 corresponds to T1 minutes after the current time (if the current time is T, the cue time is T + T1), after the distance D2 corresponds to T2 minutes after the current time, and after the distance D3 corresponds to T3 minutes after the current time, assuming that the target distance between the object to be photographed and the electronic device is D1, the current time is 10 minutes after 10 hours, and T1 is 5, the electronic device may determine that the target cue time is 10 minutes after 15 minutes.
It will be appreciated that the target cue time may not be earlier than the current time.
And 104, when the target prompt time is reached, outputting prompt information for prompting the object to be shot to adjust the posture.
In this embodiment, when the target prompt time is reached, the electronic device outputs prompt information for prompting the object to be shot to adjust the posture, so that the object to be shot can adjust the posture of the object to be shot, and the object posture information of the object to be shot is matched with the preset posture information.
For example, the electronic device may display preset posture information on the screen, and then the object to be photographed may adjust its posture according to the preset posture information displayed on the screen, thereby facilitating the user to adjust its posture.
For another example, the electronic device may also output a corresponding prompt message indicating how to adjust the left hand by displaying or playing in a voice broadcast manner, such as "please lift the left hand by 1 cm", or the like.
In the embodiment, when the object posture information of the object to be shot is not matched with the preset posture information, the object prompt time is obtained according to the object distance between the object to be shot and the electronic equipment and the object attribute information of the object to be shot; when the target prompt time is reached, prompt information for prompting the to-be-shot object to adjust the posture is output, so that the posture of the to-be-shot object can be adjusted based on the output prompt information to enable the to-be-shot object to swing out of a proper posture, and the quality of an image obtained by shooting the to-be-shot object is improved.
Furthermore, the corresponding target prompt time can be determined according to the target distance between the object to be shot and the electronic equipment, and when the target prompt time is reached, the prompt information for prompting the adjustment posture is output, so that the flexibility of outputting the prompt information can be improved.
In an optional embodiment, obtaining the target prompt time according to the target distance and the current time includes:
determining a target age value of an object to be shot;
and acquiring target prompt time according to the target distance, the current time and the target age value.
In this embodiment, the electronic device may further determine a target age value of the object to be photographed, and obtain the target prompt time according to the target distance between the object to be photographed and the electronic device, the current time, and the target age value of the object to be photographed.
Wherein, different target age values can correspond to different target prompt times.
Considering that there may not be much patience when the age of the subject to be photographed is small, and patience is large when the age of the subject to be photographed is large, the prompt information may be output relatively early for a subject to be photographed of small age, and the prompt information may be output relatively late for a subject to be photographed of large age, at the same target distance. For example, the electronic device may obtain a plurality of candidate prompt times according to the target distance. Then, the electronic device can determine the target prompt time from the plurality of candidate prompt times according to the target age value.
For example, assuming that the plurality of candidate cue times determined according to the target distance, the current time and the current time (10: 18 points) include 10 points 20 points, 10 points 21 points and 10 points 22 points, if the target age value is 10 years old, the target cue time may be 10 points 20 points; if the target age value is 20 years, the target reminder time may be 10 points and 22 points.
In an optional embodiment, the mapping relationship between the distance and the candidate prompt time may be preset, for example, the distance D1 corresponds to the candidate prompt times T11, T12, and T13, the distance D2 corresponds to the candidate prompt times T21, T22, and T23, and the distance D3 corresponds to the candidate prompt times T31, T32, and T33. The mapping relationship between the age value and the prompting time can also be preset, such as the age value E1 corresponding to the prompting time T11, the age value E2 corresponding to the prompting time T12, the age value E3 corresponding to the prompting time T14, and so on. Then, assuming that the target distance is D1 and the target age value is E1, the electronic device may determine that the target cue time is T11.
In an optional embodiment, determining a target age value of the object to be photographed includes:
acquiring a face image of an object to be shot;
and carrying out age identification on the face image by using an age identification model to obtain a target age value of the object to be shot.
For example, the electronic device may capture an image of a subject to be captured to obtain a preview image. Then, the electronic device can cut out the face region from the preview image to obtain the face image of the object to be shot. Subsequently, the electronic device may input the face image into a pre-trained age recognition model to perform age recognition on the face image, so as to obtain a target age value of the object to be photographed.
In an optional embodiment, before acquiring the face image of the object to be photographed, the method may further include:
and establishing an age identification model.
In this embodiment, a plurality of different face images are collected as training data, and in order to improve the accuracy of the established age identification model, the collected face images can be made to correspond to different age values as much as possible.
Further, deep learning can be performed according to a related technology based on the acquired training data to obtain a trained age identification model.
In an alternative embodiment, the influence of the external environment on the age identification is taken into account, so that the parameter influencing the age identification by the external environment can be represented by an age correction parameter, and the target age value of the age identification is corrected according to the influence of the age correction parameter on the age identification in the subsequent age identification.
Therefore, after establishing the age identification model, the method may further include:
and performing statistical test on the training data based on the age correction parameters, and determining the corresponding relation between the correction parameter values of the age correction parameters and the corrected age values.
The age correction parameters comprise at least one of illumination parameters, facial expression parameters and facial pose parameters.
Taking the illumination parameter as an example, if the external illumination is stronger, that is, the light intensity value in the face image is large, the age value obtained by performing age recognition is smaller than the actual age value, compared with the same face. On the contrary, if the external illumination is weak, that is, the light intensity value in the face image is small, the age value obtained by performing the age identification is larger than the actual age value.
Relative to the facial expression parameters, the facial expression may be angry, sad, happy, etc. For example, if the facial expression of a human face is happy with respect to the same human face, the age value obtained by performing age recognition may be smaller than the actual age value. If the facial expression of the face is sad, the age value obtained by performing age recognition is larger than the actual age value.
The face pose also affects the result of age recognition. For example, if the face pose estimation is performed on the face with respect to the same face, and the obtained pitch angle values are different, the age value obtained by performing the age recognition may be larger or smaller than the actual age value.
Of course, the age correction parameter may include other parameters that affect the age recognition result in addition to the above parameters. The embodiment of the application aims to determine the correction value of the correction parameter values of different age correction parameters to the age of the face, so that the accuracy of age estimation is improved.
In the embodiment of the present application, after the age identification model is established, statistical tests can be performed on all face images serving as training data based on the age correction parameter to obtain the influence degree of the age correction parameter on the age identification, and optionally, the influence degree can be described by using a corrected age value.
Assuming that the age correction parameters only include illumination parameters, all face images serving as training data can be subjected to statistical test based on the illumination parameters, the face images are subjected to illumination detection according to the related technology, different illumination parameter values are tested, and the corrected age value for performing age recognition on the face images is determined according to the tested illumination parameter values. The target age values of all the face images can be determined according to the trained age recognition model, the actual age values of all the face images are determined before the training data are collected, and the corrected age value of the illumination parameter to the face images is obtained by subtracting the estimated age value from the actual age value of each face image.
For example, when the illumination parameter value is a1, the target age value determined by the face image a through the age recognition model is b1, the actual age value of the face image a already determined when the training data is acquired is b2, and when the illumination parameter value is a1, the corrected age value caused by performing age recognition is (b 2-b 1).
Further, the embodiment of the application can also respectively determine the corresponding corrected age values based on different facial expression parameter values or facial pose parameter values.
In the embodiment of the present application, a statistical test is performed according to the above process, and finally, an estimation function between a plurality of corrected age values and corrected age values can be obtained as follows:
h(x)=θ 01 x 12 x 2 +…+θ n x n
wherein h (x) is the corrected age value, x 1 、x 2 、…x n Respectively, correction parameter values, theta, for correction parameters of different ages 1 、θ 2 …θ n Is a coefficient corresponding to an age correction parameter, θ 0 Is the offset of all age correction parameters to the corrected age value.
For example, x 1 Illumination parameter value, x, which may represent an illumination parameter of an age correction parameter 2 Face table capable of representing facial expression parameters in age correction parametersValue of the condition parameter, x 3 A face pose parameter value, x, which may represent a face pose parameter in an age correction parameter n May represent the corresponding parameter value of the nth parameter of the age correction parameters. Theta 1 、θ 2 …θ n Respectively corresponding to the illumination parameter, the facial expression parameter, the facial posture parameter and the nth parameter, and theta 0 、θ 1 、θ 2 …θ n Once determined, the identification model values do not change with respect to the same age.
Alternatively, the corrected age value may range between [ -100, 100 ].
After the face age recognition model is trained through the process, and after the estimation function is determined, when the age recognition is carried out based on the same age recognition model, the step of establishing the face age model and the step of carrying out statistical test on training data based on the age correction parameter to determine the corresponding relation between the correction parameter value and the correction age value of the age correction parameter are not required to be repeatedly executed.
When the age identification is needed, a face image of the object to be shot is acquired. And inputting the face image into a pre-trained age identification model to obtain a target age value of the object to be shot. And determining a target correction parameter value of the age correction parameter in the face image. And determining the corrected age value of the object to be shot according to the estimated age value and the target correction parameter value. And acquiring target prompt time according to the target distance, the current time and the corrected age value. The specific implementation of obtaining the target prompt time according to the target distance, the current time and the corrected age value may refer to the specific implementation of obtaining the target prompt time according to the target distance, the current time and the target age value, which is not described herein again. The target age value may range between 0, 100.
In this embodiment, the age correction parameter includes at least one of an illumination parameter, a facial expression parameter, and a facial pose parameter.
Next, target correction parameter values of the age correction parameters in the face image of the object to be photographed are determined for different age correction parameters.
The process of determining the target illumination parameter value of the illumination parameter is as follows:
the illumination detection can be performed on the face image of the object to be shot according to the related technology, so that a target illumination parameter value of the illumination parameter is obtained, for example, a light intensity value in the face image of the object to be shot and the like.
The process of determining the target facial expression parameter value of the facial expression parameter is as follows:
the facial expression of the face image of the object to be shot can be identified according to the related technology, and the target facial expression parameter value of the facial expression parameter is obtained.
The process of determining the target face pose parameter value for the face pose parameter is as follows:
the face image of the object to be shot can be subjected to face attitude detection according to the correlation technique, and target face attitude parameter values of the face attitude parameters, such as a rotation angle value and a pitch angle value of a target face, are obtained.
After the target correction parameter value is determined, a target correction age value corresponding to the target correction parameter value is calculated based on a correspondence relationship between the correction parameter value and the correction age value of the age correction parameter determined in advance, that is, the above-described estimation function. For example, the target corrected age value may be calculated by substituting the target corrected parameter value into the estimation function. Wherein the target corrected age value also ranges between [ -100, 100 ].
When the target corrected age value is determined, the sum of the target age value and the target corrected age value is determined as a corrected age value of the subject to be photographed. For example, the sum of the target age value and the target corrected age value may be directly calculated, and the result of the calculation may be finally determined as the corrected age value of the subject to be photographed.
In an optional embodiment, obtaining the target prompt time according to the target distance and the current time includes:
determining the sex of an object to be shot;
and acquiring target prompt time according to the target distance, the current time and the gender.
Wherein different genders may correspond to different target cue times.
For example, the electronic device may obtain a plurality of candidate prompt times according to the target distance and the current time. Then, the electronic device can determine the target prompt time from the plurality of candidate prompt times according to the gender.
For example, assuming that the plurality of candidate prompt times determined according to the target distance and the current time (10: 15 points) include 10: 20 points and 10: 21 points, if the gender is a woman, the target prompt time may be 10: 21 points; if the sex is male, the target prompting time can be 10 points and 20 points.
In an optional embodiment, the mapping relationship between the distance and the candidate prompt time may be preset, for example, the distance D1 corresponds to the candidate prompt times T11 and T12, the distance D2 corresponds to the candidate prompt times T21 and T22, and the distance D3 corresponds to the candidate prompt times T31 and T32. The mapping relationship between the sex and the prompting time can also be preset, for example, the sex is the prompting time T11, T21 and T31 corresponding to the woman, and the sex is the prompting time T12, T22 and T32 corresponding to the man. Then, assuming that the target distance is D2 and the gender is a woman, the electronic device may determine that the cue time is T21.
In an optional embodiment, determining the gender of the subject to be photographed comprises:
acquiring a face image of an object to be shot;
and carrying out gender identification on the face image by using a gender identification model to obtain the gender of the object to be shot.
For example, the electronic device may capture an image of a subject to be captured to obtain a preview image. Then, the electronic device can cut out the face area from the preview image to obtain the face image of the object to be shot. And then, the electronic equipment can input the face image into a pre-trained gender identification model so as to identify the gender of the face image and obtain the gender of the object to be shot.
In an optional embodiment, before acquiring the face image of the object to be photographed, the method may further include:
acquiring a plurality of training face images;
acquiring a high-level feature set of a plurality of training face images by using a first neural network of a preset model, and acquiring a bottom-level feature set of the plurality of training face images by using a second neural network of the preset model;
fusing the high-level feature set and the bottom-level feature set to obtain a fused feature set;
and inputting the fusion characteristic set serving as training data into a prediction module of a preset model for training to obtain a gender identification model.
The face database comprises a plurality of face images, and the plurality of face images comprise front face images, side face images and multi-angle face images. The multi-angle face images comprise face images at a plurality of overlooking angles, face images at a plurality of upwelling angles, face images at a plurality of side angles and the like. Therefore, a plurality of training face images can be acquired from the face database.
One face image in the plurality of face images can correspond to one user, and the plurality of face images can also correspond to the same user. The plurality of face images corresponding to the same user can be multi-angle face images.
The plurality of face images in the face database can comprise face images with high definition, face images with low definition, face images with different degrees of noise and face images with multiple postures. The multi-pose face images comprise smiling face images, serious face images and other face images with various expressions.
The face database can be built by users, such as collecting a large number of face images on the internet, collecting face images of the users and surrounding relatives and friends, shooting and obtaining a large number of face images on the street, and the like.
The face database may also utilize existing face databases, such as the CelebA database, etc.
It should be noted that the face images in the face database correspond to their gender characteristics.
A plurality of training face images can be randomly selected from the face database, and a plurality of corresponding training face images can also be selected according to user information. For example, the face images can be selected according to images shot by the user, specifically, if the oriental faces of the images shot by the user are more, the face images of the oriental faces in the training face images selected from the face database have a larger proportion. And if the number of the children faces is large in the images shot by the user, the proportion of the children faces in the training face images selected from the face database is large. And if the number of female faces in the images shot by the user is large, the proportion of the female faces in the training face images selected from the face database is large.
The image has three major underlying features, namely color, texture and shape features, although underlying features also include color, intensity, orientation, texture and edge features. The bottom layer feature set comprises a set of various bottom layer features of a plurality of training face images.
The high-level features are features which are extracted based on the bottom-level features and can better reflect semantic information of the image. It is also understood that the high-level features are constructed by a specific algorithm (such as a convolutional neural network) based on the low-level features, and generally refer to more complex features such as the contour of an object in an image. Compared with the simple extraction of the bottom-layer characteristics of the original image information, the high-layer characteristic information has more expressive power and fully considers the context information of the scene. The high-level feature set comprises a set of various high-level features of a plurality of training face images.
Specifically, a first neural network of a preset model, for example, a deep convolutional neural network of the preset model, may be used to obtain a high-level feature set of a plurality of training face images, and a second neural network of the preset model, for example, a shallow convolutional neural network of the preset model, may be used to obtain a bottom-level feature set of a plurality of training face images.
Feature fusion can be understood as integrating different features of different sources together for redundancy removal; the resulting integrated fusion information will facilitate our subsequent analysis processing. Specifically, the feature fusion may be implemented by an algorithm, for example, an algorithm based on a bayesian decision theory, an algorithm based on a sparse representation theory, an algorithm based on a deep learning theory, or the like.
And obtaining a high-level feature set and a bottom-level feature set, and fusing the high-level feature set and the bottom-level feature set to obtain a fused feature set. Specifically, the high-level features and the bottom-level features corresponding to the same input information are fused to obtain a fusion feature, where the same input information may be the same face image or a certain feature in the same face image, such as skin color. After a plurality of input information is input, a plurality of high-level features and a plurality of bottom-level features are obtained, a plurality of fusion features are obtained according to the plurality of high-level features and the bottom-level features, and a fusion feature set is formed by the plurality of fusion features.
The convolutional neural network is a feed-forward neural network, and artificial neurons of the convolutional neural network can respond to peripheral units in a part of coverage range and have excellent performance on large-scale image processing. The convolutional neural network includes convolutional layers (convolutional layers) and pooling layers (posinglayer). Specifically, the deep convolutional neural network includes more convolutional layers and pooling layers than the shallow convolutional neural network includes. The deep convolutional neural network can be used for acquiring a high-level feature set of the image, and the shallow convolutional neural network can be used for acquiring a bottom-level feature set of the image.
And after the fusion feature set is obtained, inputting the fusion feature set serving as training data into a prediction module of a preset model for training, wherein the prediction module of the preset model performs training learning according to the training data, and optimizes each calculation parameter in the preset model to obtain the trained preset model.
Specifically, a training face image may be obtained first, then a bottom layer feature set and a top layer feature set of the training face image are obtained, and then the two are fused and input into a prediction module of a preset model, such as a Logistic Regression Classifier (Logistic Regression Classifier), for training and learning, so as to obtain a prediction result. If the prediction result is correct, retaining the calculation parameters after the preset model training; and if the prediction result is incorrect, modifying the calculation parameters of the preset model to continue training until the prediction result is correct. And then, replacing other training face images, repeating the steps until all the training face images are subjected to primary prediction, and then predicting all the training face images once or for many times again until the prediction result is not changed any more to obtain the final optimized calculation parameters, wherein the preset model with the final optimized calculation parameters is a trained gender identification model.
And inputting a plurality of training face images into a preset model, and performing gender prediction on each training face image by using the preset model to obtain a prediction result. For example, if the probability of being predicted as male is 70% and the probability of being predicted as female is 30%, the result of prediction is considered as male, and the probability is 70%. And scoring the test result according to a preset correct result, wherein if the correct result is male, the prediction result is correct, and if the correct result is female, the prediction result is wrong. And adjusting the calculation parameters of the prediction model according to the correct probability of the prediction result, and when the correct rate of the prediction result of the adjusted prediction model cannot be improved and the prediction probability of each training face image cannot be integrally improved, considering the calculation parameters of the prediction model at the moment as the optimal calculation parameters, and considering the preset model with the optimal calculation parameters as the trained gender identification model.
It should be noted that, in the training process, the calculation parameters of the prediction module may be changed, and the calculation parameters of the first neural network and the second neural network may also be changed to optimize all the calculation parameters in the whole preset model.
It should be noted that the above process is a training process for a preset model. The training process can be carried out in the server, after the training is finished, the trained gender recognition model is transplanted to electronic equipment such as a smart phone, and the electronic equipment judges the face image of the object to be shot by utilizing the trained gender recognition model. The training process can also be carried out in the electronic equipment, and after the training is finished, the electronic equipment directly judges the face image of the object to be shot by using the trained gender identification model. The training process can also be carried out in a server, when the electronic equipment needs to judge the face image of the object to be shot, the face image of the object to be shot is sent to the server, the server judges the face image and sends the judgment result back to the electronic equipment.
In some embodiments, after obtaining the gender of the face image of the object to be photographed, a preview image (an image displayed on a screen of the electronic device in real time) of the object to be photographed may be optimized according to the gender. And if the face image is judged to be male, performing low-degree facial beautification on the preview image. And if the face image is judged to be female, performing high-degree facial beautification on the preview image. Different optimization strategies can be set according to the gender, and if the female can perform whitening, skin grinding, dark eye circle removal, decoration and the like.
In an optional embodiment, when the target prompt time is reached, after the prompt information for prompting the object to be photographed to adjust the posture is output when the target prompt time is reached, the method further includes:
and when the object posture information of the object to be shot is matched with the preset posture information, shooting the object to be shot to obtain a target image.
In this embodiment, when the object posture information of the object to be photographed matches the preset posture information, the electronic device may automatically photograph the object to be photographed, so as to obtain the target image.
In an optional embodiment, when the object posture information of the object to be photographed is matched with the preset posture information, the user may perform a corresponding photographing operation, such as sending out a corresponding voice, such as "please photograph", so that the electronic device receives the photographing instruction, and in response to the photographing instruction, the electronic device may photograph the object to be photographed to obtain the target image.
In an optional embodiment, acquiring object posture information of an object to be photographed includes:
acquiring a human body image of an object to be shot;
carrying out key point detection on the human body image by using a key point detection model to obtain a plurality of key points corresponding to the object to be shot;
and determining object posture information of the object to be shot according to a plurality of key points corresponding to the object to be shot.
For example, the electronic device may obtain a preview image of the object to be photographed, and perform portrait detection on the preview image by using a pre-trained portrait detection model to obtain a human body bounding box. The electronic device can cut out a human body image from the preview image based on the human body boundary frame to be used as a human body image of the object to be shot. The electronic equipment can utilize a pre-trained key point detection model to perform key point detection on the human body image to obtain a plurality of key point positions corresponding to the object to be shot. The electronic equipment can determine the object posture information of the object to be shot according to a plurality of key point positions corresponding to the object to be shot. For example, the electronic device may use a plurality of key point positions corresponding to the object to be photographed as object posture information of the object to be photographed. The electronic equipment can also perform gesture recognition based on the plurality of key point positions by using a pre-trained gesture recognition model to obtain the gesture of the object to be shot. The electronic device may use the posture of the object to be photographed as object posture information of the object to be photographed.
In an optional embodiment, after obtaining a human body image of an object to be photographed, the electronic device inputs the human body image into a preset key point detection model according to the human body image, and obtains a plurality of corresponding thermodynamic diagrams. The electronic equipment obtains a plurality of key point coordinates of the object to be shot according to the plurality of thermodynamic diagrams, wherein one thermodynamic diagram corresponds to one key point coordinate.
For example, the electronic device may train a preset Cascaded Pyramid Network (CPN) model in advance, and use the trained Cascaded Pyramid Network model as a preset key point detection model. After obtaining the human body image, the electronic device may input the human body image into a preset key point detection model to obtain a plurality of corresponding thermodynamic diagrams.
After obtaining the plurality of thermodynamic diagrams, the electronic device may find a position of the maximum probability pixel on each thermodynamic diagram, where the position of the maximum probability pixel on each thermodynamic diagram is the key point coordinate corresponding to each thermodynamic diagram, so as to obtain a plurality of key point coordinates of the object to be photographed. The electronic device may use a plurality of key point coordinates of the object to be photographed as object posture information of the object to be photographed.
The number of the key points may be 14, 17, 21, etc., and is not limited herein. Where the keypoint coordinates include an x-coordinate and a y-coordinate, that is, each keypoint coordinate may be represented by a set of (x, y) coordinates.
It is understood that, in the embodiment of the present application, the thermodynamic diagram and the key point coordinates are in one-to-one correspondence. For example, if there are 17 thermodynamic diagrams, 17 key point coordinates can be obtained correspondingly; if 21 thermodynamic diagrams exist, 21 key point coordinates can be correspondingly obtained.
In some embodiments, the inputting, by the electronic device, the preset key point detection model according to the human body image to obtain a plurality of corresponding thermodynamic diagrams may include:
the electronic equipment inputs the human body image into a preset key point detection model to obtain a plurality of corresponding characteristic graphs, wherein each group of characteristic graphs comprises a plurality of characteristic graphs with different sizes;
and the electronic equipment performs fusion processing on the feature maps in each group of feature maps to obtain a plurality of corresponding thermodynamic diagrams, wherein one group of feature maps corresponds to one thermodynamic diagram.
For example, the electronic device may arrange the plurality of feature maps of different scales in each set of feature maps in descending order. Then, the electronic device determines the feature map arranged in the middle of each group of feature maps as a first feature map. Next, the electronic device may perform upsampling or downsampling on the other feature maps in each set of feature maps based on the first feature map, so that the size of the other feature maps after the upsampling or downsampling is the same as the size of the first feature map. Subsequently, the electronic device may perform a fusion process on the first feature map and the other feature maps that have undergone an up-sampling process or a down-sampling process, so as to obtain a plurality of corresponding thermodynamic maps.
It is to be understood that the up-sampling process is to enlarge the size of the feature map, and the down-sampling process is to reduce the size of the feature map. In the embodiment of the present application, an up-sampling process may be performed on feature maps smaller than the first feature map in each group of feature maps, and a down-sampling process may be performed on feature maps larger than the first feature map in each group of feature maps.
In an optional embodiment, the electronic device may input the human body image into a preset keypoint detection model, and obtain corresponding sets of second feature maps through residual blocks of multiple convolutional layers (e.g., convolutional layers c2, c3, c4, and c 5) of the preset keypoint detection model. Wherein each set of second feature maps comprises a plurality of second feature maps, and each convolution layer corresponds to one of the second feature maps in each set of second feature maps. The depth of the convolutional layer c2 is smaller than the depth of the convolutional layer c3, the depth of the convolutional layer c3 is smaller than the depth of the convolutional layer c4, and the depth of the convolutional layer c4 is smaller than the depth of the convolutional layer c 5. Then, the electronic device may connect the plurality of second feature maps in each set of second feature maps to a different number of bottleneck blocks to obtain a corresponding plurality of sets of feature maps. Each set of profiles includes a plurality of different sized profiles. Wherein, the deeper the convolution layer, the more the number of the bottleneck blocks connected by the characteristic diagram corresponding to the deeper convolution layer. Then, the electronic device may perform fusion processing after upsampling the feature maps in each group of feature maps to a uniform dimension, for example, perform pixel-by-pixel addition on the feature maps subjected to the upsampling to the uniform dimension to obtain a plurality of corresponding thermodynamic maps.
In an optional embodiment, after the electronic device inputs a preset key point detection model according to the human body image and obtains a plurality of corresponding thermodynamic diagrams, the method may further include:
the electronic equipment performs Gaussian filtering processing on each thermodynamic diagram to obtain a plurality of corresponding target thermodynamic diagrams;
the electronic device obtains a plurality of key point coordinates of the object to be photographed according to the plurality of thermodynamic diagrams, and the obtaining may include:
the electronic equipment obtains a plurality of key point coordinates of the object to be shot according to a plurality of target thermodynamic diagrams, wherein one target thermodynamic diagram corresponds to one key point coordinate.
For example, since each of the plurality of thermodynamic diagrams obtained by the electronic device has more or less noise, after obtaining the plurality of thermodynamic diagrams, the electronic device may perform gaussian filtering on each of the thermodynamic diagrams to filter out the noise of each of the thermodynamic diagrams, so as to obtain a corresponding plurality of target thermodynamic diagrams. Subsequently, the electronic device can obtain a plurality of key point coordinates of the object to be shot according to the plurality of target thermodynamic diagrams. Wherein, a target thermodynamic diagram corresponds to a key point coordinate.
It should be noted that the noisy point refers to a point interfering with obtaining the key point, that is, the existence of the noisy point may cause the key point to be determined inaccurately.
It can be understood that the accuracy of determining the coordinates of the key points according to the target thermodynamic diagram is higher than that of determining the coordinates of the key points according to the thermodynamic diagram, but a certain amount of processor resources are consumed in the process of obtaining the target thermodynamic diagram, so that the coordinates of the key points can be determined according to the target thermodynamic diagram under the condition that the processor resources are sufficient; in the event of insufficient processor resources, the keypoint coordinates are determined from the thermodynamic diagram.
In some embodiments, before acquiring the human body image of the object to be photographed, the method may further include:
the electronic equipment acquires a plurality of sample human body images;
the electronic equipment acquires a plurality of key point coordinates corresponding to the human body in each sample human body image;
the electronic equipment trains a preset neural network model by utilizing a plurality of sample human body images and a plurality of key point coordinates corresponding to a human body in each sample human body image;
and the electronic equipment takes the trained neural network model as a preset key point detection model.
For example, the electronic device may obtain a plurality of sample body images stored therein from a database or other device. And, each sample human body image is marked with a plurality of key point coordinates. And a plurality of key point coordinates marked on each sample human body image correspond to the human body in each sample human body image. In this embodiment of the application, the electronic device may obtain a plurality of key point coordinates marked on each sample human body image, that is, a plurality of key point coordinates corresponding to a human body in each sample human body image.
After obtaining the plurality of sample human body images and the plurality of key point coordinates corresponding to the human body in each sample human body image, the electronic device may train the preset neural network model by using the plurality of sample human body images and the plurality of key point coordinates corresponding to the human body in each sample human body image. The trained neural network model is the preset key point detection model.
In some embodiments, the electronic device may further train a preset neural network model using the plurality of sample human body images, a plurality of key point coordinates corresponding to the human body in each sample human body image, and a preset loss function. The trained neural network model is the preset key point detection model.
It should be noted that the penalty function is typically used to measure how far the predicted value (e.g., the coordinates of the key points predicted by the model) of the model disagrees with the actual value (e.g., the coordinates of the key points actually marked). It is a non-negative real-valued function. In general, the smaller the loss function, the better the robustness of the model. The loss function may be set according to actual requirements.
The preset neural network model may be a cascaded pyramid network model. The cascaded pyramid network model can include a GlobalNet network and a reflonenet network. The GlobalNet network can be used for roughly training all key points of a human body. The RefineNet network can refine key points which are reflected by the GlobalNet network and difficult to train.
In some embodiments, the predetermined neural network model may include an initiation-v 4 network or an attribution rest network and a reflonenet network. The augmentation-v 4 network or the attention rest network can be used for coarse training of all key points of the human body. The RefineNet network can refine key points which are reflected by the GlobalNet network and difficult to train.
In some embodiments, before acquiring the human body image of the object to be photographed, the method may further include:
the electronic equipment acquires a plurality of groups of key point coordinates, wherein each group of key point coordinates comprises a plurality of key point coordinates;
the electronic equipment acquires the human body postures corresponding to the coordinates of each group of key points;
the electronic equipment trains a preset shallow neural network model by using the multiple groups of key point coordinates and the human body postures corresponding to each group of key point coordinates;
and the electronic equipment takes the trained shallow neural network model as a preset posture recognition model.
For example, the electronic device may obtain a plurality of sets of key point coordinates and a human body posture corresponding to each set of key point coordinates. Wherein each set of keypoint coordinates comprises a plurality of keypoint coordinates.
After obtaining the plurality of groups of key point coordinates and the human body postures corresponding to each group of key point coordinates, the electronic equipment can train the preset shallow neural network model by using the plurality of groups of key point coordinates and the human body postures corresponding to each group of key point coordinates. The trained shallow neural network model can be used as a preset gesture recognition model.
In some embodiments, the electronic device may further train the preset shallow neural network model using the plurality of sets of key point coordinates, the body posture (real body posture) corresponding to each set of key point coordinates, and a preset loss function. The trained shallow neural network model can be used as a preset gesture recognition model. .
It should be noted that the loss function is usually used to measure the degree of disagreement between the predicted value (e.g., the body pose predicted by the model) and the actual value (e.g., the actual body pose) of the model. It is a non-negative real-valued function. In general, the smaller the loss function, the better the robustness of the model. The loss function may be set according to actual requirements.
The preset shallow neural network model may be a resnet 18 network model.
In some embodiments, since the coordinate representations of two persons in the same pose are very different at different positions of the picture, in order to control this variable, the electronic device may normalize the keypoint coordinates in the multiple sets of keypoint coordinates after acquiring the multiple sets of keypoint coordinates. For example, the keypoint coordinates may be normalized using the following formula:
Figure BDA0003696724970000191
in this formula, N2 represents the normalized x-coordinate or y-coordinate. N1 represents the x-coordinate or the y-coordinate before normalization.N min Representing the x-coordinate or y-coordinate with the smallest value among the sets of keypoint coordinates. N is a radical of max Representing the x coordinate or y coordinate with the largest value among the plurality of sets of keypoint coordinates. A is a constant, and A can take on values of 240, 264, 293, 320, 335, 370, and so on.
In other embodiments, in order to reflect the relevance of the x coordinate and the y coordinate of the same keypoint, the x coordinate and the y coordinate of the same keypoint can be placed at the same position of different channels for training. For example, assuming that a group of key points includes 5 key points, the coordinates of the 5 key points are (x 1, y 1), (x 2, y 2), (x 3, y 3), (x 4, y 4) and (x 5, y 5), respectively, the human body posture corresponding to the group of key points is "standing". The data to be trained, which needs to be input into the preset shallow neural network model, is (a, b), then [ x1, x2, x3, x4, x5] and [ y1, y2, y3, y4, y5] can be used as a, and the human body posture "standing" can be used as b.
In an optional embodiment, after determining the coordinates of the plurality of key points of the object to be photographed, the electronic device may input the coordinates of the plurality of key points into a preset gesture recognition model to recognize the gesture of the object to be photographed. The electronic device may use the posture of the object to be photographed as object posture information of the object to be photographed. The preset gesture recognition model is a trained model.
In an optional embodiment, when the object posture information does not match preset posture information, before determining a target distance between the object to be photographed and the electronic device, the method further includes:
acquiring scene information of a scene where an object to be shot is located;
and determining preset attitude information from the plurality of pieces of attitude information to be selected according to the scene information.
The scene information may describe a scene in which the object to be photographed is located. For example, when the object to be photographed is at seaside, the scene information is at seaside. When the object to be shot is under the feet of the mountain, the scene information is under the feet of the mountain. And when the object to be shot is positioned beside the building, the scene information is beside the building.
The electronic equipment can collect a plurality of posture information to be selected and a plurality of scene information in advance, and store the corresponding posture information to be selected and the corresponding scene information in a one-to-one correspondence mode. The corresponding relation between the posture information to be selected and the scene information can be determined by professional photographers. Or a professional photographer determines the corresponding relationship between a part of the gesture information to be selected and the scene information, and the electronic equipment performs machine learning based on the determined corresponding relationship between the gesture information to be selected and the scene information, so as to determine the corresponding relationship between the remaining gesture information to be selected and the scene information. One piece of candidate attitude information may correspond to one or more pieces of scene information. One scene information may also correspond to one or more candidate pose information.
In this embodiment, the electronic device obtains scene information of a scene where an object to be photographed is located, and determines preset posture information from a plurality of posture information to be selected according to the scene information. The above steps may be executed before the step of acquiring the object posture information of the object to be photographed, may be executed after the step of acquiring the object posture information of the object to be photographed, or may be executed simultaneously with the step of acquiring the object posture information of the object to be photographed.
For example, assuming that the gesture information P1 to be selected corresponds to the scene information S1, the gesture information P2 to be selected corresponds to the scene information S2, and the gesture information P3 to be selected corresponds to the scene information S3, if the scene information of the scene where the object to be photographed is located is S1, the electronic device may determine that the preset gesture information is P1, if the scene information of the scene where the object to be photographed is located is S2, the electronic device may determine that the preset gesture information is P2, and if the scene information of the scene where the object to be photographed is located is S3, the electronic device may determine that the preset gesture information is P3.
It can be understood that, if there are a plurality of pieces of candidate attitude information corresponding to the scene information of the scene where the object to be photographed is located, the electronic device may determine any one piece of candidate attitude information corresponding to the scene information of the scene where the object to be photographed is located as the preset attitude information, and the electronic device may also determine, as the preset attitude information, the candidate attitude information with the highest degree of use among the candidate attitude information corresponding to the scene information of the scene where the object to be photographed is located.
In an optional embodiment, when the object posture information does not match preset posture information, before determining a target distance between the object to be photographed and the electronic device, the method further includes:
and determining preset attitude information from the plurality of pieces of attitude information to be selected according to the attitude information of the object.
In order to enable the posture of the object to be shot to be proper and accord with the preference of the object to be shot, the favorite posture of the object to be shot can be firstly put by the object to be shot, so that the electronic equipment obtains the posture information of the object to be shot and determines the preset posture information from a plurality of posture information to be selected according to the posture information of the object.
For example, the electronic device may select, from the multiple candidate pose information, the candidate pose information that is closest to the object pose information as the preselected pose information according to the object pose information.
Taking the posture information to be selected as an example, the electronic equipment acquires a plurality of key point positions included in the posture information of the object; the electronic device compares a plurality of key point positions included in each piece of gesture information to be selected with a plurality of key point positions included in the object gesture information in a one-to-one correspondence manner, so as to determine the gesture information to be selected which is most similar to the object gesture information from the plurality of pieces of gesture information to be selected as preset gesture information, for example, the gesture information to be selected which has the smallest Euclidean target distance between the included key point positions is used as the preset gesture information, or the gesture information to be selected which has the largest number of key point positions with the smallest target distance between each pair of corresponding key point positions is used as the preset gesture information.
For example, it is assumed that the candidate posture information includes candidate posture information P1 and candidate posture information P2, where a plurality of key point positions included in the candidate posture information P1 are a left elbow position L11, a right elbow position L12, and a neck position L13, respectively, a plurality of key point positions included in the candidate posture information P2 are a left elbow position L21, a right elbow position L22, and a neck position L23, respectively, and a plurality of key point positions included in the object posture information T are a left elbow position LT1, a right elbow position LT2, and a neck position LT3, respectively. The electronic device calculates Euclidean target distances D1 between the left elbow position L11, the right elbow position L12 and the neck position L13 and the left elbow position LT1, the right elbow position LT2 and the neck position LT3, calculates Euclidean target distances D2 between the left elbow position L21, the right elbow position L22 and the neck position L23 and the left elbow position LT1, the right elbow position LT2 and the neck position LT3, and if D1 is larger than D2, the electronic device can determine that the preset posture information is P2.
For another example, assume that the candidate posture information includes candidate posture information P1 and candidate posture information P2, where a plurality of key point positions included in the candidate posture information P1 are a left elbow position L11, a right elbow position L12, and a neck position L13, respectively, a plurality of key point positions included in the candidate posture information P2 are a left elbow position L21, a right elbow position L22, and a neck position L23, respectively, and a plurality of key point positions included in the object posture information T are a left elbow position LT1, a right elbow position LT2, and a neck position LT3, respectively. The target distance between L11 and LT1 is greater than the target distance between L21 and LT1, the target distance between L12 and LT2 is greater than the target distance between L22 and LT2, and the target distance between L13 and LT3 is greater than the target distance between L23 and LT3. Since the target distances of the key point positions L21, L22 and L23 to the key point positions LT1, LT2 and LT3, respectively, are all greater than the target distances of the key point positions L11, L12 and L13 to the key point positions LT1, LT2 and LT3, respectively, it can be determined that the candidate attitude information P2 is closest to the object attitude information T, and thus, the candidate attitude information P2 can be determined as the preset attitude information.
In an optional embodiment, before obtaining the object posture information of the object to be photographed, the method further includes:
displaying a plurality of gesture information to be selected;
and responding to the selection operation aiming at the plurality of pieces of posture information to be selected, and determining the posture information to be selected by the selection operation as preset posture information.
In this embodiment, the user may select the preset posture information from the plurality of posture information to be selected. For example, the electronic device may display a plurality of candidate pose information on the screen, such as constructing a corresponding outline according to each candidate pose information, and displaying the constructed outline on the screen. The user can send out corresponding voice or perform corresponding touch on the screen, so that the electronic equipment receives selection operation aiming at a plurality of gesture information to be selected, and the electronic equipment can determine the gesture information to be selected by the selection operation as preset gesture information.
For example, as shown in fig. 3, it is assumed that the electronic device displays the gesture information P1 to be selected, the gesture information P2 to be selected, the gesture information P3 to be selected, the gesture information P4 to be selected, and the gesture information P5 to be selected on the screen, and if the user clicks the gesture information P1 to be selected, the preset gesture information is P1. If the user clicks the to-be-selected posture information P3, the preset posture information is P3. If the user says 'select the 1 st to-be-selected attitude information', the preset attitude information is P1. If the user says 'select the 4 th gesture information to be selected', the preset gesture information is P4.
In an optional embodiment, the electronic device may determine preset posture information from a plurality of candidate posture information according to the scene information and the object posture information.
For example, the electronic device may determine a plurality of candidate attitude information corresponding to the scene information from the plurality of candidate attitude information, and then determine preset attitude information from the plurality of candidate attitude information corresponding to the scene information according to the object attitude information.
For another example, the electronic device may determine, from the multiple pieces of candidate posture information, multiple pieces of candidate posture information (for example, multiple pieces of candidate posture information with a minimum euclidean target distance) corresponding to the object posture information, and then determine, according to the scene information, preset posture information from the multiple pieces of candidate posture information corresponding to the object posture information.
In practical applications, when a client transacts banking business, the client sometimes needs to acquire an image. When acquiring an image of a client, the client is usually required to take a corresponding posture to confirm the identity of the client. In this case, the image of the client can be obtained by the information prompting method provided by the embodiment of the application. For example, the gesture that the client needs to put out is used as preset gesture information; and acquiring the object posture information of the client and matching the object posture information with preset posture information. When the posture information of the object is not matched with the preset posture information, determining the target distance between the object to be shot and the electronic equipment; acquiring target prompt time according to the target distance; when the target prompt time is reached, the prompt information for prompting the posture adjustment is output, so that the client can adjust the posture of the client and further put out the corresponding posture to prove the identity.
Referring to fig. 4, fig. 4 is a schematic structural diagram of an information prompting device according to an embodiment of the present application. The information presentation apparatus 200 includes: an information acquisition module 201, a distance determination module 202, a time acquisition module 203, and an information output module 204.
The information acquisition module 201 is configured to identify a posture of an object to be photographed in a photographing scene to obtain object posture information of the object to be photographed;
a distance determining module 202, configured to determine a target distance between the object to be photographed and the electronic device when the object posture information does not match preset posture information;
the time obtaining module 203 is configured to obtain current time, and obtain target prompt time according to the target distance and the current time;
and the information output module 204 is configured to output prompt information for prompting the object to be shot to adjust the posture when the target prompt time is reached.
In an optional embodiment, the time obtaining module 203 may be configured to: determining a target age value of the object to be shot; and acquiring target prompt time according to the target distance, the current time and the target age value.
In an optional embodiment, the time obtaining module 203 may be configured to: acquiring a face image of an object to be shot; and carrying out age identification on the face image by using an age identification model to obtain a target age value of the object to be shot.
In an optional embodiment, the time obtaining module 203 may be configured to: determining the sex of the object to be shot; and acquiring target prompt time according to the target distance, the current time and the gender.
In an optional embodiment, the time obtaining module 203 may be configured to: acquiring a face image of an object to be shot; and carrying out gender identification on the face image by using a gender identification model to obtain the gender of the object to be shot.
In an optional embodiment, the information prompting device 200 may further include a shooting module, where the shooting module is configured to: and when the object posture information of the object to be shot is matched with the preset posture information, shooting the object to be shot to obtain a target image.
In an optional embodiment, the information obtaining module 201 may be configured to: acquiring a human body image of an object to be shot; performing key point detection on the human body image by using a key point detection model to obtain a plurality of key points corresponding to the object to be shot; and determining object posture information of the object to be shot according to the plurality of key points corresponding to the object to be shot.
It should be noted that the information prompting device provided in the embodiment of the present application and the information prompting method in the foregoing embodiment belong to the same concept, and specific implementation processes thereof are described in the foregoing embodiment and are not described herein again.
The embodiment of the present application provides a storage medium, on which a computer program is stored, and when the computer program stored in the storage medium is executed on a processor of an electronic device provided in the embodiment of the present application, the processor of the electronic device is caused to execute any of the steps in the above information prompting method suitable for the electronic device. The storage medium may be a magnetic disk, an optical disk, a Read Only Memory (ROM), a Random Access Memory (RAM), or the like.
Referring to fig. 5, the electronic device 300 includes a processor 301 and a memory 302. Those skilled in the art will appreciate that the electronic device configuration shown in fig. 5 does not constitute a limitation of the electronic device and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components. For example, the electronic device 300 may also include a screen or a camera, among others.
The processor 301 is a control center of the electronic device, connects various parts of the entire electronic device using various interfaces and lines, performs various functions of the electronic device and processes data by running or executing an application program stored in the memory 302 and calling data stored in the memory 302, thereby integrally monitoring the electronic device.
The memory 302 may be used to store applications and data. The memory 302 stores applications containing executable code. The application programs may constitute various functional modules. The processor 301 executes various functional applications and data processing by running an application program stored in the memory 302.
In this embodiment, the processor 301 in the electronic device loads the executable code corresponding to the processes of one or more application programs into the memory 302 according to the following instructions, and the processor 302 runs the application programs stored in the memory 302, thereby implementing the following processes:
recognizing the gesture of an object to be shot in a shooting scene to obtain object gesture information of the object to be shot;
when the object posture information is not matched with preset posture information, determining the target distance between the object to be shot and the electronic equipment;
acquiring current time, and acquiring target prompt time according to the target distance and the current time;
and outputting prompt information for prompting the object to be shot to adjust the posture when the target prompt time is reached.
Referring to fig. 6, fig. 6 is a schematic structural diagram of an electronic device according to a second embodiment of the present disclosure.
The electronic device 300 may comprise components such as a memory 302, a processor 301, an input unit 303, an output unit 304, etc.
The processor 301 is a control center of the electronic device, connects various parts of the whole electronic device by using various interfaces and lines, and performs various functions of the electronic device and processes data by running or executing an application program stored in the memory 302 and calling the data stored in the memory 302, thereby performing overall monitoring of the electronic device.
The memory 302 may be used to store applications and data. The memory 302 stores applications containing executable code. The application programs may constitute various functional modules. The processor 301 executes various functional applications and data processing by running the application programs stored in the storage 302.
The input unit 303 may be used to receive input numbers, character information, or user characteristic information, such as a fingerprint, and generate a keyboard, mouse, joystick, optical, or trackball signal input related to user setting and function control.
The output unit 304 may be used to display information input by or provided to a user and various graphical user interfaces of the electronic device, which may be made up of graphics, text, icons, video, and any combination thereof. The output unit may include a display panel.
In this embodiment, the processor 301 in the electronic device loads the executable code corresponding to the processes of one or more application programs into the memory 302 according to the following instructions, and the processor 301 runs the application programs stored in the memory 302, thereby implementing the following processes:
recognizing the gesture of an object to be shot in a shooting scene to obtain object gesture information of the object to be shot;
when the object posture information is not matched with preset posture information, determining the target distance between the object to be shot and the electronic equipment;
acquiring current time, and acquiring target prompt time according to the target distance and the current time;
and outputting prompt information for prompting the object to be shot to adjust the posture when the target prompt time is reached.
In some embodiments, when the processor 301 executes to obtain the target prompt time according to the target distance and the current time, it may execute: determining a target age value of the object to be shot; and acquiring target prompt time according to the target distance, the current time and the target age value.
In some embodiments, when the processor 301 performs determining the target age value of the object to be photographed, it may perform: acquiring a face image of an object to be shot; and carrying out age identification on the face image by using an age identification model to obtain a target age value of the object to be shot.
In some embodiments, when the processor 301 executes obtaining the target prompt time according to the target distance and the current time, it may execute: determining the gender of the object to be shot; and acquiring target prompt time according to the target distance, the current time and the gender.
In some embodiments, when the processor 301 determines the gender of the subject to be photographed, the following steps may be performed: acquiring a face image of an object to be shot; and carrying out gender identification on the face image by using a gender identification model to obtain the gender of the object to be shot.
In some embodiments, after the processor 301 outputs prompt information for prompting the object to be photographed to adjust the posture when the target prompt time is reached, the following may be further performed: and when the object posture information of the object to be shot is matched with the preset posture information, shooting the object to be shot to obtain a target image.
In some embodiments, when the processor 301 performs acquiring the object posture information of the object to be photographed, it may perform: acquiring a human body image of an object to be shot; performing key point detection on the human body image by using a key point detection model to obtain a plurality of key points corresponding to the object to be shot; and determining object posture information of the object to be shot according to the plurality of key points corresponding to the object to be shot.
The information prompting method, the information prompting device, the storage medium and the electronic device provided by the application are introduced in detail, a specific example is applied in the description to explain the principle and the implementation of the application, and the description of the embodiment is only used for helping to understand the method and the core idea of the application; meanwhile, for those skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (10)

1. An information prompting method is applied to electronic equipment and is characterized by comprising the following steps:
recognizing the gesture of an object to be shot in a shooting scene to obtain object gesture information of the object to be shot;
when the object posture information is not matched with preset posture information, determining the target distance between the object to be shot and the electronic equipment;
acquiring current time, and acquiring target prompt time according to the target distance and the current time;
and outputting prompt information for prompting the object to be shot to adjust the posture when the target prompt time is reached.
2. The information prompting method according to claim 1, wherein the obtaining a target prompting time according to the target distance and the current time comprises:
determining a target age value of the object to be shot;
and acquiring target prompt time according to the target distance, the current time and the target age value.
3. The information presentation method according to claim 2, wherein the determining a target age value of the subject to be photographed includes:
acquiring a face image of an object to be shot;
and carrying out age identification on the face image by using an age identification model to obtain a target age value of the object to be shot.
4. The information prompting method according to claim 1, wherein the obtaining of the target prompting time according to the target distance and the current time comprises:
determining the gender of the object to be shot;
and acquiring target prompt time according to the target distance, the current time and the gender.
5. The information prompting method according to claim 4, wherein the determining the sex of the object to be photographed includes:
acquiring a face image of an object to be shot;
and carrying out gender identification on the face image by using a gender identification model to obtain the gender of the object to be shot.
6. The information prompting method according to claim 1, wherein after outputting prompting information for prompting the object to be photographed to adjust the posture when the target prompting time is reached, the method further comprises:
and when the object posture information of the object to be shot is matched with the preset posture information, shooting the object to be shot to obtain a target image.
7. The information prompting method according to any one of claims 1 to 6, wherein the recognizing the posture of the object to be photographed in the photographing scene to obtain the object posture information of the object to be photographed includes:
acquiring a human body image of an object to be shot;
performing key point detection on the human body image by using a key point detection model to obtain a plurality of key points corresponding to the object to be shot;
and determining object attitude information of the object to be shot according to the plurality of key points corresponding to the object to be shot.
8. An information prompting device applied to electronic equipment is characterized by comprising:
the information acquisition module is used for identifying the posture of an object to be shot in a shooting scene to obtain object posture information of the object to be shot;
the distance determining module is used for determining the target distance between the object to be shot and the electronic equipment when the object posture information is not matched with preset posture information;
the time acquisition module is used for acquiring current time and acquiring target prompt time according to the target distance and the current time;
and the information output module is used for outputting prompt information for prompting the object to be shot to adjust the posture when the target prompt time is reached.
9. A storage medium having stored therein a computer program which, when run on a computer, causes the computer to execute the information presentation method according to any one of claims 1 to 7.
10. An electronic device, comprising a processor and a memory, wherein the memory stores a computer program, and the processor is configured to execute the information presentation method according to any one of claims 1 to 7 by calling the computer program stored in the memory.
CN202210682243.2A 2022-06-15 2022-06-15 Information prompting method and device, storage medium and electronic equipment Pending CN115223238A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210682243.2A CN115223238A (en) 2022-06-15 2022-06-15 Information prompting method and device, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210682243.2A CN115223238A (en) 2022-06-15 2022-06-15 Information prompting method and device, storage medium and electronic equipment

Publications (1)

Publication Number Publication Date
CN115223238A true CN115223238A (en) 2022-10-21

Family

ID=83608163

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210682243.2A Pending CN115223238A (en) 2022-06-15 2022-06-15 Information prompting method and device, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN115223238A (en)

Similar Documents

Publication Publication Date Title
US11736756B2 (en) Producing realistic body movement using body images
US20210174072A1 (en) Microexpression-based image recognition method and apparatus, and related device
CN109905593B (en) Image processing method and device
US11163978B2 (en) Method and device for face image processing, storage medium, and electronic device
CN112036331B (en) Living body detection model training method, device, equipment and storage medium
CN109937434B (en) Image processing method, device, terminal and storage medium
WO2023098128A1 (en) Living body detection method and apparatus, and training method and apparatus for living body detection system
CN111008935B (en) Face image enhancement method, device, system and storage medium
US20190045270A1 (en) Intelligent Chatting on Digital Communication Network
Limcharoen et al. View-independent gait recognition using joint replacement coordinates (jrcs) and convolutional neural network
KR20190119212A (en) System for performing virtual fitting using artificial neural network, method thereof and computer recordable medium storing program to perform the method
CN111589138B (en) Action prediction method, device, equipment and storage medium
CN111108508A (en) Facial emotion recognition method, intelligent device and computer-readable storage medium
CN110705438B (en) Gait recognition method, device, equipment and storage medium
CN113591562A (en) Image processing method, image processing device, electronic equipment and computer readable storage medium
CN112700568A (en) Identity authentication method, equipment and computer readable storage medium
CN114898447B (en) Personalized fixation point detection method and device based on self-attention mechanism
CN115115552B (en) Image correction model training method, image correction device and computer equipment
JP2021026744A (en) Information processing device, image recognition method, and learning model generation method
CN108334821B (en) Image processing method and electronic equipment
CN110443122A (en) Information processing method and Related product
CN115223238A (en) Information prompting method and device, storage medium and electronic equipment
CN115484411A (en) Shooting parameter adjusting method and device, electronic equipment and readable storage medium
CN115623313A (en) Image processing method, image processing apparatus, electronic device, and storage medium
CN114511877A (en) Behavior recognition method and device, storage medium and terminal

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination