CN111144197A - Human identification method, device, storage medium and electronic equipment - Google Patents

Human identification method, device, storage medium and electronic equipment Download PDF

Info

Publication number
CN111144197A
CN111144197A CN201911085273.XA CN201911085273A CN111144197A CN 111144197 A CN111144197 A CN 111144197A CN 201911085273 A CN201911085273 A CN 201911085273A CN 111144197 A CN111144197 A CN 111144197A
Authority
CN
China
Prior art keywords
facial
characteristic information
human
information
target object
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN201911085273.XA
Other languages
Chinese (zh)
Inventor
徐静
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yulong Computer Telecommunication Scientific Shenzhen Co Ltd
Original Assignee
Yulong Computer Telecommunication Scientific Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yulong Computer Telecommunication Scientific Shenzhen Co Ltd filed Critical Yulong Computer Telecommunication Scientific Shenzhen Co Ltd
Priority to CN201911085273.XA priority Critical patent/CN111144197A/en
Publication of CN111144197A publication Critical patent/CN111144197A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/57Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for processing of video signals

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • General Physics & Mathematics (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Theoretical Computer Science (AREA)
  • Signal Processing (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Acoustics & Sound (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The application provides a human identification method, a human identification device, a storage medium and electronic equipment, wherein the method comprises the following steps: acquiring a face video of a target object; determining face feature information and voice feature information contained in the face video; and determining the human characteristic information of the target object based on the facial characteristic information and the voice characteristic information. According to the technical scheme, the human characteristic information of the person can be objectively and effectively identified through the face video of the target object. Compared with a mode of acquiring the human characteristics of the target object through the questionnaire, the technical scheme of the application reduces the labor cost and saves the time.

Description

Human identification method, device, storage medium and electronic equipment
Technical Field
The application relates to the field of information processing, in particular to a human recognition method, a human recognition device, a storage medium and electronic equipment.
Background
The enterprise recruits employees and needs to identify the human characteristics of the alternative people so as to select the employees suitable for the enterprise needs. When a group organizes social activities, it is also necessary to analyze the human characteristics of the relevant people to better organize the activities. People are often required to be analyzed and judged in life and work.
At present, the human recognition is generally realized by a questionnaire survey or a voice question and answer mode, and a large amount of time and labor cost are consumed. If the answerer is influenced by the objective environment, the answerer often cannot answer the question objectively, so that the analysis result is not objective and accurate.
Disclosure of Invention
In order to solve the above problem, embodiments of the present application provide a method and an apparatus for human identification, and an electronic device.
In a first aspect, an embodiment of the present application provides a human identification method, including the following steps:
acquiring a face video of a target object;
determining face feature information and voice feature information contained in the face video;
and determining the human characteristic information of the target object based on the facial characteristic information and the voice characteristic information.
Optionally, the acquiring a facial video of the target object includes:
receiving a human-based function identification instruction input aiming at a camera application interface, and switching into a shooting mode;
acquiring a face video shot for a target object in the shooting mode.
Optionally, the determining the facial feature information and the voice feature information contained in the facial video includes:
extracting a face image portion and an audio portion of the face video, the face image portion including at least one frame of face image;
acquiring the facial feature information based on the facial image part;
and acquiring the voice characteristic information based on the audio part.
Optionally, the facial feature information includes facial micro-expression information and biometric feature information, and the obtaining the facial feature information based on the facial image part includes:
acquiring a first facial image in the facial image part, positioning a biological characteristic region of the first facial image, and identifying biological characteristic information of the biological characteristic region;
identifying facial micro-expression information of the first facial image based on a facial description sample;
acquiring a next frame of facial image of the first facial image, taking the next frame of facial image as the first facial image, and performing the step of locating the biometric region of the first facial image;
when the next frame of face image is determined not to exist, face feature information corresponding to the face video is generated.
Optionally, the determining the human characteristic information of the target object based on the facial characteristic information and the voice characteristic information includes:
and inputting the biological characteristic information, the facial micro-expression information and the voice characteristic information into a trained human-based recognition model, and outputting the human-based characteristic information of the target object.
Optionally, the method further comprises:
acquiring identity information of the target object;
the determining the human characteristic information of the target object based on the facial characteristic information and the voice characteristic information comprises:
and determining the human characteristic information of the target object based on the identity information, the facial characteristic information and the voice characteristic information.
Optionally, the acquiring a facial video of the target object includes:
acquiring a face video which is acquired by a camera aiming at a target object and lasts for a preset time;
after determining the human characteristic information of the target object based on the facial characteristic information and the voice characteristic information, the method further comprises:
acquiring a next face video which is acquired by the camera for the target object and lasts for a preset time length, and executing the step of determining face feature information and voice feature information contained in the face video;
and when the next face video is the set last face video, generating the human characteristic information of all the face videos including the human characteristic information.
In a second aspect, an embodiment of the present application provides a human recognition device, including:
an acquisition unit configured to acquire a face video of a target object;
a first determination unit configured to determine face feature information and voice feature information contained in the face video;
a second determination unit configured to determine human characteristic information of the target object based on the facial characteristic information and the voice characteristic information.
In a third aspect, the present application provides a computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements the steps of any one of the above methods.
In a fourth aspect, an embodiment of the present application provides an electronic device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements the steps of any one of the above methods when executing the program.
The human-based identification method, the human-based identification device, the storage medium and the electronic equipment acquire the facial video of the target object; determining face feature information and voice feature information contained in the face video; and determining the human characteristic information of the target object based on the facial characteristic information and the voice characteristic information. According to the technical scheme, the human characteristic information of the person can be objectively and effectively identified by combining the facial characteristic information and the voice characteristic information contained in the facial video of the target object. Compared with the mode of questionnaire investigation and the like, the technical scheme of the application reduces the labor cost, saves the time and can improve the accuracy of the analysis result.
Drawings
FIG. 1 is a diagram illustrating an exemplary system architecture to which a human recognition method or apparatus according to an embodiment of the present application can be applied;
FIG. 2 is a flow chart illustrating a human recognition method according to an embodiment of the present application;
FIG. 3 is a flow chart illustrating another human recognition method according to an embodiment of the present application;
FIG. 4 is a flowchart illustrating another human recognition method according to an embodiment of the present application;
FIG. 5 is a flowchart illustrating another human recognition method according to an embodiment of the present application;
FIG. 6 is a flow chart illustrating another human recognition method according to an embodiment of the present application;
FIG. 7 is a flowchart illustrating another human recognition method according to an embodiment of the present application;
FIG. 8 is a schematic diagram illustrating a human recognition device according to an embodiment of the present application;
fig. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The present application is further described with reference to the following figures and examples.
In the following description, the terms "first" and "second" are used for descriptive purposes only and are not intended to indicate or imply relative importance. The following description provides embodiments of the present application, where different embodiments may be substituted or combined, and thus the present application is intended to include all possible combinations of the same and/or different embodiments described. Thus, if one embodiment includes feature A, B, C and another embodiment includes feature B, D, then this application should also be considered to include an embodiment that includes one or more of all other possible combinations of A, B, C, D, even though this embodiment may not be explicitly recited in text below.
The following description provides examples, and does not limit the scope, applicability, or examples set forth in the claims. Changes may be made in the function and arrangement of elements described without departing from the scope of the disclosure. Various examples may omit, substitute, or add various procedures or components as appropriate. For example, the described methods may be performed in an order different than the order described, and various steps may be added, omitted, or combined. Furthermore, features described with respect to some examples may be combined into other examples.
Fig. 1 is a schematic diagram of an exemplary system architecture to which the human recognition method or apparatus of the embodiment of the present application can be applied. The image pickup apparatus 101 may be provided inside the terminal apparatus 102, and the image pickup apparatus 101 may be independent of the terminal apparatus 102 and connected to the terminal apparatus 102. The connection mode includes direct connection through a data line and also includes connection through a network. The network may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The terminal device 102 includes, but is not limited to, devices such as a server, a mobile phone, a notebook computer, a digital broadcast receiver, a PDA (personal digital assistant), a PAD (tablet), a PMP (portable multimedia player), a vehicle terminal (e.g., a car navigation terminal), a digital television, a desktop computer, and the like.
The image pickup apparatus 101 captures a face video of a target object and transmits the face video to the terminal apparatus 102. The terminal device 102 determines the human characteristic information of the target object based on the facial characteristic information and the voice characteristic information contained in the facial video.
Referring to fig. 2, fig. 2 is a schematic flow chart of a human recognition method according to an embodiment of the present application, in which the method includes:
s201, acquiring a face video of the target object.
The target object is a target object to be determined with human characteristics. A face video of a target object may be captured by an image capturing apparatus such as a video camera. Or a plurality of facial images of the target object can be acquired by an image acquisition device such as a camera, and then voice information corresponding to the target object can be acquired by a voice acquisition device such as a microphone. It is also possible to directly acquire a video of the face of the person concerned already in the storage device. In short, as long as the face video of the target object can be acquired, the purpose of the present application is achieved, and the present application does not limit how the face video of the target object is acquired.
S202, determining face feature information and voice feature information contained in the face video.
The face feature information is information obtained based on a face image of the target object. The facial feature information may include micro-expression information as well as biometric information.
The micro-expression information is facial morphology information reflected when the target object is in an active state, and the micro-expression information changes along with objective factors such as mood of people. The micro-expression information may include: the mouth corners are raised slightly, the eyebrows are wrinkled and the like.
The biological characteristic information is facial morphology information reflected when the target object is in a static state, and does not change with objective factors such as mood of people. The biometric information may include: eye feature information, mouth feature information, nose feature information, and the like. The eye feature information may include shape information of the eye, length information of the eye, and the like. The mouth feature information may include shape information of the mouth, color information of the mouth, and the like.
The voice feature information is information extracted based on the voice information of the target object. The speech features may include: amplitude, frequency, and speech rate, etc.
S203, determining the human characteristic information of the target object based on the facial characteristic information and the voice characteristic information.
The human characteristic matching rules can be pre-stored in the system. For example: if the nose is full of flesh, the inner heart of the target object is straight and rational. A full and round forehead represents the brain clever and long sight of the target subject. A lower speech amplitude and a slower speech rate, which means a deeper mental movement of the target object, etc. And determining the human characteristic information of the target object by the pre-stored human characteristic matching rules and the pre-stored facial characteristic information.
A human-based recognition model can be established in the system, and a large number of samples are input to train the model, so that the model can achieve preset recognition accuracy. The sample is facial feature information with a human characteristic label. The recognition result of the human recognition model can be a plurality of human characteristic information with different probabilities, and the human characteristic information with the highest probability can be selected as the human characteristic information of the target object.
The human-based identification method provided by the embodiment of the application can objectively and effectively identify the human-based characteristic information of a person through the facial video of the target object. Therefore, the technical problems that in the prior art, through modes of questionnaires and the like, the labor cost is high, the recognition efficiency is low, and the objective accuracy is not enough can be solved.
Optionally, the method further comprises:
acquiring a body video of the person to be detected;
determining a body feature of the body video;
the determining the human character of the person to be detected based on the facial feature and the voice feature comprises:
and determining the human characteristics of the person to be detected based on the body characteristics, the facial characteristics and the voice characteristics.
The above physical characteristics include: the body slightly inclines to the left, and the two hands are clenched to make a fist, the two legs are closed, and the like. Physical characteristics are often important manifestations of the target subject's mood and character. Therefore, the embodiment determines the human characteristics of the target object by considering the body characteristics, so that the human characteristics of the target dried shrimps can be determined more carefully and accurately.
Referring to fig. 3, fig. 3 is a schematic flow chart of another human identification method provided in the embodiment of the present application, in which the method includes:
s301, receiving a human-based function identification instruction input by a camera application interface, and switching to a shooting mode.
The human-based identification method provided by the embodiment of the application can be applied to intelligent terminal equipment such as a server, a mobile phone, a notebook computer and the like, and the terminal equipment is provided with a camera device. The terminal equipment can be switched into the camera shooting mode only by simple operation, and the humanized characteristics are identified through the camera shooting mode.
And S302, acquiring a face video shot for the target object in the shooting mode.
The target object should be within a reasonable distance of the camera device to ensure that a clearer video of the target object is obtained.
S303, determining the face feature information and the voice feature information contained in the face video.
The description of S303 may refer to the description of step S202 in fig. 2, and is not repeated.
S304, determining the human characteristic information of the target object based on the facial characteristic information and the voice characteristic information.
The description of S304 may refer to the description of step S203 in fig. 2, and is not repeated.
The human-based identification method provided by the embodiment of the application is applied to various intelligent devices with the camera device, so that a user can quickly and conveniently apply the human-based identification method under the condition that the user needs to perform human-based feature identification, and good use experience is brought to the user.
Referring to fig. 4, fig. 4 is a schematic flow chart of another human identification method provided in the embodiment of the present application, in which the method includes:
s401, acquiring a face video of the target object.
The description of S401 may refer to the description of step S201 in fig. 2, and is not repeated.
S402, extracting a face image part and an audio part of the face video, wherein the face image part comprises at least one frame of face image.
How to extract the face image part and the audio part of the face video is a technique well known to those skilled in the art, and the description of this embodiment is omitted.
And S403, acquiring the facial feature information based on the facial image part.
One frame of image included in the face image portion is processed to acquire face feature information. Specifically, the image may be subjected to histogram equalization processing to obtain a grayscale histogram of the image, and the grayscale histogram may be subjected to gaussian smoothing filter processing to obtain an eye contour, a nose contour, a mouth contour, and the like of the target object. And acquiring facial feature information based on the eye contour, the nose contour, the mouth contour and the like of the target object.
S404, acquiring the voice characteristic information based on the audio part.
The voice feature information is information extracted based on the voice information of the target object. The speech features may include: amplitude, frequency, and speech rate, etc. Extracting information such as amplitude, frequency, and speech rate corresponding to the audio through the audio is a technique known to those skilled in the art, and will not be described in detail in this embodiment.
S405, determining the human characteristic information of the target object based on the facial characteristic information and the voice characteristic information.
The description of S405 may refer to the description of step S203 in fig. 2, and is not repeated.
According to the human identification method provided by the embodiment of the application, the face video is divided into the face image part and the audio part, and corresponding face video characteristics are extracted from the face image part and the audio part respectively. Finally, the human characteristic information of the target object is determined based on the facial video characteristics. The method of the embodiment can determine the human character of the target object more reasonably and objectively.
Referring to fig. 5, fig. 5 is a schematic flow chart of a re-human recognition method according to an embodiment of the present application, where the method includes:
s501, acquiring a face video of the target object.
The description of S501 may refer to the description of step S201 in fig. 2, and is not repeated.
S502, extracting a face image part and an audio part of the face video, wherein the face image part comprises at least one frame of face image.
The description of S502 may refer to the description of step S402 in fig. 2, and is not repeated.
S503, acquiring a first face image in the face image part, positioning a biological feature area of the first face image, and identifying biological feature information of the biological feature area.
The biological characteristic information is facial morphology information reflected when the target object is in a static state, and does not change with objective factors such as mood of people. The biometric information may include: eye feature information, mouth feature information, nose feature information, and the like. The eye feature information may include shape information of the eye, length information of the eye, and the like. The mouth feature information may include shape information of the mouth, color information of the mouth, and the like.
And S504, identifying the face micro-expression information of the first face image based on the face description sample.
The micro-expression information is facial morphology information reflected when the target object is in an active state, and the micro-expression information changes along with objective factors such as mood of people. The micro-expression information may include: the mouth corners are raised slightly, the eyebrows are wrinkled and the like.
The face description sample is a sample describing the correspondence between the face morphology information and the face micro-expression. The face description sample may include: and if the ratio of the distance between the two eyebrows to the forehead width is less than the preset ratio threshold value, the target object is in a micro-expression state with the glabella being wrinkled.
And determining the facial form information in the first facial image, and identifying the facial micro-expression information of the first facial image based on the corresponding relation between the facial form information and the facial micro-expression existing in the facial description sample.
S505, judging whether the next frame face image exists.
If the first face image exists, the next frame face image is used as the first face image, and the step S502 is executed. If not, go to step S506.
And S506, when the facial image of the next frame does not exist, generating facial feature information corresponding to the facial video.
The micro-expression information and the biometric information acquired in the above steps S503 and S504 are combined into facial feature information. The facial feature information may include a plurality of facial feature information and a plurality of biometric feature information.
S507, inputting the biological characteristic information, the facial micro-expression information and the voice characteristic information into a trained human-based recognition model, and outputting the human-based characteristic information of the target object.
According to the human-based recognition method provided by the embodiment of the application, the biological characteristic information, the facial micro-expression information and the voice characteristic information extracted from the video of the target object are used for determining the human-based characteristic information of the target object based on the biological characteristic information, the facial micro-expression information and the voice characteristic information. The method provided by the embodiment comprehensively considers the influence of the human characteristic information on the biological characteristic information, the facial micro-expression information and the voice characteristic information, and can objectively and accurately determine the human characteristic information of the target object.
Referring to fig. 6, fig. 6 is a schematic flow chart of another human identification method provided in the embodiment of the present application, in which the method includes:
s601, acquiring a face video of the target object.
The description of S601 may refer to the description of step S201 in fig. 2, and is not repeated.
S602, determining the face feature information and the voice feature information contained in the face video.
The description of S602 may refer to the description of step S202 in fig. 2, and is not repeated.
S603, acquiring the identity information of the target object.
The identity information of the target object includes information of age, native place, education background, sex, etc. of the target object.
S604, determining the human characteristic information of the target object based on the identity information, the facial characteristic information and the voice characteristic information.
Due to the identity information of people, such as age, native place, education background, etc., the human character of people is often influenced. The human-based recognition method provided by the embodiment of the application takes the facial feature information and the voice feature information into consideration, and also takes the identity information of the target object as input to determine the human-based features of the target object. Therefore, the method provided by the embodiment can better determine the human characteristics of the target object.
Referring to fig. 7, fig. 7 is a schematic flow chart of another human identification method provided in the embodiment of the present application, in which the method includes:
s701, acquiring a face video which is acquired by a camera aiming at a target object and lasts for a preset time.
The preset time can be set according to specific requirements and application scenes. The preset time period may be 30 seconds, 1 minute, 5 minutes, etc. The longer the preset duration is, the longer the video time acquired by the system is, the more the face feature information and voice feature information which can be determined by the system through the video are, and the more accurate the system detects the human features.
S702, determining the face feature information and the voice feature information contained in the face video.
The description of S702 may refer to the description of step S202 in fig. 2, and is not repeated.
S703, determining the human characteristic information of the target object based on the facial characteristic information and the voice characteristic information.
The description of S703 may refer to the description of step S203 in fig. 2, and is not repeated.
And S704, judging whether the current face video is the last face video acquired.
If yes, go to step S705; if not, step S701 is executed.
S705, generating the human characteristic information of all the face videos including the human characteristic information.
By the method for human identification provided by the embodiment of the application, a plurality of human identification results corresponding to the target object can be obtained, and the human characteristic with the most identification times can be selected as the human identification result of the target object. In the human-based identification method provided by this embodiment, the human-based identification is performed by using the facial features included in the multiple segments of facial videos, and compared with a method for determining the human-based feature information by using only one segment of facial video, the human-based feature information corresponding to the target object can be determined more accurately.
In order to better explain the technical scheme of the application, a specific embodiment that the human recognition method provided by the embodiment of the application is applied to a smart phone terminal or a tablet computer is provided below. The scheme is based on that a mobile phone or a flat-panel camera shoots videos of expression language expression and expression of a human face, and positions of key points of the human face are positioned, captured and analyzed; analyzing the advantages and disadvantages of the human nature of the person by combining big data; the method helps the opposite side to know the rough nature of one person in advance, and the analyzed content has more referential meaning and is convenient for the common public to use. The human identification method provided by the embodiment of the invention can comprise the following steps:
step 1, firstly, a 'personal identification' function is set by opening a camera application.
It should be noted that the human-based identification method provided by the embodiment of the application can be applied to not only a mobile phone terminal, but also an intelligent terminal such as a personal computer, a server, a wearable device and the like.
And 2, after the camera is opened, the camera is switched to a shooting mode, and a prompt of swinging the human face within a specific distance range is given to carry out video shooting.
And 3, shooting for 1 minute each time, and shooting for 5 times in total.
The shooting times and the shooting time can be set correspondingly according to specific requirements.
Step 4, capturing the facial micro-expression of the photographed person by utilizing a camera AI (Artificial Intelligence) technology, identifying the tone of the voice tone of the photographed person, and identifying the key points of the facial micro-expression: eyes, mouth, cheek. And analyzing the human detection result of the person according to the human corresponding rule in the related books or by using a big data analysis module.
Step 5, the human detection result can be composed of the following three parts: humanity score, content of humanity advantages, and content of humanity disadvantages.
The above-mentioned fig. 2 to fig. 7 illustrate the human recognition method in detail according to the embodiment of the present application. Referring to fig. 8, fig. 8 is a schematic structural diagram of a human recognition device according to an embodiment of the present application, and as shown in fig. 8, the human recognition device includes:
an acquisition unit 801 for acquiring a face video of a target object;
a first determination unit 802 configured to determine face feature information and voice feature information included in the face video;
a second determining unit 803, configured to determine human characteristic information of the target object based on the facial characteristic information and the voice characteristic information.
Optionally, the obtaining unit 801 is specifically configured to:
receiving a human-based function identification instruction input aiming at a camera application interface, and switching into a shooting mode;
acquiring a face video shot for a target object in the shooting mode.
Optionally, the first determining unit 802 is specifically configured to:
extracting a face image portion and an audio portion of the face video, the face image portion including at least one frame of face image;
acquiring the facial feature information based on the facial image part;
and acquiring the voice characteristic information based on the audio part.
Optionally, the first determining unit 802 is specifically configured to:
acquiring a first facial image in the facial image part, positioning a biological characteristic region of the first facial image, and identifying biological characteristic information of the biological characteristic region;
identifying facial micro-expression information of the first facial image based on a facial description sample;
acquiring a next frame of facial image of the first facial image, taking the next frame of facial image as the first facial image, and performing the step of locating the biometric region of the first facial image;
when the next frame of face image is determined not to exist, face feature information corresponding to the face video is generated.
Optionally, the second determining unit 803 is specifically configured to:
and inputting the biological characteristic information, the facial micro-expression information and the voice characteristic information into a trained human-based recognition model, and outputting the human-based characteristic information of the target object.
Optionally, the apparatus further comprises:
a second obtaining unit 804, configured to obtain identity information of the target object;
the second determining unit 803 is specifically configured to:
and determining the human characteristic information of the target object based on the identity information, the facial characteristic information and the voice characteristic information.
Optionally, the obtaining unit 801 is configured to:
acquiring a face video which is acquired by a camera aiming at a target object and lasts for a preset time;
the obtaining unit 801 is further configured to:
acquiring a next face video which is acquired by the camera for the target object and lasts for a preset time length, and executing the step of determining face feature information and voice feature information contained in the face video;
and when the next face video is the set last face video, generating the human characteristic information of all the face videos including the human characteristic information.
It is clear to a person skilled in the art that the solution according to the embodiments of the present application can be implemented by means of software and/or hardware. The "unit" and "module" in this specification refer to software and/or hardware that can perform a specific function independently or in cooperation with other components, where the hardware may be, for example, an FPGA (Field-Programmable Gate Array), an IC (Integrated Circuit), or the like.
Each processing unit and/or module in the embodiments of the present application may be implemented by an analog circuit that implements the functions described in the embodiments of the present application, or may be implemented by software that executes the functions described in the embodiments of the present application.
The embodiment of the application also provides a computer readable storage medium, on which a computer program is stored, which when executed by a processor implements the steps of the human recognition method. The computer-readable storage medium may include, but is not limited to, any type of disk including floppy disks, optical disks, DVD, CD-ROMs, microdrive, and magneto-optical disks, ROMs, RAMs, EPROMs, EEPROMs, DRAMs, VRAMs, flash memory devices, magnetic or optical cards, nanosystems (including molecular memory ICs), or any type of media or device suitable for storing instructions and/or data.
Referring to fig. 9, a schematic structural diagram of an electronic device according to an embodiment of the present application is shown, where the electronic device can be used to implement the human recognition method in the foregoing embodiment. Specifically, the method comprises the following steps:
the memory 920 may be used to store software programs and modules, and the processor 990 may execute various functional applications and data processing by operating the software programs and modules stored in the memory 920. The memory 920 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the terminal device, and the like. Further, the memory 920 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, memory 920 may also include a memory controller to provide access to memory 920 by processor 990 and input unit 930.
The input unit 930 may be used to receive input numeric or character information and generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control. In particular, the input unit 930 may include a touch-sensitive surface 931 (e.g., a touch screen, a touch pad, or a touch frame). The touch-sensitive surface 931, also referred to as a touch screen or a touch pad, may collect touch operations by a user on or near the touch-sensitive surface 931 (e.g., operations by a user on or near the touch-sensitive surface 931 using a finger, a stylus, or any other suitable object or attachment) and drive the corresponding connecting device according to a predetermined program. Alternatively, the touch sensitive surface 931 may include both a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, and sends the touch point coordinates to the processor 990, and can receive and execute commands sent by the processor 990. In addition, the touch sensitive surface 931 may be implemented in various types, such as resistive, capacitive, infrared, and surface acoustic wave.
The display unit 940 may be used to display information input by or provided to a user and various graphic user interfaces of the terminal device, which may be configured by graphics, text, icons, video, and any combination thereof. The Display unit 940 may include a Display panel 941, and optionally, the Display panel 941 may be configured in the form of an LCD (Liquid Crystal Display), an OLED (Organic Light-Emitting Diode), or the like. Further, the touch-sensitive surface 931 can overlay the display panel 941, and when a touch operation is detected on or near the touch-sensitive surface 931, the touch operation is transmitted to the processor 990 to determine the type of touch event, and then the processor 990 provides a corresponding visual output on the display panel 941 according to the type of touch event. Although in FIG. 9 the touch-sensitive surface 931 and the display panel 941 are shown as two separate components to implement input and output functions, in some embodiments the touch-sensitive surface 931 and the display panel 941 may be integrated to implement input and output functions.
The processor 990 is a control center of the terminal device, connects various parts of the entire terminal device using various interfaces and lines, and performs various functions of the terminal device and processes data by operating or executing software programs and/or modules stored in the memory 920 and calling data stored in the memory 920, thereby integrally monitoring the terminal device. Optionally, processor 990 may include one or more processing cores; processor 990 may, among other things, integrate an application processor that handles primarily the operating system, user interface, and applications, etc., and a modem processor that handles primarily wireless communications. It is to be appreciated that the modem processor described above may not be integrated into processor 990.
Specifically, in this embodiment, the display unit of the terminal device is a touch screen display, the terminal device further includes a memory, and one or more programs, where the one or more programs are stored in the memory and configured to be executed by the one or more processors, and the one or more programs include steps for implementing the human recognition method.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described device embodiments are merely illustrative, for example, the division of the unit is only a logical functional division, and there may be other division ways in actual implementation, such as: multiple units or components may be combined, or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the coupling, direct coupling or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection between the devices or units may be electrical, mechanical or other forms.
All functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may be separately regarded as one unit, or two or more units may be integrated into one unit; the integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.
The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (10)

1. A method of human recognition, the method comprising:
acquiring a face video of a target object;
determining face feature information and voice feature information contained in the face video;
and determining the human characteristic information of the target object based on the facial characteristic information and the voice characteristic information.
2. The method of claim 1, wherein the obtaining a facial video of a target object comprises:
receiving a human-based function identification instruction input aiming at a camera application interface, and switching into a shooting mode;
acquiring a face video shot for a target object in the shooting mode.
3. The method of claim 1, wherein the determining facial feature information and voice feature information contained in the facial video comprises:
extracting a face image portion and an audio portion of the face video, the face image portion including at least one frame of face image;
acquiring the facial feature information based on the facial image part;
and acquiring the voice characteristic information based on the audio part.
4. The method of claim 3, wherein the facial feature information includes facial micro-expression information and biometric information, and wherein the obtaining the facial feature information based on the facial image portion comprises:
acquiring a first facial image in the facial image part, positioning a biological characteristic region of the first facial image, and identifying biological characteristic information of the biological characteristic region;
identifying facial micro-expression information of the first facial image based on a facial description sample;
acquiring a next frame of facial image of the first facial image, taking the next frame of facial image as the first facial image, and performing the step of locating the biometric region of the first facial image;
when the next frame of face image is determined not to exist, face feature information corresponding to the face video is generated.
5. The method according to claim 4, wherein the determining the human characteristic information of the target object based on the facial characteristic information and the voice characteristic information comprises:
and inputting the biological characteristic information, the facial micro-expression information and the voice characteristic information into a trained human-based recognition model, and outputting the human-based characteristic information of the target object.
6. The method of claim 1, further comprising:
acquiring identity information of the target object;
the determining the human characteristic information of the target object based on the facial characteristic information and the voice characteristic information comprises:
and determining the human characteristic information of the target object based on the identity information, the facial characteristic information and the voice characteristic information.
7. The method of claim 1, wherein the obtaining a facial video of a target object comprises:
acquiring a face video which is acquired by a camera aiming at a target object and lasts for a preset time;
after determining the human characteristic information of the target object based on the facial characteristic information and the voice characteristic information, the method further comprises:
acquiring a next face video which is acquired by the camera for the target object and lasts for a preset time length, and executing the step of determining face feature information and voice feature information contained in the face video;
and when the next face video is the set last face video, generating the human characteristic information of all the face videos including the human characteristic information.
8. A human recognition apparatus, the apparatus comprising:
an acquisition unit configured to acquire a face video of a target object;
a first determination unit configured to determine face feature information and voice feature information contained in the face video;
a second determination unit configured to determine human characteristic information of the target object based on the facial characteristic information and the voice characteristic information.
9. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.
10. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method of any of claims 1-7 are implemented when the program is executed by the processor.
CN201911085273.XA 2019-11-08 2019-11-08 Human identification method, device, storage medium and electronic equipment Withdrawn CN111144197A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911085273.XA CN111144197A (en) 2019-11-08 2019-11-08 Human identification method, device, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911085273.XA CN111144197A (en) 2019-11-08 2019-11-08 Human identification method, device, storage medium and electronic equipment

Publications (1)

Publication Number Publication Date
CN111144197A true CN111144197A (en) 2020-05-12

Family

ID=70517046

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911085273.XA Withdrawn CN111144197A (en) 2019-11-08 2019-11-08 Human identification method, device, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN111144197A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108053838A (en) * 2017-12-01 2018-05-18 上海壹账通金融科技有限公司 With reference to audio analysis and fraud recognition methods, device and the storage medium of video analysis
CN108764010A (en) * 2018-03-23 2018-11-06 姜涵予 Emotional state determines method and device
CN109409296A (en) * 2018-10-30 2019-03-01 河北工业大学 The video feeling recognition methods that facial expression recognition and speech emotion recognition are merged
CN109522818A (en) * 2018-10-29 2019-03-26 中国科学院深圳先进技术研究院 A kind of method, apparatus of Expression Recognition, terminal device and storage medium
CN110378228A (en) * 2019-06-17 2019-10-25 深圳壹账通智能科技有限公司 Video data handling procedure, device, computer equipment and storage medium are examined in face

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108053838A (en) * 2017-12-01 2018-05-18 上海壹账通金融科技有限公司 With reference to audio analysis and fraud recognition methods, device and the storage medium of video analysis
CN108764010A (en) * 2018-03-23 2018-11-06 姜涵予 Emotional state determines method and device
CN109522818A (en) * 2018-10-29 2019-03-26 中国科学院深圳先进技术研究院 A kind of method, apparatus of Expression Recognition, terminal device and storage medium
CN109409296A (en) * 2018-10-30 2019-03-01 河北工业大学 The video feeling recognition methods that facial expression recognition and speech emotion recognition are merged
CN110378228A (en) * 2019-06-17 2019-10-25 深圳壹账通智能科技有限公司 Video data handling procedure, device, computer equipment and storage medium are examined in face

Similar Documents

Publication Publication Date Title
US11062090B2 (en) Method and apparatus for mining general text content, server, and storage medium
KR100947990B1 (en) Gaze Tracking Apparatus and Method using Difference Image Entropy
US10198071B2 (en) Methods and apparatuses for determining control information
EP3693966B1 (en) System and method for continuous privacy-preserved audio collection
WO2020019591A1 (en) Method and device used for generating information
US11641352B2 (en) Apparatus, method and computer program product for biometric recognition
CN107609463B (en) Living body detection method, living body detection device, living body detection equipment and storage medium
WO2021135692A1 (en) Data processing method and device for attention deficit hyperactivity disorder and terminal device
CN104808794A (en) Method and system for inputting lip language
CN109887187A (en) A kind of pickup processing method, device, equipment and storage medium
US20210192221A1 (en) System and method for detecting deception in an audio-video response of a user
CN110059624B (en) Method and apparatus for detecting living body
JP2018032164A (en) Interview system
US8810362B2 (en) Recognition system and recognition method
CN111158490B (en) Auxiliary semantic recognition system based on gesture recognition
CN111967739A (en) Concentration degree-based online teaching method and system
CN110286771B (en) Interaction method, device, intelligent robot, electronic equipment and storage medium
Park et al. Achieving real-time sign language translation using a smartphone's true depth images
EP3200092A1 (en) Method and terminal for implementing image sequencing
CN110443238A (en) A kind of display interface scene recognition method, terminal and computer readable storage medium
CN110633677A (en) Face recognition method and device
US10558795B2 (en) Information processing apparatus, information processing system, and method of processing information
CN110634570A (en) Diagnostic simulation method and related device
CN104992085A (en) Method and device for human body in-vivo detection based on touch trace tracking
JP2016045724A (en) Electronic apparatus

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20200512