WO2021007857A1

WO2021007857A1 - Identity authentication method, terminal device, and storage medium

Info

Publication number: WO2021007857A1
Application number: PCT/CN2019/096579
Authority: WO
Inventors: 艾静雅; 柳彤; 朱大卫; 汤慧秀
Original assignee: 深圳海付移通科技有限公司
Priority date: 2019-07-18
Filing date: 2019-07-18
Publication date: 2021-01-21
Also published as: CN111684459A

Abstract

Disclosed are an identity authentication method, a terminal device, and a storage medium. The method comprises: when a terminal device acquires a set operation instruction, collecting a first image to be detected and performing a facial recognition with respect to said first image; when the facial recognition is successful, collecting multiple consecutive second images to be detected and performing a lip recognition with respect to said second images; and when the lip recognition is successful, responding to the set operation instruction. By such means, the characteristics of an authentication scheme of being difficult to reproduce, not easily forgotten, and contactless are implemented, the accuracy of identity authentication is increased, and the security in using the terminal device is strengthened.

Description

Identity verification method, terminal equipment and storage medium

【Technical Field】

This application relates to the technical field of identity verification, in particular to an identity verification method, terminal device, and storage medium.

【Background technique】

With the development of society, people rely more and more on the use of terminal devices. For security and privacy considerations, terminal devices need to be authenticated to identify whether the current user has permission to use them, such as most smart phones. Some private content of the unlocked screen and terminal equipment will be encrypted.

Relevant identity verification methods such as fingerprint verification and character verification, fingerprint verification cannot be processed without contact with quick verification, and character verification has shortcomings such as easy to forget and easy to copy, and fingerprint verification can only ensure that the characteristics of a person are effectively verified. It is not guaranteed to be a real person, it may be a fingerprint film.

[Content of the invention]

The main problem solved by this application is to provide an identity verification method, terminal device, and storage medium, which realize the characteristics of the verification method that is difficult to copy, resist forgetting, and contactless, improve the accuracy of identity verification, and make the use of terminal devices more secure .

In order to solve the above technical problems, the technical solution adopted in this application is to provide an identity verification method. The method includes: when the terminal device obtains a setting operation instruction, collects a first image to be detected, and performs processing on the first image to be detected. Face recognition; after the face recognition is passed, a plurality of consecutive second to-be-detected images are collected, and lip shape recognition is performed on the second to-be-detected image; after the lip shape recognition is passed, the setting operation instruction is responded to.

Wherein, after the face recognition is passed, collecting a plurality of consecutive second to-be-detected images, and performing lip shape recognition on the second to-be-detected image includes: after the face recognition is passed, collecting a plurality of consecutive second to-be-detected images Image; perform lip shape recognition on a plurality of second to-be-detected images to obtain recognized text information; determine whether the recognized text information is text information in the preset whitelist; if so, determine that the lip shape recognition passes.

Wherein, the method further includes: acquiring text information entered by the user; adding the entered text information to the whitelist.

Among them, after the face recognition is passed, collecting a plurality of consecutive second to-be-detected images, and performing lip-shape recognition on the second to-be-detected image includes: after the face recognition is passed, collecting multiple second to-be-reading images Detect images; perform lip shape recognition on a plurality of second to-be-detected images to obtain recognized text information; determine whether the recognized text information is text information in the preset blacklist; if so, determine that the lip shape recognition fails.

Wherein, the method further includes: adding at least one piece of text information in the white list to the black list, and deleting the text information from the white list.

Among them, after the face recognition is passed, a plurality of consecutive second to-be-detected images are collected, and lip shape recognition is performed on the second to-be-detected image, including: after the face recognition is passed, the standard text information is displayed, and the continuous A plurality of second to-be-detected images; perform lip shape recognition on the plurality of second to-be-detected images to obtain recognized text information; determine whether the recognized text information is the same as the standard text information; if so, determine that the lip shape recognition passes.

Wherein, displaying standard text information includes: randomly selecting one text information from a plurality of text information in the database as the standard text information, and displaying the standard text information.

Wherein, performing lip shape recognition on multiple second to-be-detected images to obtain recognized text information includes: extracting face information of multiple second to-be-detected images; extracting multiple continuously changing lip shapes from multiple face information Features: Based on multiple continuously changing lip features, the recognized text information is obtained.

Among them, based on a plurality of continuously changing lip shape features to obtain recognized text information, including: input a plurality of continuously changing lip shape features into the lip shape recognition model, so that the lip shape recognition model can generate corresponding pronunciation information, and based on the pronunciation Information, calculate the corresponding recognized text information.

Wherein, when the mobile terminal obtains the setting operation instruction, collecting the first image to be detected and performing face recognition on the first image to be detected includes: when the mobile terminal obtains the setting operation instruction, collecting the first to be detected Image; extract the face image in the first image to be detected; perform face recognition on the face image.

Among them, the face recognition of the face image includes: extracting face feature information from the face image; comparing the face feature information with the pre-stored standard face feature information for similarity; and comparing the result of the similarity When the preset requirements are met, the face recognition is determined to pass.

Among them, the setting operation instruction is a payment operation instruction; after the lip shape recognition is passed, responding to the setting operation instruction includes: after the lip shape recognition is passed, responding to the payment operation instruction to complete the corresponding payment.

In order to solve the above technical problems, another technical solution adopted in this application is to provide a terminal device, which includes a processor, a camera module connected to the processor, and a memory; the memory is used to store program data, and the processor is used to execute Program data to implement the method described above.

In order to solve the above technical problem, another technical solution adopted in this application is to provide a computer storage medium for storing program data, and the program data is used to implement the above-mentioned method when the program data is executed by a processor.

In order to solve the above technical problem, another technical solution adopted in this application is to provide a terminal device, the terminal device includes: a first identification module, when the setting operation instruction is obtained, the first image to be detected is collected, and Perform face recognition on the first image to be detected; the second recognition module is used to collect a plurality of consecutive second images to be detected after the face recognition is passed, and perform lip shape recognition on the second image to be detected; response module , Used to respond to setting operation instructions after lip shape recognition is passed.

Through the above solution, the beneficial effects of the present application are: different from the prior art, an identity verification method of the present application combines face recognition and lip shape recognition to achieve verification methods that are difficult to copy, anti-forgetting, non-contact, etc. Features to improve the accuracy of identity verification and make the use of terminal equipment more secure.

【Explanation of drawings】

In order to more clearly describe the technical solutions in the embodiments of the present application, the following will briefly introduce the drawings needed in the description of the embodiments. Obviously, the drawings in the following description are only some embodiments of the present application. For those of ordinary skill in the art, other drawings can be obtained from these drawings without creative work. among them:

FIG. 1 is a schematic flowchart of the first embodiment of the identity verification method provided by the present application;

2 is a schematic flowchart of a second embodiment of the identity verification method provided by the present application;

FIG. 3 is a schematic flowchart of a third embodiment of the identity verification method provided by the present application;

4 is a schematic flowchart of a fourth embodiment of the identity verification method provided by the present application;

FIG. 5 is a schematic flowchart of a fifth embodiment of the identity verification method provided by the present application;

FIG. 6 is a schematic flowchart of a sixth embodiment of the identity verification method provided by the present application;

FIG. 7 is a schematic structural diagram of a first embodiment of a terminal device provided by the present application;

FIG. 8 is a schematic structural diagram of a second embodiment of a terminal device provided by the present application;

FIG. 9 is a schematic structural diagram of an embodiment of a computer storage medium provided by the present application.

【Detailed ways】

The technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the drawings in the embodiments of the present application. It can be understood that the specific embodiments described here are only used to explain the application, but not to limit the application. In addition, it should be noted that, for ease of description, the drawings only show a part of the structure related to the present application instead of all of the structure. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work fall within the protection scope of this application.

The terms "first", "second", etc. in this application are used to distinguish different objects, rather than to describe a specific sequence. In addition, the terms "including" and "having" and any variations thereof are intended to cover non-exclusive inclusion. For example, a process, method, system, product, or device that includes a series of steps or units is not limited to the listed steps or units, but optionally includes unlisted steps or units, or optionally also includes Other steps or units inherent to these processes, methods, products or equipment.

Reference to "embodiments" herein means that a specific feature, structure, or characteristic described in conjunction with the embodiments may be included in at least one embodiment of the present application. The appearance of the phrase in various places in the specification does not necessarily refer to the same embodiment, nor is it an independent or alternative embodiment mutually exclusive with other embodiments. Those skilled in the art clearly and implicitly understand that the embodiments described herein can be combined with other embodiments.

Referring to Fig. 1, Fig. 1 is a schematic flowchart of a first embodiment of an identity verification method provided by the present application. The method includes:

Step 11: When the terminal device obtains the setting operation instruction, it collects the first image to be detected, and performs face recognition on the first image to be detected.

When the terminal device obtains the setting operation instruction, it calls the camera module, collects the first image to be detected, and performs face recognition on the first image to be detected. In this embodiment, the terminal device may be a mobile terminal, such as a smart phone, a tablet computer, a wearable device, etc., and the setting operation instruction may be an unlock screen instruction. When the device terminal obtains the unlock screen instruction, it turns on the camera to collect the current The image information within the shooting range of the camera is used for face recognition on the collected image information. It can be understood that when there is no face information in the collected image information, the terminal device stops the recognition, or prompts to re-collect and recognize again.

Face recognition can be divided into face image acquisition and detection, face image preprocessing, face image feature extraction, and matching and recognition.

Face image collection: Different face images can be collected through the camera lens, such as static images, dynamic images, different positions, different expressions, etc. can be well collected. When the user is within the shooting range of the capture device, the capture device will automatically search for and shoot the user's face image.

Face detection: In practice, face detection is mainly used for preprocessing of face recognition, that is, to accurately calibrate the position and size of the face in the image. The pattern features contained in face images are very rich, such as histogram features, color features, template features, structural features, and Haar features. Face detection is to pick out the useful information, and use these features to realize face detection. In the face detection process, the Adaboost algorithm is used to select some rectangular features (weak classifiers) that best represent the face, and the weak classifier is constructed into a strong classifier according to the weighted voting method, and then several strong classifiers obtained by training A cascade structure of stacked classifiers is formed in series, which effectively improves the detection speed of the classifier.

Face image preprocessing: The image preprocessing of the face is based on the face detection result, the image is processed and finally serves the process of feature extraction. Due to various conditions and random interference, the original image acquired by the system cannot be used directly. It must be preprocessed by grayscale correction and noise filtering in the early stage of image processing. For face images, the preprocessing process mainly includes light compensation, gray scale transformation, histogram equalization, normalization, geometric correction, filtering and sharpening of the face image.

Face image feature extraction: The features that can be used in face recognition are usually divided into visual features, pixel statistical features, face image transformation coefficient features, and face image algebraic features. Facial feature extraction is based on certain features of the human face. Face feature extraction, also known as face representation, is a process of feature modeling of human faces. Facial feature extraction methods can be summarized into two categories: one is knowledge-based representation methods; the other is based on algebraic features or statistical learning.

The knowledge-based representation method is mainly based on the shape description of the face organs and the distance characteristics between them to obtain feature data that is helpful for face classification. Its feature components usually include the Euclidean distance, curvature, and angle between feature points. . The human face is composed of parts such as eyes, nose, mouth, and chin. The geometric description of these parts and the structural relationship between them can be used as important features to recognize the face. These features are called geometric features. Knowledge-based face representation mainly includes geometric feature-based methods and template matching methods.

Face image matching and recognition: The feature data of the extracted face image is searched and matched with the feature template stored in the database. By setting a threshold, when the similarity exceeds this threshold, the matching result is output. Face recognition is to compare the facial features to be recognized with the obtained facial feature template, and judge the identity information of the face based on the degree of similarity. This process is divided into two categories: one is confirmation, which is a process of one-to-one image comparison, and the other is identification, which is a process of one-to-many image matching and comparison.

Step 12: After the face recognition is passed, a plurality of consecutive second to-be-detected images are collected, and lip shape recognition is performed on the second to-be-detected images.

After the face recognition is passed, the manner of collecting consecutive multiple second to-be-detected images may be that the user speaks a paragraph of text, and the camera module of the terminal device collects consecutive images.

First recognize the face information in the continuous image, then extract the continuous lip shape change features in the face, then perform lip unit matching, input the lip shape feature into the lip shape recognition model, identify the corresponding pronunciation, and then recognize The pronunciation is matched with the characters of the password, and the text information spoken by the user is obtained.

The lip shape recognition model uses complex end-to-end deep neural network technology to model the lip sequence and establish a vocabulary.

In this implementation, the terminal device can broadcast a paragraph of text by voice, the user repeats this paragraph of text, collects continuous images of the user repeats this paragraph of text, extracts the lip features of the continuous images, and recognizes the corresponding pronunciation through the lip recognition model. The corresponding pronunciation is matched with password characters to obtain text information, and this text information is matched with the text information voiced by the terminal device. If the matching is passed, then step 13 is executed.

In this embodiment, a section of text can be displayed on the display screen of the terminal device, and the user can read this section of text, collect continuous images of the text read by the user, extract the lip features of the continuous images, and recognize them through the lip recognition model The corresponding pronunciation is generated, the corresponding pronunciation is matched with the password characters, and the text information is obtained, and the text information is matched with the text information voiced by the terminal device. If the matching is passed, then step 13 is executed. It can be understood that the text displayed by the terminal device may be preset text information or random text information.

Step 13: After the lip shape recognition is passed, respond to the setting operation instruction.

After the lip recognition is passed, if the operation instruction is set to unlock the screen, the terminal device will unlock the screen and display the screen content; if the operation instruction is set to unlock the terminal device’s private album, the terminal device will unlock the private album and display the photos in the private album ; If the set operation instruction is a payment instruction, the terminal device completes the corresponding payment; if the set operation instruction is to view private information, the terminal device displays the private information.

In an application scenario, when a user needs to use a terminal device to pay a bill, the terminal device obtains a payment instruction and prompts the user to perform face recognition. After face recognition, the user is prompted to say a paragraph, and the user is synchronized to collect continuous For image information, lip feature extraction is performed on the image information, and lip shape recognition is performed. After the lip shape recognition is passed, the terminal device completes the corresponding payment.

In another application scenario, the user clicks on an application on the terminal device. At this time, the application requires identity verification. The terminal device obtains the operation instruction and prompts the user to perform face recognition. After the face recognition is passed, it is displayed on the display The text information prompts the user to read the text information. When the user reads the text information, the continuous image information is synchronously collected, and the image information is subjected to lip shape extraction and lip shape recognition. After the lip shape recognition is passed, the application is unlocked.

In other embodiments, after obtaining the setting operation instruction, the terminal device performs face recognition on the user. After the face recognition is passed, it obtains the video information of the text information spoken by the user, and the terminal device splits the video information into audio streams. And the image stream, perform voice recognition on the audio stream, recognize the text information, perform continuous lip feature extraction on the image stream, recognize through the lip recognition model, and calculate the text information contained in the lip feature in the image stream. Audio The text information recognized by the stream is compared with the text information recognized by the image stream. If the same, the recognized text information is matched with the preset text information. If the matching is successful, the identity verification is considered successful, and the terminal device responds to the setting Operating instructions.

Different from the situation in the prior art, an application operation method and an identity verification method of the present application include: when a terminal device obtains a setting operation instruction, collecting a first image to be detected, Perform face recognition on an image to be detected; after the face recognition is passed, collect consecutive multiple second to be detected images, and perform lip shape recognition on the second to be detected image; after the lip shape recognition is passed, respond to the setting operation instruction. Through the above method, the characteristics of the verification method such as difficult to copy, anti-forgetting, and non-contact are realized, the accuracy of identity verification is improved, and the use of terminal equipment is more secure.

Refer to Figure 2. Figure 2 is a schematic flowchart of a second embodiment of the identity verification method provided by the present application, and the method includes:

Step 21: When the terminal device obtains the setting operation instruction, it collects the first image to be detected, and performs face recognition on the first image to be detected.

Step 22: After the face recognition is passed, a plurality of consecutive second to-be-detected images are collected.

Optionally, after the face recognition is passed, the terminal device prompts the user to collect the second to-be-detected image, such as prompting the user to face the camera and speak text information.

Optionally, when the face recognition is passed, the identity of the current user is recognized to a certain extent, so when the user is prompted to collect the second to-be-detected image, the user can be guided by relevant information to prompt the user to say the preset whitelist Text messages in.

Step 23: Perform lip shape recognition on a plurality of second to-be-detected images to obtain recognized text information.

Lip shape recognition technology is a technology that interprets the content of speech based on the movement of the lips when speaking. When performing automatic lip shape recognition, it is necessary to collect multiple images containing the lips movement of the speaker or collect a video containing the lips movement of the speaker, and then Combining image processing technology and deep learning technology to identify multi-frame continuous image sequences, by identifying the lip shape in the multi-frame continuous image sequence, mapping the lip shape to the pronunciation, and then determining the corresponding natural language words and sentences based on the pronunciation in a continuous period of time , That is, the content of the speech.

Optionally, extract the face information of multiple second to-be-detected images from the collection of multiple consecutive second to-be-detected images; extract multiple continuously changing lip features from the multiple face information; combine multiple continuously changing lip features The lip shape feature is input to the lip shape recognition model, so that the lip shape recognition model can obtain corresponding pronunciation information, and based on the pronunciation information, the corresponding recognized text information is calculated.

Specifically, the lip recognition model may be an end-to-end algorithm model based on the encoder-decoder architecture fusion spatiotemporal convolutional neural network feature extractor and word embedding network, and using the attention mechanism. Among them, the feature extractor uses a spatio-temporal convolutional neural network (STCNN), the encoder-decoder subunit uses a long short-term memory network (LSTM), and the word embedding (Embedding) encoding method uses Word2vec.

Optionally, the lip shape recognition model can use the lip shape recognition data set of Mandarin Chinese to train the model, use an improved multi-stage convolutional neural network (MTCNN) to extract the lip region in the silent video, and then send the extracted lip region In the spatio-temporal convolutional network STCNN, it is used to extract the visual feature information of the lip action. The encoder-decoder based on LSTM is used to encode lip visual feature information and decode it into relevant text information during model inference. The attention mechanism can make the model decoder pay attention to the coded content of the encoder at a specific location, instead of using the entire coded content as a basis for decoding, thereby improving the decoding effect of the model. Use optimized THULAC (THU Lexical Analyzer for Chinese, Chinese lexical analysis toolkit) to segment Chinese character sentences, and the result of segmentation is sent to Word2vec. The role of this part in the network is essentially to act as a character encoding. The encoder-decoder architecture encodes a variable-length sequence into a fixed-length representation, and represents a given fixed-length vector as a variable-length sequence. From a probabilistic point of view, the model uses a general method to learn the conditional probability distribution of another variable-length sequence under the condition of a variable-length sequence.

It can be understood that the lip shape recognition model can use the above scheme or other related schemes to establish different databases according to different languages, so as to be applied to different language regions.

Step 24: Determine whether the recognized text information is the text information in the preset whitelist.

Optionally, the user enters some text information in the terminal device, and adds the entered text information to the whitelist.

When it is determined that the text information recognized after the lip shape detection is the same as the text information in the white list, step 25 is executed.

Optionally, the text in the whitelist can have only one paragraph or multiple paragraphs. When there are multiple paragraphs of text in the whitelist, only any paragraph of the text information recognized after the lip shape detection is required.

Step 25: Confirm that the lip shape recognition passes.

Step 26: After the lip shape recognition is passed, respond to the setting operation instruction.

In an application scenario, the terminal device contains private short messages, which need to be authenticated before they can be viewed. When the user clicks to view the private short message, the terminal device responds to this operation instruction to perform face recognition on the user. After face recognition, the terminal device uses the camera to collect multiple consecutive images of the text information read by the user for detection. According to the lip shape feature, the corresponding text information is calculated through the lip shape recognition model, and the corresponding text information is matched with the preset text information in the whitelist. If the matching is successful, the lip shape recognition is determined to pass, and the terminal device responds to the setting Operation instructions, display private short messages for users to view.

Refer to Fig. 3, which is a schematic flowchart of a third embodiment of an identity verification method provided by the present application, and the method includes:

Step 31: When the terminal device obtains the setting operation instruction, it collects the first image to be detected, and performs face recognition on the first image to be detected.

Step 32: After the face recognition is passed, a plurality of second to-be-detected images that are read consecutively are collected.

Steps 31-32 have the same or similar technical solutions as the foregoing embodiment, and will not be repeated here.

Step 33: Perform lip shape recognition on a plurality of second to-be-detected images to obtain recognized text information.

Step 34: Determine whether the recognized text information is the text information in the preset blacklist.

Optionally, the user enters some text information in the terminal device, and adds the entered text information to the blacklist.

When it is determined that the text information recognized after the lip shape detection is the same as the text information in the blacklist, step 35 is executed.

Optionally, the text in the blacklist can have only one paragraph or multiple paragraphs. When there are multiple paragraphs of text in the blacklist, only any paragraph of the text information recognized after the lip shape detection is required.

Optionally, at least one piece of text information in the white list is added to the black list, and the text information is deleted from the white list.

Step 35: Determine that the lip shape recognition fails.

Optionally, when the recognized text information is different from the text information in the blacklist, the recognized text information is matched with the text information in the white list. If they are the same, the lip recognition passes and the terminal device responds to the settings Operating instructions.

In an application scenario, if the user finds that part of the text information in the whitelist of the terminal device is at risk of being stolen or has been stolen, then this part of the text information is deleted from the whitelist and added to the blacklist, so that this part of the text information can be used To verify whether the text information recognized by the lip shape is safe.

In an application scenario, each piece of text information in the whitelist of the terminal device is time-sensitive. For example, there is a time limit for each piece of text information used for identity verification (the time limit can be two hours, twenty hours, forty-eight hours, and the specific time limit is set by the system or user requirements). When the text message exceeds the time limit, the terminal device will Automatically delete it and add it to the blacklist, and prompt the user or remind the user that the text message has exceeded the time limit may be a security risk, please handle it by yourself. For example, there is a limit on the number of times each piece of text information is used for identity verification (the limit can be ten, twenty, fifty, one hundred, and the specific limit is set by the system or user needs). When the text information exceeds the limit , The terminal device will automatically delete it and add it to the blacklist and prompt the user or remind the user that the text message has exceeded the number of uses may be a security risk, please handle it by yourself. This ensures the iterative update of the text information in the whitelist, which is easy to ensure information security and is not easy to be stolen. Even if the text information is stolen, the text information is already in the blacklist, and the use of the text information cannot pass identity verification

Refer to FIG. 4, which is a schematic flowchart of a fourth embodiment of an identity verification method provided by the present application, and the method includes:

Step 41: When the terminal device obtains the setting operation instruction, it collects the first image to be detected, and performs face recognition on the first image to be detected.

Step 42: After the face recognition is passed, the standard text information is displayed, and a plurality of consecutive second to-be-detected images are collected.

After the user's face recognition is passed, the terminal device displays standard text on the display screen, prompting the user's face to face the camera to read the displayed standard text, and at the same time collect multiple consecutive second to-be-detected images when the user reads the standard text through the camera.

Optionally, the standard text information may be one of multiple text information entered in advance by the user.

Optionally, the standard text information may be randomly selected from a plurality of text information in the database as the standard text information.

Optionally, the standard text information may be randomly selected from the cloud server as the standard text information.

Step 43: Perform lip recognition on multiple second to-be-detected images to obtain recognized text information.

Optionally, step 43 is specifically extracting the face information of multiple second to-be-detected images from the continuous collection of multiple second to-be-detected images; extracting multiple continuously changing lip features from the multiple face information; A continuously changing lip shape feature is input to the lip shape recognition model, so that the lip shape recognition model can obtain the corresponding pronunciation information, and based on the pronunciation information, the corresponding recognition text information is calculated.

Step 44: Determine whether the recognized text information is the same as the standard text information.

It is judged whether the recognized text information is the same as the standard text information, and if the same, step 45 is executed.

Step 45: Confirm that the lip shape recognition passes.

After confirming that the lip shape recognition is passed, the terminal device responds to the setting operation instruction to complete the corresponding operation.

Referring to Fig. 5, Fig. 5 is a schematic flowchart of a fifth embodiment of an identity verification method provided by the present application. The method includes:

Step 51: The mobile terminal collects the first image to be detected when acquiring the setting operation instruction.

Step 52: Extract the face image in the first image to be detected.

Optionally, if the face image of the first image to be detected is not extracted, the terminal device will re-acquire the first image to be detected and prompt the user to face the camera, so that the collected first image to be detected contains the face image .

Step 53: Extract face feature information from the face image.

Optionally, a local feature extraction method can be used for the method for extracting facial feature information from the facial image.

Specifically, feature extraction based on facial organs, feature extraction based on templates, and feature extraction based on elastic map matching methods can be used.

Optionally, the method for extracting facial feature information from the facial image may adopt an overall feature extraction method.

Specifically, feature extraction based on algebraic method, feature extraction based on neural network, feature extraction based on wavelet multi-resolution can be used.

Step 54: Perform a similarity comparison between the facial feature information and the pre-stored standard facial feature information.

Optionally, the pre-stored standard facial feature information is the facial feature information extracted from the facial image information collected in advance by the user. The pre-stored standard facial feature information can be grouped as a unit, and the facial feature information in each group A face image is formed so that multiple sets of standard facial feature information can be pre-stored in the terminal device.

Step 55: When the result of the similarity comparison meets the preset requirements, it is determined that the face recognition passes.

Optionally, the similarity comparison method can be to compare a single feature with a single standard facial feature, and then multiply the comparison results of multiple single features. When the multiplied result is greater than a preset value, it is determined Face recognition passed. Taking a single feature as an example of nose, eyes, and mouth, the similarity comparison value of nose is 0.95, the similarity comparison value of eyes is 0.85, and the similarity comparison value of mouth is 0.99. Multiply the three comparison values. It is 0.95*0.85*0.99≈0.8, the preset value is 0.75, 0.8>0.75, so the similarity comparison result is greater than the preset requirement, and the face recognition is determined to pass.

Optionally, the similarity comparison method may be to compare the overall feature with the overall standard face feature, and when the comparison result is greater than a preset value, it is determined that the face recognition passes.

Step 56: After the face recognition is passed, a plurality of consecutive second to-be-detected images are collected, and lip shape recognition is performed on the second to-be-detected images.

Step 57: After the lip shape recognition is passed, respond to the setting operation instruction.

Referring to FIG. 6, FIG. 6 is a schematic flowchart of a sixth embodiment of an identity verification method provided by the present application. The method includes:

Step 61: When the terminal device obtains the payment operation instruction, it collects the first image to be detected, and performs face recognition on the first image to be detected.

Step 62: After the face recognition is passed, collect a plurality of consecutive second to-be-detected images, and perform lip shape recognition on the second to-be-detected images.

Step 63: After the lip shape recognition is passed, respond to the payment operation instruction to complete the corresponding payment.

In an application scenario, when the terminal device obtains a payment operation instruction, if the payment amount is a small payment, the face recognition in step 61 can be skipped, and the current user only needs to correctly read out the tasks set in the whitelist. A piece of text information. After the lip shape recognition is passed, the terminal device responds to the payment operation instruction to complete the corresponding payment.

In another application scenario, when the terminal device obtains the payment operation instruction, the current user needs to confirm that the current user has the authority in the terminal device through face recognition, and then correctly read any piece of text information set in the whitelist. After the lip shape recognition is passed, the terminal device responds to the payment operation instruction to complete the corresponding payment.

In another application scenario, when the terminal device obtains the payment operation instruction, the current user needs to confirm that the current user has the authority in the terminal device through face recognition, and then correctly read the random text information on the terminal device. After passing, the terminal device responds to the payment operation instruction to complete the corresponding payment.

In another application scenario, the user finds that the text information in the whitelist in the terminal device has hidden security risks. If it is stolen by others, the text information with hidden security risks will be deleted and added to the blacklist. In this way, even if the face recognition is passed, when the lip shape recognizes that the text information is the text information in the blacklist, the terminal device is immediately locked and all operation instructions are terminated.

In another application scenario, users of terminal devices are divided into identities with different permissions, and users with the highest permissions can quickly update the white list, determine the list of users with payment permissions, and can change the list of users with payment permissions at any time.

Referring to FIG. 7, FIG. 7 is a schematic structural diagram of a first embodiment of a terminal device provided by the present application. The terminal device 70 includes a processor 71, a camera module 72 connected to the processor 71, and a memory 73; the memory 73 is used to store programs Data, the processor 71 is used to execute program data to implement the following methods:

When the terminal device obtains the setting operation instruction, it collects the first image to be detected and performs face recognition on the first image to be detected; after the face recognition is passed, it collects a plurality of consecutive second images to be detected, and The second image to be detected performs lip shape recognition; after the lip shape recognition passes, responds to the setting operation instruction.

Optionally, the processor 71 is used to execute the program data to implement the following method: after the face recognition is passed, a plurality of consecutive second to-be-detected images are collected; Recognize to obtain recognized text information; determine whether the recognized text information is the text information in the preset whitelist; if so, determine that the lip shape recognition passes.

Optionally, the processor 71 used to execute the program data is also used to implement the following methods: acquiring text information entered by the user; adding the entered text information to the white list.

Optionally, the processor 71 is configured to execute the program data to implement the following method: after the face recognition is passed, collect a plurality of second to-be-detected images that are read consecutively; Shape recognition to obtain recognized text information; determine whether the recognized text information is the text information in the preset blacklist; if so, determine that the lip shape recognition fails.

Optionally, the processor 71 used to execute the program data is also used to implement the following method: adding at least one piece of text information in the white list to the black list, and deleting the text information from the white list. .

Optionally, the processor 71 is used to execute the program data to implement the following method: after the face recognition is passed, display standard text information, and collect consecutive multiple second to-be-detected images; Perform lip shape recognition on the image to be detected to obtain recognized text information; determine whether the recognized text information is the same as the standard text information; if yes, determine that the lip shape recognition passes.

Optionally, the processor 71 used to execute the program data is also used to implement the following method: randomly select one text information from a plurality of text information in the database as the standard text information, and display the standard text information.

Optionally, the processor 71 is used to execute the program data to implement the following method: extracting face information of multiple second images to be detected; extracting multiple continuously changing lip features from multiple face information; Based on multiple continuously changing lip features, the recognized text information is obtained.

Optionally, the processor 71 used to execute the program data is also used to implement the following method: input a plurality of continuously changing lip features into the lip recognition model, so that the lip recognition model can output corresponding pronunciation information, and Based on the pronunciation information, the corresponding recognized text information is calculated.

Optionally, the processor 71 is configured to execute the program data to implement the following method: when the mobile terminal obtains the setting operation instruction, collect the first image to be detected; extract the face image in the first image to be detected ; Perform face recognition on face images.

Optionally, the processor 71 is used to execute the program data to implement the following method: extracting facial feature information from a face image; comparing the facial feature information with pre-stored standard facial feature information for similarity ; When the result of the similarity comparison meets the preset requirements, the face recognition is determined to pass.

Optionally, the processor 71 used to execute the program data is also used to implement the following method: after the lip shape recognition is passed, respond to the payment operation instruction to complete the corresponding payment.

Referring to FIG. 8, FIG. 8 is a schematic structural diagram of a second embodiment of a terminal device provided by the present application. The terminal device 80 includes a first identification module 81, a second identification module 82 and a response module 83.

The first recognition module 81 is configured to collect the first image to be detected and perform face recognition on the first image to be detected when the setting operation instruction is acquired.

The second recognition module 82 is configured to collect a plurality of consecutive second to-be-detected images after the face recognition is passed, and perform lip-shape recognition on the second to-be-detected images.

The response module 83 is used to respond to the setting operation instruction after the lip shape recognition is passed.

Referring to Fig. 9, Fig. 9 is a schematic structural diagram of an embodiment of a computer storage medium provided by the present application. The computer storage medium 90 is used to store program data 91. When the program data 91 is executed by a processor, it is used to implement the following methods:

It can be understood that, when the program data 91 is executed by the processor, it is also used to implement the method in any of the foregoing embodiments.

In the several implementation manners provided in this application, it should be understood that the disclosed method and device may be implemented in other ways. For example, the device implementation described above is merely illustrative. For example, the division of the modules or units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be Combined or can be integrated into another system, or some features can be ignored or not implemented.

The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of this embodiment.

In addition, the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.

If the integrated units in the other embodiments described above are implemented in the form of software functional units and sold or used as independent products, they can be stored in a computer readable storage medium. Based on this understanding, the technical solution of this application essentially or the part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , Including several instructions to enable a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor to execute all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program code .

The above are only examples of this application, and do not limit the scope of this application. Any equivalent structure or equivalent process transformation made by using the description and drawings of this application, or directly or indirectly applied to other related technical fields, The same reasoning is included in the scope of patent protection of this application.

Claims

An identity verification method, characterized in that the method includes:

When the terminal device obtains the setting operation instruction, collects the first image to be detected, and performs face recognition on the first image to be detected;

After the face recognition is passed, collecting a plurality of consecutive second to-be-detected images, and performing lip-shape recognition on the second to-be-detected images;

After the lip shape recognition is passed, respond to the setting operation instruction.
The method according to claim 1, wherein:

After the face recognition is passed, collecting a plurality of consecutive second to-be-detected images and performing lip-shape recognition on the second to-be-detected images includes:

After the face recognition is passed, collecting a plurality of consecutive second to-be-detected images;

Performing lip shape recognition on the plurality of second to-be-detected images to obtain recognized text information;

Determine whether the recognized text information is text information in a preset whitelist;

If yes, it is determined that the lip shape recognition passes.
The method according to claim 2, wherein:

The method also includes:

Obtain the text information entered by the user;

Add the entered text information to the white list.
The method according to claim 1, wherein:

After the face recognition is passed, collecting a plurality of consecutive second to-be-detected images and performing lip-shape recognition on the second to-be-detected images includes:

After the face recognition is passed, collecting a plurality of second to-be-detected images read continuously;

Performing lip shape recognition on the plurality of second to-be-detected images to obtain recognized text information;

Determine whether the recognized text information is text information in a preset blacklist;

If yes, it is determined that the lip shape recognition fails.
The method according to claim 4, wherein:

The method also includes:

At least one piece of text information in the white list is added to the black list, and the text information is deleted from the white list.
The method according to claim 1, wherein:

After the face recognition is passed, collecting a plurality of consecutive second to-be-detected images and performing lip-shape recognition on the second to-be-detected images includes:

After the face recognition is passed, display standard text information, and collect consecutive multiple second to-be-detected images;

Performing lip shape recognition on the plurality of second to-be-detected images to obtain recognized text information;

Determine whether the recognized text information is the same as the standard text information;

If yes, it is determined that the lip shape recognition passes.
The method according to claim 6, wherein:

The display standard text information includes:

One text information is randomly selected from a plurality of text information in the database as the standard text information, and the standard text information is displayed.
The method according to any one of claims 2, 4, 6, wherein:

The performing lip recognition on the plurality of second to-be-detected images to obtain recognized text information includes:

Extracting face information of the plurality of second images to be detected;

Extracting a plurality of continuously changing lip features from a plurality of facial information;

Based on the plurality of continuously changing lip features, the recognized text information is obtained.
The method according to claim 8, wherein:

The obtaining the recognized text information based on the plurality of continuously changing lip features includes:

A plurality of continuously changing lip shape features are input to a lip shape recognition model, so that the lip shape recognition model can obtain corresponding pronunciation information, and based on the pronunciation information, corresponding recognized text information is calculated.
The method according to claim 1, wherein:

When the mobile terminal acquires the setting operation instruction, collecting the first image to be detected and performing face recognition on the first image to be detected includes:

When the mobile terminal obtains the setting operation instruction, collect the first image to be detected;

Extracting a face image in the first image to be detected;

Perform face recognition on the face image.
The method of claim 10, wherein:

The performing face recognition on the face image includes:

Extracting facial feature information from the facial image;

Comparing the facial feature information with pre-stored standard facial feature information for similarity;

When the result of the similarity comparison meets a preset requirement, it is determined that the face recognition passes.
The method according to claim 1, wherein:

The setting operation instruction is a payment operation instruction;

The responding to the setting operation instruction after the lip shape recognition is passed includes:

After the lip shape recognition is passed, respond to the payment operation instruction to complete the corresponding payment.
A terminal device, characterized in that the terminal device includes a processor, a camera module connected with the processor, and a memory;

The memory is used to store program data, and the processor is used to execute the program data to implement the method according to any one of claims 1-12.
A computer storage medium, wherein the computer storage medium is used to store program data, and the program data is used to implement the method according to any one of claims 1-12 when executed by a processor.
A terminal device, characterized in that the terminal device includes:

The first recognition module is configured to collect the first image to be detected and perform face recognition on the first image to be detected when the setting operation instruction is acquired;

The second recognition module is configured to collect a plurality of consecutive second to-be-detected images after the face recognition is passed, and perform lip-shape recognition on the second to-be-detected images;

The response module is used to respond to the setting operation instruction after the lip shape recognition is passed.